Hyphen Partners is seeking an AI Specialist Engineer to enhance the performance of large language and vision models for on-device inference. The role focuses on optimizing and deploying AI solutions, requiring expertise in model distillation, hardware-specific compilation, and strong programming skills
Job Summary
The role focuses on enhancing the performance of large language and vision models specifically for on-device inference.
Candidates will develop pipelines for model distillation and handle hardware-specific compilation tasks.
The position requires benchmarking performance across various NPU and GPU architectures to ensure optimal efficiency.
Matching Summary
Match Score: 85
Hyphen Partners is seeking an AI Specialist Engineer to enhance the performance of large language and vision models for on-device inference. The role focuses on optimizing and deploying AI solutions, requiring expertise in model distillation, hardware-specific compilation, and strong programming skills.
Skills & Requirements
Must-have
Model distillation and pruning techniques
4-bit/8-bit quantization expertise
TensorRT and ONNX Runtime experience
Edge deployment and NPU/GPU benchmarking
Strong C++ and Python programming skills
Nice-to-have
Hardware architecture optimization knowledge
Cutting-edge AI solution development
Diverse hardware architecture familiarity
Key Requirements
Expertise in model compression and quantization
Hands-on experience with TensorRT and ONNX Runtime