Hyphen Connect is seeking an AI Specialist Engineer to improve large language and vision models for on-device inference, focusing on performance optimization across various hardware architectures. The ideal candidate will have expertise in model distillation, pruning, and quantization techniques, along with strong programming skills in C++ and Python
Job Summary
The role focuses on enhancing the performance of large language and vision models specifically for on-device inference.
Candidates will develop pipelines for model distillation and handle hardware-specific compilation tasks.
Performance benchmarking across various NPU and GPU architectures is a core responsibility of this position.
Matching Summary
Match Score: 85
Hyphen Connect is seeking an AI Specialist Engineer to improve large language and vision models for on-device inference, focusing on performance optimization across various hardware architectures. The ideal candidate will have expertise in model distillation, pruning, and quantization techniques, along with strong programming skills in C++ and Python.