8+ years applied machine learning engineering experience
5+ years model optimization and ml compiler development
Expertise in ptq, qat, pruning, and knowledge distillation
This role leads applied research and strategic definition of machine learning algorithms and quantization methodologies for the Neural Network Development Kit roadmap
Job Summary
This role leads applied research and strategic definition of machine learning algorithms and quantization methodologies for the Neural Network Development Kit roadmap.
The successful candidate will collaborate with cross-functional teams to prototype, benchmark, and validate toolchain concepts against constrained hardware targets.
Key responsibilities include staying at the forefront of the Edge AI research community to ensure the NDK roadmap reflects state-of-the-art model efficiency and compression techniques.
Matching Summary
Match Score: 85
This role leads applied research and strategic definition of machine learning algorithms and quantization methodologies for the Neural Network Development Kit roadmap.
Skills & Requirements
Must-have
8+ years applied machine learning engineering experience
5+ years model optimization and ML compiler development
Expertise in PTQ, QAT, pruning, and knowledge distillation
Hands-on experience with PyTorch and TensorFlow/Lite
Familiarity with MLIR, TVM, or ONNX Runtime frameworks
Ability to translate hardware constraints into toolchain requirements
Nice-to-have
PhD in machine learning, optimization, or computer architecture
Experience with RISC-V ISA and software ecosystem
Background in FPGA-based or simulator-based prototyping
Track record of translating academic advances to product roadmaps
Proficiency with AI-assisted productivity tools
Key Requirements
Bachelor's or Master's degree in Computer Science or related field; PhD preferred
8+ years of experience in applied machine learning engineering
At least 5 years focused on model optimization or on-device inference toolchain
Proven expertise in quantization and model compression on resource-constrained hardware