Architect and implement high-impact compiler subsystems including IR design, lowering pipelines, operator fusion, scheduling, and target code generation for our NPU
Job Summary
Architect and implement high-impact compiler subsystems including IR design, lowering pipelines, operator fusion, scheduling, and target code generation for our NPU.
Implement and ship production-grade codegen and runtime integrations that are stable, debuggable and performant.
Mentor senior/junior engineers, do design reviews, and raise the team’s technical bar.
Matching Summary
Architect and implement high-impact compiler subsystems including IR design, lowering pipelines, operator fusion, scheduling, and target code generation for our NPU.
Skills & Requirements
Must-have
ML compilers and toolchains
graph lowering, IR design, operator fusion
production-grade codegen and runtime integrations
low-level optimizations and kernel teams
compiler CI and performance regression suites
troubleshoot complex performance issues
mentor senior/junior engineers
Nice-to-have
influence product and silicon roadmap
technical engagements with customers and partners
Key Requirements
Deep expertise in ML compilers and toolchains
Proven experience with graph lowering, IR design, operator fusion, scheduling, and target codegen for accelerators
Strong systems programming skills — expert C++ and solid Python
Demonstrated track record of shipping production compiler components
Excellent debugging & profiling skills
Strong communication skills and cross-functional experience
Passion for mentorship and improving engineering practices