Senior Genai Algorithms Engineer — Model Optimizations For Inference

NVIDIA

Base: 152,000 usd - 218,500 usd (level 3); 184,000...
Generative ai model optimization
Llm and diffusion model optimization
Quantization, speculative decoding, sparsity
Design, implement, and productionize model optimization algorithms for inference and deployment on NVIDIA’s latest hardware platforms

Job Summary

  • Design, implement, and productionize model optimization algorithms for inference and deployment on NVIDIA’s latest hardware platforms.
  • Your work will span multiple layers of the AI software stack—ranging from algorithm design to integration—within NVIDIA’s ecosystem and open-source frameworks.
  • This role offers a unique opportunity to work at the intersection of research and engineering, pushing the boundaries of large-scale AI optimization.

Matching Summary

Design, implement, and productionize model optimization algorithms for inference and deployment on NVIDIA’s latest hardware platforms.

Salary

Base: 152,000 USD - 218,500 USD (Level 3); 184,000 USD - 287,500 USD (Level 4); Bonus/Equity: Equity; Benefits: Comprehensive benefits package

Skills & Requirements

Must-have

  • Generative AI model optimization
  • LLM and diffusion model optimization
  • Quantization, speculative decoding, sparsity
  • Software-hardware co-design
  • Deep learning optimization algorithms
  • GPU-level optimization
  • CUDA and Triton kernel development

Nice-to-have

  • Contributions to ML frameworks
  • Large-scale GPU cluster experience
  • NVIDIA deep learning SDKs
  • High-performance GPU kernels

Key Requirements

  • Master’s, PhD, or equivalent experience
  • 5+ years of deep learning experience
  • Strong software design skills
  • Proficiency in Python, PyTorch
  • Proven foundation in algorithms and programming

Work Rights

Not specified

Tailored Resume

Cover Letter