Manager, Large Language Model Inference

Invidia

Multiple Locations
Base: 184,000 usd - 287,500 usd for level 2; base:...
Hybrid
Llm inference software development
C++ or python programming
Technical leadership experience
NVIDIA is accelerating the AI revolution by developing the industry's fastest and most efficient inference platform for deep learning models on NVIDIA GPUs

Job Summary

  • NVIDIA is accelerating the AI revolution by developing the industry's fastest and most efficient inference platform for deep learning models on NVIDIA GPUs.
  • This role involves hands-on leadership to architect and guide a team building core LLM inference runtime software, collaborating closely with researchers and GPU architects.
  • The company offers competitive base salaries, equity, benefits, and fosters a diverse and inclusive work environment with opportunities for professional growth.

Matching Summary

NVIDIA is accelerating the AI revolution by developing the industry's fastest and most efficient inference platform for deep learning models on NVIDIA GPUs.

Salary

Base: 184,000 USD - 287,500 USD for Level 2; Base: 224,000 USD - 356,500 USD for Level 3; Bonus/Equity: Eligible; Benefits: Eligible

Skills & Requirements

Must-have

  • LLM inference software development
  • C++ or Python programming
  • Technical leadership experience
  • Production-quality software delivery
  • Cross-functional team coordination
  • Kernel development and runtime optimizations

Nice-to-have

  • GPU architecture and CUDA programming
  • System-level performance tuning
  • Experience with TensorRT-LLM or vLLM
  • Building scalable user-friendly APIs
  • Empowering and growing engineering teams

Key Requirements

  • MS, PhD, or equivalent experience in Computer Science or related field
  • 7+ years software engineering experience
  • 3+ years technical leadership experience
  • Expertise in large language models or vision language models

Work Rights

Not specified

Tailored Resume

Cover Letter