Principal Software Engineer - Ai Inference

Invidia

Multiple Locations
Base: 272,000 usd - 431,250 usd; bonus/equity: eli...
Llm inference and serving systems
Gpu performance engineering
Distributed systems and concurrency
This role involves driving upstream-first engineering for open-source inference engines like vLLM and SGLang to ensure outstanding performance on NVIDIA GPUs

Job Summary

  • This role involves driving upstream-first engineering for open-source inference engines like vLLM and SGLang to ensure outstanding performance on NVIDIA GPUs.
  • You will optimize inference runtime features for efficiency, latency, and scalability while collaborating closely with internal teams and the broader community.
  • The position offers a competitive base salary range, equity, and benefits, and emphasizes NVIDIA's commitment to diversity and equal opportunity employment.

Matching Summary

This role involves driving upstream-first engineering for open-source inference engines like vLLM and SGLang to ensure outstanding performance on NVIDIA GPUs.

Salary

Base: 272,000 USD - 431,250 USD; Bonus/Equity: Eligible for equity; Benefits: Eligible for benefits

Skills & Requirements

Must-have

  • LLM inference and serving systems
  • GPU performance engineering
  • Distributed systems and concurrency
  • Programming in Rust, C++, Python, CUDA
  • Profiling and performance optimization
  • Multi-GPU and multi-node inference
  • Upstream open-source contribution

Nice-to-have

  • Mentoring senior engineers
  • Building benchmarking and regression infrastructure
  • Collaboration across teams
  • Open-source maintainer experience
  • Influencing technical forums

Key Requirements

  • 15+ years production software experience
  • Expertise in LLM inference/serving systems
  • Strong programming and optimization skills
  • Experience with GPU profiling tools
  • BS/MS in Computer Science or related field

Work Rights

Not specified

Tailored Resume

Cover Letter