Senior Deep Learning Architect, Llm Inference

NVIDIA

Base: $184,000 - $287,500 (level 4) or $224,000 - ...
Deep learning inference serving expertise
Pytorch programming and compiler optimizations
Cpu and gpu microarchitecture knowledge
The role focuses on workload characterization of the latest Large Language Models and inference servers to ensure NVIDIA maintains its leadership position

Job Summary

  • The role focuses on workload characterization of the latest Large Language Models and inference servers to ensure NVIDIA maintains its leadership position.
  • Candidates will collaborate with engineers from AI startups to establish standard benchmarking methodologies and contribute to deep learning software projects like PyTorch and vLLM.
  • The position offers a competitive base salary range of $184,000 to $356,500 USD depending on level, along with equity and benefits.

Matching Summary

The role focuses on workload characterization of the latest Large Language Models and inference servers to ensure NVIDIA maintains its leadership position.

Salary

Base: $184,000 - $287,500 (Level 4) or $224,000 - $356,500 (Level 5); Bonus/Equity: Eligible for equity; Benefits: Comprehensive benefits package included

Skills & Requirements

Must-have

  • Deep learning inference serving expertise
  • PyTorch programming and compiler optimizations
  • CPU and GPU microarchitecture knowledge
  • Experience with vLLM SGLang TRT-LLM frameworks
  • Client server LLM application development

Nice-to-have

  • Novel use cases for agentic AI tools
  • Database and visualization tool experience
  • Proactive independent approach to performance
  • Strong written and verbal communication skills

Key Requirements

  • Master's or PhD in Computer Science or equivalent
  • 6+ years of relevant software development experience
  • Solid understanding of complex software projects like compilers

Work Rights

Not specified

Tailored Resume

Cover Letter