Senior High-performance Llm Training Engineer

NVIDIA

CA, United States
Base: 184,000 usd - 356,500 usd; bonus/equity: equ...
Hybrid
Llm training workloads
Pytorch and jax optimization
Gpu architecture fundamentals
This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution

Job Summary

  • This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.
  • Implement production-quality software in multiple layers of NVIDIA's deep learning platform stack, from drivers to DL frameworks.
  • If you're excited to work across the full hardware & software stack—from GPU architecture to application code—to achieve optimal performance, we want to hear from you!

Matching Summary

This position focuses on optimizing NVIDIA’s high-performance LLM software stack in frameworks like PyTorch and JAX for high-performance training on thousands of GPUs, while also helping shape hardware roadmaps for the next generation of GPUs powering the AI revolution.

Salary

Base: 184,000 USD - 356,500 USD; Bonus/Equity: Equity; Benefits: Comprehensive benefits package

Skills & Requirements

Must-have

  • LLM training workloads
  • PyTorch and JAX optimization
  • GPU architecture fundamentals
  • CUDA programming
  • MLPerf Training benchmark suite

Nice-to-have

  • shaping hardware roadmaps
  • creative and autonomous work environment
  • collaboration with forward-thinking people

Key Requirements

  • PhD or MS degree
  • 5+ years of experience
  • 8+ years of experience
  • Proficiency in C++, Python, and CUDA

Work Rights

Not specified

Tailored Resume

Cover Letter