Ai Inference Performance Engineer - New College Grad 2026

Nvidia Corporation

Base: 124,000 usd - 195,500 usd (level 2); 152,000...
**
Optimize genai inference on nvidia accelerators
End-to-end optimization pipeline
Define and optimize cutting-edge workloads
** Nvidia Corporation is seeking a new college graduate for the role of AI Inference Performance Engineer, focusing on optimizing and benchmarking GenAI inference on its latest accelerators. The position involves a combination of technical leadership, collaboration with various teams, and contributions to open-source projects, offering a competitive salary and benefits. **

Job Summary

  • We optimize and benchmark GenAI inference on NVIDIA's latest accelerators, defining the industry’s performance standards across language models, video generation, and speech workloads.
  • Drive industry benchmark results: own the end-to-end optimization pipeline, implement and integrate optimizations in quantization, scheduling, memory management, and distributed inference across TensorRT-LLM, SGLang, and vLLM.
  • Partner with architecture, kernel, and compiler teams to shape GPU roadmaps based on real workload data.

Matching Summary

Match Score: 75

** Nvidia Corporation is seeking a new college graduate for the role of AI Inference Performance Engineer, focusing on optimizing and benchmarking GenAI inference on its latest accelerators. The position involves a combination of technical leadership, collaboration with various teams, and contributions to open-source projects, offering a competitive salary and benefits. **

Salary

Base: 124,000 USD - 195,500 USD (Level 2); 152,000 USD - 241,500 USD (Level 3); Bonus/Equity: Equity; Benefits: Benefits

Skills & Requirements

Must-have

  • Optimize GenAI inference on NVIDIA accelerators
  • End-to-end optimization pipeline
  • Define and optimize cutting-edge workloads
  • Architect distributed inference
  • Establish performance methodology
  • Contribute to open-source projects

Nice-to-have

  • Push performance to its extreme
  • Shape GPU roadmaps
  • Raise the technical bar for the team

Key Requirements

  • BS, MS, or PhD in CS, CE, EE, or equivalent
  • 2+ years of relevant software development experience
  • Strong Python or C++ programming skills
  • Expertise with a DL framework (PyTorch or JAX)
  • Proven track record of performance improvements

Work Rights

Not specified

Tailored Resume

Cover Letter