Senior Deep Learning Architect, Llm Inference

Invidia

CA, United States
Base: 184,000 usd - 287,500 usd (level 4), 224,000...
Deep learning inference serving
Pytorch programming
Gpu hardware and software performance
NVIDIA is at the forefront of the generative AI revolution focusing on inference server performance optimization for Large Language Models

Job Summary

  • NVIDIA is at the forefront of the generative AI revolution focusing on inference server performance optimization for Large Language Models.
  • The role involves workload characterization, benchmarking, content creation, and collaboration with AI startups to establish standard benchmarking methodologies.
  • Employees benefit from equity, competitive base salaries, and a diverse and inclusive work environment committed to equal opportunity.

Matching Summary

NVIDIA is at the forefront of the generative AI revolution focusing on inference server performance optimization for Large Language Models.

Salary

Base: 184,000 USD - 287,500 USD (Level 4), 224,000 USD - 356,500 USD (Level 5); Bonus/Equity: Eligible for equity; Benefits: Not specified

Skills & Requirements

Must-have

  • Deep learning inference serving
  • PyTorch programming
  • GPU hardware and software performance
  • LLM client-server application development
  • CPU and GPU microarchitecture knowledge
  • AI coding agents proficiency

Nice-to-have

  • Experience with databases and visualization tools
  • Collaborative cross-team communication
  • Proactive and independent work approach
  • Use of latest coding agents and inference technology

Key Requirements

  • Master's or PhD in Computer Science or related fields
  • 6+ years relevant software development experience
  • Experience with OpenAI API or MCP for LLM applications
  • Strong written and verbal communication skills

Work Rights

Not specified

Tailored Resume

Cover Letter