Senior Performance Engineer - Llm Inference Frameworks

Topjobstoday

**
Python programming skills
Experience with deep learning frameworks
Performance profiling and debugging skills
** NVIDIA is seeking a Senior Performance Engineer to enhance and optimize the inference infrastructure for large language models within their TensorRT-LLM team. The ideal candidate will possess extensive software engineering experience, particularly in Python, and have a strong background in deep learning frameworks and performance optimization. **

Job Summary

  • NVIDIA is hiring exceptional software engineers to build and optimize core inference infrastructure for large language models.
  • Your work will directly shape the frameworks behind state-of-the-art LLM inference used across NVIDIA and the AI community.
  • Join us to redefine what 'fast' means for LLM inference and power the next generation of generative AI at scale.

Matching Summary

Match Score: 75

** NVIDIA is seeking a Senior Performance Engineer to enhance and optimize the inference infrastructure for large language models within their TensorRT-LLM team. The ideal candidate will possess extensive software engineering experience, particularly in Python, and have a strong background in deep learning frameworks and performance optimization. **

Skills & Requirements

Must-have

  • Python programming skills
  • Experience with deep learning frameworks
  • Performance profiling and debugging skills

Nice-to-have

  • Contributions to inference frameworks
  • Expertise in performance modeling
  • Hands-on experience with NVIDIA tools

Key Requirements

  • Bachelor's or higher degree in relevant field
  • 5+ years of relevant software development experience
  • Excellent communication skills in English

Work Rights

Not specified

Tailored Resume

Cover Letter