NVIDIA is seeking a Senior Software Engineer specializing in AI Inference to enhance open-source LLM serving through contributions to upstream inference engines. The role requires hands-on expertise in performance optimization and a strong foundation in software engineering, particularly involving LLM inference stacks and high-performance computing
Job Summary
Advance open-source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang.
Implement and optimize inference-runtime capabilities to improve throughput and tail latency.
Collaborate with model, platform, and SRE teams to translate production requirements into upstreamable solutions.
Matching Summary
Match Score: 85
NVIDIA is seeking a Senior Software Engineer specializing in AI Inference to enhance open-source LLM serving through contributions to upstream inference engines. The role requires hands-on expertise in performance optimization and a strong foundation in software engineering, particularly involving LLM inference stacks and high-performance computing.