Software Engineer, Ai Inference Systems - New College Graduate 2026

Topjobstoday

Base: 108,000 usd - 195,500 usd; bonus/equity: equ...
High-performance inference stacks
Gpu kernels and compilers optimization
Multi-gpu, multi-node, multi-cloud scaling
Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation

Job Summary

  • Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.
  • Develop, optimize, and benchmark GPU kernels (hand-tuned and compiler-generated) using techniques such as fusion, autotuning, and memory/layout optimization; build and extend high-level DSLs and compiler infrastructure to boost kernel developer productivity while approaching peak hardware utilization.
  • Architect the scheduling and orchestration of containerized large-scale inference deployments on GPU clusters across clouds.

Matching Summary

Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.

Salary

Base: 108,000 USD - 195,500 USD; Bonus/Equity: equity; Benefits: benefits

Skills & Requirements

Must-have

  • high-performance inference stacks
  • GPU kernels and compilers optimization
  • multi-GPU, multi-node, multi-cloud scaling
  • Python and C/C++ programming
  • deep learning theories
  • GPU programming and performance
  • profiling/debug tools

Nice-to-have

  • Go or Rust experience
  • ML compilers and DSLs
  • containerization/virtualization technologies
  • cloud platforms experience
  • open-source contributions
  • published research papers

Key Requirements

  • Bachelor's, Master's, or PhD degree in CS, CE, or SE
  • Strong CS fundamentals
  • Experience with containers and orchestration
  • Excellent debugging, problem-solving, and communication skills

Work Rights

Not specified

Tailored Resume

Cover Letter