Senior System Software Engineer - Gpu Performance

NVIDIA Corporation

Base: $152,000 - $241,500 (level 3) or $184,000 - ...
M.s. or phd in computer science
3+ years parallel programming experience
Experience with mpi, nccl, ucx, nvshmem
The role involves conducting in-depth performance characterization on large multi-GPU and multi-node clusters to influence the roadmap of communication libraries

Job Summary

  • The role involves conducting in-depth performance characterization on large multi-GPU and multi-node clusters to influence the roadmap of communication libraries.
  • Candidates will collaborate with a dynamic team across multiple time zones to root-cause performance issues reported by customers.
  • The position offers competitive compensation ranging from $152,000 to $287,500 USD plus equity and benefits.

Matching Summary

The role involves conducting in-depth performance characterization on large multi-GPU and multi-node clusters to influence the roadmap of communication libraries.

Salary

Base: $152,000 - $241,500 (Level 3) or $184,000 - $287,500 (Level 4); Bonus/Equity: Eligible for equity; Benefits: Comprehensive benefits package included

Skills & Requirements

Must-have

  • M.S. or PhD in Computer Science
  • 3+ years parallel programming experience
  • Experience with MPI, NCCL, UCX, NVSHMEM
  • Performance benchmarking on large HPC clusters
  • C/C++ micro-benchmark implementation
  • Python scripting proficiency

Nice-to-have

  • Infiniband/Ethernet RDMA debugging experience
  • CUDA programming knowledge
  • Deep Learning Frameworks like PyTorch
  • Kubernetes and SLURM familiarity
  • Cross-timezone collaboration skills

Key Requirements

  • Master's degree or equivalent experience
  • 3+ years in parallel programming
  • Systems software fundamentals knowledge

Work Rights

Not specified

Tailored Resume

Cover Letter