Senior Systems Engineer – High-performance Ai And Networking Applications

Invidia

Multiple Locations
Base: 184,000 usd - 287,500 usd for level 4; base:...
High-speed networking technologies
Ai/hpc infrastructure experience
Performance benchmarking on nvlink and infiniband
Join the NVIDIA Deep Learning Frameworks Infrastructure team as a Senior Systems Engineer focusing on High-Performance AI & Networking Applications, committed to ground-breaking AI & Networking Solutions

Job Summary

  • Join the NVIDIA Deep Learning Frameworks Infrastructure team as a Senior Systems Engineer focusing on High-Performance AI & Networking Applications, committed to ground-breaking AI & Networking Solutions.
  • Collaborate with networking teams to plan, implement, and evaluate performance benchmarks on NVLINK, NVSwitch, and InfiniBand powered infrastructures while acting as a primary resource for fixing networking and hardware integration issues.
  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions, with eligibility for equity and benefits.

Matching Summary

Join the NVIDIA Deep Learning Frameworks Infrastructure team as a Senior Systems Engineer focusing on High-Performance AI & Networking Applications, committed to ground-breaking AI & Networking Solutions.

Salary

Base: 184,000 USD - 287,500 USD for Level 4; Base: 224,000 USD - 356,500 USD for Level 5; Bonus/Equity: Eligible; Benefits: Eligible

Skills & Requirements

Must-have

  • High-speed networking technologies
  • AI/HPC infrastructure experience
  • Performance benchmarking on NVLINK and InfiniBand
  • Deep learning frameworks knowledge
  • Multi-node system performance evaluation
  • Technical communication and debugging skills

Nice-to-have

  • Mastery in distributed training systems
  • Datacenter automation experience
  • Advanced network protocols knowledge
  • Distributed storage systems familiarity
  • Cluster management and monitoring tools

Key Requirements

  • BS/MS or PhD in Computer Science or related field
  • 8+ years experience in AI/HPC Infrastructure
  • Familiarity with AI/HPC job schedulers like Slurm or K8s
  • Experience with MPI and NCCL workflows
  • Understanding of PyTorch and MegatronLM
  • Experience with InfiniBand, NVLINK in HPC environments

Work Rights

Not specified

Tailored Resume

Cover Letter