Principal Systems At-scale Engineer

Invidia

Santa Clara, CA, US
Base: 272,000 usd - 431,250 usd; bonus/equity: eli...
Systems debugging at scale
Large fleet cluster management
Linux-based server platforms
As a Principal Systems At-Scale Engineer, you will collaborate with visionary professionals to build, deploy, and optimize large-scale data center clusters and applications

Job Summary

  • As a Principal Systems At-Scale Engineer, you will collaborate with visionary professionals to build, deploy, and optimize large-scale data center clusters and applications.
  • This role offers an outstanding opportunity to employ the latest accelerated computing and deep learning platforms to make a lasting impact on the world.
  • NVIDIA offers highly competitive salaries and a comprehensive benefits package, fostering a diverse and supportive work environment where innovation thrives.

Matching Summary

As a Principal Systems At-Scale Engineer, you will collaborate with visionary professionals to build, deploy, and optimize large-scale data center clusters and applications.

Salary

Base: 272,000 USD - 431,250 USD; Bonus/Equity: Eligible for equity; Benefits: Comprehensive benefits package

Skills & Requirements

Must-have

  • systems debugging at scale
  • large fleet cluster management
  • Linux-based server platforms
  • C/Python/Bash/Lua scripting
  • telemetry and at-scale analytics
  • performance cluster infrastructure

Nice-to-have

  • mentoring engineers
  • leading cross-team task forces
  • promoting innovation and hackathons
  • strong verbal and written communication
  • organizational health initiatives

Key Requirements

  • 15+ years systems debugging experience
  • BS/MS Computer Science or related field
  • experience with performance clusters and workload patterns
  • experience supporting performance engineering or deep learning
  • strong teamwork and communication skills

Work Rights

Not specified

Tailored Resume

Cover Letter