Software Engineer – Data And Observability Platform

NVIDIA

Base: 124,000 usd - 195,500 usd (level 2); 152,000...
Not specified
High-throughput services for metrics and logs
Design and deploy internal web applications
Modernizing data platform
NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, focusing on building high-throughput services that manage telemetry for GPU/CPU development and AI research. The ideal candidate should have strong coding skills, experience with distributed systems, and proficiency in languages like Python or Java/Scala. This position offers a competitive salary, equity, and a commitment to diversity in the workplace

Job Summary

  • You will build and maintain high-throughput services that collect and process metrics, logs, and hardware telemetry from compute clusters.
  • You will manage the deployment and lifecycle of services on Kubernetes and Cloud.
  • You will ensure high platform availability through detailed on-call practices and proactive monitoring of complex data dependencies.

Matching Summary

Match Score: 85

NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, focusing on building high-throughput services that manage telemetry for GPU/CPU development and AI research. The ideal candidate should have strong coding skills, experience with distributed systems, and proficiency in languages like Python or Java/Scala. This position offers a competitive salary, equity, and a commitment to diversity in the workplace.

Salary

Base: 124,000 USD - 195,500 USD (Level 2); 152,000 USD - 241,500 USD (Level 3); Equity: Eligible; Benefits: Eligible

Skills & Requirements

Must-have

  • High-throughput services for metrics and logs
  • Design and deploy internal web applications
  • Modernizing Data Platform
  • Kubernetes and Cloud deployment
  • Refactor data schemas and storage patterns
  • High platform availability and monitoring

Nice-to-have

  • Experience with HPC environments
  • Semiconductor build workflows
  • Handling large-scale hardware telemetry

Key Requirements

  • 2+ years of experience writing production-grade code
  • BS or MS in Computer Science or related field
  • Proficiency in Python or Java/Scala
  • Deep understanding of Distributed Systems
  • Experience building end-to-end features
  • Proficiency in SQL and data modeling

Work Rights

Not specified

Tailored Resume

Cover Letter