Software Engineer – Data And Observability Platform

NVIDIA

Base: 124,000 usd - 195,500 usd (level 2); 152,000...
Not specified (assumed hybrid based on industry standards)
High-throughput services
Metrics, logs, and hardware telemetry
Design and deploy internal web applications
NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, responsible for managing telemetry and observability for GPU/CPU development and AI research. The ideal candidate will have experience in full-stack development, distributed systems, and proficiency in programming languages such as Python or Java/Scala

Job Summary

  • Build and maintain high-throughput services that collect and process metrics, logs, and hardware telemetry from compute clusters.
  • Design and deploy internal web applications, tools, and APIs to configure data pipelines, visualize platform health, and integrate AI workflows.
  • Ensure high platform availability through detailed on-call practices and proactive monitoring of complex data dependencies.

Matching Summary

Match Score: 85

NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, responsible for managing telemetry and observability for GPU/CPU development and AI research. The ideal candidate will have experience in full-stack development, distributed systems, and proficiency in programming languages such as Python or Java/Scala.

Salary

Base: 124,000 USD - 195,500 USD (Level 2); 152,000 USD - 241,500 USD (Level 3); Equity: eligible; Benefits: eligible

Skills & Requirements

Must-have

  • high-throughput services
  • metrics, logs, and hardware telemetry
  • design and deploy internal web applications
  • Kubernetes and Cloud deployment
  • refactor data schemas
  • high platform availability

Nice-to-have

  • HPC environments
  • semiconductor build workflows
  • large-scale hardware telemetry
  • AI tools in recruiting

Key Requirements

  • BS or MS in Computer Science or related field
  • 2+ years of production-grade code experience
  • Proficiency in Python or Java/Scala
  • Deep understanding of Distributed Systems
  • Experience building end-to-end features
  • Proficiency in SQL and data modeling

Work Rights

Not specified

Tailored Resume

Cover Letter