Not specified (assumed hybrid based on industry standards)
High-throughput services
Metrics, logs, and hardware telemetry
Design and deploy internal web applications
NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, responsible for managing telemetry and observability for GPU/CPU development and AI research. The ideal candidate will have experience in full-stack development, distributed systems, and proficiency in programming languages such as Python or Java/Scala
Job Summary
Build and maintain high-throughput services that collect and process metrics, logs, and hardware telemetry from compute clusters.
Design and deploy internal web applications, tools, and APIs to configure data pipelines, visualize platform health, and integrate AI workflows.
Ensure high platform availability through detailed on-call practices and proactive monitoring of complex data dependencies.
Matching Summary
Match Score: 85
NVIDIA is seeking a Software Engineer for their Data & Observability Platform team, responsible for managing telemetry and observability for GPU/CPU development and AI research. The ideal candidate will have experience in full-stack development, distributed systems, and proficiency in programming languages such as Python or Java/Scala.