Site Reliability Engineer - Hardware Infrastructure

Nvidia Corporation

Base: $184,000 - $356,500 usd; bonus/equity: eligi...
8+ years sre devops experience
Python go perl ruby coding
Prometheus grafana observability
This position merges software and systems engineering to guarantee flawless service operation with consistent reliability and uptime

Job Summary

  • This position merges software and systems engineering to guarantee flawless service operation with consistent reliability and uptime.
  • The role involves developing guidelines for incident management, driving root cause analysis, and crafting high-quality postmortems.
  • Candidates will apply automation and Generative AI solutions to minimize manual activities and boost customer support.

Matching Summary

This position merges software and systems engineering to guarantee flawless service operation with consistent reliability and uptime.

Salary

Base: $184,000 - $356,500 USD; Bonus/Equity: Eligible for equity; Benefits: Comprehensive benefits package included

Skills & Requirements

Must-have

  • 8+ years SRE DevOps experience
  • Python Go Perl Ruby coding
  • Prometheus Grafana observability

Nice-to-have

  • Generative AI Agentic solutions expertise
  • Strong communication skills
  • Fast-paced adaptability

Key Requirements

  • Degree in Computer Science or equivalent
  • 8+ years in SRE or Production Engineering
  • Experience with fault-tolerant distributed systems

Work Rights

Not specified

Tailored Resume

Cover Letter