Manager Site Reliability Engineer

Workday

Hybrid (50% in-office or field with customers/partners)
3+ years leading sre or database engineering teams
8+ years software or systems engineering experience
Database internals tuning and replication topologies
Workday is seeking a Manager of Site Reliability Engineering (SRE) to lead their Database Reliability Engineering team. The ideal candidate will have substantial experience in managing large-scale database environments and a passion for leveraging open-source and cloud-native solutions

Job Summary

  • The role involves leading a visionary team to replace manual database interventions with automated, self-healing platforms.
  • Candidates must have extensive experience managing thousands of production databases across multiple data centers and public clouds.
  • Workday offers a flexible work approach requiring at least half of the time each quarter to be spent in-office or with customers.

Matching Summary

Match Score: 85

Workday is seeking a Manager of Site Reliability Engineering (SRE) to lead their Database Reliability Engineering team. The ideal candidate will have substantial experience in managing large-scale database environments and a passion for leveraging open-source and cloud-native solutions.

Skills & Requirements

Must-have

  • 3+ years leading SRE or Database Engineering teams
  • 8+ years software or systems engineering experience
  • Database internals tuning and replication topologies
  • Kubernetes Operators and stateful sets management
  • AWS RDS/Aurora and GCP Cloud SQL experience
  • Prometheus, Grafana, Datadog, or PMM observability
  • Reducing MTTR and institutionalizing RCA processes

Nice-to-have

  • Passion for Open-Source and Cloud Native solutions
  • Fostering culture of psychological safety
  • Deep-dive troubleshooting in Linux internals
  • Agile/Scrum and Continual Improvement Process
  • Hybrid Software/Database Engineer mindset

Key Requirements

  • Bachelor's degree in Computer Science or related field
  • 4+ years as an SRE/DBRE designing resilient data infrastructure
  • 5+ years spearheading response for critical data outages

Work Rights

Not specified

Tailored Resume

Cover Letter