Manager Site Reliability Engineer

Workday

Hybrid (at least 50% in-office time quarterly)
3+ years leading sre or database engineering teams
8+ years in software or systems engineering
Experience with kubernetes operators and stateful sets
Workday is seeking a Manager Site Reliability Engineer to lead their Database Reliability Engineering team. The ideal candidate will have extensive experience in managing database environments, implementing automated solutions, and fostering a culture of technical excellence

Job Summary

  • The role involves leading a visionary team to replace manual DBA interventions with automated, self-healing platforms for global-scale applications.
  • Candidates must possess deep expertise in database internals, including engine tuning, replication topologies, and managing workloads across AWS and GCP.
  • Workday offers a flexible work approach requiring at least 50% time in-office or field per quarter while fostering an inclusive culture rooted in integrity and empathy.

Matching Summary

Match Score: 85

Workday is seeking a Manager Site Reliability Engineer to lead their Database Reliability Engineering team. The ideal candidate will have extensive experience in managing database environments, implementing automated solutions, and fostering a culture of technical excellence.

Skills & Requirements

Must-have

  • 3+ years leading SRE or Database Engineering teams
  • 8+ years in software or systems engineering
  • Experience with Kubernetes Operators and stateful sets
  • Expertise in database internals and query optimization
  • Managing databases on AWS RDS/Aurora and GCP Cloud SQL

Nice-to-have

  • Passion for Open-Source and Cloud Native solutions
  • Fostering a culture of psychological safety
  • Spearheading high-stakes response for critical outages
  • Implementing robust observability stacks with Prometheus
  • Mentoring senior engineers to drive technical excellence

Key Requirements

  • Bachelor's degree in Computer Science or related field
  • 4+ years as an SRE/DBRE designing resilient data infrastructure
  • Proven ability to reduce Mean Time to Resolution (MTTR) during incidents

Work Rights

Not specified

Tailored Resume

Cover Letter