The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load
Job Summary
The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.
Engineers will own the reliability for the complete stack across public clouds using a foundation of Kubernetes designed from scratch for the cloud.
Workday offers a flexible work approach requiring at least half of the time each quarter to be spent in-office or with customers, combined with remote flexibility.
Matching Summary
The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.
Skills & Requirements
Must-have
Kubernetes experience required
Public cloud infrastructure (AWS/GCP/Azure)
GoLang programming proficiency
Linux operating system expertise
Distributed systems troubleshooting
CI/CD and code management
SRE operational toil reduction
Nice-to-have
Istio service mesh knowledge
OPA policy enforcement
Prometheus and Grafana monitoring
Scrum agile methodology experience
Cloud Native conference participation
Follow-the-sun on-call support
Runbook automation development
Key Requirements
BS in Computer Science or equivalent experience
3+ years SRE experience in distributed systems
1+ years handling distributed systems in public cloud
Proficiency in GoLang, Python, or Ruby
Experience with software development standard methodologies