Site Reliability Engineer - Multicloud Platform

Workday

Fully remote
3+ years sre experience
Kubernetes expertise required
Public cloud aws gcp azure
The primary function of the SRE team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load

Job Summary

  • The primary function of the SRE team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.
  • Engineers will develop and launch effective SLIs to ensure SLOs are achieved through building an extendable Observability architecture and runbook automation.
  • Workday offers a flexible work approach requiring at least half of the time each quarter in-office or with customers, combined with remote flexibility.

Matching Summary

The primary function of the SRE team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.

Skills & Requirements

Must-have

  • 3+ years SRE experience
  • Kubernetes expertise required
  • Public cloud AWS GCP Azure
  • GoLang Python Ruby proficiency
  • Linux distributed systems

Nice-to-have

  • Passion for automation culture
  • Experience with KubeCon conferences
  • Strong documentation skills
  • Remote collaboration experience
  • Scrum methodology background

Key Requirements

  • BS in Computer Science or equivalent
  • 3+ years handling distributed systems
  • 1+ years SRE experience (Junior role)

Work Rights

Not specified

Tailored Resume

Cover Letter