Site Reliability Engineer - Multicloud Platform

Workday

Fully remote
Kubernetes experience in public cloud
Linux operating system proficiency
Golang programming language skills
The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load

Job Summary

  • The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.
  • Engineers will own the reliability for the complete stack and tools that deliver Workday products across public clouds using Cloud Native technologies.
  • The company offers a flexible work approach requiring at least half of the time each quarter to be spent in-office or with customers.

Matching Summary

The primary function of the team is to ensure the reliability and availability of the platform to meet desired SLAs while reducing operational load.

Skills & Requirements

Must-have

  • Kubernetes experience in public cloud
  • Linux operating system proficiency
  • GoLang programming language skills
  • AWS GCP or Azure cloud platforms
  • CI/CD and code management methodologies

Nice-to-have

  • Passion for automating operational toil
  • Experience with distributed systems scaling
  • Ability to work independently in remote teams
  • Strong documentation and runbook creation skills
  • Collaboration with global diverse backgrounds

Key Requirements

  • BS in Computer Science or equivalent experience
  • 3+ years SRE experience in distributed systems
  • 1+ years handling distributed systems in public cloud
  • Proficiency in GoLang, Python, or Ruby

Work Rights

Not specified

Tailored Resume

Cover Letter