Site Reliability Engineering Manager, Gwcp

Guidewire Software

Hybrid
Python or go programming skills
Kubernetes eks expertise
Terraform infrastructure as code
This role combines hands-on technical leadership with managing a team responsible for the reliability and scalability of the Guidewire Cloud Platform

Job Summary

  • This role combines hands-on technical leadership with managing a team responsible for the reliability and scalability of the Guidewire Cloud Platform.
  • The manager will drive automation across infrastructure provisioning and promote self-healing systems while establishing robust SLOs and error budgets.
  • You will foster a culture of ownership and accountability, balancing short-term operational needs with long-term improvements in developer experience.

Matching Summary

This role combines hands-on technical leadership with managing a team responsible for the reliability and scalability of the Guidewire Cloud Platform.

Skills & Requirements

Must-have

  • Python or Go programming skills
  • Kubernetes EKS expertise
  • Terraform infrastructure as code
  • AWS distributed systems architecture
  • Observability tools Prometheus Datadog
  • SLO SLI error budget frameworks
  • Incident management and postmortems

Nice-to-have

  • Java Spring Boot experience
  • Okta SSO OAuth familiarity
  • TeamCity GitHub Actions Jenkins
  • KubeVela Crossplane knowledge
  • Open source project contributions
  • AI technology exploration interest
  • Kubernetes AWS certifications

Key Requirements

  • Proven experience leading engineers working for you
  • Strong background in microservices environment support
  • Demonstrated ability to build and evolve processes
  • Experience supporting production systems at scale
  • Ability to make decisions in high-pressure situations

Work Rights

Not specified

Tailored Resume

Cover Letter