Site Reliability Engineer Ii (sre) - Guidewire Cloud Platform (application)

Guidewire Software

Experience with sre principles and sli/slos
Proficiency in aws or kubernetes infrastructure
Strong troubleshooting skills for distributed systems
This role offers a unique opportunity to apply engineering principles to ensure the reliability of the Guidewire Cloud Platform used by hundreds of insurers worldwide

Job Summary

  • This role offers a unique opportunity to apply engineering principles to ensure the reliability of the Guidewire Cloud Platform used by hundreds of insurers worldwide.
  • Candidates will participate in mandatory on-call rotations to respond to incidents and alerts outside regular business hours, ensuring service availability.
  • The position involves developing automated runbooks and optimizing systems to reduce manual tasks while collaborating cross-functionally with development teams.

Matching Summary

This role offers a unique opportunity to apply engineering principles to ensure the reliability of the Guidewire Cloud Platform used by hundreds of insurers worldwide.

Skills & Requirements

Must-have

  • Experience with SRE principles and SLI/SLOs
  • Proficiency in AWS or Kubernetes infrastructure
  • Strong troubleshooting skills for distributed systems
  • Familiarity with Datadog monitoring tools
  • Ability to script in Python, Go, Java, or Shell

Nice-to-have

  • Interest in pursuing SRE or AWS certifications
  • Experience with SQL and database performance tuning
  • Knowledge of open-source data processing frameworks
  • Exposure to CI/CD pipelines like Jenkins or GitHub Actions
  • Passion for continuous learning and innovation

Key Requirements

  • Basic understanding of SLIs, SLOs, and Error Budgets
  • Experience deploying infrastructure using Terraform
  • Comfortable with Linux system administration
  • Participation in mandatory on-call rotations

Work Rights

Not specified

Tailored Resume

Cover Letter