Head Of Site Reliability Engineering

acquire.ai

Taguig City, Philippines
**
Aws production services ownership
Sre team leadership and growth
Slos and error budget enforcement
** Acquire.ai is seeking a Head of Site Reliability Engineering in Taguig City, Philippines, to lead and grow a team responsible for the reliability of AWS-based production services. The role involves strategic leadership, hands-on incident management, and a focus on automation and efficiency in a mission-critical IoT platform environment. **

Job Summary

  • Lead and grow a remote team of SREs, fostering a blameless culture and driving platform resilience.
  • Architect and implement Infrastructure-as-Code in Pulumi/TypeScript for AWS resources and lead large-scale migration projects.
  • Champion a data-driven reliability mindset, define SLOs, and ensure actionable post-mortems are tracked to closure.

Matching Summary

Match Score: 75

** Acquire.ai is seeking a Head of Site Reliability Engineering in Taguig City, Philippines, to lead and grow a team responsible for the reliability of AWS-based production services. The role involves strategic leadership, hands-on incident management, and a focus on automation and efficiency in a mission-critical IoT platform environment. **

Skills & Requirements

Must-have

  • AWS production services ownership
  • SRE team leadership and growth
  • SLOs and error budget enforcement
  • Infrastructure-as-Code (Pulumi/TypeScript)
  • CI/CD and observability pipelines
  • Incident response and post-mortems
  • DevSecOps practices

Nice-to-have

  • Blameless culture fostering
  • Continuous improvement mindset
  • Mission-critical IoT platform

Key Requirements

  • 3-6 engineers to build
  • Experience with Pulumi/TypeScript
  • Experience with AWS resources (EKS, MSK, SingleStore, MongoDB, S3)
  • Experience with SOC 2 & ISO 27001 evidence collection

Work Rights

Not specified

Tailored Resume

Cover Letter