Head Of Site Reliability Engineering

AIAPL ACQUIRE INTELLIGENCE AUSTRALIA

Hybrid
Aws production services ownership
Infrastructure-as-code with pulumi typescript
Slo and error budget management
This role offers the opportunity to build reliability engineering from the ground up in a mission-critical IoT platform

Job Summary

  • This role offers the opportunity to build reliability engineering from the ground up in a mission-critical IoT platform.
  • You will lead and grow a remote team of SREs while coaching, hiring, and fostering a blameless culture.
  • The position requires hands-on monitoring and incident response as you set Service Level Objectives and drive automation via Infrastructure-as-Code.

Matching Summary

This role offers the opportunity to build reliability engineering from the ground up in a mission-critical IoT platform.

Skills & Requirements

Must-have

  • AWS production services ownership
  • Infrastructure-as-Code with Pulumi TypeScript
  • SLO and error budget management
  • Remote team leadership and hiring
  • Incident command and post-mortem execution

Nice-to-have

  • Blameless culture fostering
  • IoT platform experience
  • DevSecOps practices championing
  • Kubernetes migration projects
  • Cost optimization initiatives

Key Requirements

  • Experience building SRE teams
  • Proficiency in Pulumi and TypeScript
  • Expertise in AWS EKS and MSK
  • Track record in incident response leadership

Work Rights

Not specified

Tailored Resume

Cover Letter