Site Reliability Engineer (sre)

Acquire Asia Pacific Pty

Not specified
Service level objectives (slo) management
Infrastructure as code with pulumi typescript
Aws services including eks and msk
Acquire Asia Pacific Pty is seeking a Site Reliability Engineer to enhance the reliability and performance of their IoT telemetry platform. The role involves defining service level objectives, automating processes, and managing incident responses to ensure high uptime and data integrity

Job Summary

  • The Site Reliability Engineer serves as the guardian of production systems ensuring reliability and scalability for an IoT telemetry platform.
  • You will define Service Level Objectives, automate operational processes using Pulumi, and manage AWS infrastructure including EKS and MSK.
  • The role requires participating in a follow-the-sun on-call rotation to provide 24x7 support across multiple global time zones.

Matching Summary

Match Score: 85

Acquire Asia Pacific Pty is seeking a Site Reliability Engineer to enhance the reliability and performance of their IoT telemetry platform. The role involves defining service level objectives, automating processes, and managing incident responses to ensure high uptime and data integrity.

Skills & Requirements

Must-have

  • Service Level Objectives (SLO) management
  • Infrastructure as Code with Pulumi TypeScript
  • AWS services including EKS and MSK
  • Incident command and post-mortem leadership
  • Prometheus Grafana PagerDuty monitoring

Nice-to-have

  • Security compliance SOC2 ISO 27001
  • Follow-the-sun on-call rotation experience
  • IoT telemetry platform knowledge
  • SingleStore MongoDB database management
  • Teamwork and innovation culture

Key Requirements

  • Experience defining and enforcing SLOs
  • Proficiency in Pulumi with TypeScript
  • Knowledge of AWS EKS and MSK services
  • Ability to lead incident response procedures
  • Familiarity with SOC2 and ISO 27001 standards

Work Rights

Not specified

Tailored Resume

Cover Letter