Not specified (assumed hybrid or remote based on industry standards).
Kubernetes and aws infrastructure experience
Strong python programming skills
Distributed systems and networking knowledge
Resmed is seeking a Site Reliability Engineer (SRE) II to enhance the reliability and efficiency of its large-scale digital products through automation and AI/ML techniques. The ideal candidate will possess a strong background in software engineering, SRE, or ML engineering, and have experience with cloud platforms and observability tools
Job Summary
This role blends software engineering and systems engineering to ensure large-scale digital products are reliable, scalable, and efficient.
You will apply GenAI and AI/ML techniques to automate SRE workflows including incident response, alert triage, and operational analysis.
The company fosters a diverse and inclusive culture driven by excellence where innovative ideas are encouraged.
Matching Summary
Match Score: 85
Resmed is seeking a Site Reliability Engineer (SRE) II to enhance the reliability and efficiency of its large-scale digital products through automation and AI/ML techniques. The ideal candidate will possess a strong background in software engineering, SRE, or ML engineering, and have experience with cloud platforms and observability tools.
Skills & Requirements
Must-have
Kubernetes and AWS infrastructure experience
Strong Python programming skills
Distributed systems and networking knowledge
CI/CD pipeline management expertise
Observability platform proficiency
Nice-to-have
GenAI and AI/ML automation experience
Prompt-driven system design skills
Responsible AI implementation knowledge
Incident response workflow optimization
Postmortem analysis capabilities
Key Requirements
Experience in Site Reliability or Software Engineering
Proficiency with Datadog or CloudWatch monitoring tools
Background in ML Engineering or AI-enabled automation