Stability AI is seeking a Senior Site Reliability Engineer to enhance its cloud infrastructure, collaborating with various teams to ensure system reliability and innovation. The ideal candidate should possess strong skills in cloud architecture, infrastructure as code, and incident management, with a focus on AWS and related technologies
Job Summary
Stability AI is seeking a Senior Site Reliability Engineer to shape and improve their evolving cloud infrastructure.
The role involves architecting scalable systems in AWS with a focus on high availability and resilience.
Candidates will collaborate across engineering, IT, and security teams to drive innovation and enforce SRE best practices.
Matching Summary
Match Score: 85
Stability AI is seeking a Senior Site Reliability Engineer to enhance its cloud infrastructure, collaborating with various teams to ensure system reliability and innovation. The ideal candidate should possess strong skills in cloud architecture, infrastructure as code, and incident management, with a focus on AWS and related technologies.
Skills & Requirements
Must-have
AWS cloud environment management
Terraform infrastructure as code
Kubernetes container scaling
Grafana ELK stack monitoring
CI/CD pipeline enhancement
Nice-to-have
Mentoring junior team members
Championing SRE principles
Driving incident root cause analysis
Cloud security experience background
Key Requirements
Experience scaling resource intensive systems
Background in software development or automation scripting