Hands-on kubernetes management and troubleshooting
Proficiency in golang for automation scripting
The team is dedicated to improving platform reliability, observability, and delivering operational success at scale using cloud-native technologies
Job Summary
The team is dedicated to improving platform reliability, observability, and delivering operational success at scale using cloud-native technologies.
Engineers will design and implement automation solutions to reduce manual effort and enable the team to operate large-scale distributed systems.
The role involves participating in a follow-the-sun on-call model across three time zones to ensure continuous platform coverage and rapid incident response.
Matching Summary
The team is dedicated to improving platform reliability, observability, and delivering operational success at scale using cloud-native technologies.
Skills & Requirements
Must-have
1 to 8 years SRE or DevOps experience
Hands-on Kubernetes management and troubleshooting
Proficiency in GoLang for automation scripting
Experience with public cloud environments AWS GCP
Solid Linux/Unix operating system background
Nice-to-have
Familiarity with Istio OPA Prometheus Grafana
Experience contributing to Cloud Native conferences
Knowledge of Agile Scrum methodologies
Strong documentation and runbook creation skills
Ability to work in a follow-the-sun model
Key Requirements
BS in Computer Science or equivalent practical experience
1 to 8 years of relevant site reliability engineering experience