Adobe Media and Data Science Research (MDSR) Laboratory
12+ years sre or production engineering experience
Expert proficiency in python, go, java, or bash
Deep understanding of kubernetes and microservices
This role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure
Job Summary
This role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure.
The successful candidate will champion advanced automation frameworks to enable zero-touch operations and introduce AI/ML-based predictive monitoring.
Candidates are expected to serve as a technical authority during high-impact incidents and lead blameless postmortems to drive continuous improvement.
Matching Summary
This role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure.
Skills & Requirements
Must-have
12+ years SRE or production engineering experience
Expert proficiency in Python, Go, Java, or Bash
Deep understanding of Kubernetes and microservices
Advanced Infrastructure as Code with Terraform
Mastery of observability stacks like Prometheus and Grafana
Nice-to-have
Experience with chaos engineering and error budgets
Background in high-traffic media streaming systems
Familiarity with big data ecosystems like Kafka and Spark
Hands-on security compliance experience SOC2 GDPR
Cloud or Kubernetes professional certifications
Key Requirements
Bachelor's or Master's degree in Computer Science or Engineering
12+ years of experience in site reliability or distributed systems
Proven track record managing globally distributed cloud-native systems