12+ years in site reliability or production engineering
Expert proficiency in python, go, java, or bash
Deep understanding of kubernetes and microservices
The role involves defining the long-term reliability and scalability strategy for Adobe's Pass platform while architecting large-scale distributed systems
Job Summary
The role involves defining the long-term reliability and scalability strategy for Adobe's Pass platform while architecting large-scale distributed systems.
Candidates will lead organization-wide reliability initiatives including chaos engineering, error budgets, and driving measurable improvements in system uptime.
This position requires mentoring SREs and software engineers to cultivate a deep reliability-first thinking culture across multiple teams.
Matching Summary
The role involves defining the long-term reliability and scalability strategy for Adobe's Pass platform while architecting large-scale distributed systems.
Skills & Requirements
Must-have
12+ years in site reliability or production engineering
Expert proficiency in Python, Go, Java, or Bash
Deep understanding of Kubernetes and microservices
Advanced experience with Infrastructure as Code
Mastery in observability stacks like Prometheus and Grafana
Nice-to-have
Experience with AI/ML-based predictive monitoring
Prior work in high-traffic media streaming platforms