Senior Computer Scientist - Sre

Adobe Media and Data Science Research (MDSR) Laboratory

12+ years in site reliability or production engineering
Expert proficiency in python, go, java, or bash
Deep understanding of kubernetes and microservices
The role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure

Job Summary

  • The role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure.
  • Candidates will champion advanced automation frameworks to enable zero-touch operations and introduce AI/ML-based predictive monitoring systems.
  • This position requires leading organization-wide reliability initiatives such as chaos engineering and driving measurable improvements in MTTR and MTBF.

Matching Summary

The role involves defining the long-term reliability and scalability strategy for the Adobe Pass platform while ensuring zero single points of failure.

Skills & Requirements

Must-have

  • 12+ years in site reliability or production engineering
  • Expert proficiency in Python, Go, Java, or Bash
  • Deep understanding of Kubernetes and microservices
  • Advanced experience with Terraform and CI/CD
  • Mastery in Prometheus, Grafana, Datadog, or OpenTelemetry

Nice-to-have

  • Experience with error budgets and chaos engineering
  • Prior work in high-traffic media streaming platforms
  • Familiarity with Kafka, Spark, or Hadoop ecosystems
  • Hands-on security compliance experience (SOC2, GDPR)
  • Published contributions on reliability or distributed systems

Key Requirements

  • Bachelor's or Master's degree in Computer Science or Engineering
  • 12+ years of experience in SRE or large-scale distributed systems
  • Proven track record managing globally distributed cloud-native systems
  • Strong expertise in networking, storage, and distributed databases

Work Rights

Not specified

Tailored Resume

Cover Letter