Senior Site Reliability Engineer

Medeloop

San Francisco, United States
On-site
Aws cloud infrastructure
Kubernetes and docker
Observability stacks (datadog, cloudwatch)
Own the reliability, scalability, performance, and operational excellence of Medeloop’s platform

Job Summary

  • Own the reliability, scalability, performance, and operational excellence of Medeloop’s platform.
  • Design, implement, and manage scalable, secure, and highly available cloud infrastructure on AWS.
  • Partner closely with product and engineering teams to embed reliability thinking into the software development lifecycle.

Matching Summary

Own the reliability, scalability, performance, and operational excellence of Medeloop’s platform.

Skills & Requirements

Must-have

  • AWS cloud infrastructure
  • Kubernetes and Docker
  • Observability stacks (DataDog, CloudWatch)
  • Incident response and management
  • CI/CD pipelines (GitHub Actions)
  • Infrastructure as Code (Terraform, CDK)
  • Scripting (Python, Bash)

Nice-to-have

  • Healthcare compliance (HIPAA, SOC 2)
  • Chaos engineering practices
  • Open-source contributions
  • Multi-cloud experience

Key Requirements

  • 7+ years DevOps/SRE experience
  • 2+ years senior capacity
  • Bachelor's or Master's degree
  • AWS services proficiency
  • Observability platforms experience
  • CI/CD pipeline experience
  • Infrastructure as Code expertise
  • Containerization and orchestration experience
  • SLO/SLI and incident response track record
  • Networking and security best practices
  • Authentication and authorization systems experience

Work Rights

Not specified

Tailored Resume

Cover Letter