Senior Site Reliability Engineer

Medeloop

Montréal, Canada
On-site
Aws cloud infrastructure
Kubernetes and docker
Ci/cd pipelines with github actions
Design, implement, and manage scalable, secure, and highly available cloud infrastructure on AWS

Job Summary

  • Design, implement, and manage scalable, secure, and highly available cloud infrastructure on AWS.
  • Define, implement, and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) across all production services.
  • Build self-service tooling and runbooks that reduce toil and empower development teams to ship independently.

Matching Summary

Design, implement, and manage scalable, secure, and highly available cloud infrastructure on AWS.

Skills & Requirements

Must-have

  • AWS cloud infrastructure
  • Kubernetes and Docker
  • CI/CD pipelines with GitHub Actions
  • Infrastructure as Code (IaC)
  • Observability stacks (DataDog, CloudWatch, Sentry)
  • Incident response and management
  • SLOs and SLIs
  • Python and Bash scripting

Nice-to-have

  • Healthcare compliance (HIPAA, SOC 2)
  • Chaos engineering practices
  • Open-source contributions
  • Advanced certifications

Key Requirements

  • 7+ years DevOps/SRE experience
  • 2+ years senior capacity
  • Bachelor’s or Master’s degree
  • AWS, DataDog, GitHub Actions, IaC tools, Docker, Kubernetes experience
  • SLO/SLI and incident response experience
  • Networking, security, and compliance understanding
  • Authentication/authorization systems experience

Work Rights

Not specified

Tailored Resume

Cover Letter