Staff Site Reliability Engineer

Replit

Remote, United States
Competitive salary & equity; not specified; 401(k)...
Remote
8-10 years site reliability engineering experience
Strong python or go programming skills
Deep understanding of distributed systems
Replit is seeking a Staff Site Reliability Engineer to enhance the reliability and performance of its infrastructure that supports millions of developers worldwide. The ideal candidate will have extensive experience in SRE or similar roles, strong programming skills, and a passion for making software creation accessible. This remote position offers competitive salary and benefits, along with a flexible work environment

Job Summary

  • Join the team to ensure the reliability, scalability, and performance of Replit's infrastructure serving millions of developers worldwide.
  • You will architect comprehensive monitoring solutions, lead incident response, and automate operational tasks to create step-function improvements.
  • The role offers competitive salary, equity, a 401(k) match, flexible time off, and an autonomous remote work environment.

Matching Summary

Match Score: 85

Replit is seeking a Staff Site Reliability Engineer to enhance the reliability and performance of its infrastructure that supports millions of developers worldwide. The ideal candidate will have extensive experience in SRE or similar roles, strong programming skills, and a passion for making software creation accessible. This remote position offers competitive salary and benefits, along with a flexible work environment.

Salary

Competitive Salary & Equity; Not specified; 401(k) Program with 4% match, Health/Dental/Vision/Life Insurance, Paid Parental Leave, Flexible Time Off

Skills & Requirements

Must-have

  • 8-10 years Site Reliability Engineering experience
  • Strong Python or Go programming skills
  • Deep understanding of distributed systems
  • Expertise in Kubernetes and container orchestration
  • Experience with Terraform or Pulumi infrastructure as code
  • Proven track record in monitoring and observability solutions

Nice-to-have

  • Deep experience with Google Cloud Platform services
  • Expert-level knowledge of Prometheus, Grafana, Datadog
  • Experience designing high throughput low latency systems
  • Significant experience with Go and Terraform specifically
  • Familiarity with rapid-growth startup environments
  • Passion for making software creation accessible

Key Requirements

  • 8-10 years of SRE or similar experience
  • High-quality well-tested code proficiency in Python or Go
  • Deep experience with cloud-native technologies and Kubernetes

Work Rights

Not specified

Tailored Resume

Cover Letter