Staff Site Reliability Engineer

Thrivemarketjobs

Remote
**
Define and build reliability foundation
Establish sre practice from ground up
Build observability into everything
** Thrive Market is seeking a Staff Site Reliability Engineer to establish and enhance the reliability infrastructure for their platform. The role involves defining service levels, implementing observability practices, and contributing to the development of scalable systems during a period of rapid growth. **

Job Summary

  • We’re looking for a Staff Site Reliability Engineer to help define and build the reliability foundation for Thrive Market’s platform.
  • You’ll be working with a first-class group of engineers to establish our SRE practice from the ground up; defining SLOs, SLIs and Error Budgets, building observability into everything we do, and creating the frameworks that ensure our systems scale reliably during our company’s rapid growth.
  • We’ve recently containerized our entire platform on Kubernetes, and we’re evaluating a potential platform migration to a next-generation ecommerce platform.

Matching Summary

Match Score: 75

** Thrive Market is seeking a Staff Site Reliability Engineer to establish and enhance the reliability infrastructure for their platform. The role involves defining service levels, implementing observability practices, and contributing to the development of scalable systems during a period of rapid growth. **

Skills & Requirements

Must-have

  • Define and build reliability foundation
  • Establish SRE practice from ground up
  • Build observability into everything
  • Architect and optimize Kubernetes platform
  • Lead incident response efforts
  • Design and implement automated deployment pipelines

Nice-to-have

  • Champion culture of operational excellence
  • Collaborate with product engineering teams
  • Balance hands-on work with strategic thinking

Key Requirements

  • 7+ years of experience in SRE, DevOps, or Infrastructure Engineering
  • Deep expertise in Kubernetes (K8s)
  • Advanced proficiency in Linux administration
  • Extensive experience with core AWS services
  • Strong experience with Infrastructure as Code tools
  • Hands-on experience defining and implementing SLOs, SLIs, and error budgets

Work Rights

Not specified

Tailored Resume

Cover Letter