Senior Site Reliability Engineer

lto.de

Pune, India
Onsite
Design and build reliable solutions
Define and implement sli's and slo's
Build meaningful monitoring and alerting
You work close to both the code and production environments, designing and building solutions that make our platform measurable, predictable, and resilient

Job Summary

  • You work close to both the code and production environments, designing and building solutions that make our platform measurable, predictable, and resilient.
  • You will work primarily from our Pune office, while collaborating closely with the rest of the SRE team in our Dutch office.
  • You work on a mission-critical SaaS platform with real customer impact and influence how reliability and observability are engineered into software.

Matching Summary

You work close to both the code and production environments, designing and building solutions that make our platform measurable, predictable, and resilient.

Skills & Requirements

Must-have

  • design and build reliable solutions
  • define and implement SLI's and SLO's
  • build meaningful monitoring and alerting
  • automate away operational risk
  • experience changing application code
  • design observability as part of system architecture

Nice-to-have

  • think like an engineer when things go wrong
  • question existing reliability practices
  • use data to drive measurable improvements
  • collaborate across teams
  • distributed systems or large SaaS platforms
  • regulated or compliance-sensitive environments

Key Requirements

  • 5+ years of experience as a Site Reliability Engineer
  • Strong experience with monitoring, alerting and observability
  • Proven ability to design and work with SLI’s, SLO’s
  • Hands-on coding experience
  • Experience building automation
  • Experience working with Microsoft Azure or other major public cloud providers
  • Comfortable working with live production systems

Work Rights

Not specified

Tailored Resume

Cover Letter