This role is critical to transforming systems from reactive operations to proactive, data-driven reliability engineering across the ServiceTitan platform
Job Summary
This role is critical to transforming systems from reactive operations to proactive, data-driven reliability engineering across the ServiceTitan platform.
You will lead a high-impact team in Bangalore responsible for defining SLOs, error budgets, and end-to-end observability spanning metrics, logs, and traces.
The position requires deep expertise in managing large-scale distributed databases including Azure SQL, PostgreSQL, Cosmos DB, and Kafka while driving cost efficiency.
Matching Summary
This role is critical to transforming systems from reactive operations to proactive, data-driven reliability engineering across the ServiceTitan platform.
Skills & Requirements
Must-have
SQL Reliability & Observability Strategy
Azure Cloud Platform Expertise
Database Performance Optimization
OpenTelemetry Implementation
Distributed Systems Management
SRE Framework Definition
Nice-to-have
AI-driven anomaly detection experience
Kubernetes orchestration knowledge
Mentoring senior engineering talent
Cross-functional leadership skills
NoSQL system familiarity
Key Requirements
10-15+ years software engineering or SRE experience
5+ years engineering leadership at Manager/Director level
BS or MS in Computer Science or equivalent
Deep expertise in SQL and relational database systems