Databricks is seeking a Senior Staff Technical Program Manager for Reliability to lead critical reliability initiatives across its infrastructure and product engineering teams. The role demands extensive experience in cloud infrastructure, distributed systems, and technical program management, focusing on enhancing operational excellence and reliability for Databricks' multi-cloud platform
Job Summary
Lead the strategy, execution, and continuous improvement of critical Reliability initiatives across infrastructure and product engineering teams.
Partner closely with senior engineering leaders to define Reliability strategy, set long-term goals, and execute multi-quarter programs.
Drive adoption of reliability best practices across engineering teams, including error budgets, incident reviews, and design-for-resilience patterns.
Matching Summary
Match Score: 85
Databricks is seeking a Senior Staff Technical Program Manager for Reliability to lead critical reliability initiatives across its infrastructure and product engineering teams. The role demands extensive experience in cloud infrastructure, distributed systems, and technical program management, focusing on enhancing operational excellence and reliability for Databricks' multi-cloud platform.
Skills & Requirements
Must-have
large-scale distributed systems
cloud infrastructure
operational excellence
multi-cloud infrastructure
reliability strategy
Nice-to-have
systems thinking
reliability culture
chaos engineering
failure mode analysis
Key Requirements
10+ years of experience
Experience with hyperscale cloud providers
Demonstrated success leading Reliability Programs
Strong understanding of infrastructure, distributed systems, or SRE practices
Experience partnering with senior engineering leadership
Ability to translate ambiguous goals into actionable program plans