This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform
Job Summary
This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform.
Own the enterprise Datadog implementation end-to-end: architecture, integrations, cost optimization, and feature enablement.
Define and operationalize a tiered incident severity model with clear escalation paths, SLAs, and communication protocols.
Matching Summary
This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform.
Skills & Requirements
Must-have
Datadog for observability
Observability COE setup
SRE principles and governance
Cloud platforms (AWS, Azure, GCP)
Automation and configuration management
Nice-to-have
Agile working culture
Collaboration across matrixed organizations
Exposure to regulated industries
Modern service management frameworks
Key Requirements
8+ years in SRE, DevOps, or Infrastructure roles
3+ years in a leadership capacity
Proficiency in automation and configuration management tools