This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform
Job Summary
This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform.
Own the enterprise Datadog implementation end-to-end: architecture, integrations, cost optimization, and feature enablement.
Define and operationalize a tiered incident severity model with clear escalation paths, SLAs, and communication protocols.
Matching Summary
This highly strategic role will be responsible for defining, building, and driving the Observability practice across the enterprise, which includes establishing SRE principles, patterns, and governance frameworks with Datadog as the primary observability platform.
Skills & Requirements
Must-have
Datadog for observability
Cloud platforms (AWS, Azure, GCP)
Automation and configuration management
CI/CD pipelines
Distributed systems and microservices
Performance tuning and capacity planning
Nice-to-have
Agile working culture
Collaborate across matrixed organizations
Modern service management frameworks
Regulated industries experience
Key Requirements
8+ years in SRE, DevOps, or Infrastructure roles
3+ years in a leadership capacity
Proficiency in scripting/programming languages (Python, Go, Shell, etc.)