The Datacenter Incident Program Manager is responsible for designing, operating, and continuously improving the end-to-end incident management lifecycle across mission-critical data center environments
Job Summary
The Datacenter Incident Program Manager is responsible for designing, operating, and continuously improving the end-to-end incident management lifecycle across mission-critical data center environments.
This role owns the “before, during, and after” mechanics of incidents — establishing standards and playbooks in steady state, serving as (or designating) Incident Commander during active events, and driving structured post-incident review and corrective action to closure.
The ideal candidate brings operational credibility in hyperscale or mission-critical infrastructure, demonstrates calm leadership during high-pressure events, and has a strong bias toward structured documentation, process clarity, and measurable improvement.
Matching Summary
The Datacenter Incident Program Manager is responsible for designing, operating, and continuously improving the end-to-end incident management lifecycle across mission-critical data center environments.
Skills & Requirements
Must-have
Incident management lifecycle
High-density compute environment
Calm leadership under pressure
Structured documentation and process clarity
Post-incident review and corrective action
Nice-to-have
Hyperscale AI compute experience
ISO-based quality systems
Structured operational documentation frameworks
Key Requirements
7+ years in mission-critical infrastructure
Direct experience leading major incidents
Strong familiarity with facilities systems, hardware operations, or network infrastructure
Demonstrated experience running war rooms and executive updates
Experience conducting root cause analysis and corrective action tracking