This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM
Job Summary
This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM.
The leader builds and operates the Operations and Reliability Engineering function using AIOps practices to prevent incidents and reduce alert noise.
Genesys empowers organizations to improve loyalty by creating the best experiences for customers and employees through its AI-powered platform.
Matching Summary
This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM.
Skills & Requirements
Must-have
8+ years infrastructure operations experience
5+ years leading reliability engineering teams
AIOps event correlation and alert management
Cloud infrastructure and networking fundamentals
Incident command and post-incident review leadership
Nice-to-have
Experience implementing SLOs and error budgets
Scripting skills in Python or PowerShell
Familiarity with CMDB dependency modeling
Strong stakeholder communication skills
Experience with automated self-healing workflows
Key Requirements
8+ years in infrastructure operations or SRE
5+ years leading teams in operations or engineering
Proven track record of designing reliability through AIOps