This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM
Job Summary
This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM.
The leader builds and operates the Operations and Reliability Engineering function using AIOps practices to prevent incidents and reduce alert noise.
Genesys employs more than 6,000 people globally who embrace empathy and cultivate collaboration to succeed while offering independence to make a larger impact.
Matching Summary
This role is accountable for improving reliability, observability, and automated recovery across Cloud Infrastructure, Networking, Enterprise Tools, and IAM.
Skills & Requirements
Must-have
8+ years in infrastructure operations
5+ years leading reliability teams
AIOps event correlation and normalization
Cloud infrastructure and networking fundamentals
Incident command and escalation management
SLO/SLI definition and observability standards
Automation and self-healing workflow design
Nice-to-have
Experience with practical SLOs and error budgets
Familiarity with service mapping and CMDB modeling
Scripting leadership in Python or PowerShell
Strong stakeholder communication skills
Experience driving reliability culture adoption
Key Requirements
8+ years in infrastructure operations or SRE
5+ years leading teams in operations or engineering
Proven track record designing reliability through AIOps
Working knowledge of cloud infrastructure and IAM lifecycle