The Sr Staff Reliability Engineer will have end-to-end accountability for the reliability of IT services within a defined application portfolio
Job Summary
The Sr Staff Reliability Engineer will have end-to-end accountability for the reliability of IT services within a defined application portfolio.
Key measures of success include service stability, deployment quality, technical debt reduction, and proactive preventative maintenance mechanisms.
The role requires guiding best-in-class software engineering standards to enable metrics generation on technology health including availability and resiliency.
Matching Summary
The Sr Staff Reliability Engineer will have end-to-end accountability for the reliability of IT services within a defined application portfolio.
Skills & Requirements
Must-have
Expert experience with DynaTrace Splunk TrueSight
Deep understanding of Linux containers Kubernetes
Strong hybrid cloud experience IaaS PaaS SaaS
Experience with Terraform CloudFormation Infrastructure as Code
End-to-end accountability for IT service reliability
Nice-to-have
Understanding FinOps or cost-optimization practices
Experience with API gateways and network observability
AWS Solutions Architect certification
Experience in regulated insurance environments
Knowledge of GenAI in testing
Key Requirements
Expert experience with Performance and Observability tools
Deep understanding of Linux systems and orchestration tools
Strong solution architecture orientation for hybrid cloud
Experience with continuous integration and DevOps methodologies
Expertise with Infrastructure as Code tools like Terraform