The role involves designing and implementing high-availability, fault-tolerant architectures across on-prem and AWS cloud platforms to support business-critical applications
Job Summary
The role involves designing and implementing high-availability, fault-tolerant architectures across on-prem and AWS cloud platforms to support business-critical applications.
Candidates will partner closely with development and security teams to reduce operational toil, improve system reliability, and enable faster delivery of services.
Broadridge fosters a collaborative culture where associates are empowered to be authentic and bring their best to work while driving operational excellence.
Matching Summary
The role involves designing and implementing high-availability, fault-tolerant architectures across on-prem and AWS cloud platforms to support business-critical applications.
Skills & Requirements
Must-have
8+ years Site Reliability Engineering experience
AWS cloud-native architecture expertise
Terraform Infrastructure as Code proficiency
Python or Java programming skills
Linux/Unix systems administration
High-availability fault-tolerant design
CI/CD pipeline implementation
Nice-to-have
Experience in financial services industry
EKS/Kubernetes at scale knowledge
Chaos Engineering familiarity
FinOps cost optimization leadership
Legacy system refactoring experience
Mentoring engineers into SRE roles
Blameless postmortem culture advocacy
Key Requirements
8+ years of experience in SRE, Platform Engineering, DevOps, or Systems Engineering
Strong programming experience in Python, Java, or similar languages
Deep hands-on expertise with AWS services including EKS, EC2, RDS, Lambda
Proven track record with Terraform and Infrastructure as Code
Strong understanding of networking, security, and distributed systems