Implement Site Reliability Engineering within Risk & Compliance applications, ensuring system reliability, scalability, and performance by defining SLOs, SLIs, and SLAs
Job Summary
Implement Site Reliability Engineering within Risk & Compliance applications, ensuring system reliability, scalability, and performance by defining SLOs, SLIs, and SLAs.
Develop automation tools and scripts using Python and Go, and leverage Infrastructure as Code principles to eliminate manual tasks and provision infrastructure.
Build, lead, and manage a high-performing team of engineers, fostering a culture of reliability and providing career development support.
Matching Summary
Implement Site Reliability Engineering within Risk & Compliance applications, ensuring system reliability, scalability, and performance by defining SLOs, SLIs, and SLAs.
Skills & Requirements
Must-have
Site Reliability Engineering principles
Implement SRE for Risk & Compliance
Define SLOs, SLIs, SLAs
Error budget management
Develop automation tools (Python, Go)
Infrastructure as Code (IaC)
Monitoring, alerting, logging systems
Nice-to-have
Thrive in fast-paced environment
Advocate for reliability-first culture
Customer journey focus
Continuous development
Culture champion
Key Requirements
10+ years professional engineering experience
4+ years SRE team leadership experience
Java, Spring Boot, Microservices, Python background