Manager – Site Reliability Engineering

London Stock Exchange Group

London, United Kingdom
On-site
Service ownership for production environment
Incident commander for major incidents
Sre best practices
Assume end-to-end accountability for the Clearing production environment, ensuring high availability, optimal performance, and robust resilience of business-critical systems

Job Summary

  • Assume end-to-end accountability for the Clearing production environment, ensuring high availability, optimal performance, and robust resilience of business-critical systems.
  • Act as Incident Commander during major incidents, leading resolution efforts, managing stakeholder communications, and driving root cause analysis and remediation.
  • Build and mentor a high-performing SRE team, promoting a culture of accountability, continuous improvement, and blameless postmortems to enhance operational excellence.

Matching Summary

Assume end-to-end accountability for the Clearing production environment, ensuring high availability, optimal performance, and robust resilience of business-critical systems.

Skills & Requirements

Must-have

  • Service Ownership for production environment
  • Incident Commander for major incidents
  • SRE best practices
  • Oracle database expertise
  • Automation at scale
  • Observability practices
  • Cloud and On-Premise infrastructure

Nice-to-have

  • Financial markets knowledge
  • Regulatory compliance
  • Chaos engineering
  • AWS experience

Key Requirements

  • 3+ years in a leadership capacity
  • Degree educated or equivalent work experience
  • Experience leading teams supporting mixed infrastructure
  • Expertise in automation (Python, Shell, PowerShell etc.)

Work Rights

Not specified

Tailored Resume

Cover Letter