Site Reliability Engineer - Incident Management

F5

Multiple Locations
Hybrid
Major incident management experience
Observability tools implementation
Root cause analysis facilitation
F5 is seeking a Site Reliability Engineer focused on incident management to ensure the availability and performance of critical systems. The role involves leading major incident response efforts, implementing observability tools, and collaborating with cross-functional teams, all while maintaining clear communication during high-pressure situations

Job Summary

  • The Reliability Engineer will lead the resolution of major incidents by managing the end-to-end incident lifecycle including detection, escalation, troubleshooting, and resolution.
  • This role is responsible for designing, implementing, and managing end-to-end observability solutions such as synthetic monitoring, infrastructure monitoring, tracing, and metrics systems.
  • The ideal candidate must demonstrate a mix of strong technical skills, effective communication, and the ability to remain composed and solutions-oriented under high-pressure situations.

Matching Summary

Match Score: 85

F5 is seeking a Site Reliability Engineer focused on incident management to ensure the availability and performance of critical systems. The role involves leading major incident response efforts, implementing observability tools, and collaborating with cross-functional teams, all while maintaining clear communication during high-pressure situations.

Skills & Requirements

Must-have

  • Major incident management experience
  • Observability tools implementation
  • Root cause analysis facilitation
  • Synthetic and infrastructure monitoring
  • High-pressure situation handling

Nice-to-have

  • Blameless postmortem culture
  • Cross-functional collaboration skills
  • Process improvement advocacy
  • Clear communication under pressure
  • Solutions-oriented mindset

Key Requirements

  • Bachelor's degree in Computer Science or related field
  • 3+ years of SRE or DevOps experience
  • Experience with DataDog, Grafana, Splunk, or similar tools
  • Strong understanding of ITIL principles
  • Proficiency in Python, Go, or Bash scripting

Work Rights

Not specified

Tailored Resume

Cover Letter