Site Reliability Engineer Senior - Cloud

Worldpay (FIS)

Bangalore, India
Cloud platforms (aws, azure, gcp)
Infrastructure as code (terraform, cloudformation)
Slos and slis
Lead the design and evolution of observability, monitoring, and alerting systems to ensure end-to-end visibility and proactive issue detection

Job Summary

  • Lead the design and evolution of observability, monitoring, and alerting systems to ensure end-to-end visibility and proactive issue detection.
  • Own incident management processes, including high-severity incident response, root cause analysis, and continuous improvement initiatives.
  • A work environment built on collaboration, flexibility and respect.

Matching Summary

Lead the design and evolution of observability, monitoring, and alerting systems to ensure end-to-end visibility and proactive issue detection.

Skills & Requirements

Must-have

  • Cloud platforms (AWS, Azure, GCP)
  • Infrastructure as Code (Terraform, CloudFormation)
  • SLOs and SLIs
  • Containerizing legacy apps
  • Monitoring tools (Prometheus, Grafana, DataDog)
  • Logging frameworks (Splunk, ELK Stack)
  • Scripting and automation (Python, Bash, Ansible)
  • CI/CD pipelines (Jenkins, GitLab CI/CD, Azure DevOps)
  • Incident response and post-mortem culture

Nice-to-have

  • Change Agent mindset
  • Influence cross-functional teams
  • Drive change at scale
  • Calm, data-driven approach

Key Requirements

  • 10+ Yrs of proven experience in Lead SRE/DevOps/Infrastructure Engineering
  • Deep expertise in cloud platforms
  • Proven Expertise setting up SLOs and SLIs
  • Experience with Containerizing legacy apps
  • Strong background in monitoring tools
  • Advanced proficiency in scripting and automation
  • Hands-on experience with CI/CD pipelines
  • Demonstrated leadership in incident response
  • Ability to take Incident command

Work Rights

Not specified

Tailored Resume

Cover Letter