Engineer Lead, Site Reliability

Fidelity National Information Services

Jacksonville, FL, US
3d onsite
Public cloud (aws) hands-on experience
Infrastructure as code (terraform)
Kubernetes (eks) deployment and management
Build software solutions and systems to manage platform infrastructure and applications, partnering with development teams to improve services through rigorous testing and release procedures

Job Summary

  • Build software solutions and systems to manage platform infrastructure and applications, partnering with development teams to improve services through rigorous testing and release procedures.
  • Improve reliability, quality, and time-to-market of our suite of software solutions by building monitoring that alerts on symptoms rather than on outages and running the production environment.
  • Balance feature development speed and reliability with well-defined service level objectives, partnering with stakeholders to design and deliver a reliable, scalable, secure, and performant platform.

Matching Summary

Build software solutions and systems to manage platform infrastructure and applications, partnering with development teams to improve services through rigorous testing and release procedures.

Skills & Requirements

Must-have

  • Public Cloud (AWS) hands-on experience
  • Infrastructure as Code (Terraform)
  • Kubernetes (EKS) deployment and management
  • Observability and Monitoring tools
  • Scripting and Automation (Python, PowerShell, Bash)
  • Windows and Linux environments
  • DevOps and CI/CD practices

Nice-to-have

  • ServiceNow for ticket and incident management
  • Harness.io for CI/CD deployments
  • Microsoft Azure services exposure
  • AWS or Azure certifications
  • AWS Lambda experience
  • PostgreSQL administration or development
  • Capital Markets and financial services understanding
  • IT event correlation and analysis software
  • Disaster Recovery/Business Continuity planning
  • Leadership and mentoring junior staff

Key Requirements

  • 5+ years of experience in IT operations, infrastructure management, or related technical roles
  • Public Cloud (AWS) hands-on experience
  • Infrastructure as Code (Terraform) strong experience
  • Kubernetes (EKS) deployment and management experience
  • Proficiency with monitoring tools (CloudWatch, Grafana, Prometheus, Splunk)
  • Automation using Python, PowerShell, and Bash
  • Solid experience with Windows and Linux environments
  • Working knowledge of DevOps and CI/CD pipelines
  • Strong troubleshooting skills for production environments
  • Skilled in diagnosing and resolving application and infrastructure failures
  • Ability to create technical documentation and communicate effectively

Work Rights

Not specified

Tailored Resume

Cover Letter