Staff Engineer, Site Reliability

LearnUpon

Dublin, Ireland
On-site
Infrastructure scale-out
Observability function
Resilient, scalable infrastructure
Identify opportunities to improve and scale infrastructure for performance, observability, maintainability, and cost by creating innovative solutions

Job Summary

  • Identify opportunities to improve and scale infrastructure for performance, observability, maintainability, and cost by creating innovative solutions.
  • Lead efforts to build an observability function incorporating application metrics, transaction tracking, and event log management.
  • Work with other Engineering teams to provide infrastructure solutions that meet their ongoing requirements and participate in an on-call rota.

Matching Summary

Identify opportunities to improve and scale infrastructure for performance, observability, maintainability, and cost by creating innovative solutions.

Skills & Requirements

Must-have

  • infrastructure scale-out
  • observability function
  • resilient, scalable infrastructure
  • microservice environments
  • containerisation technologies
  • IaC and automation tooling
  • CI/CD pipelines

Nice-to-have

  • customer experience focus
  • collaborative environments
  • friendly, supportive team
  • database scaling experience

Key Requirements

  • 7+ years software or Ops experience
  • 5+ years cloud engineering experience
  • 2+ years AWS experience
  • Kubernetes and Docker experience
  • Observability tech stacks design
  • SLO/SLI implementation architecture
  • Cost analysis of observability
  • Large-scale distributed systems
  • IaC, automation, CI/CD experience

Work Rights

Not specified

Tailored Resume

Cover Letter