[s3ns] Sre Monitoring & Observability (h/f)

THALES

Not specified
Prometheus mimir grafana loki stack
Kubernetes on-prem cluster management
Sla slo sli error budget management
THALES is seeking a Site Reliability Engineer (SRE) for their S3NS division, which focuses on providing secure cloud solutions in partnership with Google Cloud. The ideal candidate will have a strong background in monitoring and observability, particularly with on-premise Kubernetes environments, and will play a crucial role in maintaining service availability and supporting operational tasks

Job Summary

  • The role involves maintaining and evolving the monitoring stack for S3NS infrastructure on-premise.
  • You will ensure adherence to availability commitments (SLI, SLO, SLA) towards Google and internal teams.
  • Responsibilities include participating in on-call rotations, incident response, and contributing to blameless post-mortems.

Matching Summary

Match Score: 85

THALES is seeking a Site Reliability Engineer (SRE) for their S3NS division, which focuses on providing secure cloud solutions in partnership with Google Cloud. The ideal candidate will have a strong background in monitoring and observability, particularly with on-premise Kubernetes environments, and will play a crucial role in maintaining service availability and supporting operational tasks.

Skills & Requirements

Must-have

  • Prometheus Mimir Grafana Loki stack
  • Kubernetes on-prem cluster management
  • SLA SLO SLI error budget management
  • Incident response and post-mortem analysis
  • CICD pipeline automation scripting

Nice-to-have

  • Stress management during incidents
  • Clear communication skills
  • Complex problem solving mindset

Key Requirements

  • Bac+5 degree in computer science
  • Minimum 3 years of similar experience
  • Mastery of SRE concepts and Kubernetes

Work Rights

Not specified

Tailored Resume

Cover Letter