Site Reliability Engineer

ioh

ID
On-site
Kubernetes and cloud-native infrastructure
Ci/cd pipelines using gitlab ci/cd
Gitops deployment workflows using argo cd
Ensure the reliability, scalability, and performance of hybrid and cloud-native infrastructure

Job Summary

  • Ensure the reliability, scalability, and performance of hybrid and cloud-native infrastructure.
  • Manage and optimize workloads running on Google Kubernetes Engine (GKE) and OpenShift.
  • Implement centralized logging using Grafana Loki and ELK Stack, and build dashboards and alerts using Grafana and Datadog.

Matching Summary

Ensure the reliability, scalability, and performance of hybrid and cloud-native infrastructure.

Skills & Requirements

Must-have

  • Kubernetes and cloud-native infrastructure
  • CI/CD pipelines using GitLab CI/CD
  • GitOps deployment workflows using Argo CD
  • Infrastructure as Code using Terraform
  • Python scripting for automation
  • Monitoring with Grafana and Datadog
  • Distributed tracing with OpenTelemetry

Nice-to-have

  • Improving system resilience
  • Supporting mission-critical services
  • Solving complex infrastructure challenges
  • Building automation at scale
  • Go programming language

Key Requirements

  • 0–4 years of experience in SRE, DevOps, Cloud Engineering, or Platform Engineering
  • Bachelor’s degree in Computer Science, Informatics, Information Systems, Electrical Engineering, Mathematics/Statistics, or related field
  • Strong Linux system administration and networking fundamentals
  • Hands-on experience with Kubernetes and containerized environments
  • Infrastructure as Code experience (Terraform), Ansible
  • Familiarity with Google Cloud Platform (GCP)

Work Rights

Not specified

Tailored Resume

Cover Letter