Senior Site Reliability Engineer, Tenant Services: Geo
GitLab
Remote
Remote
Operating distributed systems at scale
Cloud provider experience
Kubernetes and ecosystem
Site Reliability Engineers are responsible for keeping all user-facing services and other GitLab production systems running smoothly, blending pragmatic operators and software craftspeople
Job Summary
Site Reliability Engineers are responsible for keeping all user-facing services and other GitLab production systems running smoothly, blending pragmatic operators and software craftspeople.
In this role, you will join the Tenant Services, Geo team, supporting GitLab Dedicated customer migrations and Geo-related escalations.
The company embraces AI as a core productivity multiplier, with all team members expected to incorporate AI into their daily workflows to drive efficiency, innovation, and impact.
Matching Summary
Site Reliability Engineers are responsible for keeping all user-facing services and other GitLab production systems running smoothly, blending pragmatic operators and software craftspeople.
Skills & Requirements
Must-have
operating distributed systems at scale
cloud provider experience
Kubernetes and ecosystem
infrastructure as code
programming and scripting
observability systems
incident response and on-call
Nice-to-have
customer engagement during migrations
problem definition and system improvement
self-directed and organized
remote asynchronous collaboration
clear written and verbal communication
alignment with company values
Key Requirements
Experience operating highly-available distributed systems
Hands-on experience with major cloud providers
Experience with Kubernetes
Experience with IaC and configuration management
Strong programming skills in Go or Ruby
Experience with observability systems
Practical exposure to data replication/migration
Comfort participating in on-call rotation
Ability to engage directly with enterprise customers
Ability to clearly define problems and improve systems