Red Hat is seeking a Customer Site Reliability Engineer to join their OpenShift Managed Cloud Services team. This fully remote role focuses on ensuring the reliability and performance of cloud services while providing exceptional customer support and leading technical escalations
Job Summary
The CSRE plays a crucial role in ensuring the availability, reliability, and performance of critical services at scale.
You will partner with Technical Account Managers, Services, Fleet SRE, DevOps, and infrastructure teams to address customer-specific and fleet-wide issues.
Red Hat is the world’s leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies.
Matching Summary
Match Score: 85
Red Hat is seeking a Customer Site Reliability Engineer to join their OpenShift Managed Cloud Services team. This fully remote role focuses on ensuring the reliability and performance of cloud services while providing exceptional customer support and leading technical escalations.
Skills & Requirements
Must-have
OpenShift/Kubernetes container platform
container-based technologies on Linux
Linux-based systems in public cloud
enterprise systems monitoring
enterprise configuration management
object-oriented software engineering
TCP/IP networking and common protocols
Nice-to-have
customer-first mindset
technical lead for customer escalations
Knowledge-Centered Support (KCS) champion
AI and automation projects
fluent in additional languages
Key Requirements
Advanced Experience with OpenShift/Kubernetes
Proficient with container-based technologies on Linux
Proficient in managing Linux-based systems on AWS, Azure, or GCP
Advanced experience with enterprise systems monitoring (Prometheus preferred)
Advanced with enterprise configuration management (Ansible, Terraform)
Software engineering experience using object-oriented languages (Golang preferred)
Superior communications skills and customer presentation experience
Ability to quickly learn new technologies
Demonstrated ability to troubleshoot systems issues