Site Reliability Engineer For Hhsc, Austin, Tx Category Information Technology Location Austin, Tx
Pedigo Staffing Services
Austin, Tx, USA
On-site
Linux/unix systems and system internals
Python, go, java, bash scripting
Highly available, distributed systems
Ensures reliability, availability, performance, and scalability of production systems by applying software engineering practices to infrastructure and operations
Job Summary
Ensures reliability, availability, performance, and scalability of production systems by applying software engineering practices to infrastructure and operations.
Partners with development teams to build resilient, observable, and automated platforms that meet defined service level objectives (SLOs).
Independently performs a variety of complicated tasks with a wide degree of creativity and latitude expected.
Matching Summary
Ensures reliability, availability, performance, and scalability of production systems by applying software engineering practices to infrastructure and operations.
Skills & Requirements
Must-have
Linux/Unix systems and system internals
Python, Go, Java, Bash scripting
highly available, distributed systems
cloud platforms (AWS, GCP)
containerization and orchestration (Docker, Kubernetes)
monitoring, alerting, and logging
SLIs, SLOs, and error budgets
incident management and RCA
security and compliance integration
Nice-to-have
observability tools (Prometheus, Grafana)
24x7 production environments
on-call rotations
Key Requirements
8 years experience in systems engineering, DevOps, or SRE