Own and enhance the reliability, scalability, and security of a complex cloud infrastructure supporting mission-critical workloads
Job Summary
Own and enhance the reliability, scalability, and security of a complex cloud infrastructure supporting mission-critical workloads.
Work hands-on across multi-region AWS/EKS environments, partnering with engineering leads, ML and simulation teams, and customer-facing teams to drive operational excellence.
Lead incident response, implement automated remediation, and guide cloud architecture decisions while optimizing performance, security, and cost.
Matching Summary
Own and enhance the reliability, scalability, and security of a complex cloud infrastructure supporting mission-critical workloads.
Skills & Requirements
Must-have
cloud infrastructure
AWS/EKS environments
incident response
automated remediation
cloud architecture decisions
performance optimization
security optimization
Nice-to-have
fast-paced environment
high-autonomy environment
shaping infrastructure strategies
customer success impact
Key Requirements
Deep technical expertise
Strong problem-solving skills
End-to-end ownership of large-scale infrastructure projects