Sr. System Development Engineer, Cloud AI/ML/storage server teams
Amazon
Cupertino, CA, US
Not specified; not specified; not specified
On-site
Linux kernel driver debugging
Python ruby java c/c++ programming
Hardware failure root cause analysis
You will lead the development of automation software and diagnostic tooling to maintain the health of AWS storage and AI/ML compute fleets
Job Summary
You will lead the development of automation software and diagnostic tooling to maintain the health of AWS storage and AI/ML compute fleets.
The role requires decomposing complex server testability and reliability problems into straightforward tasks while driving delivery through hardware, software, and system design knowledge.
You will collaborate with internal engineering teams and external ODMs to ensure new server designs meet rigorous testability and automation requirements throughout the lifecycle.
Matching Summary
You will lead the development of automation software and diagnostic tooling to maintain the health of AWS storage and AI/ML compute fleets.