Systems Development Engineer, Research Compute Platform, Fauna
Amazon
New York, NY, US
142,300.00 - 192,400.00 usd annually py
On-site
On-prem gpu compute management
Job scheduling layer implementation
Slurm ray skypilot experience
This role involves owning the end-to-end physical and virtual infrastructure used by ML scientists to train reinforcement learning policies for real robots
Job Summary
This role involves owning the end-to-end physical and virtual infrastructure used by ML scientists to train reinforcement learning policies for real robots.
The ideal candidate must be equally comfortable diagnosing hardware issues like GPU thermal faults as they are designing complex job scheduling systems.
Fauna Robotics is an Amazon company dedicated to building capable, safe, and delightful robots that people want to interact with in everyday spaces.
Matching Summary
This role involves owning the end-to-end physical and virtual infrastructure used by ML scientists to train reinforcement learning policies for real robots.