$144,700 to $261,300; bonus potential: incentive p...
Hybrid
Ml infrastructure strategy
Large-scale ml training and inference
Python and pytorch ecosystem
The mission of the AVCPE team is to provide input into large scale ML infrastructure strategy, advise on key decisions affecting our cloud budget, identify and execute optimization projects, and provide capacity planning and engineering expertise to support GM’s efforts in developing autonomous vehicles (AV)
Job Summary
The mission of the AVCPE team is to provide input into large scale ML infrastructure strategy, advise on key decisions affecting our cloud budget, identify and execute optimization projects, and provide capacity planning and engineering expertise to support GM’s efforts in developing autonomous vehicles (AV).
Conduct deep-dive analyses of production workloads to identify bottlenecks and propose high-impact optimization strategies.
GM offers a variety of health and wellbeing benefit programs, including medical, dental, vision, retirement savings plan, and paid vacation & holidays.
Matching Summary
The mission of the AVCPE team is to provide input into large scale ML infrastructure strategy, advise on key decisions affecting our cloud budget, identify and execute optimization projects, and provide capacity planning and engineering expertise to support GM’s efforts in developing autonomous vehicles (AV).
Salary
$144,700 to $261,300; Bonus Potential: Incentive pay program; Benefits: Health and wellbeing benefit programs
Skills & Requirements
Must-have
ML infrastructure strategy
large-scale ML training and inference
Python and PyTorch ecosystem
Kubernetes for orchestrating workloads
Nvidia DCGM and Grafana
AWS, GCP, or Azure
Nice-to-have
Enterprise-grade Nvidia GPU architectures
deploying open-source models
BigQuery for data analysis
Nvidia Nsight for performance tuning
translate complex infrastructure needs
Key Requirements
5+ years of professional experience
Bachelor’s Degree in Computer Science
Expert-level coding skills in Python
Resolving performance issues in large-scale distributed environments
Deep understanding of distributed systems and ML system design
Hands-on experience with Kubernetes
Technical proficiency with Nvidia DCGM, nvidia-smi, and Grafana