Senior Machine Learning Infrastructure Engineer

PlusAI

Santa Clara, CA, United States
On-site
Scalable architectures for petabytes of data
Mlops pipelines for model versioning
Experiment tracking frameworks
Design scalable architectures capable of handling petabytes of data while ensuring optimal performance for both training and inference phases

Job Summary

  • Design scalable architectures capable of handling petabytes of data while ensuring optimal performance for both training and inference phases.
  • Build robust pipelines for managing model versioning systems and experiment tracking frameworks, which are essential for maintaining reproducibility across experiments.
  • This role offers unparalleled opportunities—both technically and professionally—for individuals passionate about solving challenging problems using modern cloud-native technologies.

Matching Summary

Design scalable architectures capable of handling petabytes of data while ensuring optimal performance for both training and inference phases.

Skills & Requirements

Must-have

  • Scalable architectures for petabytes of data
  • MLOps pipelines for model versioning
  • Experiment tracking frameworks
  • Large-scale GPU cluster management
  • Docker and Kubernetes orchestration
  • PyTorch or TensorFlow integration

Nice-to-have

  • Physical AI company
  • Autonomous trucks
  • Cloud-native technologies
  • Cutting-edge solutions

Key Requirements

  • Senior level experience
  • Experience with Docker
  • Experience with Kubernetes
  • Experience with PyTorch
  • Experience with TensorFlow

Work Rights

Not specified

Tailored Resume

Cover Letter