Senior Ai Ops Engineer| Kubernetes/docker

KLA

Ann Arbor, MI, USA
$134,800.00 - $229,200.00 annually; not specified;...
Python for ml automation
Mlflow/w&b for experiment tracking
Ci/cd for ml pipelines
This role will be pivotal in architecting and delivering the automation layer that enables fast, reproducible, and scalable model development—spanning end-to-end experiment management, model fine-tuning pipelines, and Reinforcement Learning with Human Feedback (RLHF)

Job Summary

  • This role will be pivotal in architecting and delivering the automation layer that enables fast, reproducible, and scalable model development—spanning end-to-end experiment management, model fine-tuning pipelines, and Reinforcement Learning with Human Feedback (RLHF).
  • We encourage you to apply if you’re a systems-minded engineer who loves turning research workflows into reliable production-grade pipelines, setting standards, and mentoring others to raise the bar across the organization.
  • KLA’s total rewards package for employees may also include participation in performance incentive programs and eligibility for additional benefits including but not limited to: medical, dental, vision, life, and other voluntary benefits, 401(K) including company matching, employee stock purchase program (ESPP), student debt assistance, tuition reimbursement program, development and career growth opportunities and programs, financial planning benefits, wellness benefits including an employee assistance program (EAP), paid time off and paid company holidays, and family care and bonding leave.

Matching Summary

This role will be pivotal in architecting and delivering the automation layer that enables fast, reproducible, and scalable model development—spanning end-to-end experiment management, model fine-tuning pipelines, and Reinforcement Learning with Human Feedback (RLHF).

Salary

$134,800.00 - $229,200.00 Annually; Not specified; Not specified

Skills & Requirements

Must-have

  • Python for ML automation
  • MLflow/W&B for experiment tracking
  • CI/CD for ML pipelines
  • Kubernetes and Docker
  • Distributed GPU training optimization
  • Automated ML model evaluation

Nice-to-have

  • Systems-minded engineer
  • Turning research into production
  • Mentoring and raising standards
  • Experience in semiconductor industry
  • Operating ML platforms with IP security

Key Requirements

  • 5+ years of experience in MLOps/Platform Engineering/DevOps/ML Engineering
  • Bachelor's degree in Computer Science, Software Engineering, or related field
  • Master's Level Degree and 6 years related work experience OR Bachelor's Level Degree and 8 years related work experience OR equivalent work experience
  • Experience with containerization (Docker)
  • Experience with orchestration (Kubernetes)
  • Experience with CI/CD
  • Experience with version control (Git)
  • Experience with Infrastructure-as-Code (Terraform/Bicep or equivalent)

Work Rights

Not specified

Tailored Resume

Cover Letter