Mlops & Agentic Platform Engineer (ai Infrastructure)

Hyphen Connect

Boston, United States
On-site
Manage model registries and continuous training loops
Deploy agents as scalable microservices on kubernetes
Build observability dashboards for token usage and latency
The role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure

Job Summary

  • The role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure.
  • Candidates will deploy AI agents as scalable microservices on Kubernetes and build observability dashboards to track performance metrics.
  • The ideal candidate possesses a strong DevOps/MLOps background with expertise in tools like MLflow and Terraform.

Matching Summary

The role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure.

Skills & Requirements

Must-have

  • Manage model registries and continuous training loops
  • Deploy agents as scalable microservices on Kubernetes
  • Build observability dashboards for token usage and latency

Nice-to-have

  • Experience with LangSmith or Weights & Biases
  • Strong background in DevOps practices
  • Knowledge of A/B testing infrastructure

Key Requirements

  • Strong DevOps/MLOps background
  • Experience with Kubernetes and Docker
  • Knowledge of MLflow or Weights & Biases

Work Rights

Not specified

Tailored Resume

Cover Letter