Mlops & Agentic Platform Engineer (ai Infrastructure)

Hyphen Partners

Boston, United States
On-site
Manage model registries and continuous training loops
Deploy agents as scalable microservices on kubernetes
Build observability dashboards for token usage
This role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure

Job Summary

  • This role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure.
  • The ideal candidate will deploy agents as scalable microservices on Kubernetes and build observability dashboards to track token usage and latency.
  • Candidates must possess a strong DevOps/MLOps background with experience in tools like Kubernetes, Docker, and Terraform.

Matching Summary

This role involves managing model registries, developing continuous training loops, and implementing A/B testing infrastructure.

Skills & Requirements

Must-have

  • Manage model registries and continuous training loops
  • Deploy agents as scalable microservices on Kubernetes
  • Build observability dashboards for token usage
  • Strong DevOps/MLOps background with Kubernetes Docker Terraform
  • Experience with MLflow Weights & Biases or LangSmith

Nice-to-have

  • Knowledge of building scalable microservice architectures
  • Adept at implementing A/B testing infrastructure
  • Track agent reasoning paths in dashboards

Key Requirements

  • Strong DevOps/MLOps background
  • Experience with MLflow or Weights & Biases
  • Knowledge of scalable microservice architectures

Work Rights

Not specified

Tailored Resume

Cover Letter