Engineering Manager, Model Serving

Together AI

San Francisco, CA, United States
Base: $250,000 - $300,000; equity: startup equity ...
On-site
5+ years production ml inference experience
Deep expertise with kubernetes and multi-cluster orchestration
Experience with llm inference frameworks like vllm or sglang
Together AI is building an AI Inference & Model Shaping Platform to bring advanced generative AI models to the world

Job Summary

  • Together AI is building an AI Inference & Model Shaping Platform to bring advanced generative AI models to the world.
  • The role involves owning availability and performance SLAs for production inference services across serverless and dedicated deployments.
  • Candidates will mentor team members, drive reliability improvements, and partner with cross-functional teams to optimize system efficiency.

Matching Summary

Together AI is building an AI Inference & Model Shaping Platform to bring advanced generative AI models to the world.

Salary

Base: $250,000 - $300,000; Equity: Startup equity included; Benefits: Health insurance and other competitive benefits

Skills & Requirements

Must-have

  • 5+ years production ML inference experience
  • Deep expertise with Kubernetes and multi-cluster orchestration
  • Experience with LLM inference frameworks like vLLM or SGLang
  • Proven track record of SLA ownership and incident response

Nice-to-have

  • Experience building internal developer platforms
  • Background in GPU infrastructure cost optimization
  • Contributions to open-source ML infrastructure projects

Key Requirements

  • 5+ years operating production ML systems at scale
  • 2+ years in senior IC or tech lead roles
  • Experience with multi-tenant SaaS platforms

Work Rights

Not specified

Tailored Resume

Cover Letter