Staff Machine Learning Engineer, Voice Ai

Together AI

San Francisco, United States
On-site
8+ years ml engineering experience
Tensorrt-llm or sglang expertise
Python and pytorch proficiency
Together AI is building the best inference infrastructure for voice applications to power production-grade, real-time voice agents

Job Summary

  • Together AI is building the best inference infrastructure for voice applications to power production-grade, real-time voice agents.
  • The role involves owning the model serving stack for STT, TTS, and speech-to-speech while optimizing latency and throughput on H100s and B200s.
  • Candidates must drive the technical strategy for next-generation audio-native LLMs and end-to-end speech-to-speech systems before they become mainstream.

Matching Summary

Together AI is building the best inference infrastructure for voice applications to power production-grade, real-time voice agents.

Skills & Requirements

Must-have

  • 8+ years ML engineering experience
  • TensorRT-LLM or SGLang expertise
  • Python and PyTorch proficiency
  • GPU optimization and CUDA kernels
  • System design at production scale
  • Real-time voice inference architecture

Nice-to-have

  • Audio codec tokenization schemes
  • Speech-to-speech model paradigms
  • Fine-tuning speech models at scale
  • Collaboration with model partners
  • Developer tooling product intuition

Key Requirements

  • Bachelor's or Master's in Computer Science or related field
  • 8+ years of ML engineering experience
  • Deep practical expertise in LLM serving engines

Work Rights

Not specified

Tailored Resume

Cover Letter