Engineering Manager, Model Inference

Abridge

San Francisco, California, USA
Remote
Ml systems and inference frameworks
Llm architecture understanding
Inference optimizations
Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs

Job Summary

  • Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs.
  • Own the technical direction of our inference systems—making key decisions around batching, throughput, latency, and GPU utilization.
  • Abridge is transforming healthcare delivery experiences with generative AI, enabling clinicians and patients to connect in deeper, more meaningful ways.

Matching Summary

Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs.

Skills & Requirements

Must-have

  • ML systems and inference frameworks
  • LLM architecture understanding
  • inference optimizations
  • GPU characteristics and performance analysis
  • distributed, real-time systems at scale
  • parallelism strategies

Nice-to-have

  • startup urgency and focus
  • training infrastructure and RL workloads
  • secure, compliant systems on cloud platforms

Key Requirements

  • 5+ years of engineering experience
  • 1+ years in a technical leadership or management role
  • Deep, hands-on experience with ML systems
  • Strong understanding of LLM architecture
  • Experience with inference optimizations
  • Familiarity with GPU characteristics
  • Experience deploying reliable, distributed, real-time systems
  • Experience with parallelism strategies

Work Rights

Not specified

Tailored Resume

Cover Letter