Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs
Job Summary
Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs.
Own the technical direction of our inference systems—making key decisions around batching, throughput, latency, and GPU utilization.
Abridge is transforming healthcare delivery experiences with generative AI, enabling clinicians and patients to connect in deeper, more meaningful ways.
Matching Summary
Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs.
Skills & Requirements
Must-have
ML systems and inference frameworks
LLM architecture understanding
inference optimizations
GPU characteristics and performance analysis
distributed, real-time systems at scale
parallelism strategies
Nice-to-have
startup urgency and focus
training infrastructure and RL workloads
secure, compliant systems on cloud platforms
Key Requirements
5+ years of engineering experience
1+ years in a technical leadership or management role
Deep, hands-on experience with ML systems
Strong understanding of LLM architecture
Experience with inference optimizations
Familiarity with GPU characteristics
Experience deploying reliable, distributed, real-time systems