Software Engineer - Voice Ai (inference Runtime)

Baseten

San Francisco, California, United States
Competitive compensation; equity included; benefit...
Remote
Production-grade real-time large-scale systems
Tail latency optimization p95/p99
Python programming proficiency
This role involves owning the end-to-end product roadmap and engineering implementation for Baseten's in-house Voice AI inference stack

Job Summary

  • This role involves owning the end-to-end product roadmap and engineering implementation for Baseten's in-house Voice AI inference stack.
  • The team focuses on reducing tail latency and increasing throughput for state-of-the-art open-source voice models in production environments.
  • Baseten offers competitive compensation with meaningful equity, 100% health coverage, and a flexible PTO policy including a company-wide winter break.

Matching Summary

This role involves owning the end-to-end product roadmap and engineering implementation for Baseten's in-house Voice AI inference stack.

Salary

Competitive compensation; Equity included; Benefits: Medical, dental, vision, parental leave, 401(k)

Skills & Requirements

Must-have

  • Production-grade real-time large-scale systems
  • Tail latency optimization p95/p99
  • Python programming proficiency
  • AI coding assistant usage

Nice-to-have

  • Model serving runtime experience vLLM TensorRT
  • Containerization and orchestration Docker Kubernetes
  • Speech audio ML model familiarity
  • Systems-level performance profiling skills

Key Requirements

  • Bachelor's degree in Computer Science or related field
  • Proven track record with high-performance real-time systems
  • Strong collaboration and communication skills

Work Rights

Not specified

Tailored Resume

Cover Letter