Machine Learning Engineer - Inference

Together AI

San Francisco, California, United States
Base: $160,000 - $230,000; bonus/equity: startup e...
On-site
3+ years high-performance production code experience
Proficiency in python and pytorch
Deep understanding of low-level os concepts
The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale

Job Summary

  • The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale.
  • Candidates will collaborate closely with AI researchers and engineers to optimize runtime inference services for large-scale AI applications.
  • Together AI offers competitive compensation including a base salary range of $160,000 - $230,000 plus equity and benefits.

Matching Summary

The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale.

Salary

Base: $160,000 - $230,000; Bonus/Equity: Startup equity included; Benefits: Health insurance and other competitive benefits

Skills & Requirements

Must-have

  • 3+ years high-performance production code experience
  • Proficiency in Python and PyTorch
  • Deep understanding of low-level OS concepts
  • Experience building high performance libraries

Nice-to-have

  • Knowledge of TGI, vLLM, TensorRT-LLM inference systems
  • Familiarity with speculative decoding techniques
  • CUDA or Triton programming knowledge
  • Background in Rust, Cython, or compilers

Key Requirements

  • 3+ years of professional software engineering experience
  • Production-quality coding skills with extensive testing
  • Strong grasp of multi-threading and memory management

Work Rights

Not specified

Tailored Resume

Cover Letter