3+ years high-performance production code experience
Proficiency in python and pytorch
Deep understanding of low-level os concepts
The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale
Job Summary
The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale.
Candidates will collaborate closely with AI researchers and engineers to optimize runtime inference services for large-scale AI applications.
Together AI offers competitive compensation including a base salary range of $160,000 - $230,000 plus equity and benefits.
Matching Summary
The role focuses on designing and building production systems that power the Together AI inference engine to ensure reliability and performance at scale.
Salary
Base: $160,000 - $230,000; Bonus/Equity: Startup equity included; Benefits: Health insurance and other competitive benefits
Skills & Requirements
Must-have
3+ years high-performance production code experience
Proficiency in Python and PyTorch
Deep understanding of low-level OS concepts
Experience building high performance libraries
Nice-to-have
Knowledge of TGI, vLLM, TensorRT-LLM inference systems
Familiarity with speculative decoding techniques
CUDA or Triton programming knowledge
Background in Rust, Cython, or compilers
Key Requirements
3+ years of professional software engineering experience
Production-quality coding skills with extensive testing
Strong grasp of multi-threading and memory management