5+ years professional software development experience
High-performance production quality code
Globally distributed micro-service architectures
Together AI is building the AI Acceleration Cloud, an end-to-end platform combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure
Job Summary
Together AI is building the AI Acceleration Cloud, an end-to-end platform combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.
The role involves designing and building the IaaS software layer for a new GB200 data center with thousands of GPUs while working on a global multi-exabyte high-performance object store.
Candidates will create services, tools, and developer documentation while performing architecture and research work for decentralized AI workloads.
Matching Summary
Together AI is building the AI Acceleration Cloud, an end-to-end platform combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.
Salary
Base: $160,000 - $230,000; Equity: Startup equity included; Benefits: Health insurance and other benefits
Skills & Requirements
Must-have
5+ years professional software development experience
High-performance production quality code
Globally distributed micro-service architectures
Strong fundamental software development skills
Systems knowledge and troubleshooting abilities
Nice-to-have
Deep experience with Kubernetes internals
Experience virtualizing GPUs and Infiniband
Knowledge of DPUs and SmartNICs
Familiarity with NCCL and CUDA programming
Experience with Cluster API
Key Requirements
5+ years of professional software development experience
Proficiency in at least one backend programming language (Golang desired)
Experience building globally distributed micro-service architectures