Senior Software Engineer - Together Cloud Platform

Together AI

San Francisco, United States
Base: $160,000 - $230,000; equity: startup equity ...
On-site
5+ years building large scale distributed systems
Expert-level golang programming skills
Experience with kubernetes and container orchestration
Together AI is building the AI Acceleration Cloud to combine the fastest LLM inference engine with state-of-the-art AI cloud infrastructure

Job Summary

  • Together AI is building the AI Acceleration Cloud to combine the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.
  • The role involves designing and developing foundational backend services that power a highly available global cloud platform serving internal and external customers.
  • Candidates will work on a distributed GPU scheduling system and manage a global plane for data center compute, networking, and storage.

Matching Summary

Together AI is building the AI Acceleration Cloud to combine the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.

Salary

Base: $160,000 - $230,000; Equity: Startup equity included; Benefits: Health insurance and remote flexibility

Skills & Requirements

Must-have

  • 5+ years building large scale distributed systems
  • Expert-level Golang programming skills
  • Experience with Kubernetes and container orchestration
  • Strong knowledge of compute networking and storage
  • Proficiency in relational database PostgreSQL

Nice-to-have

  • Experience with AWS Azure or GCP cloud providers
  • Familiarity with Kinesis Airflow Kafka data infrastructure
  • Background in ML hardware virtualization
  • Experience with Slurm cluster management
  • Knowledge of GB200s/GB300s BlueField DPUs

Key Requirements

  • Bachelor's or Master's degree in Computer Science or related field
  • 5+ years experience in fault tolerant distributed systems
  • Equivalent practical experience to formal degree

Work Rights

Not specified

Tailored Resume

Cover Letter