5+ years experience building distributed compute platforms
Deep expertise in python or rust programming
Strong understanding of consensus and fault tolerance
FAL is seeking a Software Engineer specializing in distributed systems to develop large-scale computing platforms using Python or Rust. The ideal candidate will have over five years of experience in building complex, reliable systems and will thrive in an environment that emphasizes innovation and continuous improvement
Job Summary
The role involves building a core Python/Rust platform for request routing, AI workload orchestration, and GPU autoscaling.
Candidates must produce forward designs to scale the system to 100x current traffic while maintaining low latency globally.
The company offers interesting work, significant learning opportunities, and regular team events and offsites.
Matching Summary
Match Score: 85
FAL is seeking a Software Engineer specializing in distributed systems to develop large-scale computing platforms using Python or Rust. The ideal candidate will have over five years of experience in building complex, reliable systems and will thrive in an environment that emphasizes innovation and continuous improvement.
Skills & Requirements
Must-have
5+ years experience building distributed compute platforms
Deep expertise in Python or Rust programming
Strong understanding of consensus and fault tolerance
Track record of designing systems under real production load
Nice-to-have
Experience with AI/ML inference or training infrastructure
Background in building multi-tenant compute platforms
Understanding of networking fundamentals and performance characteristics
Familiarity with GPU workload characteristics and scheduling constraints