Experience building high performance exabyte scale storage
Hands-on work with gpu kernel optimization cutlass
xAI is seeking a Member of Technical Staff for their Compute Infrastructure team, focusing on building and optimizing a large-scale AI supercomputer. The role requires expertise in GPU clusters, low-level systems programming, and Linux kernel internals, aimed at enhancing AI training and inference capabilities
Job Summary
xAI is building one of the world's largest AI supercomputers from the ground up with a mission to create AI systems that understand the universe.
The role involves owning both raw GPU supercomputer hardware and the platform layer, working across the full stack from low-level kernel optimizations to massive-scale orchestration.
Total rewards include a base salary ranging from $180,000 to $440,000 USD plus equity, comprehensive medical coverage, and access to a 401(k) retirement plan.
Matching Summary
Match Score: 85
xAI is seeking a Member of Technical Staff for their Compute Infrastructure team, focusing on building and optimizing a large-scale AI supercomputer. The role requires expertise in GPU clusters, low-level systems programming, and Linux kernel internals, aimed at enhancing AI training and inference capabilities.
Salary
Base: $180,000 - $440,000 USD; Bonus/Equity: Equity included in total rewards package; Benefits: Comprehensive medical, vision, dental, 401(k), disability, life insurance
Skills & Requirements
Must-have
Deep low-level systems programming C/C++ Rust
Experience building high performance exabyte scale storage
Hands-on work with GPU kernel optimization CUTLASS
Strong experience with large-scale GPU clusters
Experience with Linux kernel internals scheduling
Nice-to-have
Ability to reason from first principles
Track record of building AI infrastructure platforms
Optimize for memory-bound and compute-bound scenarios
Flat organizational structure appreciation
Strong communication skills for knowledge sharing
Key Requirements
Deep low-level systems programming expertise
Production scale distributed compute infrastructure experience
High-performance infrastructure for AI workloads background