Member Of Technical Staff — Inference

Radixark

Palo Alto, CA, United States
Competitive compensation; meaningful equity includ...
On-site
5+ years systems engineering experience
Large-scale llm inference systems expertise
Deep gpu architecture understanding
RadixArk is seeking a Member of Technical Staff for Inference to optimize large-scale AI inference systems, focusing on performance and efficiency across thousands of GPUs. The ideal candidate will have extensive experience in systems engineering and ML infrastructure, particularly in performance-critical environments

Job Summary

  • RadixArk is seeking a Member of Technical Staff to push the limits of large-scale AI inference by working on core systems serving frontier models.
  • This role sits at the intersection of systems engineering, ML infrastructure, and performance optimization to shape how state-of-the-art models are deployed worldwide.
  • The company has optimized kernels serving billions of tokens daily and contributed to infrastructure powering leading AI companies and research labs.

Matching Summary

Match Score: 85

RadixArk is seeking a Member of Technical Staff for Inference to optimize large-scale AI inference systems, focusing on performance and efficiency across thousands of GPUs. The ideal candidate will have extensive experience in systems engineering and ML infrastructure, particularly in performance-critical environments.

Salary

Competitive compensation; Meaningful equity included; Comprehensive benefits and flexible work arrangements

Skills & Requirements

Must-have

  • 5+ years systems engineering experience
  • Large-scale LLM inference systems expertise
  • Deep GPU architecture understanding
  • Latency and throughput optimization skills
  • Distributed systems and networking fundamentals
  • Proficiency in C++, Rust, Go, or Python

Nice-to-have

  • Experience with vLLM or TensorRT-LLM stacks
  • CUDA and Triton kernel optimization background
  • KV-cache management and scheduling strategies
  • Inference at scale (1000+ GPUs) experience
  • HPC or high-performance systems background
  • Open-source contributions in ML infrastructure

Key Requirements

  • 5+ years of experience in systems engineering or ML infrastructure
  • Strong debugging skills across system layers
  • Experience profiling compute-intensive workloads

Work Rights

Not specified

Tailored Resume

Cover Letter