Tl, Research Inference

OpenAI

San Francisco, United States
High-performance inference runtimes
Gpu-centric performance engineering
Distributed inference across multiple gpus
You will build the systems that enable advanced AI models to run efficiently at scale

Job Summary

  • You will build the systems that enable advanced AI models to run efficiently at scale.
  • Your work will directly influence how models are designed, evaluated, and iterated on across the research organization.
  • This is not a product-serving role. Instead, it is a research-enabling systems role focused on performance, correctness, and realism - ensuring that AI research is grounded in what can actually scale.

Matching Summary

You will build the systems that enable advanced AI models to run efficiently at scale.

Skills & Requirements

Must-have

  • high-performance inference runtimes
  • GPU-centric performance engineering
  • distributed inference across multiple GPUs
  • inference-critical operators and kernels
  • diagnose performance bottlenecks

Nice-to-have

  • research-enabling systems role
  • hands-on technical ownership
  • solving hard ambiguous systems problems

Key Requirements

  • experience building production inference systems
  • experience with multi-GPU or distributed systems
  • understand research ideas and implement them

Work Rights

Not specified

Tailored Resume

Cover Letter