Distributed systems reasoning from first principles
End-to-end inference workload analysis
The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference
Job Summary
The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference.
You will build cost-to-serve estimates from microbenchmarks and create tools that help cross-functional teams reason about latency, capacity, utilization, and cost tradeoffs.
OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
Matching Summary
The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference.
Salary
Not specified; Not specified; Not specified
Skills & Requirements
Must-have
Performance profiling and benchmarking expertise
Distributed systems reasoning from first principles
End-to-end inference workload analysis
Nice-to-have
Collaboration with engineering and research teams
Comfort working across abstraction layers
Deep hardware efficiency knowledge
Key Requirements
Deep expertise with performance profiling and optimization
Experience with kernel and accelerator optimization