Software Engineer, Inference - Performance Optimization

OpenAI

San Francisco, United States
Not specified; not specified; not specified
Performance profiling and benchmarking expertise
Distributed systems reasoning from first principles
End-to-end inference workload analysis
The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference

Job Summary

  • The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference.
  • You will build cost-to-serve estimates from microbenchmarks and create tools that help cross-functional teams reason about latency, capacity, utilization, and cost tradeoffs.
  • OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.

Matching Summary

The team analyzes inference stack performance across application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference.

Salary

Not specified; Not specified; Not specified

Skills & Requirements

Must-have

  • Performance profiling and benchmarking expertise
  • Distributed systems reasoning from first principles
  • End-to-end inference workload analysis

Nice-to-have

  • Collaboration with engineering and research teams
  • Comfort working across abstraction layers
  • Deep hardware efficiency knowledge

Key Requirements

  • Deep expertise with performance profiling and optimization
  • Experience with kernel and accelerator optimization
  • Knowledge of networking and fleet scheduling

Work Rights

Not specified

Tailored Resume

Cover Letter