Software Engineer, Inference Ai/ml

CoreWeave

Sunnyvale, CA, US
Base: $92,000 to $135,000; bonus/equity: discretio...
On-site
Python/go/c++ development
Model serving services
Gpu platform
Join the Inference team to ship production features that improve latency, reliability, and cost for model serving on our GPU platform

Job Summary

  • Join the Inference team to ship production features that improve latency, reliability, and cost for model serving on our GPU platform.
  • Implement well-scoped features and fixes in Python/Go/C++ for model-serving services (e.g., Triton, vLLM, TensorRT-LLM, Ray Serve).
  • CoreWeave offers a comprehensive benefits program including 100% paid medical, dental, and vision insurance, a 401(k) with employer match, and flexible PTO.

Matching Summary

Join the Inference team to ship production features that improve latency, reliability, and cost for model serving on our GPU platform.

Salary

Base: $92,000 to $135,000; Bonus/Equity: discretionary bonus, equity awards; Benefits: comprehensive benefits program

Skills & Requirements

Must-have

  • Python/Go/C++ development
  • model serving services
  • GPU platform
  • containerization and Kubernetes
  • Linux fundamentals
  • data structures and algorithms

Nice-to-have

  • performance experiments
  • micro-batching, KV cache, streaming
  • Grafana/Prometheus/OpenTelemetry
  • entrepreneurial outlook
  • independent thinking

Key Requirements

  • BS/MS in CS, EE, or related field, or equivalent practical experience
  • Git/CI basics
  • Exposure to containers and Kubernetes
  • Curiosity about GPU inference concepts
  • Internship or project deploying microservice or ML inference demo
  • Coursework/research with PyTorch or TensorFlow
  • Simple CUDA projects

Work Rights

Must be a US person (citizen, permanent resident, refugee, or asylee) or eligible for export authorization

Tailored Resume

Cover Letter