Senior Engineer Ii, Gpu Kernel And Performance

DigitalOcean

Boston, United States
Base: $167,200.00 to $209,000; bonus/equity: poten...
On-site
Expert-level triton or cuda programming
Deep understanding of gpu architectures
Experience with fp8 int8 and fp4 quantization
DigitalOcean is seeking a Senior Engineer to maximize throughput and minimize latency for advanced large models in their Inference Cloud

Job Summary

  • DigitalOcean is seeking a Senior Engineer to maximize throughput and minimize latency for advanced large models in their Inference Cloud.
  • The role involves designing high-performance GPU kernels using Triton and CUDA C++ while implementing state-of-the-art quantization techniques.
  • Employees receive competitive compensation including base salary, potential bonuses, equity grants, and reimbursement for conferences and training.

Matching Summary

DigitalOcean is seeking a Senior Engineer to maximize throughput and minimize latency for advanced large models in their Inference Cloud.

Salary

Base: $167,200.00 to $209,000; Bonus/Equity: Potential bonus based on performance; Equity grants upon hire and ESPP option available; Benefits: Competitive array including flexible time off and EAP

Skills & Requirements

Must-have

  • Expert-level Triton or CUDA programming
  • Deep understanding of GPU architectures
  • Experience with FP8 INT8 and FP4 quantization
  • Track record achieving >80% hardware peak performance
  • Strong grasp of linear algebra mapping to parallel hardware

Nice-to-have

  • Contributions to the Triton compiler
  • Custom CUDA kernels for major LLMs
  • Growth mindset and big bold thinking
  • Ability to act as a force multiplier
  • Passion for simplifying cloud and AI

Key Requirements

  • Senior level engineering experience
  • Proven optimization track record on GPU hardware

Work Rights

Not specified

Tailored Resume

Cover Letter