Base: $167,200.00 to $209,000; bonus/equity: bonus...
On-site
Expert-level triton or cuda programming
Deep understanding of gpu architectures
Optimize kernels to achieve >80% peak performance
DigitalOcean is seeking a Senior Engineer to play a key technical role in maximizing throughput and minimizing latency for advanced large models
Job Summary
DigitalOcean is seeking a Senior Engineer to play a key technical role in maximizing throughput and minimizing latency for advanced large models.
The successful candidate will act as a force multiplier by solving complex bottlenecks in memory bandwidth and guiding the technical roadmap for the high-performance inference fleet.
Employees receive competitive compensation including base salary, potential bonuses, equity grants, and reimbursement for conferences and training.
Matching Summary
DigitalOcean is seeking a Senior Engineer to play a key technical role in maximizing throughput and minimizing latency for advanced large models.
Salary
Base: $167,200.00 to $209,000; Bonus/Equity: Bonus based on performance; Equity grants upon hire and ESPP available; Benefits: Competitive array including flexible time off and EAP
Skills & Requirements
Must-have
Expert-level Triton or CUDA programming
Deep understanding of GPU architectures
Optimize kernels to achieve >80% peak performance
Implement FlashAttention-4 and TileLang
Develop FP8 INT8 and FP4 quantization techniques
Nice-to-have
Contributed to the Triton compiler
Strong grasp of linear algebra mapping
Growth mindset with big bold thinking
Experience with long-context attention mechanisms
Key Requirements
Track record of optimizing kernels to >80% theoretical hardware peak performance
Expert-level proficiency in Triton or CUDA C++
Deep understanding of SMs, Warp scheduling, and Tensor Cores