Contribute features to vLLM that empower the newest models with the latest NVIDIA GPU hardware features; profile and optimize the inference framework (vLLM) with methods like speculative decoding, data/tensor/expert/pipeline-parallelism, prefill-decode disaggregation.
Base: 108,000 USD - 195,500 USD; Bonus/Equity: equity; Benefits: benefits
Must-have
Nice-to-have
Not specified