Base: $120,100.00 to $225,700.00 py; bonus/equity:...
Not specified
Master's or ph.d. in computer science
Proficient in ai accelerator architectures
Deep understanding of cuda and triton
Tencent Music Entertainment Group is seeking a Senior AI Inference Systems Engineer in Palo Alto, California, to lead the optimization of inference pipelines for large models and engage in innovative research in heterogeneous computing. The ideal candidate will possess a strong technical background in AI inference optimization, hardware architecture, and distributed systems, complemented by advanced degrees and significant experience in the field
Job Summary
The role involves leading the optimization of the full inference pipeline for Large Models such as LLMs and Multimodal systems.
Candidates must possess deep expertise in heterogeneous computing and hardware-specific tuning for real-time and batch inference scenarios.
Employees are eligible for a sign-on payment, relocation package, restricted stock units, and comprehensive medical and retirement benefits.
Matching Summary
Match Score: 85
Tencent Music Entertainment Group is seeking a Senior AI Inference Systems Engineer in Palo Alto, California, to lead the optimization of inference pipelines for large models and engage in innovative research in heterogeneous computing. The ideal candidate will possess a strong technical background in AI inference optimization, hardware architecture, and distributed systems, complemented by advanced degrees and significant experience in the field.
Salary
Base: $120,100.00 to $225,700.00 per year; Bonus/Equity: Sign-on payment and restricted stock units available; Benefits: Medical, dental, vision, life, disability, 401(k), vacation, holidays, sick leave
Skills & Requirements
Must-have
Master's or Ph.D. in Computer Science
Proficient in AI accelerator architectures
Deep understanding of CUDA and Triton
Expertise in multi-level KV Cache management
Experience with PyTorch and TensorFlow frameworks
Nice-to-have
High-level publications or core patents
Experience tuning ultra-large-scale clusters
Strong analytical and cross-team collaboration skills
Key Requirements
Master's or Ph.D. degree required
Significant professional experience in AI inference optimization
Mastery of quantization and intelligent routing techniques