Workload Optimization Intern

Intel Retiree Medical Plan Trust

Shanghai, China
On-site
C++ and python programming proficiency
Deep learning fundamentals understanding
Master's or ph.d. student status
Intel Retiree Medical Plan Trust is seeking a Graduate Technical Intern for their AI Engineering team in Shanghai. The role focuses on performance optimization, deployment architecture, kernel development, and innovation in deep learning solutions

Job Summary

  • The role focuses on optimizing key use cases and models while debugging accuracy and memory management issues.
  • Candidates will design deployment frameworks leveraging new features in vLLM to accelerate inference performance.
  • This position requires developing high-performance kernels specifically for Intel GPU and CPU architectures.

Matching Summary

Match Score: 85

Intel Retiree Medical Plan Trust is seeking a Graduate Technical Intern for their AI Engineering team in Shanghai. The role focuses on performance optimization, deployment architecture, kernel development, and innovation in deep learning solutions.

Skills & Requirements

Must-have

  • C++ and Python programming proficiency
  • Deep learning fundamentals understanding
  • Master's or Ph.D. student status

Nice-to-have

  • Experience with LLMs and multimodal models
  • Familiarity with vLLM inference frameworks
  • Hands-on GPU kernel development experience

Key Requirements

  • Current Master's or Ph.D. student in CS, AI, or related field
  • Minimum 4 days per week availability
  • Commitment of 6 months or longer

Work Rights

Not specified

Tailored Resume

Cover Letter