Ai Intern – Vla Deployment

XPENG Inc

Santa Clara, CA, United States
On-site
Strong c++ and python programming skills
Familiarity with pytorch deep learning framework
Understanding of model inference and deployment workflows
This role focuses on optimizing and deploying large-scale multimodal models onto vehicle-grade compute platforms for autonomous driving

Job Summary

  • This role focuses on optimizing and deploying large-scale multimodal models onto vehicle-grade compute platforms for autonomous driving.
  • Candidates will support model quantization, pruning, and compression techniques under the guidance of senior engineers.
  • The position involves collaborating with research and platform teams to improve model deployability and analyze performance metrics like latency and memory usage.

Matching Summary

This role focuses on optimizing and deploying large-scale multimodal models onto vehicle-grade compute platforms for autonomous driving.

Skills & Requirements

Must-have

  • Strong C++ and Python programming skills
  • Familiarity with PyTorch deep learning framework
  • Understanding of model inference and deployment workflows
  • Knowledge of ONNX, TensorRT, or similar frameworks
  • Exposure to INT8/FP16 quantization concepts

Nice-to-have

  • Experience with CUDA or GPU programming
  • Background in Transformers or multimodal models
  • Interest in computer architecture and edge systems
  • Previous internship in embedded AI or inference acceleration
  • Contributions to open-source repositories

Key Requirements

  • BS, MS, or PhD in Computer Science, Electrical Engineering, Robotics, or related field
  • Strong problem-solving skills in a fast-paced engineering environment

Work Rights

Not specified

Tailored Resume

Cover Letter