Deep Learning Algorithms Engineer - Acot

Sheto

Ai model optimization on gpus
Cuda and parallel computing
Deep learning framework experience
NVIDIA is seeking a motivated AI Acceleration & Optimization Engineer to improve the performance, scalability, and efficiency of modern AI models across NVIDIA GPU platforms

Job Summary

  • NVIDIA is seeking a motivated AI Acceleration & Optimization Engineer to improve the performance, scalability, and efficiency of modern AI models across NVIDIA GPU platforms.
  • The role involves collaborating with multiple teams including architecture, research, CUDA, compiler, and framework teams to bring next-generation AI workloads from research to production with strong performance and reliability.
  • Responsibilities include optimizing AI models such as LLMs, VLMs, diffusion models, profiling workloads, supporting optimization techniques, and contributing to benchmarking and internal tooling.

Matching Summary

NVIDIA is seeking a motivated AI Acceleration & Optimization Engineer to improve the performance, scalability, and efficiency of modern AI models across NVIDIA GPU platforms.

Skills & Requirements

Must-have

  • AI model optimization on GPUs
  • CUDA and parallel computing
  • Deep learning framework experience
  • Performance profiling and analysis
  • Python programming skills

Nice-to-have

  • Experience with TensorRT and vLLM
  • Knowledge of quantization and sparsity
  • Distributed training or inference experience
  • Open-source ML systems contributions
  • C++ proficiency and debugging skills

Key Requirements

  • Bachelor’s or Master’s degree in relevant field or equivalent experience
  • 2–4 years experience in deep learning or high-performance computing
  • Good understanding of deep learning fundamentals and transformer architectures
  • Familiarity with GPU architecture and parallel computing concepts
  • Experience with at least one major ML framework
  • Programming skills in Python

Work Rights

Not specified

Tailored Resume

Cover Letter