Senior Deep Learning Kernel Software Performance Architect

Sheto

Unknown
Base: 152,000 usd - 218,500 usd for level 3; base:...
Gpu computing and parallel programming
High performance kernel development
Analytical performance modeling and profiling
Craft GPU-accelerated system architectures that push the boundaries of deep learning performance

Job Summary

  • Craft GPU-accelerated system architectures that push the boundaries of deep learning performance.
  • Collaborate closely across NVIDIA teams such as CUDA Compiler teams, AI/ML training and inference performance teams, and hardware architecture performance teams.
  • NVIDIA is widely considered to be one of the technology world’s most desirable employers with a focus on creativity and pushing silicon to its highest performance.

Matching Summary

Craft GPU-accelerated system architectures that push the boundaries of deep learning performance.

Salary

Base: 152,000 USD - 218,500 USD for Level 3; Base: 184,000 USD - 287,500 USD for Level 4; Bonus/Equity: Eligible; Benefits: Eligible

Skills & Requirements

Must-have

  • GPU computing and parallel programming
  • High performance kernel development
  • Analytical performance modeling and profiling
  • Deep learning software performance optimization
  • Programming in Python, C, C++
  • Machine learning and deep learning fundamentals

Nice-to-have

  • Collaboration with cross-functional teams
  • Experience with CUDA compiler and AI/ML performance teams
  • Prototyping high-performance software
  • Visualization and optimization using simulators and test suites

Key Requirements

  • Master's or PhD in Computer Science, Electrical Engineering or Computer Engineering or equivalent experience
  • 5+ years of relevant industry or research experience
  • Experience in math library performance analysis and profiling
  • Familiarity with GPU computing and parallel programming models

Work Rights

Not specified

Tailored Resume

Cover Letter