Senior System Software Engineer - Ai Performance And Efficiency Tools

Sheto

Unknown
Base: 184,000 usd - 287,500 usd (level 4), 224,000...
Hybrid
C++ and python software development
Deep learning frameworks knowledge
Gpu cluster job scheduling experience
This role involves building internal profiling, debugging, benchmarking, and simulation tools for AI workloads on GPU clusters

Job Summary

  • This role involves building internal profiling, debugging, benchmarking, and simulation tools for AI workloads on GPU clusters.
  • The team collaborates with hardware architects and software teams to improve performance and efficiency of AI systems and GPU clusters.
  • The position offers a competitive base salary range, equity, benefits, and fosters a diverse and inclusive work environment.

Matching Summary

This role involves building internal profiling, debugging, benchmarking, and simulation tools for AI workloads on GPU clusters.

Salary

Base: 184,000 USD - 287,500 USD (Level 4), 224,000 USD - 356,500 USD (Level 5); Bonus/Equity: Eligible for equity; Benefits: Eligible for benefits

Skills & Requirements

Must-have

  • C++ and Python software development
  • Deep Learning frameworks knowledge
  • GPU cluster job scheduling experience
  • NVIDIA GPUs and CUDA programming
  • Profiling and debugging AI workloads

Nice-to-have

  • Strong problem-solving skills
  • Customer-facing communication skills
  • Passion for continuous learning
  • Experience with Linux device drivers
  • Knowledge of GPU and CPU architecture

Key Requirements

  • BS+ in Computer Science or related field
  • 5+ years software development experience
  • Experience with distributed training and inference
  • Knowledge of Slurm or Kubernetes scheduling
  • Experience with NCCL
  • Ability to work with global teams

Work Rights

Not specified

Tailored Resume

Cover Letter