Hpc And Ai Cluster Engineer

NVIDIA

Not specified; not specified; highly competitive s...
Not specified (assumed to be flexible, possibly hybrid or onsite based on industry norms).
3+ years experience in hpc and ai
Linux networking and os internals knowledge
Job scheduling tools like slurm or k8s
NVIDIA is seeking a HPC and AI Cluster Engineer to join their Networking clusters solutions team, focusing on building supercomputers and AI clusters. The ideal candidate will have experience in managing large-scale HPC/AI clusters, as well as solid proficiency in both Linux and Windows environments. This role offers the opportunity to work with cutting-edge technologies in a collaborative and innovative environment

Job Summary

  • The role involves deploying and maintaining large-scale supercomputers and AI clusters using groundbreaking technologies.
  • Candidates will work closely with scientific researchers and developers to craft improved workflows for accelerated computing.
  • NVIDIA offers highly competitive salaries, an extensive benefits package, and a work environment promoting diversity and inclusion.

Matching Summary

Match Score: 85

NVIDIA is seeking a HPC and AI Cluster Engineer to join their Networking clusters solutions team, focusing on building supercomputers and AI clusters. The ideal candidate will have experience in managing large-scale HPC/AI clusters, as well as solid proficiency in both Linux and Windows environments. This role offers the opportunity to work with cutting-edge technologies in a collaborative and innovative environment.

Salary

Not specified; Not specified; Highly competitive salaries and extensive benefits package

Skills & Requirements

Must-have

  • 3+ years experience in HPC and AI
  • Linux networking and OS internals knowledge
  • Job scheduling tools like Slurm or K8s
  • Python programming and bash scripting skills
  • Experience with CI/CD pipelines and automation

Nice-to-have

  • Knowledge of GPU architecture and CUDA
  • Familiarity with Lustre and GPFS storage
  • Experience with RDMA fabrics like InfiniBand
  • Background in container microservice technologies
  • Experience with DGX hardware platforms

Key Requirements

  • Bachelor's Degree in Computer Science or Engineering
  • 3+ years of relevant professional experience
  • Equivalent experience to degree requirement

Work Rights

Not specified

Tailored Resume

Cover Letter