Not specified (likely hybrid or onsite based on typical hpc roles)
8+ years experience in large scale compute infrastructure
Experience with job schedulers like slurm and sge
Strong script-writing skills in python and bash
Topjobstoday is seeking a Senior HPC Site Reliability Engineer to enhance their HPC infrastructure, focusing on designing and optimizing private compute clouds for advanced applications in chip modeling and deep learning. The ideal candidate will have extensive experience in large-scale compute environments, job scheduling, and cloud services, along with strong scripting skills
Job Summary
Join NVIDIA's mission to improve HPC infrastructure and innovate technology.
Provide leadership in designing and implementing large-scale compute cloud.
Help tackle strategic challenges in resource utilization and cloud strategy.
Matching Summary
Match Score: 85
Topjobstoday is seeking a Senior HPC Site Reliability Engineer to enhance their HPC infrastructure, focusing on designing and optimizing private compute clouds for advanced applications in chip modeling and deep learning. The ideal candidate will have extensive experience in large-scale compute environments, job scheduling, and cloud services, along with strong scripting skills.
Skills & Requirements
Must-have
8+ years experience in large scale compute infrastructure