Hpc Cluster Architect

Nexgencloud

UK
Competitive salary; annual discretionary bonus sch...
Remote
Nvidia gpu platform experience h100 h200 b-series
Infiniband rdma topology and performance tuning
Linux systems pcie topology numa alignment
NexGen Cloud is seeking an HPC Cluster Architect to own the design and deployment of large-scale GPU clusters. The ideal candidate will have hands-on experience in HPC cluster design and a solid understanding of NVIDIA GPU platforms, with the role being remote and offering a collaborative company culture

Job Summary

  • This role owns the full architecture cycle from customer conversation to production deployment for large-scale dedicated GPU cluster contracts.
  • The successful candidate will act as a senior technical authority translating complex design trade-offs into clear decisions while engaging directly with OEMs and vendors.
  • NexGen Cloud offers real ownership and autonomy in a fast-moving team that equips people with AI at every level to solve harder problems.

Matching Summary

Match Score: 85

NexGen Cloud is seeking an HPC Cluster Architect to own the design and deployment of large-scale GPU clusters. The ideal candidate will have hands-on experience in HPC cluster design and a solid understanding of NVIDIA GPU platforms, with the role being remote and offering a collaborative company culture.

Salary

Competitive salary; Annual discretionary bonus scheme; Employee wellbeing benefits

Skills & Requirements

Must-have

  • NVIDIA GPU platform experience H100 H200 B-series
  • InfiniBand RDMA topology and performance tuning
  • Linux systems PCIe topology NUMA alignment
  • Full lifecycle design to deployment for GPU clusters
  • OEM vendor engagement and hardware validation

Nice-to-have

  • Spectrum-X or next-generation Ethernet fabric experience
  • Large-scale cluster deployments over 1000 GPUs
  • Liquid-cooled HPC environment exposure
  • Infrastructure-as-code automation skills
  • Performance benchmarking with NCCL MLPerf

Key Requirements

  • Proven experience designing GPU-based HPC or AI clusters at scale
  • Background from OEM hyperscaler neo-cloud or enterprise research HPC environment
  • Confident ability to engage with customers vendors and internal engineering teams

Work Rights

Not specified

Tailored Resume

Cover Letter