Systems architectures, security, networking, storage
Provide technical leadership to HPC engineering staff, coordinate team, and guide architectural and operational decisions for the HPC environment
Job Summary
Provide technical leadership to HPC engineering staff, coordinate team, and guide architectural and operational decisions for the HPC environment.
Design, deploy, and maintain the university’s high-performance computing cluster, configure and maintain the workload scheduler, and architect quality-of-service policies.
Administer Linux systems across infrastructure projects and deployment of new GPUs for research and teaching, and deploy and support AI workloads.
Matching Summary
Provide technical leadership to HPC engineering staff, coordinate team, and guide architectural and operational decisions for the HPC environment.
Skills & Requirements
Must-have
Linux systems administration
RHEL/CentOS Linux operating management
Systems architectures, security, networking, storage
Parallel computing, batch/scheduling systems
Deploy and support AI workloads
Large-scale research computing platforms
Nice-to-have
Technical leadership to HPC staff
Architect quality-of-service policies
Vendor evaluations and proof of concept
Advanced technical support to faculty
Key Requirements
Bachelor's degree in Computer Science
3 years of experience in Linux systems administration
Experience in research computing environment
Experience with programming languages C, C++, bash, Perl
Experience with source control systems Git
Experience with log correlation software Sumologic
Experience with HPC environments SLURM, GPFS
Experience with Machine Learning Frameworks Tensorflow