Software Engineer, SystemML - AI Networking

Meta

Menlo Park, CA, US
On-site
C/c++ and python programming skills
Distributed ml training experience
Gpu architecture knowledge
Meta is seeking a Software Engineer for its AI Networking team, focusing on developing software for multi-GPU and multi-node data communication to enhance distributed machine learning workloads. Ideal candidates will have strong programming skills in C/C++ and Python, along with experience in machine learning and high-performance computing

Job Summary

  • The team owns the critical software stack around NCCL that enables multi-GPU and multi-node data communication for nearly every distributed GPU-based ML workload at Meta.
  • Engineers will lead the development of collective communication libraries with a specific focus on improving reliability and performance for large-scale GenAI and LLM training.
  • This role requires deep expertise in machine learning frameworks like PyTorch and specialized experience in distributed training paradigms such as Data Parallel and Model Parallel.

Matching Summary

Match Score: 85

Meta is seeking a Software Engineer for its AI Networking team, focusing on developing software for multi-GPU and multi-node data communication to enhance distributed machine learning workloads. Ideal candidates will have strong programming skills in C/C++ and Python, along with experience in machine learning and high-performance computing.

Skills & Requirements

Must-have

  • C/C++ and Python programming skills
  • Distributed ML Training experience
  • GPU architecture knowledge
  • ML systems and AI infrastructure expertise
  • High performance computing background

Nice-to-have

  • Experience with NCCL library
  • PhD in Computer Science or related field
  • CUDA programming proficiency
  • RoCE/Infiniband performance analysis
  • FSDP and Tensor Parallel implementation

Key Requirements

  • Bachelor's degree in Computer Science or equivalent practical experience
  • Proven track record of leading successful projects
  • Effective leadership and communication skills

Work Rights

Not specified

Tailored Resume

Cover Letter