Work with cutting-edge 400G/800G Ethernet fabrics used in AI and GPU data centers, gaining exposure to IP routed and VxLAN fabric and the full NVMeoF stack
Job Summary
Work with cutting-edge 400G/800G Ethernet fabrics used in AI and GPU data centers, gaining exposure to IP routed and VxLAN fabric and the full NVMeoF stack.
Configure and validate Linux/Ubuntu servers for RoCEv2 and NVMeoF traffic generation using 400G NICs, performing system-level tuning to optimize RDMA performance.
Collaborate cross-functionally with hardware, software, and AI platform teams to enhance fabric performance and reliability, ensuring production-ready outcomes.
Matching Summary
Work with cutting-edge 400G/800G Ethernet fabrics used in AI and GPU data centers, gaining exposure to IP routed and VxLAN fabric and the full NVMeoF stack.
Skills & Requirements
Must-have
RDMA, RoCEv2, NVMe over IP Fabrics
Linux/Ubuntu systems administration and tuning
Python scripting and automation
PFC and ECN configuration
Traffic generation tools
IP, BGP, VxLAN, TCP congestion control, QoS
Linux debugging utilities
Nice-to-have
Network telemetry and monitoring tools
AI or GPU cluster interconnect testing
Cisco Nexus 9K familiarity
Ansible or Jenkins automation
SPDK/DPDK and user-space NVMeoF
Key Requirements
8-12 Years experience
Solid understanding of RDMA, RoCEv2, and NVMe over IP Fabrics
Strong hands-on experience with Linux/Ubuntu systems administration and tuning
Proficiency in Python scripting and automation frameworks
Experience configuring PFC and ECN parameters
Familiarity with traffic generation tools
Deep knowledge of network protocols
Strong debugging skills using Linux utilities
Experience tuning 400G NICs
Bachelor’s or Master’s degree in Computer Engineering, Electrical Engineering, Computer Science, or equivalent practical experience