Solutions Architect, Inference Deployments

Nvidia Corporation

Base: $152,000 - $241,500 (level 3) or $184,000 - ...
5+ years in solutions architecture
Distributed systems deployment on kubernetes
Ai inference workload experience
The role involves building inference pipelines with NVIDIA Dynamo to distribute tasks among GPU workers for improved efficiency

Job Summary

  • The role involves building inference pipelines with NVIDIA Dynamo to distribute tasks among GPU workers for improved efficiency.
  • Candidates will collaborate with DevOps teams to orchestrate disaggregated inference using Kubernetes for complex enterprise workloads.
  • The position offers a competitive base salary ranging from $152,000 to $287,500 depending on the level, along with equity and benefits.

Matching Summary

The role involves building inference pipelines with NVIDIA Dynamo to distribute tasks among GPU workers for improved efficiency.

Salary

Base: $152,000 - $241,500 (Level 3) or $184,000 - $287,500 (Level 4); Bonus/Equity: Eligible for equity; Benefits: Comprehensive benefits package included

Skills & Requirements

Must-have

  • 5+ years in Solutions Architecture
  • Distributed systems deployment on Kubernetes
  • AI inference workload experience
  • NVIDIA Dynamo or Triton Inference Server
  • GPU orchestration with NVIDIA operators
  • Low-latency networking and memory hierarchies

Nice-to-have

  • Experience with NVIDIA NIM and NIXL technologies
  • Deep understanding of transformer neural networks
  • Knowledge of quantization and speculative decoding
  • Contributions to open-source AI projects
  • NVIDIA Certified AI Engineer credentials

Key Requirements

  • BS in CS/Engineering or equivalent experience
  • 5+ years in Solutions Architecture
  • Proven track record deploying distributed systems

Work Rights

Not specified

Tailored Resume

Cover Letter