Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences
Job Summary
Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences.
The role focuses on end-to-end AI DevOps for LLMs/SLMs, including on-prem inference packaging, runtime optimization, and model/service observability.
Candidates will optimize inference performance across CPU and GPU environments using quantization, batching, and KV-cache strategies to ensure cost and latency efficiency.
Matching Summary
Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences.
Salary
Base: $199,700.00 - $254,600.00 (US/Canada); Bonus/Equity: Eligible for annual bonuses and restricted stock units; Benefits: Medical, dental, vision, 401(k) match, paid time off
Skills & Requirements
Must-have
Python Java C++ production services
PyTorch TensorFlow ML lifecycle
NLP Generative AI production deployment
LLM SLM inference optimization
GPU CPU runtime tuning
Nice-to-have
vLLM Triton TensorRT-LLM experience
Air-gapped customer-managed deployments
Edge deployment resource constraints
Cross-functional team collaboration
Responsible AI behavior integration
Key Requirements
Bachelor's degree with 7+ years experience or Master's with 4+ years
Experience deploying NLP/Generative AI systems in production
Strong software engineering skills in Python, Java, or C++