Senior Ai/ml Platform Engineer (llm/slm Inference)

Cisco UK

Base: $199,700.00 - $254,600.00 (us/canada); bonus...
Python java c++ production services
Pytorch tensorflow ml lifecycle
Nlp generative ai production deployment
Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences

Job Summary

  • Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences.
  • The role focuses on end-to-end AI DevOps for LLMs/SLMs, including on-prem inference packaging, runtime optimization, and model/service observability.
  • Candidates will optimize inference performance across CPU and GPU environments using quantization, batching, and KV-cache strategies to ensure cost and latency efficiency.

Matching Summary

Join Cisco's CX AI Incubation Team to build and operate scalable AI systems that move from prototype to production for Intelligent Customer Experiences.

Salary

Base: $199,700.00 - $254,600.00 (US/Canada); Bonus/Equity: Eligible for annual bonuses and restricted stock units; Benefits: Medical, dental, vision, 401(k) match, paid time off

Skills & Requirements

Must-have

  • Python Java C++ production services
  • PyTorch TensorFlow ML lifecycle
  • NLP Generative AI production deployment
  • LLM SLM inference optimization
  • GPU CPU runtime tuning

Nice-to-have

  • vLLM Triton TensorRT-LLM experience
  • Air-gapped customer-managed deployments
  • Edge deployment resource constraints
  • Cross-functional team collaboration
  • Responsible AI behavior integration

Key Requirements

  • Bachelor's degree with 7+ years experience or Master's with 4+ years
  • Experience deploying NLP/Generative AI systems in production
  • Strong software engineering skills in Python, Java, or C++

Work Rights

Not specified

Tailored Resume

Cover Letter