The role involves designing and implementing CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment including LLMs and vector databases
Job Summary
The role involves designing and implementing CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment including LLMs and vector databases.
Candidates will provision and manage AI infrastructure across cloud hyperscalers like AWS and GCP using infrastructure-as-code tools such as Terraform.
The position requires collaborating with data scientists and AI engineers to ensure smooth transitions from experimentation to production while maintaining security best practices.
Matching Summary
The role involves designing and implementing CI/CD pipelines for AI and ML model training, evaluation, and RAG system deployment including LLMs and vector databases.
Skills & Requirements
Must-have
4+ years DevOps or MLOps experience
Cloud-native services AWS GCP Azure ML
CI/CD tools GitHub Actions ArgoCD Jenkins
Python Bash Go scripting languages
Deep Kubernetes and container lifecycle management
Infrastructure-as-code Terraform proficiency
Nice-to-have
Experience with MLflow Kubeflow SageMaker Pipelines
Familiarity with prompt engineering and model fine-tuning
Knowledge of secure AI deployment compliance frameworks
Experience with model versioning drift detection rollback strategies
Exposure to LangChain LangGraph CrewAI agent orchestration
Key Requirements
4+ years of DevOps MLOps or infrastructure engineering experience
2+ years in AI/ML environments preferred
Hands-on experience with GPU infrastructure management