This role involves leading the design and operationalization of complex, production-grade agentic systems that plan, call tools safely, and continuously improve through feedback
Job Summary
This role involves leading the design and operationalization of complex, production-grade agentic systems that plan, call tools safely, and continuously improve through feedback.
The successful candidate will define safe tool-use patterns including structured outputs, permissioning, auditability, and human-in-the-loop approval steps for sensitive actions.
You will establish end-to-end AgentOps/LLMOps practices to ensure release pipelines, canary strategies, and safe rollback mechanisms are in place for agentic systems.
Matching Summary
This role involves leading the design and operationalization of complex, production-grade agentic systems that plan, call tools safely, and continuously improve through feedback.
Skills & Requirements
Must-have
Multi-agent system architecture
Production-grade agentic solutions
LLMOps and AgentOps practices
Safe tool-use patterns and guardrails
Distributed tracing and observability
Data drift detection and monitoring
Nice-to-have
Mentoring senior and junior engineers
Cross-functional stakeholder influence
Reusable patterns and playbooks creation
Continuous improvement of evaluation coverage
Key Requirements
Bachelor's degree in Computer Science or related field
Advanced degree preferred
Sustained ownership of production AI/ML systems
Real-world experience shipping complex agentic systems
Proven ability to define observability and reliability practices