This role focuses on building transferable expertise in agent evaluation engineering for real-world LLM agent systems in the UA performance marketing field
Job Summary
This role focuses on building transferable expertise in agent evaluation engineering for real-world LLM agent systems in the UA performance marketing field.
The intern will design metrics for agent reliability, including success rate, tool-call precision, and safety-related failure rates.
Potential opportunities exist to compose research papers and publish findings at scientific conferences.
Matching Summary
This role focuses on building transferable expertise in agent evaluation engineering for real-world LLM agent systems in the UA performance marketing field.
Skills & Requirements
Must-have
Strong Python programming fundamentals
Interest in AI systems and LLM agents
Ability to analyze logs and traces
Detail-oriented approach to evaluation
Nice-to-have
Experience with LangChain-like agents
Familiarity with pytest and observability tools
Prior experience with data analysis
Curiosity about agent failure modes
Key Requirements
Currently pursuing or recent Master's/PhD degree
Degree in Computer Science, AI, ML, or related field