42DOT is seeking an LLM Engineer to evaluate and enhance the performance of large language models (LLMs) through the development of assessment systems and platforms. The ideal candidate should possess over three years of experience in LLM evaluation and deep learning, along with a strong proficiency in Python and a collaborative mindset
Job Summary
The role focuses on building a robust evaluation system to ensure the reliability and continuous improvement of Large Language Models (LLMs).
Candidates will design automation pipelines using Argo Workflows and MLflow to manage end-to-end model validation and deployment verification.
This position requires establishing reproducible benchmarks and protocols to detect performance regressions in rapidly changing LLM environments.
Matching Summary
Match Score: 85
42DOT is seeking an LLM Engineer to evaluate and enhance the performance of large language models (LLMs) through the development of assessment systems and platforms. The ideal candidate should possess over three years of experience in LLM evaluation and deep learning, along with a strong proficiency in Python and a collaborative mindset.