Evals Engineer, Applied Ai

Scale AI

San Francisco, CA, US
$179,400—$224,250 usd; equity: subject to board of...
On-site
Large language models (llms)
Genai evaluation suite
Llm-as-a-judge autorater frameworks
This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite

Job Summary

  • This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite.
  • Partner with Scale’s Operations team and enterprise customers to translate ambiguity into structured evaluation data, guiding the creation and maintenance of gold-standard human-rated datasets and expert rubrics that anchor AI evaluation systems.
  • Compensation packages at Scale for eligible roles include base salary, equity, and benefits.

Matching Summary

This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite.

Salary

$179,400—$224,250 USD; Equity: subject to Board of Director approval; Benefits: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO

Skills & Requirements

Must-have

  • Large Language Models (LLMs)
  • GenAI Evaluation Suite
  • LLM-as-a-Judge autorater frameworks
  • Python and major ML frameworks
  • AI evaluation systems

Nice-to-have

  • passion for tackling complex evaluation challenges
  • dynamic, fast-paced research environment
  • integrating novel research ideas into workflows
  • collaboration with operations or external teams

Key Requirements

  • 2+ years of experience in Machine Learning or Applied Research
  • Bachelor’s degree in Computer Science, Electrical Engineering, or related field
  • Hands-on experience with Large Language Models (LLMs)
  • Strong understanding of frontier model evaluation methodologies

Work Rights

Not specified

Tailored Resume

Cover Letter