Evals Engineer, Applied Ai

Scale AI

San Francisco, CA, US
Base: $179,400—$224,250 usd; equity: subject to bo...
On-site
Large language models (llms)
Genai evaluation suite
Llm-as-a-judge autorater
This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite

Job Summary

  • This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite.
  • Partner with Scale’s Operations team and enterprise customers to translate ambiguity into structured evaluation data, guiding the creation and maintenance of gold-standard human-rated datasets and expert rubrics that anchor AI evaluation systems.
  • Compensation packages at Scale for eligible roles include base salary, equity, and benefits.

Matching Summary

This high-impact role is critical to our mission of delivering the industry's leading GenAI Evaluation Suite.

Salary

Base: $179,400—$224,250 USD; Equity: subject to Board of Director approval; Benefits: Comprehensive health, dental and vision coverage, retirement benefits, learning and development stipend, generous PTO, commuter stipend

Skills & Requirements

Must-have

  • Large Language Models (LLMs)
  • GenAI Evaluation Suite
  • LLM-as-a-Judge autorater
  • Python and ML frameworks
  • AI evaluation methodologies

Nice-to-have

  • novel research ideas integration
  • dynamic fast-paced research
  • collaboration with enterprise customers
  • published ML/AI research

Key Requirements

  • 2+ years Machine Learning or Applied Research
  • Bachelor's degree in CS or related field
  • Hands-on LLMs and Generative AI experience
  • Proficiency in Python and major ML frameworks

Work Rights

Not specified

Tailored Resume

Cover Letter