Member Of Technical Staff - Data Quality Engineer (pre-training)

Reflection AI

San Francisco, United States
On-site
Python for ml/llm workflows
Scalable code development
Large dataset experience
Your mission is to ensure that the data used to train our models meets a high bar for quality, reliability, and downstream impact

Job Summary

  • Your mission is to ensure that the data used to train our models meets a high bar for quality, reliability, and downstream impact.
  • Working closely with our pre-training teams you will own upstream data quality for LLM pre-training and design, validate, and scale automated QA methods.
  • We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

Matching Summary

Your mission is to ensure that the data used to train our models meets a high bar for quality, reliability, and downstream impact.

Skills & Requirements

Must-have

  • Python for ML/LLM workflows
  • Scalable code development
  • Large dataset experience
  • Automated QA systems
  • LLM training and evaluation understanding

Nice-to-have

  • Deep curiosity about data quality
  • Collaborating with researchers
  • Building reusable QA pipelines

Key Requirements

  • Strong engineering fundamentals
  • Experience building data pipelines
  • Experience building QA systems
  • Experience building evaluation workflows
  • Detail-oriented analytical mindset

Work Rights

Not specified

Tailored Resume

Cover Letter