Research Engineer, Safeguards Labs

Anthropic

San Francisco, CA, US
Base: $350,000 - $850,000 usd; bonus/equity: not s...
On-site
Track record driving research projects independently
Proficiency in python and large datasets
Working familiarity with llm operations
Anthropic's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for society

Job Summary

  • Anthropic's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for society.
  • The Safeguards Labs team investigates novel safety methods by prototyping new approaches to safe models and usage safeguards.
  • Candidates will scope their own projects, run experiments end-to-end, and decide when ideas are ready for production handoff.

Matching Summary

Anthropic's mission is to create reliable, interpretable, and steerable AI systems that are safe and beneficial for society.

Salary

Base: $350,000 - $850,000 USD; Bonus/Equity: Not specified; Benefits: Competitive compensation, optional equity donation matching, generous vacation and parental leave

Skills & Requirements

Must-have

  • Track record driving research projects independently
  • Proficiency in Python and large datasets
  • Working familiarity with LLM operations

Nice-to-have

  • Experience building ML classifiers for abuse or fraud
  • Background in trust and safety or integrity
  • Knowledge of agentic environment evaluation methodologies

Key Requirements

  • Bachelor's degree or equivalent experience
  • Years of experience correlating with internal job level

Work Rights

Not specified

Sponsorship: available

Tailored Resume

Cover Letter