Research Scientist, Agent Robustness

Scale

San Francisco, CA, US
Base: $216,000 - $270,000 usd; equity: included ba...
On-site
Experience with rlhf and dpo techniques
Track record of published ml research
Designing evaluation harnesses for agents
This role focuses on tackling fundamental challenges in building safe and aligned AI agents through rigorous research and benchmarking

Job Summary

  • This role focuses on tackling fundamental challenges in building safe and aligned AI agents through rigorous research and benchmarking.
  • The team collaborates across industry, the public sector, and academia to publish findings that help governments and industries mitigate AI risks.
  • Compensation includes a base salary ranging from $216,000 to $270,000 USD along with equity and comprehensive benefits.

Matching Summary

This role focuses on tackling fundamental challenges in building safe and aligned AI agents through rigorous research and benchmarking.

Salary

Base: $216,000 - $270,000 USD; Equity: Included based on Board approval; Benefits: Comprehensive health, dental, vision, retirement, PTO, and stipends

Skills & Requirements

Must-have

  • Experience with RLHF and DPO techniques
  • Track record of published ML research
  • Designing evaluation harnesses for agents
  • Three years of sophisticated ML experience

Nice-to-have

  • Hands-on experience with SWE-bench or WebArena
  • Red-teaming and adversarial testing skills
  • Knowledge of prompt injection vulnerabilities

Key Requirements

  • At least three years of ML experience
  • Published research in machine learning
  • Practical experience with post-training techniques

Work Rights

Not specified

Tailored Resume

Cover Letter