Base: $216,000 - $270,000 usd; equity: included ba...
On-site
Experience with rlhf dpo grpo post-training techniques
Track record of published research in generative ai
Three years experience addressing sophisticated ml problems
The role focuses on developing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties
Job Summary
The role focuses on developing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties.
Candidates will collaborate with policymakers, engineers, and researchers to translate findings into actionable safety standards and evaluation benchmarks.
Compensation includes a base salary range of $216,000 to $270,000 USD along with equity, comprehensive health benefits, and a learning stipend.
Matching Summary
The role focuses on developing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties.
Salary
Base: $216,000 - $270,000 USD; Equity: Included based on Board approval; Benefits: Comprehensive health, dental, vision, retirement, PTO, and learning stipend
Skills & Requirements
Must-have
Experience with RLHF DPO GRPO post-training techniques
Track record of published research in generative AI
Three years experience addressing sophisticated ML problems
Nice-to-have
Experience with mechanistic interpretability or probing
Familiarity with red-teaming adversarial evaluation
Understanding of reward hacking sycophancy alignment faking
Key Requirements
At least three years of experience in ML research or product development
Published research track record in machine learning and generative AI
Strong written and verbal communication skills for cross-functional teams