Base: $216,000 - $270,000 usd; equity: included ba...
On-site
Experience with rlhf dpo grpo techniques
Track record of published ml research
Three years experience in sophisticated ml problems
The role involves designing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties
Job Summary
The role involves designing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties.
Candidates will collaborate with policymakers, engineers, and researchers to translate findings into actionable safety standards and evaluation benchmarks.
Compensation includes a base salary range of $216,000 to $270,000 USD along with equity, comprehensive health benefits, and a learning stipend.
Matching Summary
The role involves designing post-training pipelines to study how training choices affect model safety, robustness, and alignment properties.
Salary
Base: $216,000 - $270,000 USD; Equity: Included based on Board approval; Benefits: Comprehensive health, dental, vision, retirement, learning stipend, PTO
Skills & Requirements
Must-have
Experience with RLHF DPO GRPO techniques
Track record of published ML research
Three years experience in sophisticated ML problems
Strong written and verbal communication skills
Nice-to-have
Experience with mechanistic interpretability and probing
Familiarity with red-teaming adversarial evaluation
Understanding of reward hacking and alignment faking
Key Requirements
At least three years of ML experience
Published research in machine learning or generative AI