Research Engineer/scientist - Human Alignment, Consumer Devices
OpenAI
San Francisco, United States
Remote
Rlhf and post-training methods
Reward modeling and preference learning
Long-horizon evaluation
The Future of Computing Research team is focused on developing new methods, models, and evaluation frameworks for multimodal AI, aiming to create product experiences that are useful, delightful, and worthy of long-term trust
Job Summary
The Future of Computing Research team is focused on developing new methods, models, and evaluation frameworks for multimodal AI, aiming to create product experiences that are useful, delightful, and worthy of long-term trust.
This role will focus on building the learning and evaluation foundations that help models become more context-aware, adaptive, and useful over time, working on problems such as reward modeling, preference learning, and long-horizon evaluation.
You will collaborate closely with safety researchers to ensure that adaptation and personalization remain aligned, interpretable, and bounded by clear constraints, and help define how OpenAI measures success for personalized AI systems.
Matching Summary
The Future of Computing Research team is focused on developing new methods, models, and evaluation frameworks for multimodal AI, aiming to create product experiences that are useful, delightful, and worthy of long-term trust.
Skills & Requirements
Must-have
RLHF and post-training methods
reward modeling and preference learning
long-horizon evaluation
multimodal AI systems
human-in-the-loop evaluation
Nice-to-have
pushing beyond one-turn assistant behavior
systems that improve through feedback
learn from richer interaction signals
product-shaping research with high stakes
Key Requirements
Strong background in machine learning research
Experience in RLHF, reward modeling, preference optimization, or post-training for large models
Experience building datasets or eval pipelines grounded in human preferences