The ideal candidate designs and implements post-training pipelines, develops RL environments and reward models, and conducts training runs to improve model capabilities for agentic applications
Job Summary
The ideal candidate designs and implements post-training pipelines, develops RL environments and reward models, and conducts training runs to improve model capabilities for agentic applications.
Key responsibilities include designing and maintaining post-training pipelines, developing reinforcement learning environments, reward models, and evaluation signals, and debugging, optimizing, and scaling distributed training workloads.
Intel invests in our people and offers a complete and competitive package of benefits employees and their families through every stage of life.
Matching Summary
The ideal candidate designs and implements post-training pipelines, develops RL environments and reward models, and conducts training runs to improve model capabilities for agentic applications.
Salary
$170,500.00-240,710.00 USD
Skills & Requirements
Must-have
fine-tuning large language models
reinforcement learning environments
reward models and evaluation signals
distributed training workloads
research experiments and ablation studies
Nice-to-have
work independently in ambiguous spaces
strong debugging and problem-solving skills
balance of research rigor and engineering
clear technical communication
demonstrated learning agility
Key Requirements
3+ years of experience in ML engineering
Python/C++ programming
LLM architectures, optimization, and model training
Masters or PhD degrees preferred
Hands-on experience with full post-training pipeline