Staff/senior Staff Ai Engineer, Model Post-training And Alignment
OKX
San Jose, United States
Base: $313,055.00 to $450,000.00; bonus/equity: pe...
On-site
Large model post-training
Preference learning and alignment
Rlaif closed-loop systems
This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities
Job Summary
This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities.
You will work across the full lifecycle of post-training—from data strategy and reward modeling to reinforcement learning–based optimization and production-grade inference deployment.
The company offers a competitive total compensation package, L&D programs, education subsidy, team building, wellness allowances, and comprehensive healthcare schemes.
Matching Summary
This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities.
Salary
Base: $313,055.00 to $450,000.00; Bonus/Equity: performance bonus and long-term incentives may be provided; Benefits: full range of medical, financial, and/or other benefits
Skills & Requirements
Must-have
large model post-training
preference learning and alignment
RLAIF closed-loop systems
low-latency serving frameworks
domain-specific data strategies
Nice-to-have
crypto and decentralized applications
team building programs
wellness and meal allowances
Key Requirements
8 years of industry experience
Bachelor's in Computer Science, AI, Machine Learning
Deep familiarity with DPO, GRPO, RL-based post-training
Experience training specialized small models
Experience deploying models in low-latency production