Staff/senior Staff Ai Engineer, Model Post-training And Alignment

OKX

San Jose, United States
Base: $313,055.00 to $450,000.00; bonus/equity: pe...
On-site
Large model post-training
Preference learning and alignment
Rlaif closed-loop systems
This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities

Job Summary

  • This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities.
  • You will work across the full lifecycle of post-training—from data strategy and reward modeling to reinforcement learning–based optimization and production-grade inference deployment.
  • The company offers a competitive total compensation package, L&D programs, education subsidy, team building, wellness allowances, and comprehensive healthcare schemes.

Matching Summary

This role focuses on designing, executing, and optimizing post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities.

Salary

Base: $313,055.00 to $450,000.00; Bonus/Equity: performance bonus and long-term incentives may be provided; Benefits: full range of medical, financial, and/or other benefits

Skills & Requirements

Must-have

  • large model post-training
  • preference learning and alignment
  • RLAIF closed-loop systems
  • low-latency serving frameworks
  • domain-specific data strategies

Nice-to-have

  • crypto and decentralized applications
  • team building programs
  • wellness and meal allowances

Key Requirements

  • 8 years of industry experience
  • Bachelor's in Computer Science, AI, Machine Learning
  • Deep familiarity with DPO, GRPO, RL-based post-training
  • Experience training specialized small models
  • Experience deploying models in low-latency production

Work Rights

Not specified

Tailored Resume

Cover Letter