The role involves designing domain-specific synthetic data generation pipelines using self-instruct and constitutional prompting methods
Job Summary
The role involves designing domain-specific synthetic data generation pipelines using self-instruct and constitutional prompting methods.
Candidates will implement automated systems for quality scoring and data de-duplication to ensure high-quality training data.
This position manages critical data pipelines that directly feed into Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) training loops.
Matching Summary
The role involves designing domain-specific synthetic data generation pipelines using self-instruct and constitutional prompting methods.