**
Hyphen Connect is looking for a Synthetic Data Engineer to design and implement synthetic data generation pipelines to support data processing and model training. The ideal candidate should have experience with large-scale data pipelines and knowledge of prompt engineering and bias mitigation.
**
Job Summary
The role focuses on designing domain-specific synthetic data generation pipelines using self-instruct and constitutional prompting.
Candidates will implement automated systems for quality scoring and de-duplication to ensure high-quality data management.
This position is critical for managing data pipelines that directly feed into Supervised Fine-Tuning and Direct Preference Optimization training loops.
Matching Summary
Match Score: 75
**
Hyphen Connect is looking for a Synthetic Data Engineer to design and implement synthetic data generation pipelines to support data processing and model training. The ideal candidate should have experience with large-scale data pipelines and knowledge of prompt engineering and bias mitigation.
**
Skills & Requirements
Must-have
design domain-specific synthetic data pipelines
implement automated quality scoring systems
manage SFT and DPO training loop data
Nice-to-have
experience with dataset distillation techniques
knowledge of bias mitigation strategies
proficiency in self-instruct prompting methods
Key Requirements
Proven experience building large-scale data pipelines