Design and implement novel approaches to optimize datasets for model training, focusing on data curation strategies like coreset selection and embedding-based filtering
Job Summary
Design and implement novel approaches to optimize datasets for model training, focusing on data curation strategies like coreset selection and embedding-based filtering.
Build and maintain large-scale image and video pipelines, orchestrating ingestion, synthetic data generation, and versioned releases to maximize model performance.
Join a cutting-edge defense startup founded by ex-Navy engineers with a strong track record and an engineering-first culture focused on technical excellence.
Matching Summary
Design and implement novel approaches to optimize datasets for model training, focusing on data curation strategies like coreset selection and embedding-based filtering.
Skills & Requirements
Must-have
Data infrastructure design
AI/ML principles knowledge
Image and video pipelines
Data curation strategies
Synthetic data generation
Nice-to-have
Coreset selection expertise
Embedding-based filtering
Automated complexity scoring
Continuous learning culture
Key Requirements
Expert-level data infrastructure skills
Strong knowledge of AI & Machine Learning principles