Data Curation Intern

Karya Inc

Bengaluru, India
Not specified; not specified; not specified
On-site
Strong attention to detail
Python programming skills
Text data format familiarity
Karya is on a mission to provide AI-enabled earning and learning opportunities to communities with high talent but low access to opportunities

Job Summary

  • Karya is on a mission to provide AI-enabled earning and learning opportunities to communities with high talent but low access to opportunities.
  • You will work with large open-source datasets requiring significant cleaning, structuring, and enrichment before they can be used effectively in model training pipelines.
  • This role offers a clear progression path from text data curation to read-speech and voice data preparation with mentorship from experienced professionals.

Matching Summary

Karya is on a mission to provide AI-enabled earning and learning opportunities to communities with high talent but low access to opportunities.

Salary

Not specified; Not specified; Not specified

Skills & Requirements

Must-have

  • Strong attention to detail
  • Python programming skills
  • Text data format familiarity

Nice-to-have

  • Prior NLP dataset exposure
  • Knowledge of Indian languages
  • Experience with data versioning tools

Key Requirements

  • Comfort with Python libraries like pandas and regex
  • Familiarity with CSV, JSONL, Parquet formats
  • Curiosity about AI/ML and language technology

Work Rights

Not specified

Tailored Resume

Cover Letter