Data Developer

DRW

Montreal, Canada
On-site
Rag architectures
Vector databases
Python
Design and build data pipelines for RAG systems, including document ingestion, chunking, embedding generation, and vector storage

Job Summary

  • Design and build data pipelines for RAG systems, including document ingestion, chunking, embedding generation, and vector storage.
  • Build ingestion pipelines for structured and unstructured data sources into a centralized data lake, ensuring data is clean, normalized, and accessible for analytics, research, and AI workloads.
  • Collaborate with ML engineers to optimize data formats and storage patterns for GPU-accelerated inference.

Matching Summary

Design and build data pipelines for RAG systems, including document ingestion, chunking, embedding generation, and vector storage.

Skills & Requirements

Must-have

  • RAG architectures
  • vector databases
  • Python
  • DAG-based orchestration
  • embedding models
  • semantic search systems
  • distributed data processing
  • Docker
  • containerization

Nice-to-have

  • data quality initiatives
  • low latency systems
  • high accuracy retrieval
  • prompt engineering
  • challenge consensus
  • open minds

Key Requirements

  • 2-5 years building data systems
  • Bachelor's or Master's degree
  • Experience with vector databases
  • Proficiency in Python
  • Experience with DAG orchestration
  • Experience with embedding models
  • Experience with distributed data processing
  • Familiarity with Docker

Work Rights

Not specified

Tailored Resume

Cover Letter