Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals
Job Summary
Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals.
This role blends applied research with strong software engineering: rapid iteration, rigorous evaluation, and production-minded implementation for cloud-scale batch processing and interactive workflows.
Build scalable cloud workflows for batch processing and integrate outputs with APIs and downstream consumers.
Matching Summary
Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals.
Skills & Requirements
Must-have
Computer Vision systems using Python
Deep learning for computer vision
PyTorch framework experience
ML prototypes into reliable pipelines
Cloud or backend workflows
Nice-to-have
Vision-language models experience
Multimodal fusion experience
Video pipelines experience
Real-world datasets experience
Reusable platform components development
Key Requirements
Bachelor’s degree in CS, EE, Robotics or related
4+ years building computer vision systems
Experience taking ML prototypes into reliable pipelines
Experience building or integrating ML systems into cloud workflows