Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals
Job Summary
Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals.
This role blends applied research with strong software engineering: rapid iteration, rigorous evaluation, and production-minded implementation for cloud-scale batch processing and interactive workflows.
We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.
Matching Summary
Develop end-to-end pipelines that transform images and video into structured, reliable observations by combining modern vision models with multimodal reasoning and contextual signals.
Skills & Requirements
Must-have
Computer Vision pipelines
Deep learning for computer vision
Python
PyTorch
ML prototypes to reliable pipelines
Cloud or backend ML workflows
Nice-to-have
Vision-language models
Multimodal fusion
Video pipelines
Real-world datasets
Reusable platform components
Key Requirements
Bachelor’s degree or equivalent practical experience
4+ years of experience
Strong experience with deep learning for computer vision
Experience taking ML prototypes into reliable pipelines
Experience building or integrating ML systems into cloud or backend workflows