Hyphen Connect is seeking a Multimodal AI Systems Architect to enhance its AI systems by integrating vision and audio models for efficient voice-to-voice interactions and multimodal retrieval. The ideal candidate will possess experience with relevant technologies and architectures, focusing on optimizing AI interactions and system performance
Job Summary
The role focuses on developing AI systems that seamlessly integrate vision and audio models to enhance voice-to-voice interactions.
Candidates will be responsible for architecting multimodal RAG systems capable of retrieving insights from videos and PDFs.
This position requires expertise in optimizing streaming latency and integrating advanced models like Whisper and CLIP into core reasoning loops.
Matching Summary
Match Score: 85
Hyphen Connect is seeking a Multimodal AI Systems Architect to enhance its AI systems by integrating vision and audio models for efficient voice-to-voice interactions and multimodal retrieval. The ideal candidate will possess experience with relevant technologies and architectures, focusing on optimizing AI interactions and system performance.