Hyphen Partners is looking for a Multimodal AI Systems Architect to enhance AI systems integrating vision and audio models, focusing on voice-to-voice interactions and multimodal retrieval capabilities. The ideal candidate should have experience with Whisper, CLIP, and multimodal LLM integration, alongside expertise in streaming architectures and cross-modal alignment.
Must-have
Nice-to-have
Not specified