Multimodal Ai Systems Architect (ai Engineering)

Hyphen Partners

China, China
On-site
Integrate vision encoders and audio-native models
Optimize streaming latency for voice interactions
Architect multimodal rag systems
Hyphen Partners is looking for a Multimodal AI Systems Architect to enhance AI systems integrating vision and audio models, focusing on voice-to-voice interactions and multimodal retrieval capabilities. The ideal candidate should have experience with Whisper, CLIP, and multimodal LLM integration, alongside expertise in streaming architectures and cross-modal alignment

Job Summary

  • The role focuses on developing AI systems that seamlessly integrate vision and audio models to enhance voice-to-voice interactions.
  • Candidates will be responsible for architecting multimodal RAG systems capable of retrieving insights from videos and PDFs.
  • This position requires expertise in optimizing streaming latency to ensure efficient and innovative AI performance.

Matching Summary

Match Score: 85

Hyphen Partners is looking for a Multimodal AI Systems Architect to enhance AI systems integrating vision and audio models, focusing on voice-to-voice interactions and multimodal retrieval capabilities. The ideal candidate should have experience with Whisper, CLIP, and multimodal LLM integration, alongside expertise in streaming architectures and cross-modal alignment.

Skills & Requirements

Must-have

  • Integrate vision encoders and audio-native models
  • Optimize streaming latency for voice interactions
  • Architect multimodal RAG systems
  • Experience with Whisper and CLIP models
  • Knowledge of WebRTC streaming architectures

Nice-to-have

  • Expertise in cross-modal alignment techniques
  • Innovative system design capabilities
  • Efficient multimodal retrieval strategies

Key Requirements

  • Experience with multimodal LLM integration
  • Knowledge of streaming architectures and WebRTC
  • Expertise in cross-modal alignment

Work Rights

Not specified

Tailored Resume

Cover Letter