Member Of Technical Staff - Vlm

Black Forest Labs

Freiburg, Germany
On-site
Pretrained or significantly advanced vlm
Production track record in multimodal architectures
Deep understanding of vision and language representations
This role involves pioneering the integration of Vision-Language Models directly into the FLUX generation stack to improve controllability and alignment

Job Summary

  • This role involves pioneering the integration of Vision-Language Models directly into the FLUX generation stack to improve controllability and alignment.
  • Candidates must have a proven track record of pretraining or significantly advancing VLMs rather than just fine-tuning existing models.
  • The team values deep scientific understanding, low ego collaboration, and bold execution while maintaining a balance between research excellence and shipping products.

Matching Summary

This role involves pioneering the integration of Vision-Language Models directly into the FLUX generation stack to improve controllability and alignment.

Skills & Requirements

Must-have

  • Pretrained or significantly advanced VLM
  • Production track record in multimodal architectures
  • Deep understanding of vision and language representations
  • Experience with distributed multi-node training

Nice-to-have

  • Experience with diffusion or flow-based generative models
  • Knowledge of autoregressive and diffusion paradigm composition
  • Strong publication record in frontier research

Key Requirements

  • Staff or Senior Individual Contributor level experience
  • Proven deployment of pretrained VLMs in production systems
  • Demonstrated ability to push the frontier on multimodal architectures

Work Rights

Not specified

Tailored Resume

Cover Letter