Ml Research Scientist I/ii, Multimodal Data Extraction

Lilasciences

Cambridge, United States
Base: $176,000 - $304,000 usd; bonus/equity: bonus...
**
Ai systems development
Large language, multi-modal models
Pytorch and hugging face transformers
** Lila Sciences is seeking an ML Research Scientist specializing in multimodal data extraction to contribute to their vision of scientific superintelligence. The ideal candidate will possess a strong background in machine learning and related fields, aiming to develop AI systems for unifying scientific knowledge across various formats. **

Job Summary

  • Advance Lila’s vision of scientific superintelligence by developing foundation models that autonomously read, interpret, and structure scientific knowledge across text, images, and experimental data.
  • Research and develop AI systems that extract and structure knowledge from diverse scientific sources, designing and fine-tuning large language, multi-modal and specialized models for factual, interpretable data extraction.
  • Build scalable pipelines for unstructured and heterogeneous scientific data, integrating text, tables, and visuals, and collaborate with domain experts to align extracted data with real-world discovery workflows.

Matching Summary

Match Score: 75

** Lila Sciences is seeking an ML Research Scientist specializing in multimodal data extraction to contribute to their vision of scientific superintelligence. The ideal candidate will possess a strong background in machine learning and related fields, aiming to develop AI systems for unifying scientific knowledge across various formats. **

Salary

Base: $176,000 - $304,000 USD; Bonus/Equity: Bonus potential and generous early equity; Benefits: Not specified

Skills & Requirements

Must-have

  • AI systems development
  • large language, multi-modal models
  • PyTorch and Hugging Face Transformers
  • scientific data extraction
  • multimodal fusion architectures

Nice-to-have

  • document-level understanding
  • scientific document parsing
  • knowledge graph construction
  • noisy or heterogeneous data
  • advancing AI in physical sciences

Key Requirements

  • PhD or equivalent research experience
  • Machine learning expertise
  • NLP expertise
  • Vision-language modeling expertise
  • Proven ability to train, fine-tune, and evaluate LLMs and multimodal models
  • Strong understanding of physical science data structures

Work Rights

Not specified

Tailored Resume

Cover Letter