Applied Scientist, Document Understanding

Thomson Reuters

New York City, NY, US
Base: $136,000 - $253,000 usd; bonus: eligible for...
**
Phd or master's in computer science or ai
3+ years shipping document understanding systems
Production experience with pytorch and hugging face
** Thomson Reuters is seeking an Applied Scientist for Document Understanding to design and implement advanced document processing systems that enhance their legal products. The ideal candidate should possess a PhD or Master's degree in a relevant field, extensive experience in document understanding, and a strong ability to work independently in a hybrid work environment. **

Job Summary

  • This role focuses on building foundational intelligence for Westlaw, PracticalLaw, and CoCounsel through advanced document understanding systems.
  • Candidates are expected to drive independent technical decisions on chunking strategies, classification approaches, and multi-document reasoning architectures.
  • The position offers a hybrid work model, competitive benefits including mental health days, and opportunities to contribute to published research.

Matching Summary

Match Score: 75

** Thomson Reuters is seeking an Applied Scientist for Document Understanding to design and implement advanced document processing systems that enhance their legal products. The ideal candidate should possess a PhD or Master's degree in a relevant field, extensive experience in document understanding, and a strong ability to work independently in a hybrid work environment. **

Salary

Base: $136,000 - $253,000 USD; Bonus: Eligible for Annual Bonus based on performance; Benefits: Comprehensive package including 401k match, tuition reimbursement, and flexible PTO

Skills & Requirements

Must-have

  • PhD or Master's in Computer Science or AI
  • 3+ years shipping document understanding systems
  • Production experience with PyTorch and Hugging Face
  • Semantic chunking beyond fixed-size methods
  • Knowledge graph construction from unstructured text
  • Model distillation and SLM deployment under latency constraints

Nice-to-have

  • Publications at ACL, EMNLP, NeurIPS, or KDD
  • Experience with legal document structures and taxonomies
  • Familiarity with AzureML or AWS SageMaker
  • Background in RAG and agentic workflows
  • Experience with synthetic data generation for NLP

Key Requirements

  • PhD or Master's degree required
  • 3+ years post-degree industry experience
  • Production depth in document layout analysis
  • Expertise in entity recognition and citation parsing
  • Proven track record of shipping to production

Work Rights

Not specified

Tailored Resume

Cover Letter