Python Backend Developer

TransPerfect Translations International Inc

Hybrid
Expert-level python skills
Experience with opencv and pymupdf
Deep familiarity with tesseract and paddleocr
The role involves leading the research and implementation of a document conversion pipeline to transform unstructured PDFs into formatted .docx files

Job Summary

  • The role involves leading the research and implementation of a document conversion pipeline to transform unstructured PDFs into formatted .docx files.
  • Candidates will perform comparative analysis between commercial and open-source AI solutions while establishing metrics for format fidelity.
  • This is a hybrid role requiring both strategic decision-making and hands-on development within a diverse global team across the USA, Spain, Portugal, and India.

Matching Summary

Match Score: 85

The role involves leading the research and implementation of a document conversion pipeline to transform unstructured PDFs into formatted .docx files.

Skills & Requirements

Must-have

  • Expert-level Python skills
  • Experience with OpenCV and PyMuPDF
  • Deep familiarity with Tesseract and PaddleOCR
  • Knowledge of LayoutLMv3, Donut, or Nougat models
  • Understanding of OOXML document formats
  • Experience integrating GPT or Claude models

Nice-to-have

  • Experience with Pandoc AST
  • Background in DTP, Typography or Graphic Design
  • Contributions to open-source OCR projects

Key Requirements

  • Expert-level Python proficiency required
  • Deep familiarity with modern Transformer-based document models
  • Architectural vision for API vs custom pipeline decisions

Work Rights

Not specified

Tailored Resume

Cover Letter