Familiarity with tesseract, paddleocr, layoutlmv3, donut, nougat
Understanding of ooxml document formats
TransPerfect is seeking an experienced Python Backend Developer to join their AI team in Barcelona, Spain. The role involves developing solutions for document processing and requires expertise in Python, OCR technologies, and architectural decision-making
Job Summary
Shape the future of AI in a global organization by joining the innovative Artificial Intelligence (AI) team.
Lead the research and implementation of a document conversion pipeline to convert complex PDFs into editable .docx files.
Solve the 'last mile' of document processing by recreating visual and structural intent, including tables, layouts, and styling.
Matching Summary
Match Score: 85
TransPerfect is seeking an experienced Python Backend Developer to join their AI team in Barcelona, Spain. The role involves developing solutions for document processing and requires expertise in Python, OCR technologies, and architectural decision-making.
Skills & Requirements
Must-have
Python mastery with OpenCV, PyMuPDF, python-docx
Familiarity with Tesseract, PaddleOCR, LayoutLMv3, Donut, Nougat
Understanding of OOXML document formats
Experience using GPT or Claude models
Ability to decide between API vs. custom pipeline
Nice-to-have
Pandoc AST experience
Background in DTP, Typography, or Graphic Design
Contributions to open-source OCR or PDF manipulation projects
Key Requirements
Expert-level Python skills
Deep familiarity with Transformer-based document models
Experience with LLM integration
Ability to decide between off-the-shelf API or custom pipeline