Build, scale, and operate a data platform that allows users to discover and query across petabytes of biological, chemical, and patient-centric datasets
Job Summary
Build, scale, and operate a data platform that allows users to discover and query across petabytes of biological, chemical, and patient-centric datasets.
Work with biologists, chemists, and data scientists to build relatability and query-ability into diverse datasets for future drug discovery questions.
Join the Data Lake team responsible for relational and object storage, ensuring all data flows to the Data Lake and is discoverable, queryable, and relatable.
Matching Summary
Build, scale, and operate a data platform that allows users to discover and query across petabytes of biological, chemical, and patient-centric datasets.
Salary
Base: £75,900 - £101,900; Bonus/Equity: Eligible for annual bonus and equity compensation; Benefits: Comprehensive benefits package
Skills & Requirements
Must-have
Python and SQL expertise
Containerization (Docker, Kubernetes)
Infrastructure as Code (Terraform)
Agentic Development tools
Relational Databases (Postgres, MySql)
Data container files (Parquet, Avro)
Medallion Architecture principles
Cloud provider experience (GCP, AWS, Azure)
DevOps capacity
CI/CD and system maintenance
Nice-to-have
Experience with Search (Elasticsearch)
Vector/Graph databases experience
Scaling within GCP ecosystem
Drug discovery domain knowledge
Intellectual curiosity
Key Requirements
5+ years of deep experience in modern, cloud-based data engineering
Proven track record of building and maintaining robust platforms
Hands-on implementation experience
Collaborative problem-solving skills
People-first mindset and empathetic teammate
Drive to learn the domain and explore new technologies