We are seeking Data Architects to design and build the data infrastructure that makes AI-native drug discovery possible, transforming raw scientific data into machine-actionable, FAIR-compliant assets
Job Summary
We are seeking Data Architects to design and build the data infrastructure that makes AI-native drug discovery possible, transforming raw scientific data into machine-actionable, FAIR-compliant assets.
The role involves partnering cross-functionally with scientific software engineers, Methods4Insight, and Tech@Lilly to ensure scalable, compliant, and production-grade data architectures.
Eli Lilly offers a comprehensive benefits program including 401(k), pension, medical, dental, vision, life insurance, flexible benefits, and well-being programs.
Matching Summary
We are seeking Data Architects to design and build the data infrastructure that makes AI-native drug discovery possible, transforming raw scientific data into machine-actionable, FAIR-compliant assets.
Salary
Base: $151,500 - $222,200; Bonus/Equity: Company bonus based on performance; Benefits: Comprehensive benefits including 401(k), pension, medical, dental, vision, life insurance, flexible benefits, and well-being programs
Skills & Requirements
Must-have
data modeling and ontology engineering
data platform and lakehouse architecture
knowledge graph and specialized database systems
semantic web technologies implementation
ETL/ELT pipeline development
real-time and streaming data integration
FAIR-compliant data frameworks
Nice-to-have
strong communication skills
familiarity with cloud platforms
pharmaceutical research industry experience
experience with laboratory instrument data integration
experience with vector and array databases
scientific data standards knowledge
performance-critical data processing skills
Key Requirements
M.S. or PhD in related STEM field
6+ years (MS) or 2+ years (PhD) data architecture experience
experience with Databricks, Snowflake, Spark or equivalent
experience with Neo4j, Neptune, MongoDB, TileDB or equivalent
knowledge of semantic web technologies RDF, OWL, SPARQL
experience with streaming data platforms Kafka, Kinesis
understanding of scientific data types in life sciences