Build and scale the data infrastructure that powers Formation Bio's drug development platform, supporting the Data Science team with coherent data models and feature engineering pipelines
Job Summary
Build and scale the data infrastructure that powers Formation Bio's drug development platform, supporting the Data Science team with coherent data models and feature engineering pipelines.
Design and build scalable ingestion pipelines for diverse data sources, develop robust data models in dbt, and orchestrate pipelines using Dagster.
Implement data quality checks, validation frameworks, and monitoring to ensure trustworthiness of datasets and collaborate with Data Scientists to translate data needs into reusable models.
Matching Summary
Build and scale the data infrastructure that powers Formation Bio's drug development platform, supporting the Data Science team with coherent data models and feature engineering pipelines.
Salary
$177,500 - $232,000
Skills & Requirements
Must-have
Snowflake and dbt fluency
Data modeling best practices
Large and complex datasets
Diverse data ingestion patterns
Data quality and engineering rigor
Scalable ingestion pipelines
dbt model development and maintenance
Orchestration with Dagster
Data quality checks and monitoring
Collaboration with Data Scientists
Nice-to-have
Data quality and observability tooling
Spark for large-scale data processing
Large-scale data transfer tooling
Healthcare or life sciences data experience
Key Requirements
5+ years of experience in data engineering
Hands-on expertise with Snowflake
Experience with modern orchestration tools (Dagster, Airflow, Prefect)
Experience with large datasets (TB -> PB scale)
Strong data modeling skills
Experience with data quality and observability tooling (Elementary, Great Expectations)
Experience with Spark, Databricks
Experience with large-scale data transfer tooling (AWS DataSync)
Experience in healthcare or life sciences data environments