Senior Data Engineer - Real World Data

Formation Bio

New York, NY, USA
Base: $204,500 - $267,000; bonus/equity: equity; b...
On-site
Ehr and claims data transformation
Scalable data pipelines with dagster and dbt
Snowflake for compute and storage
Model and transform raw EHR and claims data into clean, canonical, and analytics-ready datasets using SQL, Python, and clinical standards like OMOP

Job Summary

  • Model and transform raw EHR and claims data into clean, canonical, and analytics-ready datasets using SQL, Python, and clinical standards like OMOP.
  • Build and manage scalable data pipelines using Dagster for orchestration, dbt for transformation, and Snowflake as the primary compute and storage engine.
  • Conduct hands-on RWD analyses to answer scientific and strategic research questions—including disease epidemiology, treatment patterns, patient journey characterization, and comparative effectiveness.

Matching Summary

Model and transform raw EHR and claims data into clean, canonical, and analytics-ready datasets using SQL, Python, and clinical standards like OMOP.

Salary

Base: $204,500 - $267,000; Bonus/Equity: Equity; Benefits: Comprehensive benefits and generous perks

Skills & Requirements

Must-have

  • EHR and claims data transformation
  • Scalable data pipelines with Dagster and dbt
  • Snowflake for compute and storage
  • Hands-on RWD analysis
  • Generative AI for data structuring
  • Clinical standards like OMOP

Nice-to-have

  • Familiarity with RWD study design
  • Causal inference frameworks
  • Working with commercial RWD vendors

Key Requirements

  • 5+ years of data engineering experience
  • 2+ years in healthcare or life sciences
  • Experience with EHR or claims datasets
  • Fluent in SQL and Python
  • Experience building longitudinal patient cohorts
  • Familiarity with modern data infrastructure (Snowflake, dbt, Dagster)

Work Rights

Not specified

Tailored Resume

Cover Letter