Data Engineer, Analytics Data Products

Wirecutter Inc

New York, NY, United States
Base: $110,000 - $130,000 usd; bonus/equity: not s...
On-site
Elt/etl pipeline implementation
Dbt for data modeling
Pyspark for large-scale processing
Design, model, and implement complex ELT/ETL pipelines for cleansed and curated data layers in the medallion architecture, taking full ownership of the data product's structure, partitioning, documentation, and performance characteristics

Job Summary

  • Design, model, and implement complex ELT/ETL pipelines for cleansed and curated data layers in the medallion architecture, taking full ownership of the data product's structure, partitioning, documentation, and performance characteristics.
  • Manage the physical data storage across both GCP and AWS, selecting optimal file formats and designing efficient partitioning and clustering strategies.
  • Implement centralized data quality checks and observability mechanisms within the data pipeline to proactively identify and resolve data issues.

Matching Summary

Design, model, and implement complex ELT/ETL pipelines for cleansed and curated data layers in the medallion architecture, taking full ownership of the data product's structure, partitioning, documentation, and performance characteristics.

Salary

Base: $110,000 - $130,000 USD; Bonus/Equity: Not specified; Benefits: Medical, dental, vision, 401(k) match, PTO

Skills & Requirements

Must-have

  • ELT/ETL pipeline implementation
  • dbt for data modeling
  • PySpark for large-scale processing
  • Cloud data storage management (GCP/AWS)
  • Spark compute resource administration
  • Data quality checks and observability

Nice-to-have

  • Dual-cloud environment experience
  • Infrastructure-as-Code tools
  • Advanced Lakehouse file formats
  • Experimentation/A/B testing data support
  • CI/CD pipeline integration

Key Requirements

  • 2+ years Data Engineering experience
  • Proficiency in SQL
  • Production-level data modeling experience
  • End-to-end data product development
  • Cloud Data Warehouse experience (BigQuery)
  • Python and PySpark proficiency
  • Cloud services familiarity (GCP or AWS)
  • Workflow orchestration tools experience
  • Version control systems (Git) experience

Work Rights

Not specified

Tailored Resume

Cover Letter