Design, develop, and maintain a core Python ETL framework by writing reusable, well-tested modules that power data transformations across client pipelines
Job Summary
Design, develop, and maintain a core Python ETL framework by writing reusable, well-tested modules that power data transformations across client pipelines.
Build Python integrations with external systems (SFTP, third-party APIs, client platforms) that are robust, testable, and reusable, and identify and eliminate manual bottlenecks in data onboarding and analysis through well-designed automation.
Contribute to an internal metadata management application (FastAPI backend, React/TypeScript frontend), building API endpoints, writing database migrations, and occasionally developing frontend features.
Matching Summary
Design, develop, and maintain a core Python ETL framework by writing reusable, well-tested modules that power data transformations across client pipelines.
Salary
$140-160K
Skills & Requirements
Must-have
Python ETL framework development
AWS Batch, Lambda, Step Functions, EventBridge
Python integrations with external systems
FastAPI, SQLAlchemy, PostgreSQL
PySpark DataFrame API
AWS S3, Lambda, Batch, SageMaker, StepFunctions
PyArrow and columnar data formats (Parquet)
Nice-to-have
React/TypeScript frontend development
Data governance and metadata management
Mentoring junior engineers
Key Requirements
3+ years of experience or 5+ years of relevant experience
Bachelor's degree in a related field
Experience designing and maintaining production ETL/ELT pipelines
Advanced proficiency in Python, Pandas, and PySpark