Workforcity is seeking a Python Data Engineer with over 5 years of experience to design and optimize data pipelines, ensuring data quality and integrity. The role involves collaboration with cross-functional teams and requires strong proficiency in Apache Spark and SQL
Job Summary
Design, develop, and optimize scalable data pipelines and ETL/ELT processes using Apache Spark (preferably with Scala or Python) to ingest, transform, and load large datasets from diverse sources.
Collaborate closely with data scientists, data analysts, business intelligence developers, and application teams to understand data requirements and deliver appropriate data solutions.
Manage data lifecycle, including data archival, retention, and compliance with data governance policies and security standards.
Matching Summary
Match Score: 85
Workforcity is seeking a Python Data Engineer with over 5 years of experience to design and optimize data pipelines, ensuring data quality and integrity. The role involves collaboration with cross-functional teams and requires strong proficiency in Apache Spark and SQL.
Skills & Requirements
Must-have
Apache Spark (Scala or Python)
ETL/ELT processes
SQL queries and stored procedures
Data warehousing and data lakes
Python or Scala programming
Cloud platforms (AWS, Azure, GCP)
Nice-to-have
Agile/Scrum development
Linux/Unix environments
Data lifecycle management
Code reviews and documentation
Key Requirements
5+ years of professional experience
Bachelor’s degree/University degree or equivalent experience