Design, develop, and maintain scalable ETL pipelines using AWS Glue, PySpark, and Spark
Job Summary
Design, develop, and maintain scalable ETL pipelines using AWS Glue, PySpark, and Spark.
Orchestrate workflows using Apache Airflow to automate data processing tasks.
Build and implement data pipelines in distributed data platforms including warehouses, databases, data lakes and cloud lakehouses to enable data predictions and models, and reporting and visualisation analysis via data integration tools and frameworks.
Matching Summary
Design, develop, and maintain scalable ETL pipelines using AWS Glue, PySpark, and Spark.
Skills & Requirements
Must-have
AWS Glue
PySpark
Spark
Apache Airflow
Python programming
SQL queries
Hadoop or Teradata
Nice-to-have
Cross-functional teams
Dynamic environments
Problem-solving abilities
Critical thinking abilities
Communication skills
Documentation skills
Key Requirements
12+ years of experience in Data Engineering
Bachelor’s degree/master’s degree in Computer Science/Information Technology