Data Engineer - Pyspark

Barclays

Pune, India
Pyspark dataframes rdd sparksql
Aws cloud development testing
Aws data analytics stack
Build and maintain systems that collect, store, process, and analyse data, such as data pipelines, data warehouses and data lakes to ensure that all data is accurate, accessible, and secure

Job Summary

  • Build and maintain systems that collect, store, process, and analyse data, such as data pipelines, data warehouses and data lakes to ensure that all data is accurate, accessible, and secure.
  • Design and implement scalable and efficient data transformation/storage solutions using Snowflake and build reusable components using Snowflake and AWS Tools/Technology.
  • Collaborate with data scientists to build and deploy machine learning models and implement Cloud based Enterprise data warehouse with multiple data platforms along with Snowflake and NoSQL environment.

Matching Summary

Build and maintain systems that collect, store, process, and analyse data, such as data pipelines, data warehouses and data lakes to ensure that all data is accurate, accessible, and secure.

Skills & Requirements

Must-have

  • pyspark Dataframes RDD SparkSQL
  • AWS Cloud development testing
  • AWS Data Analytics Stack
  • Snowflake data transformation storage
  • DBT ELT pipeline development
  • advanced SQL PL SQL programs
  • reusable components Snowflake AWS

Nice-to-have

  • stakeholder engagement requirements elicitation
  • infrastructure setup solutions
  • data marts data warehousing concepts
  • analytical interpersonal skills
  • cloud enterprise data warehouse
  • NoSQL environment data movement
  • data governance lineage tools
  • orchestration tools Apache Airflow

Key Requirements

  • Two major project implementations
  • Exposure to data governance or lineage tools
  • Experience in using Orchestration tools
  • Knowledge on Abinitio ETL tool

Work Rights

Not specified

Tailored Resume

Cover Letter