Lead Engineer Bigdata - Pyspark

Citi Handlowy

Not specified
Pyspark for large-scale data processing
Python programming with object-oriented design
Apache spark architecture and dataframe api
Citi Handlowy is seeking a Lead Engineer specialized in Big Data and PySpark to join their Big Data Analytics team. The ideal candidate will possess extensive experience in Python programming and Apache Spark, focusing on designing and optimizing scalable data pipelines

Job Summary

  • The role involves designing, developing, and maintaining efficient, scalable, and reliable data pipelines using PySpark.
  • Candidates will collaborate with stakeholders to translate data requirements into technical specifications and optimize jobs for performance.
  • The position requires mentoring junior developers and contributing to the continuous improvement of the team's technical capabilities.

Matching Summary

Match Score: 85

Citi Handlowy is seeking a Lead Engineer specialized in Big Data and PySpark to join their Big Data Analytics team. The ideal candidate will possess extensive experience in Python programming and Apache Spark, focusing on designing and optimizing scalable data pipelines.

Skills & Requirements

Must-have

  • PySpark for large-scale data processing
  • Python programming with object-oriented design
  • Apache Spark architecture and DataFrame API
  • Distributed file systems like HDFS or S3
  • Relational databases and SQL proficiency

Nice-to-have

  • Cloud platform experience AWS Azure GCP
  • Workflow orchestration tools Apache Airflow
  • Streaming data processing with Kafka
  • Containerization technologies Docker Kubernetes
  • Data warehousing concepts and modeling

Key Requirements

  • 8-12 years of relevant experience
  • 5+ years professional Big Data development
  • 5+ years hands-on PySpark experience
  • Bachelor's or Master's degree in CS or related field

Work Rights

Not specified

Tailored Resume

Cover Letter