Senior Pyspark Data Engineer

Citi Handlowy

Not specified
Pyspark and advanced python programming
Apache airflow workflow orchestration
Cloudera or databricks experience
Citi Handlowy is seeking a Senior PySpark Data Engineer to join their data engineering team, focusing on designing and maintaining scalable data pipelines and workflows. The ideal candidate will have extensive experience with PySpark, Big Data technologies, and data processing systems

Job Summary

  • The role involves designing, developing, and maintaining robust, scalable, and high-performance data pipelines using PySpark.
  • Candidates must have extensive hands-on experience with Big Data ecosystems including Cloudera and/or Databricks.
  • The position requires strong expertise in SQL, data warehousing concepts, and the ability to mentor junior engineers.

Matching Summary

Match Score: 85

Citi Handlowy is seeking a Senior PySpark Data Engineer to join their data engineering team, focusing on designing and maintaining scalable data pipelines and workflows. The ideal candidate will have extensive experience with PySpark, Big Data technologies, and data processing systems.

Skills & Requirements

Must-have

  • PySpark and advanced Python programming
  • Apache Airflow workflow orchestration
  • Cloudera or Databricks experience
  • Starburst Trino Prestero query engines
  • SQL relational and non-relational databases
  • Linux Unix environment proficiency
  • CI/CD pipeline implementation

Nice-to-have

  • Mentoring junior data engineers
  • Data modeling for optimal storage
  • Collaboration with data scientists
  • Promoting engineering best practices
  • Troubleshooting complex data issues

Key Requirements

  • 6+ years of professional data engineering experience
  • Bachelor's degree or equivalent experience
  • Proficiency with distributed query engines like Starburst
  • Experience with Git Hub and CI/CD pipelines

Work Rights

Not specified

Tailored Resume

Cover Letter