Data Engineer - Pyspark

Workforcity

Spark java development expertise
Python and apache spark
Big data processing
This is a data engineer position responsible for the design, development, implementation, and maintenance of data flow channels and data processing systems

Job Summary

  • This is a data engineer position responsible for the design, development, implementation, and maintenance of data flow channels and data processing systems.
  • Develop and optimize scalable Spark Java-based data pipelines for processing and analyzing large scale financial data and design distributed computing solutions.
  • Work with business stakeholders and Business Analysts to understand requirements and with other data scientists to interpret complex datasets.

Matching Summary

This is a data engineer position responsible for the design, development, implementation, and maintenance of data flow channels and data processing systems.

Skills & Requirements

Must-have

  • Spark Java development expertise
  • Python and Apache Spark
  • Big Data processing
  • Data flow channels design
  • Scalable data pipelines
  • Real-time processing systems
  • Confluent Kafka

Nice-to-have

  • Interpersonal and communication skills
  • Fast-paced financial environment
  • Mathematical and analytical mindset
  • Problem-solving skills
  • Code inspection processes

Key Requirements

  • 5-8 Years of experience in data ecosystems
  • 4-5 years hands-on Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting
  • 3+ years relational SQL and NoSQL databases
  • Strong proficiency in Python and Spark Java
  • Data Integration, Migration & Large Scale ETL experience
  • Data Modeling experience
  • Experience with cloud platforms (AWS, GCP)
  • Experience with container technologies (Docker, Kubernetes)

Work Rights

Not specified

Tailored Resume

Cover Letter