In_senior Associate_ Pyspark Developer _data & Analytics _advisory _kolkjata

Virtualspaces Pwc

Kolkata, India
Onsite
Hadoop/spark ecosystem data pipelines
Spark (scala/pyspark) and hive/impala sql
Kafka for streaming ingestion
Leverage data to drive insights and make informed business decisions using advanced analytics techniques

Job Summary

  • Leverage data to drive insights and make informed business decisions using advanced analytics techniques.
  • Develop and implement innovative solutions to optimize business performance and enhance competitive advantage.
  • Be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for clients and communities.

Matching Summary

Leverage data to drive insights and make informed business decisions using advanced analytics techniques.

Skills & Requirements

Must-have

  • Hadoop/Spark ecosystem data pipelines
  • Spark (Scala/PySpark) and Hive/Impala SQL
  • Kafka for streaming ingestion
  • NiFi for batch/near-real-time flows
  • Cloudera Manager, YARN/Tez, HDFS
  • Job orchestration using Oozie/Airflow
  • Data warehousing concepts
  • Linux/Unix, Shell scripting, Git, CI/CD
  • SQL and data modelling for BFSI

Nice-to-have

  • Cloudera Data Platform (CDP)
  • Apache Ranger, Atlas, Kerberos
  • Cloud data services (AWS, Azure, GCP)
  • Databricks experience
  • Containerization and orchestration
  • Python for data processing
  • Monitoring/observability tools

Key Requirements

  • 4-7 Years of experience
  • B.E.(B.Tech)/M.E/M.Tech
  • ETL Testing

Work Rights

Not specified

Tailored Resume

Cover Letter