In_senior Associate_spark Scala_data And Analytics_advisory_bangalore
PwC PricewaterhouseCoopers GmbH
Bengaluru, India
Spark scala data processing pipelines
Hive-based data models and etl
Oozie / airflow workflow scheduling
Design, develop, and optimize large-scale data processing pipelines using Spark (Scala) and implement robust, reusable, and efficient data transformation logic
Job Summary
Design, develop, and optimize large-scale data processing pipelines using Spark (Scala) and implement robust, reusable, and efficient data transformation logic.
Develop and maintain Hive-based data models and ETL processes, schedule and manage workflows using Oozie / Airflow, and implement CI/CD pipelines using GitHub Actions.
At PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities, powered by technology in an environment that drives innovation.
Matching Summary
Design, develop, and optimize large-scale data processing pipelines using Spark (Scala) and implement robust, reusable, and efficient data transformation logic.
Skills & Requirements
Must-have
Spark Scala data processing pipelines
Hive-based data models and ETL
Oozie / Airflow workflow scheduling
CI/CD pipelines using GitHub Actions
Maven or SBT dependency management
Spark job optimization and debugging
End-to-end data-ingestion workflows
Nice-to-have
Docker containerization
Kubernetes deployment and scaling
Kafka / streaming pipelines
Cross-functional collaboration
Leveraging data for insights
Optimizing business performance
Key Requirements
4-8 years of experience
B.E, B.Tech, M.E, MCA, M.Tech education
Master of Engineering, Bachelor of Engineering, MBA preferred