Not specified; benefits: contributory pension sche...
Onsite
Strong analytic skills with unstructured datasets
Python programming (pyspark, pandas, pyarrow)
Distributed data processing using apache spark
The role involves building data pipelines that clean, transform, and aggregate data from disparate sources while collaborating with stakeholders in a cross-functional Agile team
Job Summary
The role involves building data pipelines that clean, transform, and aggregate data from disparate sources while collaborating with stakeholders in a cross-functional Agile team.
Candidates must hold current eDV clearance due to the sensitive nature of the work supporting defense, intelligence, and cyber sectors.
Benefits include a contributory pension scheme up to 10.5%, 25 days holiday plus public holidays, and an early finish on Fridays.
Matching Summary
The role involves building data pipelines that clean, transform, and aggregate data from disparate sources while collaborating with stakeholders in a cross-functional Agile team.
Salary
Not specified; Benefits: Contributory Pension Scheme (up to 10.5% company contribution); Bonus: Discretionary company bonus scheme
Skills & Requirements
Must-have
Strong analytic skills with unstructured datasets
Python programming (PySpark, Pandas, PyArrow)
Distributed data processing using Apache Spark
Data ETL tools (Airflow, AWS Step Functions, NiFi)
Cloud services experience (AWS, Azure, or GCP)
Messaging and streaming technologies (Kafka, SQS)
SQL and NoSQL database management
Containerization with Docker and Kubernetes
Nice-to-have
Experience with HDFS, Iceberg, Elastic, S3, Data Lake