The role involves designing and architecting scalable data engineering pipelines that handle diverse data types including text, images, video, and audio
Job Summary
The role involves designing and architecting scalable data engineering pipelines that handle diverse data types including text, images, video, and audio.
Candidates will leverage AWS S3 and Snowflake to build robust data ingestion, transformation, and storage solutions while ensuring data governance and security.
HPE offers a hybrid work model requiring an average of two days per week onsite at an HPE office, along with comprehensive benefits and professional development programs.
Matching Summary
The role involves designing and architecting scalable data engineering pipelines that handle diverse data types including text, images, video, and audio.
Skills & Requirements
Must-have
7+ years data engineering experience
Python PySpark SQL development
AWS S3 Snowflake architecture
Web crawling and scraping expertise
Salesforce API and Bulk API integration
End-to-end pipeline design for unstructured data
Nice-to-have
Databricks platform familiarity
Kafka and Spark Streaming knowledge
Github Actions workflow automation
Cross-functional collaboration skills
Passion for data handling ethics
Comfort with ambiguity and change
Key Requirements
Bachelor's degree in Computer Science or related quantitative field
7+ years of data analysis and engineering experience
Hands-on experience with Salesforce API and web crawling