Deploy data pipelines from development to production
Work with structured and unstructured datasets
The role of the Data Engineer is to collaborate with the existing team of data scientists, data engineers and analysts to create data tools, develop data ingestion and processing pipelines, ensuring optimized data processing, and ensuring that data systems meet STB's business requirements
Job Summary
The role of the Data Engineer is to collaborate with the existing team of data scientists, data engineers and analysts to create data tools, develop data ingestion and processing pipelines, ensuring optimized data processing, and ensuring that data systems meet STB's business requirements.
Prepare, process, cleanse and verify the integrity of data collected for analysis; design, develop and implement self-managed data processing and compilation pipelines related to key enterprise data domains.
Work closely with vendors and internal stakeholders to project manage and coordinate Data Science & Analytics's (DS&A) data ingestion and data processing pipelines across platforms.
Matching Summary
The role of the Data Engineer is to collaborate with the existing team of data scientists, data engineers and analysts to create data tools, develop data ingestion and processing pipelines, ensuring optimized data processing, and ensuring that data systems meet STB's business requirements.
Skills & Requirements
Must-have
Develop data ingestion and processing pipelines
Deploy data pipelines from development to production
Work with structured and unstructured datasets
Proficient in R, Python, and SQL
Design and implement data processing pipelines
Enhance data reliability and quality
Nice-to-have
Experience with DataOps and DevOps processes
Comfortable in a dynamic, fast-paced environment
Good presentation and communication skills
Aptitude for solving engineering problems
Key Requirements
1-2 years of work experience in a related field
Trained in a quantitative discipline (Computer Science, Engineering, Math, Statistics)
Experience deploying data pipelines to production advantageous