Software Engineer, Data Infrastructure & Acquisition - Paris, France
Speechify
Paris, France
On-site
Proficiency with bash and python scripting
Experience with docker and infrastructure-as-code
Professional experience with gcp cloud provider
The role is responsible for all aspects of data collection to support model training operations at petabyte-scale
Job Summary
The role is responsible for all aspects of data collection to support model training operations at petabyte-scale.
Candidates will operate and extend the cloud infrastructure for the ingestion pipeline, currently running on GCP and managed with Terraform.
Speechify offers a competitive salary, a friendly atmosphere, and an opportunity to build products that directly impact people with learning differences.
Matching Summary
The role is responsible for all aspects of data collection to support model training operations at petabyte-scale.
Skills & Requirements
Must-have
Proficiency with bash and Python scripting
Experience with Docker and Infrastructure-as-Code
Professional experience with GCP cloud provider
Nice-to-have
Experience with web crawlers
Large-scale data processing workflows
Ability to handle multiple tasks and adapt
Key Requirements
BS/MS/PhD in Computer Science or related field
5+ years of industry experience in software development