Data Scientist

PPD (Thermo Fisher)

Bangalore, India
On-site
Python and pyspark for data pipelines
Machine learning and ai model development
Generative ai techniques (llms, prompt engineering)
Design, develop, and maintain scalable data pipelines and data processing solutions using Python, SQL, and PySpark in AWS environments

Job Summary

  • Design, develop, and maintain scalable data pipelines and data processing solutions using Python, SQL, and PySpark in AWS environments.
  • Build, train, evaluate, and deploy machine learning and AI models to solve business problems and improve decision-making.
  • Apply Generative AI (GenAI) techniques (e.g., LLMs, prompt engineering, embeddings) to develop innovative data products and automation solutions.

Matching Summary

Design, develop, and maintain scalable data pipelines and data processing solutions using Python, SQL, and PySpark in AWS environments.

Skills & Requirements

Must-have

  • Python and PySpark for data pipelines
  • Machine learning and AI model development
  • Generative AI techniques (LLMs, prompt engineering)
  • AWS cloud environments (S3, SageMaker)
  • SQL for data manipulation
  • MLOps best practices for deployment
  • Data exploration and feature engineering

Nice-to-have

  • Collaborate with cross-functional teams
  • Optimize data processing workflows
  • Stay updated on AI/ML advancements
  • Knowledge sharing sessions

Key Requirements

  • 3–5 years of experience in data science
  • Experience with AWS services
  • Experience with machine learning frameworks
  • Experience with Generative AI
  • Strong knowledge of SQL
  • Understanding of data modeling and ETL
  • Experience with Git

Work Rights

Not specified

Tailored Resume

Cover Letter