Principal Engineer, Machine Learning, SMAI

MICRON SEMICONDUCTOR ASIA OPERATIONS PTE. LTD.

Yishun, Singapore
Not specified
Distributed training strategies fsdp deepspeed megatron-lm
Fine-tuning large language models peft lora qlora
Building autonomous ai agents langchain langgraph crewai
Micron Semiconductor Asia Operations is seeking a Principal Engineer in Machine Learning for their Smart Manufacturing and AI team. The role involves designing and implementing scalable AI/ML solutions to enhance Micron’s manufacturing processes, requiring extensive experience in data technologies and machine learning frameworks

Job Summary

  • The role focuses on delivering industry-winning machine learning, custom GenAI, and Agentic AI solutions to power Micron's dominance in the memory solutions market.
  • Candidates will architect large-scale custom model training jobs on multi-node clusters and optimize training throughput using distributed strategies like FSDP and DeepSpeed.
  • The position requires designing autonomous AI agents capable of multi-step reasoning to automate complex manufacturing workflows using frameworks such as LangChain and LangGraph.

Matching Summary

Match Score: 85

Micron Semiconductor Asia Operations is seeking a Principal Engineer in Machine Learning for their Smart Manufacturing and AI team. The role involves designing and implementing scalable AI/ML solutions to enhance Micron’s manufacturing processes, requiring extensive experience in data technologies and machine learning frameworks.

Skills & Requirements

Must-have

  • Distributed training strategies FSDP DeepSpeed Megatron-LM
  • Fine-tuning Large Language Models PEFT LoRA QLoRA
  • Building autonomous AI Agents LangChain LangGraph CrewAI
  • GPU architecture optimization memory hierarchy tensor cores
  • Python programming with PyTorch framework proficiency
  • CI/CD pipelines Jenkins Docker Kubernetes
  • 9+ years building scalable ETL pipelines

Nice-to-have

  • Experience with HPC job schedulers Slurm
  • CUDA programming Triton kernels custom C++ extensions
  • Multi-Agent Systems orchestration collaboration
  • Computer vision signal processing techniques
  • Snowflake Google Cloud platform knowledge
  • Ray KubeFlow GPU workload orchestration
  • Strong analytical thinking communication skills

Key Requirements

  • Technical Degree required
  • Computer Science or Statistics background highly desired
  • Minimum 9+ years experience with big data processing
  • Deep understanding of GPU architecture and resource management
  • Proficiency in Python preferred over Java

Work Rights

Not specified

Tailored Resume

Cover Letter