R&d Engineer, Hpc Systems

KLA

Chennai, India
Distributed frameworks and system-level solutions
Deploying ai-based solutions at scale
Deep learning frameworks (tensorflow, pytorch)
KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem, investing 15% of sales back into R&D

Job Summary

  • KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem, investing 15% of sales back into R&D.
  • The Advanced Computing Labs (ACL) in India focuses on delivering advanced parallel computing research and software architectures for AI + HPC + Cloud solutions.
  • The role involves exposing limitations in existing solutions and developing distributed frameworks to scale AI and image processing loads across multi-node clusters.

Matching Summary

KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem, investing 15% of sales back into R&D.

Skills & Requirements

Must-have

  • Distributed frameworks and system-level solutions
  • Deploying AI-based solutions at scale
  • Deep learning frameworks (TensorFlow, PyTorch)
  • Modern and advanced C++ concepts
  • Operating systems, computer networks
  • Modern distributed systems architecture

Nice-to-have

  • Heterogeneous programming languages (CUDA, Triton)
  • Open-source operating systems and software stack
  • Container infrastructure (Docker, Singularity, Kubernetes)
  • Active participation in C++ standards bodies

Key Requirements

  • Masters/PhD in Computer Science or related fields
  • Relevant experience and extraordinary track-record
  • Experience with deep-learning frameworks deployment
  • Strong Scripting Skills in Bash, Python

Work Rights

Not specified

Tailored Resume

Cover Letter