STAFF / SENIOR ENGINEER, AI SYSTEMS & LLM INFERENCE OPTIMIZATION | AI INFRA PLATFORM | SINGAPORE-BASED

GK CONSULTING PTE. LTD.

Singapore, Singapore
**
C++/cuda/python programming skills
Deep learning systems experience
Llm inference optimization techniques
** GK Consulting Pte. Ltd. is seeking a Staff/Senior Engineer specializing in AI Systems and LLM Inference Optimization, with a focus on improving large-scale AI infrastructure performance. The ideal candidate will possess a strong background in computer architecture or software systems, along with expertise in programming and deep learning. **

Job Summary

  • This role focuses on optimizing latency, throughput, and cost efficiency across large-scale AI systems for next-generation foundation models.
  • Candidates will work closely with research and engineering teams to improve model efficiency using advanced techniques like quantization and sparsity.
  • The position supports two profiles: Computer Architecture for hardware-software co-optimization or Software/AI Systems for runtime and compiler optimization.

Matching Summary

Match Score: 75

** GK Consulting Pte. Ltd. is seeking a Staff/Senior Engineer specializing in AI Systems and LLM Inference Optimization, with a focus on improving large-scale AI infrastructure performance. The ideal candidate will possess a strong background in computer architecture or software systems, along with expertise in programming and deep learning. **

Skills & Requirements

Must-have

  • C++/CUDA/Python programming skills
  • Deep learning systems experience
  • LLM inference optimization techniques

Nice-to-have

  • Experience with PyTorch or TensorRT
  • Knowledge of vLLM framework
  • Strong collaboration and performance mindset

Key Requirements

  • Master's or Ph.D. in a relevant field
  • Hands-on experience in large model inference
  • Strong background in high-performance kernels

Work Rights

Not specified

Tailored Resume

Cover Letter