Senior Ml Engineer - Kimchi (llm Inference Optimization)
Castaigroupinc
Austria
Competitive salary depending on experience; equity...
Fully remote
5+ years building real ml systems
Production python services development
Experience with vllm sglang or tensorrt-llm
Cast AI is seeking a Senior ML Engineer for their Kimchi team, focusing on optimizing inference for large language models (LLMs) within cloud-native environments. The ideal candidate will have extensive experience in building ML systems, particularly in performance tuning and infrastructure optimization
Job Summary
This role focuses on optimizing throughput, latency, and KV cache utilization to improve customer inference speed and company margins.
The successful candidate will lead the technical direction of the Kimchi system, which automatically matches workloads to the most cost-efficient LLM configurations.
Employees enjoy a flexible remote-first environment with equity options, a learning budget, and dedicated time for personal projects.
Matching Summary
Match Score: 85
Cast AI is seeking a Senior ML Engineer for their Kimchi team, focusing on optimizing inference for large language models (LLMs) within cloud-native environments. The ideal candidate will have extensive experience in building ML systems, particularly in performance tuning and infrastructure optimization.