Firmus Technologies is a global leader pioneering efficient AI infrastructure across Asia Pacific with a mission to combine cutting-edge technology with sustainability
Job Summary
Firmus Technologies is a global leader pioneering efficient AI infrastructure across Asia Pacific with a mission to combine cutting-edge technology with sustainability.
The role involves owning the control plane that powers AI workload submission by designing unified APIs, CLIs, and web interfaces for training, inference, and fine-tuning on Kubernetes and Slurm.
Engineers will implement intelligent scheduling policies including priority classes, preemption, fairness algorithms, and multi-tenant isolation while wiring observability pipelines for per-job GPU metrics and cost tracking.
Matching Summary
Firmus Technologies is a global leader pioneering efficient AI infrastructure across Asia Pacific with a mission to combine cutting-edge technology with sustainability.
Skills & Requirements
Must-have
Deep Kubernetes expertise
Hands-on Slurm experience
Production API development
Distributed systems knowledge
RBAC and resource quotas
Nice-to-have
Python, Go, or Java proficiency
LLM engineering collaboration
Sustainability commitment
Multi-generational liquid cooling context
Cost tracking implementation
Key Requirements
5–7 years of backend engineering experience
Production API and distributed systems background
Kubernetes Job controllers and Pod specs knowledge