Site Reliability Engineer, Machine Learning Systems - Singapore
BYTEDANCE PTE. LTD.
Singapore
Hybrid
Bachelor's degree in computer science
Proficiency in go, python, or shell
Kubernetes and container experience
ByteDance is seeking a Site Reliability Engineer for Machine Learning Systems in Singapore, focused on maintaining and optimizing large-scale ML systems. The ideal candidate will have a strong background in programming, operations of distributed systems, and a passion for innovation within a diverse team
Job Summary
The ByteDance Large Model Team is committed to developing the most advanced AI large model technology in the industry.
You will build large-scale heterogeneous systems integrating GPU/NPU/RDMA/Storage and ensure they run steadily and reliably.
The role offers a positive team atmosphere with career growth opportunities and paid leave within a flat organization.
Matching Summary
Match Score: 85
ByteDance is seeking a Site Reliability Engineer for Machine Learning Systems in Singapore, focused on maintaining and optimizing large-scale ML systems. The ideal candidate will have a strong background in programming, operations of distributed systems, and a passion for innovation within a diverse team.