Site Reliability Engineer (SRE)

TRULYYY PTE. LTD.

Singapore
5+ years sre or devops experience
Jvm memory management and gc troubleshooting
Distributed systems middleware expertise
This role supports large-scale, high-availability internet systems across Southeast Asia with a focus on system reliability and incident response

Job Summary

  • This role supports large-scale, high-availability internet systems across Southeast Asia with a focus on system reliability and incident response.
  • The successful candidate will drive system improvements including high availability, fault tolerance, and disaster recovery mechanisms.
  • Candidates must possess strong analytical thinking and the ability to communicate effectively in Mandarin to support regional technical teams.

Matching Summary

Match Score: 85

This role supports large-scale, high-availability internet systems across Southeast Asia with a focus on system reliability and incident response.

Skills & Requirements

Must-have

  • 5+ years SRE or DevOps experience
  • JVM memory management and GC troubleshooting
  • Distributed systems middleware expertise
  • High-concurrency production environment support
  • Python, Shell, Go, or Java scripting proficiency

Nice-to-have

  • SRE operational frameworks knowledge
  • Capacity planning and service governance
  • Mandarin language communication skills
  • Disaster recovery planning experience
  • Office network LAN/WIFI troubleshooting

Key Requirements

  • Minimum 5 years of SRE/DevOps experience
  • Proficiency in JVM and distributed systems
  • Experience with monitoring tools like Grafana and Prometheus
  • Ability to troubleshoot Java process issues
  • Basic networking knowledge for office and cloud

Work Rights

Not specified

Tailored Resume

Cover Letter