Application Support Engineer (SRE)

KNOVEL ENGINEERING PTE. LTD.

Singapore
Competitive remuneration; not specified; benefits ...
Cloud infrastructure proficiency aws or azure
Kubernetes and docker container troubleshooting
Observability tools datadog grafana prometheus
The role involves owning the reliability of an AI-native ecosystem by leading high-impact incident resolution

Job Summary

  • The role involves owning the reliability of an AI-native ecosystem by leading high-impact incident resolution.
  • Candidates will perform deep-dive root-cause analysis across cloud, containerized environments, and ML model behaviors.
  • The company offers a flat hierarchy with minimal bureaucracy and opportunities to work with cutting-edge technologies.

Matching Summary

Match Score: 75

The role involves owning the reliability of an AI-native ecosystem by leading high-impact incident resolution.

Salary

Competitive remuneration; Not specified; Benefits included

Skills & Requirements

Must-have

  • Cloud infrastructure proficiency AWS or Azure
  • Kubernetes and Docker container troubleshooting
  • Observability tools Datadog Grafana Prometheus
  • Scripting skills Python Golang Bash
  • Root cause analysis for production incidents

Nice-to-have

  • Machine learning system support experience
  • Infrastructure as Code Terraform Ansible Helm
  • ITIL incident management framework knowledge
  • Mentoring junior support staff capabilities

Key Requirements

  • Degree in Information Systems Computer Science or related field
  • Experience supporting complex cloud-native software systems
  • Formal experience with incident management frameworks like ITIL

Work Rights

Not specified

Tailored Resume

Cover Letter