You will be responsible for maintaining high service availability, implementing monitoring and driving continuous improvements in our production environment
Job Summary
You will be responsible for maintaining high service availability, implementing monitoring and driving continuous improvements in our production environment.
After 1 Year You will: Drive Strategic Initiatives: play an instrumental role in defining the long-term reliability roadmap, integrating new tools and practices to further stabilize our services.
Benefits include Nationale Nederlanden life insurance, Edenred lunch card, Luxmed private health care, and a Multisport card.
Matching Summary
You will be responsible for maintaining high service availability, implementing monitoring and driving continuous improvements in our production environment.
Skills & Requirements
Must-have
OpenStack internal tools
Python / Golang / Bash proficiency
IAC management (Puppet, Ansible, Terraform)
Kubernetes and Docker experience
Monitoring, logging, and alerting systems (Prometheus, Grafana)
SRE methodologies
Fluent English
Nice-to-have
Collaborative mindset
Performance tuning skills
Continuous improvement mindset
Key Requirements
Proficiency in development using Python / Golang / Bash or similar language
Hands-on experience in managing IAC and optimizing it
Skilled in overseeing and maintaining cloud infrastructure
Experience with monitoring, logging and alerting systems combined with automating repetitive tasks
Expertise in applying SRE practices by maintaining large software systems and monitoring it