Site Reliability Engineer I

Sumologic

San Jose, Costa Rica
On-site
Cloud native application development experience
Strong debugging and troubleshooting skills
Aws networking compute storage managed services
The role involves owning the availability of planet-scale observability and security products by striving for sustained operational excellence

Job Summary

  • The role involves owning the availability of planet-scale observability and security products by striving for sustained operational excellence.
  • Candidates will write code and automation to reduce operational workload, eliminate toil, and enable developers to deliver features more rapidly.
  • The position requires participating in blame-free root cause analysis meetings and driving issue resolution across global teams.

Matching Summary

The role involves owning the availability of planet-scale observability and security products by striving for sustained operational excellence.

Skills & Requirements

Must-have

  • Cloud native application development experience
  • Strong debugging and troubleshooting skills
  • AWS Networking Compute Storage managed services
  • Modern CI/CD tooling Kubernetes Terraform Ansible Jenkins
  • Infrastructure as Code practices Terraform CloudFormation
  • Production-ready code Java Scala or Go
  • Linux systems and command line proficiency

Nice-to-have

  • Experience with Sumo Logic products
  • Planet-scale product development experience
  • Expert-level AWS SaaS operations
  • Streaming technologies Kafka KSQL
  • Advanced JVM workload tuning at scale
  • Flexible willingness to step into new roles

Key Requirements

  • Bachelor's or Master's Degree in Computer Science or related field
  • 1+ years of industry experience
  • Ability to author production-ready code in Java Scala or Go

Work Rights

Not specified

Tailored Resume

Cover Letter