Senior Site Reliability Engineer

Navan (TripActions)

Austin, TX, US
On-site
5+ years sre or devops lead experience
Production 24x7 environment experience
Java application operational experience
The role involves designing and developing tooling, automation, and infrastructure services to power Navan's scalable systems used by thousands of travelers daily

Job Summary

  • The role involves designing and developing tooling, automation, and infrastructure services to power Navan's scalable systems used by thousands of travelers daily.
  • Candidates must identify reliability anti-patterns, automate toil, and leverage AI tools to achieve autonomous operations and improve system observability.
  • The position requires defining system reliability standards including SLO/SLI frameworks and driving the adoption of blameless post-mortem practices across engineering teams.

Matching Summary

The role involves designing and developing tooling, automation, and infrastructure services to power Navan's scalable systems used by thousands of travelers daily.

Skills & Requirements

Must-have

  • 5+ years SRE or DevOps Lead experience
  • Production 24x7 environment experience
  • Java application operational experience
  • AWS public cloud infrastructure
  • Infrastructure as Code Terraform
  • CI/CD pipeline management
  • Microservice architecture patterns

Nice-to-have

  • Python, Node.js, or Go scripting
  • Mentoring junior engineers
  • AI-assisted developer tools
  • Fast-paced startup environment
  • Strong ownership culture

Key Requirements

  • 5+ years progressive SRE or DevOps experience
  • 2+ years production 24x7 environment experience
  • Hands-on Java JVM profiling and tuning
  • Experience with distributed systems in AWS
  • Proficiency in Terraform or Cloudformation
  • Proven track record of mentoring engineers

Work Rights

Not specified

Tailored Resume

Cover Letter