Staff Site Reliability Engineer

AlphaSense

Delhi, India
On-site
8+ years site reliability engineering experience
Production saas systems at scale
Python or go programming proficiency
The role involves architecting core reliability platforms and leading by example in incident response to achieve 99.99% uptime

Job Summary

  • The role involves architecting core reliability platforms and leading by example in incident response to achieve 99.99% uptime.
  • Candidates will drive the AIOps strategy by automating diagnostics, remediation, and proactive failure prevention.
  • This position requires acting as a force multiplier by mentoring fellow engineers and influencing architectural decisions across the global organization.

Matching Summary

The role involves architecting core reliability platforms and leading by example in incident response to achieve 99.99% uptime.

Skills & Requirements

Must-have

  • 8+ years Site Reliability Engineering experience
  • Production SaaS systems at scale
  • Python or Go programming proficiency
  • AWS, GCP, or Azure cloud expertise
  • Kubernetes container orchestration
  • TCP/IP, DNS, HTTP/S networking fundamentals
  • Prometheus, Grafana, Datadog monitoring tools

Nice-to-have

  • Advanced observability with OTEL
  • Continuous profiling experience
  • AIOps automation strategy
  • Blameless postmortem culture leadership
  • Mentoring and force multiplier skills

Key Requirements

  • 8+ years of SRE or DevOps experience
  • 3+ years in a Senior+ SRE position
  • Hands-on production SaaS scaling background

Work Rights

Not specified

Tailored Resume

Cover Letter