Site Reliability Engineering (sre) Tech Lead

Obsidian Security

Palo Alto, United States
Base: $250,000—$280,000 usd; bonus/equity: not spe...
On-site
Multi-tenant saas platform
Instrument critical system paths
Connector health models
Define and build the reliability foundation for a complex, multi-tenant SaaS platform serving enterprise and financial customers

Job Summary

  • Define and build the reliability foundation for a complex, multi-tenant SaaS platform serving enterprise and financial customers.
  • Architect and implement systems that handle real-world complexity: upstream SaaS dependencies, sparse and noisy data, and mission-critical enterprise workloads.
  • Build a detection and reliability platform from the ground up, solving for multi-tenant systems with upstream dependencies and sparse data.

Matching Summary

Define and build the reliability foundation for a complex, multi-tenant SaaS platform serving enterprise and financial customers.

Salary

Base: $250,000—$280,000 USD; Bonus/Equity: Not specified; Benefits: Not specified

Skills & Requirements

Must-have

  • multi-tenant SaaS platform
  • instrument critical system paths
  • connector health models
  • tiered incident communication
  • SLI/SLO standards
  • self-service observability tooling
  • anomaly detection
  • incident response processes

Nice-to-have

  • customer-facing status pages
  • third-party SaaS connector ingestion
  • B2B SaaS enterprise/financial customers

Key Requirements

  • 7+ years in SRE, production engineering
  • 2+ years technical lead
  • AWS and/or GCP
  • Kubernetes, Helm
  • Prometheus, Grafana
  • GitLab CI/CD, ArgoCD
  • multi-tenant SaaS systems
  • distributed microservices

Work Rights

Not specified

Tailored Resume

Cover Letter