Senior Site Reliability Engineer, Observability

Webflow

Remote, Argentina
Not specified; equity (rsus); annual win bonus pro...
Remote
5+ years distributed systems experience
Datadog or grafana observability tools
Opentelemetry instrumentation frameworks
The role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour

Job Summary

  • The role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour.
  • Engineers will own and evolve the observability stack using OpenTelemetry and Datadog to provide actionable metrics, traces, and logs across services.
  • Webflow offers comprehensive benefits including equity (RSUs), full health coverage, flexible vacation, and a 401(k) with a 100% employer match.

Matching Summary

The role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour.

Salary

Not specified; Equity (RSUs); Annual WIN bonus program eligible

Skills & Requirements

Must-have

  • 5+ years distributed systems experience
  • Datadog or Grafana observability tools
  • OpenTelemetry instrumentation frameworks
  • SLO/SLI definition and operationalization
  • AWS or GCP cloud environment scaling
  • Kubernetes container architecture experience
  • Terraform or Pulumi infrastructure as code

Nice-to-have

  • AI-powered agent development for observability
  • Proactive embrace of emerging AI technologies
  • Experience improving incident response processes
  • Full-stack application debugging in TypeScript
  • Building automated root cause analysis tools

Key Requirements

  • BS/BA degree or relevant experience
  • Business-level English fluency
  • 5+ years in customer-facing distributed systems

Work Rights

Not specified

Tailored Resume

Cover Letter