Senior Site Reliability Engineer, Observability

Webflow

Remote, Argentina
Not specified; equity (rsus); annual win bonus pro...
Remote
5+ years distributed systems experience
Hands-on observability platform expertise
Opentelemetry instrumentation knowledge
This role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour

Job Summary

  • This role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour.
  • The successful candidate will own and evolve the observability stack using OpenTelemetry and Datadog to provide actionable metrics, traces, and logs.
  • Webflow offers comprehensive benefits including equity (RSUs), full health coverage, flexible vacation, and a 401(k) with 100% employer match.

Matching Summary

This role focuses on improving the reliability and stability of Webflow's customer-facing production infrastructure serving millions of page views per hour.

Salary

Not specified; Equity (RSUs); Annual WIN bonus program eligible

Skills & Requirements

Must-have

  • 5+ years distributed systems experience
  • Hands-on observability platform expertise
  • OpenTelemetry instrumentation knowledge
  • SLO/SLI definition and operationalization
  • AWS or GCP cloud environment scaling
  • Kubernetes container architecture experience
  • Infrastructure-as-code tool proficiency

Nice-to-have

  • AI-powered agent development experience
  • Pulumi infrastructure tool usage
  • Incident response process improvement
  • Proactive embrace of emerging technologies
  • Full-stack application debugging skills

Key Requirements

  • BS/BA degree or relevant experience
  • Business-level English fluency
  • 5+ years in customer-facing distributed systems

Work Rights

Not specified

Tailored Resume

Cover Letter