Senior Site Reliability Engineer

DraftKings

Boston, Massachusetts, United States
3+ years distributed cloud experience
Kubernetes administration and troubleshooting
Go or python software development
This role involves building and scaling critical infrastructure across global data centers, multiple cloud platforms, and on-premise systems to drive stability at scale

Job Summary

  • This role involves building and scaling critical infrastructure across global data centers, multiple cloud platforms, and on-premise systems to drive stability at scale.
  • You will implement automation for self-healing, fault-tolerant infrastructure using declarative configurations and event-driven workflows while developing internal tools to eliminate repetitive tasks.
  • The position requires participating in an on-call rotation, incident reviews, root cause identification, and Root Cause Analysis (RCA) reporting to ensure the highest level of uptime.

Matching Summary

This role involves building and scaling critical infrastructure across global data centers, multiple cloud platforms, and on-premise systems to drive stability at scale.

Skills & Requirements

Must-have

  • 3+ years distributed cloud experience
  • Kubernetes administration and troubleshooting
  • Go or Python software development
  • Terraform and Ansible automation
  • Linux networking and kernel debugging

Nice-to-have

  • Creative problem-solving skills
  • Collaborative team culture contribution
  • Experience with multiple public clouds
  • On-premise system management

Key Requirements

  • Bachelor's degree in Computer Science or relevant education
  • At least 3 years of experience managing distributed cloud environments
  • Deep expertise in container orchestration with Kubernetes

Work Rights

Not specified

Tailored Resume

Cover Letter