Hands-on experience operating high-availability systems
Nebius is seeking a Senior Network Site Reliability Engineer (NetSRE) to enhance their network infrastructure, crucial for supporting their AI-driven cloud services. The role is remote and emphasizes building reliability through automation and operational excellence
Job Summary
Nebius is leading a new era in cloud computing to serve the global AI economy by providing tools without massive infrastructure costs.
The role involves defining reliability goals, driving improvements across the whole network, and owning incident response to turn failures into durable fixes.
Candidates will build and evolve observability, design safer change workflows, and work closely with network engineers to embed operability into designs.
Matching Summary
Match Score: 85
Nebius is seeking a Senior Network Site Reliability Engineer (NetSRE) to enhance their network infrastructure, crucial for supporting their AI-driven cloud services. The role is remote and emphasizes building reliability through automation and operational excellence.
Salary
Competitive salary; Comprehensive benefits package; Flexible working arrangements
Skills & Requirements
Must-have
Strong production Linux fundamentals
Solid understanding of networking basics
Hands-on experience operating high-availability systems
Ability to write software automation in Go or Python
Experience with modern infrastructure tooling like IaC and CI/CD
Nice-to-have
Experience with high-throughput traffic processing
Low-level networking performance debug background
Experience building network-safe delivery pipelines
Background with large-scale network observability
Key Requirements
Structured approach to debugging complex systems
Software/automation development skills (Go preferred)