Senior Software Engineer - Fleet Management

Nscaleoperationsukltd

Uk, United Kingdom
On-site
Python-based automation systems
Distributed systems at scale
Infrastructure reliability and scalability
Build foundational Python-based automation systems that manage the entire lifecycle of our compute infrastructure

Job Summary

  • Build foundational Python-based automation systems that manage the entire lifecycle of our compute infrastructure.
  • Drive technical strategy for reliability, observability, incident response, and operational excellence.
  • Join a collaborative, supportive, and innovative environment where your contributions spark real impact.

Matching Summary

Build foundational Python-based automation systems that manage the entire lifecycle of our compute infrastructure.

Skills & Requirements

Must-have

  • Python-based automation systems
  • Distributed systems at scale
  • Infrastructure reliability and scalability
  • Workflow automation for GPU nodes
  • Hardware lifecycle operations automation
  • AI tools for development workflow

Nice-to-have

  • Workflow orchestration tools
  • Infrastructure tooling experience
  • Bare metal provisioning and automation
  • GPU infrastructure experience
  • HPC and networking knowledge
  • Kubernetes and IaC experience

Key Requirements

  • 5+ years software engineering experience
  • Production systems experience
  • Infrastructure automation or workflow tooling focus
  • Hands-on day 2 operations experience

Work Rights

Not specified

Tailored Resume

Cover Letter