Site Reliability Engineer | Ai Infrastructure

JLL

Tel Aviv, Israel
Base: not specified; bonus/equity: rsus (st + ard ...
Hybrid
Cloud platforms (azure or aws)
Containerization (docker, kubernetes)
Ci/cd pipelines
You will own the platform layer for AI agents, including deployment architecture, observability, and production reliability

Job Summary

  • You will own the platform layer for AI agents, including deployment architecture, observability, and production reliability.
  • The challenge involves designing deployment and observability for LLM-backed services, tracking output quality, cost per invocation, and model drift.
  • You will create tooling that makes it easy for the team to build, test, and deploy, setting the team's default patterns.

Matching Summary

You will own the platform layer for AI agents, including deployment architecture, observability, and production reliability.

Salary

Base: Not specified; Bonus/Equity: RSUs (standard 4-year vest), annual bonus; Benefits: Keren hishtalmut

Skills & Requirements

Must-have

  • Cloud platforms (Azure or AWS)
  • Containerization (Docker, Kubernetes)
  • CI/CD pipelines
  • Infrastructure-as-code (Terraform, CDK)
  • Monitoring and observability tools
  • Incident management experience
  • Linux, networking, security fundamentals

Nice-to-have

  • AI/ML infrastructure experience
  • Production code in TypeScript or Python
  • Self-service developer tooling
  • Cost optimization for cloud workloads
  • Enterprise security engineering

Key Requirements

  • 5+ years in SRE, platform engineering, DevOps, or infrastructure roles
  • Experience owning infrastructure end-to-end
  • On-call, production incidents, post-mortems
  • Comfortable working independently with broad ownership
  • Strong written and verbal English

Work Rights

Not specified

Tailored Resume

Cover Letter