Site Reliability Engineer (infrastructure Applications) - Avp / Director P3 - Ets

Morgan Stanley

Hybrid
Linux troubleshooting and automation
Production reliability ownership
Incident response and root-cause analysis
Join Morgan Stanley’s Application Services Infrastructure team to keep a set of business-critical infrastructure applications reliable for technologists across the firm

Job Summary

  • Join Morgan Stanley’s Application Services Infrastructure team to keep a set of business-critical infrastructure applications reliable for technologists across the firm.
  • You’ll combine deep Linux troubleshooting with automation and reliability engineering: improving monitoring, reducing toil, leading upgrades, and driving root-cause fixes that prevent repeat incidents.
  • After onboarding, you’ll join a rotating on-call roster with periodic weekend coverage (~1 weekend/month).

Matching Summary

Join Morgan Stanley’s Application Services Infrastructure team to keep a set of business-critical infrastructure applications reliable for technologists across the firm.

Skills & Requirements

Must-have

  • Linux troubleshooting and automation
  • Production reliability ownership
  • Incident response and root-cause analysis
  • Improve monitoring and alerting quality
  • Build self-service workflows and documentation
  • Bash/shell scripting and Python automation

Nice-to-have

  • Cloud-native deployment and support
  • Observability tooling experience
  • Linux administration and performance tuning
  • Workflow and scheduling platforms experience

Key Requirements

  • At least 7 years of experience in production support / reliability
  • Strong command-line troubleshooting skills
  • Production-ready automation in bash/shell plus one language
  • Strong written communication skills
  • Working understanding of distributed architecture
  • AI-assisted development and operational automation

Work Rights

Not specified

Tailored Resume

Cover Letter