Operate and support multi-tenant SaaS workloads across multiple AWS accounts, ensuring high availability and resilience through proactive monitoring, troubleshooting, and incident response
Job Summary
Operate and support multi-tenant SaaS workloads across multiple AWS accounts, ensuring high availability and resilience through proactive monitoring, troubleshooting, and incident response.
Own and enhance CI/CD pipelines (ArgoCD, GitHub Actions) for reliable, repeatable deployments and automate operational workflows, supporting engineering teams with smooth delivery pipelines and self-service tooling.
Implement and manage monitoring and logging with Splunk, Grafana, OpenTelemetry, and AWS CloudWatch, defining SLI/SLO metrics and driving continuous improvements in availability and performance.
Matching Summary
Operate and support multi-tenant SaaS workloads across multiple AWS accounts, ensuring high availability and resilience through proactive monitoring, troubleshooting, and incident response.
Skills & Requirements
Must-have
AWS services expertise
Kubernetes and Docker proficiency
ArgoCD and GitHub Actions CI/CD
AWS CDK and CloudFormation IaC
AWS networking services
IAM, AWS KMS, AWS WAF
OpenTelemetry, Splunk, Grafana monitoring
TypeScript and Bash scripting
Multi-account AWS management
Nice-to-have
Mentoring team members
Thriving in fast-paced environment
Leveraging technology for efficiency
Building customer value infrastructure
AWS CDK construct development
Projen familiarity
Helm charts and Kubernetes manifests
AI tools and frameworks
Key Requirements
7+ years in DevOps or cloud infrastructure roles
Significant experience in SaaS and multi-tenant platforms
Proven track record of mentoring team members
Expert knowledge of AWS services
Deep proficiency in Docker, Kubernetes, Helm
Expertise in CI/CD tools
Advanced experience with AWS CDK
Strong understanding of AWS networking services
In-depth knowledge of security and compliance
Extensive experience with monitoring and alerting tools
Familiarity with AWS Glue and Managed Kafka
Strong scripting and automation skills
Experience managing multiple AWS accounts
Exceptional communication and collaboration skills