Infrastructure as code with terraform or cloudformation
Site reliability engineering principles implementation
The role involves administering and engineering Big Data platforms across multiple Hadoop clusters in the cloud to ensure scalability and reliability
Job Summary
The role involves administering and engineering Big Data platforms across multiple Hadoop clusters in the cloud to ensure scalability and reliability.
Candidates will apply Site Reliability Engineering principles to design robust tooling and automated response mechanisms for proactive risk mitigation.
This position requires owning the architecture of Platform, PaaS, and SaaS solutions while providing technical leadership and mentorship to the team.
Matching Summary
The role involves administering and engineering Big Data platforms across multiple Hadoop clusters in the cloud to ensure scalability and reliability.
Skills & Requirements
Must-have
Hadoop cluster administration on AWS EMR
Infrastructure as Code with Terraform or CloudFormation
Site Reliability Engineering principles implementation
Python, SQL, Spark, and Linux proficiency
Incident triaging and service restoration ownership
Nice-to-have
Experience with Splunk, Dynatrace, or CloudWatch
Mentorship of junior and mid-level engineers
Knowledge of machine learning algorithms
Self-service data wrangling enablement
Data visualization and metadata management
Key Requirements
7+ years of experience in platform administration and big data technologies
3+ years of hands-on AWS Infrastructure as Code engineering experience
Bachelor's degree in Computer Science or related field
Architectural guidance experience for developers and administrators