Manage and maintain workspaces, clusters, pools, and jobs in Databricks or other big data systems, monitoring performance, usage, and costs to optimize cluster configurations
Job Summary
Manage and maintain workspaces, clusters, pools, and jobs in Databricks or other big data systems, monitoring performance, usage, and costs to optimize cluster configurations.
Configure role-based access control (RBAC), audit logging, and ensure compliance with data governance and security standards.
Develop CI/CD pipelines for notebooks, jobs, and libraries using tools like Gitlab, GitHub Actions, Azure DevOps, or Jenkins, and automate cluster lifecycle management and job scheduling.
Matching Summary
Manage and maintain workspaces, clusters, pools, and jobs in Databricks or other big data systems, monitoring performance, usage, and costs to optimize cluster configurations.
Skills & Requirements
Must-have
Cloud-based data platforms
Apache Spark
Big data technologies
Databricks administration
Python, SQL, shell scripting
CI/CD pipelines
Infrastructure-as-code
Nice-to-have
Enable data teams to work efficiently
Passion for continuous improvement
Collaborative and creative culture
Commitment to sustainability
Key Requirements
Proven experience administering big data platforms
Strong understanding of Apache Spark
Experience with cloud platforms (Azure preferred)
Familiarity with Unity Catalog, Delta Lake, Lakehouse architecture