Skip to main content

Databricks

Harden Databricks with Immuta’s Policy-As-Code Framework

Databricks

Databricks provides a powerful, spark-centric, cloud-based analytics platform that enables users to rapidly process, transform and explore data. However, its preconfigured security can be insufficient in regulating or monitoring confidential information due to the flexibility it offers. This can be of particular concern to highly regulated enterprise, such a financial and health-care companies. Policy-as-code is a new paradigm that can help manage the additional technical overhead required for compliance and governance for these companies as they migrate more and more sensitive data to Databricks.

Data Security and Privacy Controls

Policy-as-code is a way to automate data security and privacy controls, allowing organizations to apply centralized policies to data science and analytics projects. Introducing Immuta’s Policy-As-Code feature: a simple and efficient way to equip Databrick’s security model with even more protection, while granting the entire organization improved visibility into data use and access. With Immuta and Databricks, policy-as-code can be implemented quickly and easily. For example, you can create data sources, policies, projects, and purposes with endpoints, methods, query parameters, and payload definitions. Additionally, Immuta allows you to audit data access for compliance and create data governance policies without writing code.

Setting up the Integration

To set up the integration, Databricks provides an API that can be used to write and deploy policies using JSON. These policies can be used to define who has access to what datasets and for how long, as well as the granularity of access granted on a dataset. Immuta also allows users to easily audit data usage and enforce rules on data governance across all Databricks clusters. A sample policy might look something like this:

{ 

    "name": "My Databricks Dataset", 

    "actions": { 

        "read": [ 

            "group: Databricks", 

        ], 

        "write": [] 

    }, 

    "time_restrictions": { 

        "start": <start time>, 

        "end": <end time> 

    } 

}

Immuta’s Policy-As-Code feature provides Databricks with a comprehensive and industry-tested security model. This allows enterprises to ensure that confidential data is properly managed and secured. Organizations can swiftly and effectively deploy centralized policies for data science and analytics projects using this feature for Databricks security. It gives users the capacity to simply construct data sources, policies, projects, and purposes as well as determine endpoints, methods query parameters, and payloads. Additionally, it can be used to audit data access for compliance and create data governance policies without having to write code.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

David Callaghan, Solutions Architect

As a solutions architect with Perficient, I bring twenty years of development experience and I'm currently hands-on with Hadoop/Spark, blockchain and cloud, coding in Java, Scala and Go. I'm certified in and work extensively with Hadoop, Cassandra, Spark, AWS, MongoDB and Pentaho. Most recently, I've been bringing integrated blockchain (particularly Hyperledger and Ethereum) and big data solutions to the cloud with an emphasis on integrating Modern Data produces such as HBase, Cassandra and Neo4J as the off-blockchain repository.

More from this Author

Follow Us