Achieving end-to-end lineage in Databricks while allowing external users to access raw data can be a challenging task. In Databricks, leveraging Unity Catalog for end-to-end lineage is a best practice. However, enabling external users to access raw data while maintaining security and lineage integrity requires a well-thought-out architecture. This blog outlines a reference architecture to […]
Posts Tagged ‘data governance’
Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks
Deletion Vectors will be enabled by default in Delta Live Tables (DLTs) for materialized views and streaming tables starting April 28, 2025. Predictive Optimization for DLT maintenance will also be enabled by default. This could provide both cost savings and performance improvements. Our Databricks Practice holds FinOps as a core architectural tenet, but sometimes compliance […]
How Automatic Liquid Clustering Supports Databricks FinOps at Scale
Perficient has a FinOps mindset with Databricks, so the Automatic Liquid Clustering announcement grabbed my attention. I’ve mentioned Liquid Clustering before when discussing the advantages of Unity Catalog beyond governance use cases. Unity Catalog: come for the data governance, stay for the predictive optimization. I am usually a fan of being able to tune the dials […]
SAP and Databricks: Better Together
SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
SAP and Databricks: Better Together
SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
A New Era of AI Agents in the Enterprise?
In a move that has sparked intense discussion across the enterprise software landscape, Klarna announced its decision to drop both Salesforce Sales Cloud and Workday, replacing these industry-leading platforms with its own AI-driven tools. This announcement, led by CEO Sebastian Siemiatkowski, may signal a paradigm shift toward using custom AI agents to manage critical business […]
Agentic AI: The New Frontier in GenAI
In the rapidly evolving landscape of digital transformation, businesses are constantly seeking innovative ways to enhance their operations and gain a competitive edge. While Generative AI (GenAI) has been the hot topic since OpenAI introduced ChatGPT to the public in November 2022, a new evolution of the technology is emerging that promises to revolutionize how […]
Maximize Your Data Management with Unity Catalog
Databricks Unity Catalog is a unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. Unity Catalog offers a comprehensive solution for enhancing data governance, operational efficiency, and technological performance. By centralizing metadata management, access controls, and data lineage tracking, it simplifies compliance, reduces complexity, and improves query performance […]
Risk Management Data Strategy – Insights from an Inquisitive Overseer
We are witnessing a sea-change in the way data is managed by banks and financial institutions all over the world. Data being commoditized and, in some cases, even monetized by banks is the order of the day. Though this seems to be at a stage where some more push is required in terms of adoption […]
Data Lake Governance with Tagging in Databricks Unity Catalog
The goal of Databricks Unity Catalog is to provide centralized security and management to data and AI assets across the data lakehouse. Unity Catalog provides fine-grained access control for all the securable objects in the lakehouse; databases, tables, files and even models. Gone are the limitations of the Hive metadata store. The Unity Catalog metastore […]
Let’s Meet at Informatica World 2023 #InformaticaWorld
Informatica World takes place May 8-11 at the Venetian Resort Las Vegas and we can’t wait to meet you there! Perficient is a proud sponsor of Informatica’s largest event, which brings together customers and partners from across the globe. Perficient is a global digital consultancy, an Informatica Platinum Enterprise Partner, and the 2022 Cloud Modernization […]
Transform Your Business with Amazon DataZone
Amazon recently released a new data tool called DataZone, which allows companies to share, search, and discover data at scale across organizational boundaries. It offers many features such as the ability to search for published data, request access, collaborate with teams through data assets, manage and monitor data assets across projects, access analytics with a […]