These days, data is pretty much everywhere, and come 2025, it’s more valuable, more regulated, and honestly more complicated than ever. But with all this data comes responsibility. That’s where data governance steps in. Whether you’re running a business, working in IT, or just really care about your privacy, understanding data governance is key to […]
Posts Tagged ‘data governance’
Understanding Clean Rooms: A Comparative Analysis Between Databricks and Snowflake
“Clean rooms” have emerged as a pivotal data sharing innovation with both Databricks and Snowflake providing enterprise alternatives. Clean rooms are secure environments designed to allow multiple parties to collaborate on data analysis without exposing sensitive details of data. They serve as a sandbox where participants can perform computations on shared datasets while keeping raw […]
Master Data Management: The Key to Improved Analytics Reporting
In today’s data-driven business environment, organizations rely heavily on analytics to make strategic decisions. However, the effectiveness of analytics reporting depends on the quality, consistency, and reliability of data. This is where Master Data Management (MDM) plays a crucial role. By establishing a single, authoritative source of truth for critical data domains, MDM ensures that […]
Avoiding Metadata Contention in Unity Catalog
Metadata contention in Unity Catalog can occur in high-throughput Databricks environments, slowing down user queries and impacting performance across the platform. Our Finops strategy shifts left on performance. However, we have found scenarios where clients are still experiencing query slowdowns intermittently and even on optimized queries. As our client’s lakehouse footprint grows, we are seeing […]
Why Do Organizations Need Data Governance?
A well-known fact about data is that it is a crucial Asset in an organization when managed appropriately. Data governance helps organizations manage data appropriately. Some customers say data governance is an optional best practice but not a mandatory implementation strategy. Then, ask your customer a few questions: Is your data reliable or trustworthy? Is your […]
End-to-End Lineage and External Raw Data Access in Databricks
Achieving end-to-end lineage in Databricks while allowing external users to access raw data can be a challenging task. In Databricks, leveraging Unity Catalog for end-to-end lineage is a best practice. However, enabling external users to access raw data while maintaining security and lineage integrity requires a well-thought-out architecture. This blog outlines a reference architecture to […]
Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks
Deletion Vectors will be enabled by default in Delta Live Tables (DLTs) for materialized views and streaming tables starting April 28, 2025. Predictive Optimization for DLT maintenance will also be enabled by default. This could provide both cost savings and performance improvements. Our Databricks Practice holds FinOps as a core architectural tenet, but sometimes compliance […]
How Automatic Liquid Clustering Supports Databricks FinOps at Scale
Perficient has a FinOps mindset with Databricks, so the Automatic Liquid Clustering announcement grabbed my attention. I’ve mentioned Liquid Clustering before when discussing the advantages of Unity Catalog beyond governance use cases. Unity Catalog: come for the data governance, stay for the predictive optimization. I am usually a fan of being able to tune the dials […]
SAP and Databricks: Better Together
SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
SAP and Databricks: Better Together
SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
A New Era of AI Agents in the Enterprise?
In a move that has sparked intense discussion across the enterprise software landscape, Klarna announced its decision to drop both Salesforce Sales Cloud and Workday, replacing these industry-leading platforms with its own AI-driven tools. This announcement, led by CEO Sebastian Siemiatkowski, may signal a paradigm shift toward using custom AI agents to manage critical business […]
Agentic AI: The New Frontier in GenAI
In the rapidly evolving landscape of digital transformation, businesses are constantly seeking innovative ways to enhance their operations and gain a competitive edge. While Generative AI (GenAI) has been the hot topic since OpenAI introduced ChatGPT to the public in November 2022, a new evolution of the technology is emerging that promises to revolutionize how […]