Skip to main content

David CallaghanSenior Solutions Architect

Databricks Champion | Center of Excellence Lead | Data Privacy & Governance Expert | Speaker & Trainer | 30+ Yrs in Enterprise Data Architecture

Connect with David

Blogs from this Author

Istock 1536191188

Avoiding Metadata Contention in Unity Catalog

Metadata contention in Unity Catalog can occur in high-throughput Databricks environments, slowing down user queries and impacting performance across the platform. Our Finops strategy shifts left on performance. However, we have found scenarios where clients are still experiencing query slowdowns intermittently and even on optimized queries. As our client’s lakehouse footprint grows, we are seeing […]

Construction Concept. Residential Building Drawings And Architectural Model,

End-to-End Lineage and External Raw Data Access in Databricks

Achieving end-to-end lineage in Databricks while allowing external users to access raw data can be a challenging task. In Databricks, leveraging Unity Catalog for end-to-end lineage is a best practice. However, enabling external users to access raw data while maintaining security and lineage integrity requires a well-thought-out architecture. This blog outlines a reference architecture to […]

Istock 2163867912

Top 5 Mistakes That Make Your Databricks Queries Slow (and How to Fix Them)

I wanted to discuss the top 5 mistakes that make your Databricks queries slow as a prequel to some of my FinOps blogs. Premature optimization may or may be the root of all evil, but we can all agree optimization without a solid foundation is not an effective use of time and resources. Predictive optimization […]

delete

Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks

Deletion Vectors will be enabled by default in Delta Live Tables (DLTs) for materialized views and streaming tables starting April 28, 2025. Predictive Optimization for DLT maintenance will also be enabled by default. This could provide both cost savings and performance improvements. Our Databricks Practice holds FinOps as a core architectural tenet, but sometimes compliance […]

Istock 2148852705

Unlocking the Future of Enterprise AI: Databricks announces Anthropic Partnership

The recent strategic partnership between Databricks and Anthropic is a big step forward for enabling enterprises to build, deploy, and govern AI agents that reason over proprietary data with accuracy, security, and governance. The landscape of enterprise AI is evolving rapidly, and we’re excited to share how our practice is positioned to help businesses maximize […]

Istock 2160707342

Delta Live Tables and Great Expectations: Better Together

Modern data platforms like Databricks enable organizations to process massive volumes of batch and streaming data—but scaling reliably requires more than just compute power. It demands data observability: the ability to monitor, validate, and trace data through its lifecycle. This blog compares two powerful tools—Delta Live Tables and Great Expectations—that bring observability to life in […]

Istock 179133772

How Automatic Liquid Clustering Supports Databricks FinOps at Scale

Perficient has a FinOps mindset with Databricks, so the Automatic Liquid Clustering announcement grabbed my attention. I’ve mentioned Liquid Clustering before when discussing the advantages of Unity Catalog beyond governance use cases. Unity Catalog: come for the data governance, stay for the predictive optimization. I am usually a fan of being able to tune the dials […]

Handshake

SAP and Databricks: Better Together

SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]

Boxers In Action

Databricks on Azure versus AWS

As a Databricks Champion working for Perficient’s Data Solutions team, I spend most of my time installing and managing Databricks on Azure and AWS. The decision on which cloud provider to use is typically outside my scope since the organization has already made it. However, there are occasions when the client uses both hyperscalers or […]

Optimizing Costs and Performance in Databricks: A FinOps Approach

As organizations increasingly rely on Databricks for big data processing and analytics, managing costs and optimizing performance become crucial for maximizing ROI. A FinOps strategy tailored to Databricks can help teams strike the right balance between cost control and efficient resource utilization. Below, we outline key practices in cluster management, data management, query optimization, coding, […]

Handshake

SAP and Databricks: Better Together

SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]

Man placing red block to bridge a gap between unpainted blocks

Integrate Salesforce and Databricks

90% of Fortune 500 companies use Salesforce as their Customer Relations Management tool. I have ingested data from Salesforce into almost every database using almost every ETL tool. Every integration tool out there has a Salesforce connector; Salesforce even owns Mulesoft. The integration always worked, but it was rarely smooth. Its just something that you […]

Load More