Skip to main content

David CallaghanSolutions Architect

As a solutions architect with Perficient, I bring twenty years of development experience and I'm currently hands-on with Hadoop/Spark, blockchain and cloud, coding in Java, Scala and Go. I'm certified in and work extensively with Hadoop, Cassandra, Spark, AWS, MongoDB and Pentaho. Most recently, I've been bringing integrated blockchain (particularly Hyperledger and Ethereum) and big data solutions to the cloud with an emphasis on integrating Modern Data produces such as HBase, Cassandra and Neo4J as the off-blockchain repository.

Connect with David

Blogs from this Author

Handshake

SAP and Databricks: Better Together

SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]

Boxers In Action

Databricks on Azure versus AWS

As a Databricks Champion working for Perficient’s Data Solutions team, I spend most of my time installing and managing Databricks on Azure and AWS. The decision on which cloud provider to use is typically outside my scope since the organization has already made it. However, there are occasions when the client uses both hyperscalers or […]

Optimizing Costs and Performance in Databricks: A FinOps Approach

As organizations increasingly rely on Databricks for big data processing and analytics, managing costs and optimizing performance become crucial for maximizing ROI. A FinOps strategy tailored to Databricks can help teams strike the right balance between cost control and efficient resource utilization. Below, we outline key practices in cluster management, data management, query optimization, coding, […]

Handshake

SAP and Databricks: Better Together

SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]

Man placing red block to bridge a gap between unpainted blocks

Integrate Salesforce and Databricks

90% of Fortune 500 companies use Salesforce as their Customer Relations Management tool. I have ingested data from Salesforce into almost every database using almost every ETL tool. Every integration tool out there has a Salesforce connector; Salesforce even owns Mulesoft. The integration always worked, but it was rarely smooth. Its just something that you […]

Unity Catalog, the Well-Architected Lakehouse and Operational Excellence

I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]

Unity Catalog, the Well-Architected Lakehouse and Performance Efficiency

I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]

Istock 1419229965

Unity Catalog, the Well-Architected Lakehouse and Cost Optimization

I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]

Unity Catalog and the Well-Architected Lakehouse in Databricks

I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. While Unity Catalog is a foundational component, it should be part of a broader strategic initiative to realign some of your current practices that may be less than optimal with newer, better practices. One comprehensive […]

Istock 2148575950

Maximize Your Data Management with Unity Catalog

Databricks Unity Catalog is a unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. Unity Catalog offers a comprehensive solution for enhancing data governance, operational efficiency, and technological performance. By centralizing metadata management, access controls, and data lineage tracking, it simplifies compliance, reduces complexity, and improves query performance […]

The Technical Power of Unity Catalog – Beyond Governance

If you use Databricks, you probably know that Databricks Unity Catalog is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. With Unity Catalog, organizations can seamlessly govern both structured and unstructured data in any format, as well as machine learning models, notebooks, dashboards and […]

Little kid putting a coin in a piggy bank

Reducing Technical Debt with Databricks System Tables

Databricks system tables are currently in Public Preview, which means they are accessible but some detail may still change. This is how Databricks describes system tables: System tables are a Databricks-hosted analytical store of your account’s operational data found in the system catalog. System tables can be used for historical observability across your account. I’m going to […]

Load More