Across industries like manufacturing, energy, life sciences, and retail, data drives decisions on durability, resilience, and sustainability. A significant share of this critical data resides in SAP systems, which is why so many business have invested i SAP Datasphere. SAP Datasphere is a comprehensive data service that enables seamless access to mission-critical business data across SAP […]
David Callaghan – Solutions Architect
As a solutions architect with Perficient, I bring twenty years of development experience and I'm currently hands-on with Hadoop/Spark, blockchain and cloud, coding in Java, Scala and Go. I'm certified in and work extensively with Hadoop, Cassandra, Spark, AWS, MongoDB and Pentaho. Most recently, I've been bringing integrated blockchain (particularly Hyperledger and Ethereum) and big data solutions to the cloud with an emphasis on integrating Modern Data produces such as HBase, Cassandra and Neo4J as the off-blockchain repository.
Connect with David
Blogs from this Author
Integrate Salesforce and Databricks
90% of Fortune 500 companies use Salesforce as their Customer Relations Management tool. I have ingested data from Salesforce into almost every database using almost every ETL tool. Every integration tool out there has a Salesforce connector; Salesforce even owns Mulesoft. The integration always worked, but it was rarely smooth. Its just something that you […]
Unity Catalog, the Well-Architected Lakehouse and Operational Excellence
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Unity Catalog, the Well-Architected Lakehouse and Performance Efficiency
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Unity Catalog, the Well-Architected Lakehouse and Cost Optimization
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Unity Catalog and the Well-Architected Lakehouse in Databricks
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. While Unity Catalog is a foundational component, it should be part of a broader strategic initiative to realign some of your current practices that may be less than optimal with newer, better practices. One comprehensive […]
Maximize Your Data Management with Unity Catalog
Databricks Unity Catalog is a unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. Unity Catalog offers a comprehensive solution for enhancing data governance, operational efficiency, and technological performance. By centralizing metadata management, access controls, and data lineage tracking, it simplifies compliance, reduces complexity, and improves query performance […]
The Technical Power of Unity Catalog – Beyond Governance
If you use Databricks, you probably know that Databricks Unity Catalog is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. With Unity Catalog, organizations can seamlessly govern both structured and unstructured data in any format, as well as machine learning models, notebooks, dashboards and […]
Reducing Technical Debt with Databricks System Tables
Databricks system tables are currently in Public Preview, which means they are accessible but some detail may still change. This is how Databricks describes system tables: System tables are a Databricks-hosted analytical store of your account’s operational data found in the system catalog. System tables can be used for historical observability across your account. I’m going to […]
Databricks strengthens MosaicAI with Lilac
Databricks has acquired LilacAI as it continues to strengthen its end-to-end data intelligence platform. The 2023 acquisition of MosaicML gave Databricks significant capabilities in the in the Generative AI space with the ability to train and deploy Large Language Models (LLMs) at scale. Next, Databricks purchased Arcion to provide native real-time data ingestion into their […]
GCP Container Registry to Artifact Registry Migration
I got an email from Google Cloud Platform today entitled: [Action Required] Upgrade to Artifact Registry before March 18, 2025 This is not the first time Google has discontinued a product I use. They gave full year of lead time but I knew I would forget about it before then. I decided to look into […]
Using Snowflake and Databricks Together
This is not another comparison between Databricks and Snowflake; they’re not hard to find. This is a practical guide about using Databricks and Snowflake together in your organization. Many companies have both products implemented. Sometimes, there is a discrepancy between the two as far as the data being stored, creating new data silos. The Databricks […]