I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Posts Tagged ‘Databricks’
Unity Catalog, the Well-Architected Lakehouse and Performance Efficiency
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Unity Catalog, the Well-Architected Lakehouse and Cost Optimization
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. Any migration exercise implies movement from a current to a future state. A migration from the Hive Metastore to Unity Catalog will require planning around workspaces, catalogs and user access. This is also an opportunity […]
Unity Catalog and the Well-Architected Lakehouse in Databricks
I have written about the importance of migrating to Unity Catalog as an essential component of your Data Management Platform. While Unity Catalog is a foundational component, it should be part of a broader strategic initiative to realign some of your current practices that may be less than optimal with newer, better practices. One comprehensive […]
Maximize Your Data Management with Unity Catalog
Databricks Unity Catalog is a unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. Unity Catalog offers a comprehensive solution for enhancing data governance, operational efficiency, and technological performance. By centralizing metadata management, access controls, and data lineage tracking, it simplifies compliance, reduces complexity, and improves query performance […]
The Technical Power of Unity Catalog – Beyond Governance
If you use Databricks, you probably know that Databricks Unity Catalog is the industry’s only unified and open governance solution for data and AI, built into the Databricks Data Intelligence Platform. With Unity Catalog, organizations can seamlessly govern both structured and unstructured data in any format, as well as machine learning models, notebooks, dashboards and […]
Reducing Technical Debt with Databricks System Tables
Databricks system tables are currently in Public Preview, which means they are accessible but some detail may still change. This is how Databricks describes system tables: System tables are a Databricks-hosted analytical store of your account’s operational data found in the system catalog. System tables can be used for historical observability across your account. I’m going to […]
Perficient Colleague Attains Champion Status
Databricks has recognized David Callaghan as a Partner Champion. As the first Perficient colleague to receive inclusion in the program, David is paving the way for others to get their footing with the partner. Program Overview To be a Databricks Partner champion, one must: Display Thought Leadership Harness Technical Expertise Become Community Leader Demonstrate Innovation […]
Data & Dragons: Perficient Attends Data + AI Summit
Dancing with Data It was but a fortnight into 2024 AC (After Conquest) when the great council gathered to decide who would succeed Perficient’s 2023 Data & AI Summit attendees. Many claims were heard, but only a few were considered. The council was assembled to prevent a war from being fought over the succession, for […]
Einstein Personalization and Salesforce Connections 2024: AI Integration at the Forefront
Attending Salesforce Connections 2024 at McCormick Place in Chicago was an energizing experience, highlighting the forefront of AI integration in business operations. The event gathered industry leaders to explore the latest advancements in artificial intelligence, data integration, and commerce, with a clear focus on how these technologies are reshaping the business landscape. One of the […]
Apache Spark: Merging Files using Databricks
In data engineering and analytics workflows, merging files emerges as a common task when managing large datasets distributed across multiple files. Databricks, furnishing a powerful platform for processing big data, prominently employs Scala. In this blog post, we’ll delve into how to merge files efficiently using Scala on Databricks. Introduction: Merging files entails combining the […]
Introduction to Star and Snowflake schema
In the world of data warehousing and business intelligence, two key concepts are fundamental: Snowflake and Star Schema. These concepts play a pivotal role in designing effective data models for analyzing large volumes of data efficiently. Let’s delve into what Snowflake and Star Schema are and how they are used in the realm of data […]