Databricks Articles - Page 4 of 7 / Blogs / Perficient

Perficient Colleague Attains Champion Status

Databricks has recognized David Callaghan as a Partner Champion. As the first Perficient colleague to receive inclusion in the program, David is paving the way for others to get their footing with the partner. Program Overview To be a Databricks Partner champion, one must: Display Thought Leadership Harness Technical Expertise Become Community Leader Demonstrate Innovation […]

Data & Dragons: Perficient Attends Data + AI Summit

Dancing with Data It was but a fortnight into 2024 AC (After Conquest) when the great council gathered to decide who would succeed Perficient’s 2023 Data & AI Summit attendees. Many claims were heard, but only a few were considered. The council was assembled to prevent a war from being fought over the succession, for […]

Databricks

Einstein Personalization and Salesforce Connections 2024: AI Integration at the Forefront

Attending Salesforce Connections 2024 at McCormick Place in Chicago was an energizing experience, highlighting the forefront of AI integration in business operations. The event gathered industry leaders to explore the latest advancements in artificial intelligence, data integration, and commerce, with a clear focus on how these technologies are reshaping the business landscape. One of the […]

Customer Experience + Design Data + Intelligence Digital Marketing Platforms and Technology Salesforce Strategy and Transformation

Apache Spark: Merging and Renaming Files

Apache Spark: Merging Files using Databricks

In data engineering and analytics workflows, merging files emerges as a common task when managing large datasets distributed across multiple files. Databricks, furnishing a powerful platform for processing big data, prominently employs Scala. In this blog post, we’ll delve into how to merge files efficiently using Scala on Databricks. Introduction: Merging files entails combining the […]

Cloud Databricks Platforms and Technology Technology Partners

Introduction to Star and Snowflake schema

In the world of data warehousing and business intelligence, two key concepts are fundamental: Snowflake and Star Schema. These concepts play a pivotal role in designing effective data models for analyzing large volumes of data efficiently. Let’s delve into what Snowflake and Star Schema are and how they are used in the realm of data […]

Databricks Platforms and Technology

Databricks strengthens MosaicAI with Lilac

Databricks has acquired LilacAI as it continues to strengthen its end-to-end data intelligence platform. The 2023 acquisition of MosaicML gave Databricks significant capabilities in the in the Generative AI space with the ability to train and deploy Large Language Models (LLMs) at scale. Next, Databricks purchased Arcion to provide native real-time data ingestion into their […]

Data + Intelligence

Using Snowflake and Databricks Together

This is not another comparison between Databricks and Snowflake; they’re not hard to find. This is a practical guide about using Databricks and Snowflake together in your organization. Many companies have both products implemented. Sometimes, there is a discrepancy between the two as far as the data being stored, creating new data silos. The Databricks […]

Data + Intelligence

Stethoscope With Clipboard And Laptop On Desk Doctor Working In Hospital Writing A Prescription Healthcare And Medical Concept Test Results In Background Vintage Color Selective Focus.

Writing Testable Python Objects in Databricks

I’ve been writing about Test-Driven Development in Databricks and some of the interesting issues that you can run into with Python objects. It’s always been my opinion that code that is not testable is detestable. Admittedly, its been very difficult getting to where I wanted to be with Databricks and TDD. Unfortunately, it’s hard to […]

Data + Intelligence

Understanding the role of Py4J in Databricks

I mentioned that my attempt to implement TDD with Databricks was not totally successful. Setting up the local environment was not a problem and getting a service id for CI/CD component was more of an administrative than a technical problem. Using mocks to test python objects that are serialized to Spark is actually the issue. […]

Data + Intelligence

Tick Symbol On A Digital Lcd Display With Reflection.

Test Driven Development with Databricks

I don’t like testing Databricks notebooks and that’s a problem. I like Databricks. I like Test Driven Development. Not in an evangelical; 100% code coverage or fail kind of way. I just find that a reasonable amount of code coverage gives me a reasonable amount of confidence. Databricks has documentation for unit testing. I tried […]

Data + Intelligence

Spark DataFrame: Writing into Files

This blog post explores how to write Spark DataFrame into various file formats for saving data to external storage for further analysis or sharing. Before diving into this blog have a look at my other blog posts discussing about creating the DataFrame and manipulating the DataFrame along with writing a DataFrame into tables and views. […]

Cloud Databricks Platforms and Technology Technology Partners

Spark SQL Properties

The spark.sql.* properties are a set of configuration options specific to Spark SQL, a module within Apache Spark designed for processing structured data using SQL queries, DataFrame API, and Datasets. These properties allow users to customize various aspects of Spark SQL’s behavior, optimization strategies, and execution environment. Here’s a brief introduction to some common spark.sql.* […]

Databricks Platforms and Technology

Posts Tagged ‘Databricks’