SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
Databricks
Databricks on Azure versus AWS
As a Databricks Champion working for Perficient’s Data Solutions team, I spend most of my time installing and managing Databricks on Azure and AWS. The decision on which cloud provider to use is typically outside my scope since the organization has already made it. However, there are occasions when the client uses both hyperscalers or […]
Is it really DeepSeek FTW?
So, DeepSeek just dropped their latest AI models, and while it’s exciting, there are some cautions to consider. Because of the US export controls around advanced hardware, DeepSeek has been operating under a set of unique constraints that have forced them to get creative in their approach. This creativity seems to have yielded real progress […]
SAP and Databricks: Better Together
SAP Databricks is important because convenient access to governed data to support business initiatives is important. Breaking down silos has been a drumbeat of data professionals since Hadoop, but this SAP <-> Databricks initiative may help to solve one of the more intractable data engineering problems out there. SAP has a large, critical data footprint […]
Omnichannel Analytics Simplified – Optimizely Acquires Netspring
Recently, the news broke that Optimizely acquired Netspring, a warehouse-native analytics platform. I’ll admit, I hadn’t heard of Netspring before, but after taking a closer look at their website and capabilities, it became clear why Optimizely made this strategic move. Simplifying Omnichannel Analytics for Real Digital Impact Netspring is not just another analytics platform. It […]
Dreamforce 2024 Session Recap: Data Cloud + Databricks: As Good Together as PB&J
At Dreamforce 2024, Perficient explored the integration of Databricks and Salesforce Data Cloud, focusing on an insurance industry use case. This session showcased data processing, customer engagement, and AI-driven insights, offering real-world value to enterprises. Here’s a comprehensive recap of the session, highlighting the key takeaways and technical depth discussed. Speakers Two of Perficient’s top […]
Perficient Colleague Attains Champion Status
Databricks has recognized David Callaghan as a Partner Champion. As the first Perficient colleague to receive inclusion in the program, David is paving the way for others to get their footing with the partner. Program Overview To be a Databricks Partner champion, one must: Display Thought Leadership Harness Technical Expertise Become Community Leader Demonstrate Innovation […]
The Quest for Spark Performance Optimization: A Data Engineer’s Journey
In the bustling city of Tech Ville, where data flows like rivers and companies thrive on insights, there lived a dedicated data engineer named Tara. With over five years of experience under her belt, Tara had navigated the vast ocean of data engineering, constantly learning, and evolving with the ever-changing tides.One crisp morning, Tara was […]
Data & Dragons: Perficient Attends Data + AI Summit
Dancing with Data It was but a fortnight into 2024 AC (After Conquest) when the great council gathered to decide who would succeed Perficient’s 2023 Data & AI Summit attendees. Many claims were heard, but only a few were considered. The council was assembled to prevent a war from being fought over the succession, for […]
Salesforce Data Cloud – What Does noETL / noELT Mean for Me?
In the realm of data management and analytics, the terms ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) have been commonplace for decades. They describe the processes involved in moving data from one system to another, transforming it as needed along the way. However, with the advent of technologies like Salesforce Data Cloud, a […]
ELT IS DEAD. LONG LIVE ZERO COPY.
Imagine a world where we can skip Extract and Load, just do our data Transformations connecting directly to sources no matter what data platform you use? Salesforce has taken significant steps over the last 2 years with Data Cloud to streamline how you get data in and out of their platform and we’re excited to […]
Apache Spark: Merging Files using Databricks
In data engineering and analytics workflows, merging files emerges as a common task when managing large datasets distributed across multiple files. Databricks, furnishing a powerful platform for processing big data, prominently employs Scala. In this blog post, we’ll delve into how to merge files efficiently using Scala on Databricks. Introduction: Merging files entails combining the […]