Across industries like manufacturing, energy, life sciences, and retail, data drives decisions on durability, resilience, and sustainability. A significant share of this critical data resides in SAP systems, which is why so many business have invested i SAP Datasphere. SAP Datasphere is a comprehensive data service that enables seamless access to mission-critical business data across SAP […]
Databricks
Omnichannel Analytics Simplified – Optimizely Acquires Netspring
Recently, the news broke that Optimizely acquired Netspring, a warehouse-native analytics platform. I’ll admit, I hadn’t heard of Netspring before, but after taking a closer look at their website and capabilities, it became clear why Optimizely made this strategic move. Simplifying Omnichannel Analytics for Real Digital Impact Netspring is not just another analytics platform. It […]
Dreamforce 2024 Session Recap: Data Cloud + Databricks: As Good Together as PB&J
At Dreamforce 2024, Perficient explored the integration of Databricks and Salesforce Data Cloud, focusing on an insurance industry use case. This session showcased data processing, customer engagement, and AI-driven insights, offering real-world value to enterprises. Here’s a comprehensive recap of the session, highlighting the key takeaways and technical depth discussed. Speakers Two of Perficient’s top […]
Perficient Colleague Attains Champion Status
Databricks has recognized David Callaghan as a Partner Champion. As the first Perficient colleague to receive inclusion in the program, David is paving the way for others to get their footing with the partner. Program Overview To be a Databricks Partner champion, one must: Display Thought Leadership Harness Technical Expertise Become Community Leader Demonstrate Innovation […]
The Quest for Spark Performance Optimization: A Data Engineer’s Journey
In the bustling city of Tech Ville, where data flows like rivers and companies thrive on insights, there lived a dedicated data engineer named Tara. With over five years of experience under her belt, Tara had navigated the vast ocean of data engineering, constantly learning, and evolving with the ever-changing tides.One crisp morning, Tara was […]
Data & Dragons: Perficient Attends Data + AI Summit
Dancing with Data It was but a fortnight into 2024 AC (After Conquest) when the great council gathered to decide who would succeed Perficient’s 2023 Data & AI Summit attendees. Many claims were heard, but only a few were considered. The council was assembled to prevent a war from being fought over the succession, for […]
Salesforce Data Cloud – What Does noETL / noELT Mean for Me?
In the realm of data management and analytics, the terms ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) have been commonplace for decades. They describe the processes involved in moving data from one system to another, transforming it as needed along the way. However, with the advent of technologies like Salesforce Data Cloud, a […]
ELT IS DEAD. LONG LIVE ZERO COPY.
Imagine a world where we can skip Extract and Load, just do our data Transformations connecting directly to sources no matter what data platform you use? Salesforce has taken significant steps over the last 2 years with Data Cloud to streamline how you get data in and out of their platform and we’re excited to […]
Apache Spark: Merging Files using Databricks
In data engineering and analytics workflows, merging files emerges as a common task when managing large datasets distributed across multiple files. Databricks, furnishing a powerful platform for processing big data, prominently employs Scala. In this blog post, we’ll delve into how to merge files efficiently using Scala on Databricks. Introduction: Merging files entails combining the […]
Introduction to Star and Snowflake schema
In the world of data warehousing and business intelligence, two key concepts are fundamental: Snowflake and Star Schema. These concepts play a pivotal role in designing effective data models for analyzing large volumes of data efficiently. Let’s delve into what Snowflake and Star Schema are and how they are used in the realm of data […]
Spark DataFrame: Writing into Files
This blog post explores how to write Spark DataFrame into various file formats for saving data to external storage for further analysis or sharing. Before diving into this blog have a look at my other blog posts discussing about creating the DataFrame and manipulating the DataFrame along with writing a DataFrame into tables and views. […]
Spark SQL Properties
The spark.sql.* properties are a set of configuration options specific to Spark SQL, a module within Apache Spark designed for processing structured data using SQL queries, DataFrame API, and Datasets. These properties allow users to customize various aspects of Spark SQL’s behavior, optimization strategies, and execution environment. Here’s a brief introduction to some common spark.sql.* […]