When working with datasets in Excel, you might encounter situations where multiple values are stored in a single cell, separated by a newline character (added using Alt + Enter). This can make data analysis challenging. In this blog, we’ll walk you through how to split such data into separate rows using Power Query, a powerful […]
Data + Intelligence
Perficient Included in IDC Market Glance: Payer, 1Q25
Health insurers today are navigating intense technological and regulatory requirements, along with rising consumer demand for seamless digital experiences. Leading organizations are investing in advanced technologies and automations to modernize operations, streamline experiences, and unlock reliable insights. By leveraging scalable infrastructures, you can turn data into a powerful tool that accelerates business success. IDC Market […]
End-to-End Lineage and External Raw Data Access in Databricks
Achieving end-to-end lineage in Databricks while allowing external users to access raw data can be a challenging task. In Databricks, leveraging Unity Catalog for end-to-end lineage is a best practice. However, enabling external users to access raw data while maintaining security and lineage integrity requires a well-thought-out architecture. This blog outlines a reference architecture to […]
Fine-Tuning LLaMA 70B Using Hugging Face Accelerate & DeepSpeed on Multiple Nodes
by Luis Pacheco, Uday Yallapragada and Cristian Muñoz Large language models (LLMs) like Meta’s LLaMA 70B are revolutionizing natural language processing tasks, but training or fine-tuning them requires massive computational and memory resources. To address these challenges, we employ distributed training across multiple GPU nodes using DeepSpeed and Hugging Face Accelerate. This blog walks you […]
Meet Perficient at the Optimized AI Conference
Are you ready to transform how you work? Perficient is excited to announce our participation at this year’s Optimized AI Conference in Atlanta, where I’ll be presenting on how to double your productivity at work using cutting-edge AI strategies. Why the Optimized AI Conference Matters The Optimized AI Conference creates a unique intersection where AI […]
Top 5 Mistakes That Make Your Databricks Queries Slow (and How to Fix Them)
I wanted to discuss the top 5 mistakes that make your Databricks queries slow as a prequel to some of my FinOps blogs. Premature optimization may or may be the root of all evil, but we can all agree optimization without a solid foundation is not an effective use of time and resources. Predictive optimization […]
Deletion Vectors in Delta Live Tables: Identifying and Remediating Compliance Risks
Deletion Vectors will be enabled by default in Delta Live Tables (DLTs) for materialized views and streaming tables starting April 28, 2025. Predictive Optimization for DLT maintenance will also be enabled by default. This could provide both cost savings and performance improvements. Our Databricks Practice holds FinOps as a core architectural tenet, but sometimes compliance […]
Unlocking the Future of Enterprise AI: Databricks announces Anthropic Partnership
The recent strategic partnership between Databricks and Anthropic is a big step forward for enabling enterprises to build, deploy, and govern AI agents that reason over proprietary data with accuracy, security, and governance. The landscape of enterprise AI is evolving rapidly, and we’re excited to share how our practice is positioned to help businesses maximize […]
7 Steps to Define a Data Governance Structure for a Mid-Sized Bank (Without Losing Your Mind)
A mid-sized bank I was consulting with for their data warehouse modernization project finally realized that data isn’t just some necessary but boring stuff the IT department hoards in their digital cave. It’s the new gold, the ticking time bomb of risk, and the bane of every regulatory report that’s ever come back with more […]
Delta Live Tables and Great Expectations: Better Together
Modern data platforms like Databricks enable organizations to process massive volumes of batch and streaming data—but scaling reliably requires more than just compute power. It demands data observability: the ability to monitor, validate, and trace data through its lifecycle. This blog compares two powerful tools—Delta Live Tables and Great Expectations—that bring observability to life in […]
Perficient Achieves AWS Glue Service Delivery Designation
Perficient has earned the AWS Glue Service Delivery Designation, demonstrating our deep technical expertise and proven success in delivering scalable, cost-effective, and high-performance data integration, data pipeline orchestration, and data catalog solutions. What is the AWS Service Delivery Program? The AWS Service Delivery Program is an AWS Specialization Program designed to validate AWS Partners with […]
How Automatic Liquid Clustering Supports Databricks FinOps at Scale
Perficient has a FinOps mindset with Databricks, so the Automatic Liquid Clustering announcement grabbed my attention. I’ve mentioned Liquid Clustering before when discussing the advantages of Unity Catalog beyond governance use cases. Unity Catalog: come for the data governance, stay for the predictive optimization. I am usually a fan of being able to tune the dials […]