Last week, I attended TrailblazerDX in San Francisco, where the content was all about Salesforce Data Cloud and AI! There were over 300 sessions to attend, from technical talks to hands-on workshops where attendees could learn how to build copilots and how to use the latest Salesforce platform features directly from product managers, architects, and […]
Data + Intelligence
Writing Testable Python Objects in Databricks
I’ve been writing about Test-Driven Development in Databricks and some of the interesting issues that you can run into with Python objects. It’s always been my opinion that code that is not testable is detestable. Admittedly, its been very difficult getting to where I wanted to be with Databricks and TDD. Unfortunately, it’s hard to […]
The Event-Driven Data Layer: Unifying Analytics and Development Teams
One common way of implementing tags through Adobe Launch is using a data layer, which is a JSON object (key/value pair) that is loaded onto the page from which attributes are passed through as the user navigates through the website or completes certain objectives. This can be extended further into using an Event-Driven Data Layer […]
AI (Artificial Intelligence) Powered Product People
AI (Artificial Intelligence) Powered Product People Have you ever wondered how artificial intelligence can transform not only our lives but also our professions? As someone passionate about technology and digital product development, I have always been intrigued by new tools and how they can improve our lives and careers. In recent times, artificial intelligence (AI) […]
Understanding the role of Py4J in Databricks
I mentioned that my attempt to implement TDD with Databricks was not totally successful. Setting up the local environment was not a problem and getting a service id for CI/CD component was more of an administrative than a technical problem. Using mocks to test python objects that are serialized to Spark is actually the issue. […]
Test Driven Development with Databricks
I don’t like testing Databricks notebooks and that’s a problem. I like Databricks. I like Test Driven Development. Not in an evangelical; 100% code coverage or fail kind of way. I just find that a reasonable amount of code coverage gives me a reasonable amount of confidence. Databricks has documentation for unit testing. I tried […]
LinkedIn open sources a control plane for lake houses
LinkedIn open sources a lot of code. Kafka, of course, but also Samza and Voldemoort and a bunch of Hadoop tools like DataFu and Gobblin. Open-source projects tend to be created by developers to solve engineering problems while commercial products … Anyway, LinkedIn has a new open-source data offering called OpenHouse, which is billed as […]
Talend ESB – tRestRequest and tRestResponse
This article covers the configuration of tRestRequest, tRestResponse and How can we create the HTTP listeners using postman. Create a job in the Talend tool (it can be Talend ESB or Talend Data fabric). Once you have created a job, place tRestRequest, tjavaRow component, and tRestResponse in the designer. Once you have placed all the […]
Maintaining Your Adobe Launch Implementation
Adobe Launch is a valuable tool to help you manage the tags placed across your website, including Facebook, Pinterest, and Bing pixels as well as Adobe Analytics and Target depending on your property. Many of these tags are either deployed via custom code or via one of the many extensions within Launch’s extension to help […]
Ready for Microsoft Copilot for Microsoft 365?
Organizations want to leverage the productivity enhancements Microsoft Copilot for Microsoft 365 may enable, but want to avoid unintentional over-exposure of organizational information while users are accessing these Copilot experiences. Our Microsoft team is fielding many questions from customers about how to secure and govern Microsoft Copilot for Microsoft 365. These organizations want to ensure […]
Databricks Lakehouse Federation Public Preview
Sometimes, its nice to be able to skip a step. Most data projects involve data movement before data access. Usually this is not an issue; everyone agrees that the data must be made available before it can be available. There are use cases where the data movement part is a blocker because of time, cost, […]
Data Lake Governance with Tagging in Databricks Unity Catalog
The goal of Databricks Unity Catalog is to provide centralized security and management to data and AI assets across the data lakehouse. Unity Catalog provides fine-grained access control for all the securable objects in the lakehouse; databases, tables, files and even models. Gone are the limitations of the Hive metadata store. The Unity Catalog metastore […]