Databricks Databricks provides a powerful, spark-centric, cloud-based analytics platform that enables users to rapidly process, transform and explore data. However, its preconfigured security can be insufficient in regulating or monitoring confidential information due to the flexibility it offers. This can be of particular concern to highly regulated enterprise, such a financial and health-care companies. Policy-as-code […]
David Callaghan – Solutions Architect
As a solutions architect with Perficient, I bring twenty years of development experience and I'm currently hands-on with Hadoop/Spark, blockchain and cloud, coding in Java, Scala and Go. I'm certified in and work extensively with Hadoop, Cassandra, Spark, AWS, MongoDB and Pentaho. Most recently, I've been bringing integrated blockchain (particularly Hyperledger and Ethereum) and big data solutions to the cloud with an emphasis on integrating Modern Data produces such as HBase, Cassandra and Neo4J as the off-blockchain repository.
Connect with David
Blogs from this Author
Next-Generation Data Cleanrooms with Delta Sharing
Data-driven companies are finding more and more use cases where their internal data could be supplemented with external datasets to deliver more business value. At the same time, there are legitimate data privacy concerns that need to be addressed, particularly among regulated enterprises in the financial and healthcare sector. There are opportunities here for a […]
Delta Sharing for Modern Secure Data Sharing
Delta Lake is an open-source framework under the Linux Foundation used to build Lakehouse architectures. A new project is Delta Sharing, which is an open protocol for secure real-time exchange of large datasets. Databricks provides production-grade implementations of the projects under delta-io, including Databricks Delta Sharing. Understanding the open-source foundation of the enterprise offering can […]
Beyond Encryption: Implementing anonymity algorithms
Businesses and organizations now hold more personal information than ever before. Storing a lot of data may be useful in a variety of ways, such as reporting and analytics, which might expose PII that is linked to the data being analyzed. When data is being transmitted or stored, encryption is useful for protecting it, whereas […]
Beyond Encryption: Protect sensitive data using t-closeness
Businesses and organizations now hold more personal information than ever before. Storing a lot of data may be useful in a variety of ways, such as reporting and analytics, which might expose PII that is linked to the data being analyzed. When data is being transmitted or stored, encryption is useful for protecting it, whereas […]
Beyond Encryption: Protect sensitive data using l-diversity
Businesses and organizations now hold more personal information than ever before. Storing a lot of data may be useful in a variety of ways, such as reporting and analytics, which might expose PII that is linked to the data being analyzed. When data is being transmitted or stored, encryption is useful for protecting it, whereas […]
How will the sale of Watson affect healthcare?
IBM is selling their AI for healthcare, known as Watson Health, to Francisco Partners for an estimated $1 billion. Watson Health is an artificial intelligence system created by IBM. The system is designed to work with healthcare providers and pharmaceutical companies to help improve the efficacy of treatments and reduce healthcare costs. Francisco Partners has […]
cURL to get a JSON makeover
cURL is frequently used by developers working with REST API’s to send and receive data using JSON notation. This has been a common pattern for years, but it has never been seamless. There have been a number of times when I’ve been trying to get a JSON payload to work against an endpoint for a […]
Understanding Searchable Encryption in the Cloud
Many companies struggle with storing PII in the cloud. When you store sensitive data in the cloud, it’s critical to guarantee that it remains private. Encrypting the data before sending it to the cloud storage server is one approach to do this. This will protect your information and ensure that no one can access it. […]
Protect PII with anonymized datasets for Data Scientists with differential privacy
Businesses and organizations now hold more personal information than ever before. Storing large amounts of structured and unstructured data may be useful in a variety of ways, such as reporting and analytics, but it might expose PII that is linked to the data being analyzed.As organizations are increasingly under pressure to comply with data privacy […]
Beyond Encryption: Protect sensitive data using k-anonymity
Businesses and organizations now hold more personal information than ever before. Storing a lot of data may be useful in a variety of ways, such as reporting and analytics, which might expose PII that is linked to the data being analyzed. When data is being transmitted or stored, encryption is useful for protecting it, whereas […]
Real-time Retail with Databrick’s Lakehouse Accelerators
Databricks has announced Lakehouse for Retail, a collection of more than twenty free, open-source Retail Solution Accelerators. Solution accelerators are tools that help companies in constructing a solution for their data and AI problem. They can be used to show the feasibility of a prototype and then the business can use that as support for […]