Skip to main content

Data & Intelligence

The Best Way to Limit the Value of Big Data

A few years back I worked for a client that was implementing cell level security on every data structure within their data warehouse. They had nearly 1,000 tables and 200,000 columns — yikes! Talking about administrative overhead. The logic was that data access should only be given on a need-to-know basis. The idea would be that users would have to request access to certain tables and columns.

Big DataNeed-to-know is a term frequently used in military and government institutions that refers to granting access to sensitive information to cleared individuals. This is a good concept, but the key here is the part about “granting access to SENSITIVE data.” The key is that the information has to be classified first, then need-to-know (for cleared individuals) is applied.

Data Intelligence - The Future of Big Data
The Future of Big Data

With some guidance, you can craft a data platform that is right for your organization’s needs and gets the most return from your data capital.

Get the Guide

Most government documents are not sensitive. This allows the administrative resources to focus on the sensitive, classified information. The system for classifying information as Top Secret, Secret, and Confidential, has relatively stringent rules for, but also discourages the over classification of information. This is because when a document is classified, its use becomes limited.

This same phenomenon is true in the corporate world. The more a set of data is locked down, the less it will be used. Unnecessary limiting an information’s workers access to data obviously does not help the overall objectives of the organization. Big Data just magnifies this dynamic and unnecessarily restricting access to Big Data is the best way to limit its value. Unreasonably lock down Big Data, its value will be severely limited.

Now this is not to say, certain data should not be restricted. Social Security Numbers (SSN), HIPPA governed data elements, and Account numbers are a few examples. We do need solutions to restrict access to this critical information but that systems should restrict escalating those controls to information that should not be as tightly controlled.

A Classify, Separate, and Secure strategy is quite effective for securing only critical data elements. Classify information, if possible at the field/column) level, using specific, consistent, guidelines that do not unnecessarily restrict information. When we load information into a data reservoir (or data lake), we Separate sensitive information from unrestricted information. This should be executed at the column level in tables. For example, if a table has field containing SSNs, physically separate this into another table. Masking may also be appropriate, and depending on the other data elements, we may want to not the sensitive data columns into our cluster. This prevents the security escalation effect that happens when we classify a table as sensitive because of just one column of sensitive data. Lastly, we Secure the sensitive information. This may be in another directory or system (like Apache Accumolo). The objective is focus our efforts into locking down the secure information and minimizing the administrative overhead.

Thoughts on “The Best Way to Limit the Value of Big Data”

  1. “Unnecessary limiting an information’s workers access to data obviously does not help the overall objectives of the organization. ”

    Maybe everyone doesn’t need access every single piece of data, but if you keep all the data locked in an ivory tower than how are people supposed to use it to make better decisions?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Bill Busch

Bill is a Director and Senior Data Strategist leading Perficient's Big Data Team. Over his 27 years of professional experience he has helped organizations transform their data management, analytics, and governance tools and practices. As a veteran in analytics, Big Data, data architecture and information governance, he advises executives and enterprise architects on the latest pragmatic information management strategies. He is keenly aware of how to advise and lead companies through developing data strategies, formulating actionable roadmaps, and delivering high-impact solutions. As one of Perficient’s prime thought leaders for Big Data, he provides the visionary direction for Perficient’s Big Data capability development and has led many of our clients largest Data and Cloud transformation programs. Bill is an active blogger and can be followed on Twitter @bigdata73.

More from this Author

Follow Us