5 Ways to Build a Data Lake and not a Data Swamp - Perficient Blogs
  • Topics
  • Industries
  • Partners





5 Ways to Build a Data Lake and not a Data Swamp

In the last 6 months, my customers and I have been on a journey with some of the largest cloud data lake vendors, open source Big Data vendors, and a team of the smartest Big Data architects that I’ve worked with in my career. As we explore this journey, often our clients are looking to build a data lake. Here are the most common use cases:

  • Capture healthcare EMR/EHR datasets
  • Restaurant point of sale, menu, and feedback
  • Capture claims and patient information for provider data
  • Benchmarks and social media (Twitter) data integrated into a data warehouse

When clients view data lakes and the encompassing technologies as a “cool new toy” to challenge their status quo, in no time, this data lake becomes a data swamp with a lot of “dark data”  for cleanup and consumption. Sure, storage is cheap and you can usually afford to store that data swamp. But if you are truly thinking of your business’s competitive advantage with data as an asset, it’s in your company’s best interest to create some structure and planning around your data lake.

Here are 5 ways to think about a Big Data project, whether you are just a beginner willing to invest in Big Data or you are halfway through an implementation:

  1. Start with the Business Value: Never think of a Big Data project as a technology project. There are many technologies to choose from which Perficient can help with. However, without a business value proposition, this is yet another IT project.
  2. Think Long Term: Big Data projects are never a one-time investment. With large volumes of data in your hand, think of investing in data scientists and data stewards to test your data to find new ways of changing your business models (such as sales, marketing, service, and manufacturing). As Bernard Marr mentioned, “a good data scientist is a good journalist
  3. It’s All About Analytics: Big Data projects should ALWAYS end up with Analytics. A picture speaks a thousand words.
  4. Learn Story Telling: Visualization is a fantastic way of representing data. However, there needs to be a coherence to the story. If the charts are jumping from sales per territory to customer service for a customer to manufacturing and logistics (all in one chart), there is no synchronization, and little value to management.
  5. Wrap All of the Above with Governance: What is data without a lineage of where it is coming from, without good quality, and without a clear glossary of definitions?

Perficient has a lot of experience in industry leading technologies for Big Data. In addition, Perficient can help you govern your data with an emphasis on business value with our proven Enable Methodologies. Reach out to us for more information.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Subscribe to the Weekly Blog Digest:

Sign Up