This is the second blog in a series that dives into how organizations become data-driven, with insights and strategy from Perficient’s Senior Data Strategist and Solutions Architect, Dr. Chuck Brooks. To read the first blog click here.
Companies with outdated technology such as on-premises data warehouses, Excel spreadsheets as a data source, and email as the primary source of community often breed cultures in which data is protected and hoarded. While having the right technology in place is not necessarily a shortcut to creating a culture of data, technology and culture are two sides of the same coin as both depend on the company being open to innovation.
In the first blog Give Employees and Customers Access to the Data, we discussed data silos.
In many organizations data is siloed in repositories of data that are controlled by one department or business unit and isolated from the rest of the organization. Silos make it hard for knowledge workers in other parts of the organization to access and use the data. Data silos hinder business operations and the data analytics initiatives that support them. Silos limit the ability of knowledge workers to use data to manage business processes and make informed business decisions. So, if silos are so bad, why do so many organizations have them?
Old Technology and Old Paradigms
There are several reasons that silos develop, but two of the biggest reasons are that many organizations continue to use old technologies and outdated paradigms.
In the early days of data management, storage was extremely expensive. Cost prevented organizations from collecting and storing the volume and variety of data that was generated across the company. It was not cost-effective to collect data and store it without clearly understanding how it would be used. Different departments across the organization had different ideas about what data should be collected and could be effectively used. This led to different departments and organizational units funding and collecting the data that they felt was useful and storing it in their own isolated data repositories.
Early relational databases were not capable of handling large amounts of data for analysis while maintaining acceptable performance levels. The increasing performance meant limiting data volume. To reduce the amount of data that the database management system had to manage the data mart was introduced. A data mart is a subject-oriented database that is often a segment (silo) of an organization’s data. The subset of data held in a data mart typically aligns with a particular business unit like sales, finance, or marketing. Because a data mart only contains the data applicable to a certain business area, it was commonly believed that the data mart was a cost-effective way to gain actionable insights quickly.
The IT Leader's Guide to Multicloud Readiness
This guide provides practical key insights and important factors to consider to make informed decisions in your multicloud journey.
New Technology and New Paradigms
Fast forward to 2022 where data storage technologies are now fast and inexpensive, so much so that I now believe it is important to collect and store most or all the data available and then determine how to use it later. Furthermore, we have learned that often the data needed to solve business problems is not contained in the data of just one business unit or even within one organization.
This fact means the data mart is dead. Organizations still using data marts have not embraced new data management technologies and/or have not recognized that many business problems and business decisions require data from across the organization and from data from external processes and external organizations. In 2022, data management systems exist that can collect and integrate the large volume and variety of data that is needed to make data-driven business decisions and they perform at impressive performance levels.
Embrace New Technology
I believe the Google Cloud Platform (GCP) is “the best data management platform on the planet.” The new technologies that comprise the GCP data platform provide several data management systems. Cloud SQL and Cloud Spanner provide online transaction processing (OLTP) capability, Bigtable and Cloud Firestone provide NoSQL capability, and BigQuery provides OLAP capabilities. All of the GCP data management systems provide for the separation of storage and compute. By decoupling these components, the GCP data management platform provides:
- Inexpensive, virtually unlimited, and seamlessly scalable storage
- Stateless, resilient compute
- Data sharing
- ACID-compliant storage operations
- A logical storage model, rather than physical
- Severless and scalable performance
The GCP platform provides best-of-breed performance and inexpensive storage costs. This platform gives the data team the new technologies and tools that it needs to remove silos and collect, integrate, and deliver data to enterprise knowledge workers.
Read the next blog in the series, here.
Perficient’s Cloud Data Expertise
The world’s leading brands choose to partner with us because we are large enough to scale major cloud projects, yet nimble enough to provide focused expertise in specific areas of your business.
Our cloud, data, and analytics team can assist with your entire data and analytics lifecycle, from data strategy to implementation. We will help you make sense of your data and show you how to use it to solve complex business problems. We’ll assess your current data and analytics issues, and develop a strategy to guide you to your long-term goals.
Download the guide, Becoming a Data-Driven Organization With Google Cloud Platform, to learn more about Dr. Chuck’s GCP data strategy.
Learn more about our Google Data capabilities, here.