As the volume of information that can be scrutinized and dissected grows, and the number of potential correlations abound, the potential for false information and false correlations increases, as well. The fact that we have greater quantities and greater varieties of data does not in itself ensure that our conclusions or our analysis of that data will automatically produce better results. In fact, the increased number of variables that we are bringing to bear in our analysis actually poses a greater challenge in weeding out those relationships that may be misleading, irrelevant, or for other reasons, should be excluded.
The situation also calls for more precise statistical methods and a greater level of knowledge and judgment as to how to interpret that data. More variables and greater data volume call for more hypotheses to be tested. Greater efficiency in testing can be achieved through smaller representative samples of data, if the team possesses the skills to effectively define subsets of data that still constitute “representative”. There is also the problem of deciding when the searching and the testing are completed. It is easy for biases to creep in where the data can be slanted toward certain outcomes, without carrying through the process to its full and logical conclusion. Even if, in theory, the process is a scientific one, there is still much that is left to human judgment.
Additionally, the added complexity means that there are fewer people who can fully understand both the methodology and the conclusions. Marketing and sales decisions have traditionally involved some science, but also a lot of intuition and gut feeling, based on the experiences and predilections of those drawing the conclusions and determining the course of action. One can lapse unknowingly into finding the statistical analytics that most readily confirm one’s own preconceived notions and beliefs, and give minimal importance to those that do not. This can especially be true if deadlines are looming and not every hypothesis can be tested.
Among other issues is trying to solve every problem through analytics. Many problems have other solutions, such as improving communications and teamwork, eliminating bottlenecks, improving efficiencies, etc., and are inappropriate candidates for data analysis. Another danger can be excessive zeal in exploring this voluminous amount of data and the statistic analytics associated with it, where trying to find correlations turns into more of an open-ended exploration or academic exercise than actual solution of real business problems. One must resist going down blind alleys that will not lead to actionable results.
This is not to say that the plan to be adhered to must be so strict as to not allow for the serendipitous finding of unexpected correlations or uncovering of trends that can have a real business impact. The trick is to be alert to these sorts of things while still keeping the primary focus on the plan.
A problem that some organizations encounter is that while management has bought into the broad concepts and benefits of big data analytics, the whole organization may not be fully ready to make changes and take action based on these new recommendations. Pilot programs may be needed to lend credibility and prove the value and worth of the supporting analysis. Initial inertia or resistance may need to be overcome in those who have been more ad hoc in their approaches in the past. Presentation of the recommendations should be tailored to recognize this reality. Too much talk of statistical correlations, algorithms, probabilities and mathematical models will not necessarily be effective in winning over the decision makers. More talk of benefits, return on investment, customer satisfaction, expansion of customer base, and increased revenues will provide a better catalyst and a spur to action.