The only thing I have found disappointing about Coveo’s analytics-based machine learning service, Reveal, is that it wasn’t available sooner. I have struggled to help clients adjust and tune the relevancy of their search engines for many years. After spending the better part of a week in-person with the Coveo founders and leadership team, I can say that Reveal is the most effective and straightforward solution I have seen to the problem of poor relevancy in search results. Period.
Let me pause for a minute and describe the basics of Coveo Reveal. Reveal actually begins with a cloud-based service called Coveo Usage Analytics, which captures and store vast quantities of user query and click-stream data, including actions that users take before or after running a search. Usage Analytics can capture information about how users arrive at a site, where they navigate before running a search, and what they do after clicking a search result. The Reveal service then takes all of that information and builds models that predict the most ideal search results for particular queries. When Coveo executes a search query and ranks the results, the Reveal service provides a scoring value that is added to the other relevancy signals, influencing the final order of the results.
I must admit, at this point in my understanding I wasn’t completely sold on the feature. We have been telling our customers to look at their search logs and click logs for years. But it was often hollow advice. First, most clients didn’t end up doing the analysis or didn’t stick with it long-term. Second, there was usually too much information for a mere human to make sense of it and determine the best course of action. The human brain is very good at recognizing patterns, but it has its limits. If there are too many different signals or inputs, and the events are spread out over time, it can be nearly impossible for a human analyst to spot emerging trends or subtle problems.
Therein lies the beauty of Reveal. Machine learning algorithms do not get tired or overwhelmed. If you provide adequate computational resources, and enough training data, these algorithms can very accurately find patterns in complex data from multiple sources over long periods of time.
Let me give an admittedly silly example to help explain the concept. Imagine that you run a convenience store, and you are trying to figure out to the trick to get more kids to buy a pack of gum when they shop at your store (bear with me here – maybe it’s a high-margin item?). You try profiling the kids themselves (tall, short, freckles, hair color) – but you don’t see any pattern that predicts whether or not they will end up buying a pack of gum. You consider the time of day – morning, afternoon or evening – but that doesn’t necessarily predict the outcome either. In reality, it turns out the biggest predictor is whether or not the kid is coming straight from some physical activity and it was a hot day, which means they are thirsty. Placing a rack of gum next to the cold sport drinks is the secret. But, there were so many possibilities that you couldn’t single out this particular correlation.
Like I said, it’s a silly example, but the same idea applies to promoting the most ideal search results. Reveal can see all the different things people did or tried on the website, and which of those activities produced desirable outcomes and which ones produced undesirable outcomes. We, the mere humans, don’t have to figure out the winning correlations ourselves. The machine learning models can do that faster and more accurately than we can, across extremely large datasets.
In practice, this produces some remarkable results. Coveo tells a great story about one of their customer’s implementations and a simple, unexpected success. The client sells consumer electronics, some of which use a small wireless USB dongle that plugs into your computer. The USB dongle was an OEM part from an overseas manufacturer and had a part number printed on it that did not correspond to any data in the client’s own product catalog. But, some users would type this part number into the search box and — wait for it — not find any results. Not very surprising. Unless you have a sixth sense, or a meticulous synonym dictionary, most search engines would choke on this example.
But Reveal was able to piece together the bigger story. Users that searched for the USB dongle part number often tried other ways to find the product after their initial failure. Perhaps they noticed that the other half of the device had a different part number printed on the bottom. Eventually the users would succeed and a positive event was logged in the Usage Analytics database. Reveal “crunched the numbers” and realized that when users searched for the obscure USB dongle part number, it would be a good idea to show the other product in the search results because it would lead to those positive outcome events. Like ants following a trail to sugar, machine learning exploits the fact that positive outcomes are reinforcing. If something works for one person, and then another, soon the model learns to predict this outcome and put it into practice for future visitors.
Coveo has found similar success in a variety of knowledge base and self-help implementations. This doesn’t just work for selling products. For example, avoiding the creation of a help ticket can be considered a successful outcome. If two users search for help on the same technical issue but click on different articles in the knowledge base, and one user ultimately ends up opening a help ticket and the other doesn’t, Reveal will learn the one article was likely better than the other, moving it higher in the rankings in the future.
In retrospect, Reveal is a solution that I’m surprised is just coming to fruition (I suspect that the large amounts of analytical data required for a solution like this delayed some on-premise search vendors; Coveo’s cloud-based approach scales nicely and appears to be ready for prime time). Reveal has roots in simple pattern matching, and cause and effect relationships. To date, the industry has been largely focused on page rank and text analytics and textbook algorithms like TF/IDF to deterministically rank content. But taking a step back and looking at real, live patterns in user behavior turns out to have a remarkable impact on the quality of search results. I am optimistic that we are nearing the end of “10 results per page, and 10 pages of results”. With a system like Reveal, we are getting much closer to being able to give users the one best answer to their query.
For more information about Coveo Reveal, or the Coveo search platform in general, feel free to reach out to me by leaving a comment or using the links at the bottom of the page.