Skip to main content

Experience Management

Machine Learning Pt. 2: Applications in Enterprise Search

In last week’s article, What is Machine Learning?, I gave an overview of Machine Learning and provided a link to a very informative visual guide that describes how the technology works.  I also hinted at some useful applications within Enterprise Search.  I would like to cover a few of those that I have seen in recent months.

Query Analysis with Clustering

clusteringThere are often many ways to search for the same concept or idea.  When we are reviewing search logs and trying to draw conclusions from what people are searching for, slight variations from query to query can make it difficult to do a comprehensive analysis.  For example, we recently worked with a major airline and one of their top queries was “bereavement fares”, i.e. people looking for discounted fares when attending a sudden funeral.  We knew that this particular query was popular, and that it needed some tuning to best serve the customers, but further analysis reveled other similar but different queries suffering from the same problem.  The other forms of the query were less common, and if treated individually, they would have flown under the radar (no pun intended) in our analysis.
Clustering techniques, using machine learning, can group together variations of the same query and allow us to analyze and tune all of them at once.  For example, a well-trained tool can recognize that all of the following queries are fundamentally about the same thing:  bereavement fares, death in the family, funeral fares, funeral discounted fares, bereavement discounted tickets, etc.  Here is one other common example we have seen in the field: holiday calendar, vacation calendar, corporate holidays, etc.  By clustering the queries, we can appropriately deal with all of the variations at once, instead of treating only the most popular form of the query.
There are many ways to train a machine learning engine to cluster queries.  We can teach it to cluster synonyms or related words using a thesaurus (i.e. death = loss = mourning).  That would correlate some of the queries, but not all of them, such as ‘funeral’, which is related but a different part of speech.  In addition to the straight dictionary approach, we can work backwards from real-world behavior.  If the airline has a specific page that deals with bereavement fares, we can first find all of the queries that people have used to find that page in the past, and we can provide those queries as input to train our machine learning engine.  Machine learning is a game of numbers, and the more input we provide, the better the results can get.  As the visual learning guide explained, the more concretely we demonstrate to the engine what constitutes a query about bereavement fares vs. a query not about bereavement fares, the better it will be able to recognize that distinction in the future.
A nice side effect of this training is the elimination of noise around the critical words or concepts in the query.  A brute-force approach to stripping out noise, like ‘tickets’, ‘discounted’, ‘fares’, ‘family’, etc., will be hit or miss.  A well-trained machine learning engine can make a statistical guess about queries that it has never seen before, avoiding the need to program it for every possible variation or noise-word in the future.

Query Auto-complete / Search as you Type

Search as you type, or query auto-completion, can be a very interesting and useful tool on a search form, but you have a very limited opportunity to provide useful information for your users.  At best, you have just a couple of letters or words as input, and only a few milliseconds to provide a response.  Take a look at the example below, which quickly identifies medical conditions and clinic facilities as I begin to type:
Screen Shot 2015-08-05 at 12.08.48 PM
 
 
Apple.com and Adobe.com also provide very nice examples when starting to type the name of one of their products or services.
autocompletes
Adobe and Apple have trained their auto-complete to recognize common queries almost instantaneously.  Machine learning allows us to pre-train an engine to very quickly recognize words or phrases and classify them in a meaningful way.  A well-trained machine learning engine can quickly identify important entities in the query, like people, places or things, without having to run a full search to figure out what is what.
For example, imagine I search for the word “Target” on a financial services website.  I might have meant the bulls-eye in a game of archery, or the retail home supply giant.  If I were a financial institution offering customers information about companies and other investments, it would be helpful to recognize Target, the company, and provide a quick link to the company’s profile for my users.  We are doing similar implementations that recognize people’s names or locations and special forms of queries (like a comparison between item A vs. item B).  In the end, we shave off seconds, or extra clicks, and over the course of a day or a month, it adds up to significant savings.
 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Chad Johnson

Chad is a Principal of Search and Knowledge Discovery at Perficient. He was previously the Director of Perficient's national Google for Work practice.

More from this Author

Follow Us