Skip to main content

Research and Studies

The Great Knowledge Box Showdown: Google Now vs. Siri vs. Cortana

Google’s Knowledge Graph has been the center of much attention lately. We have been hearing a lot about another concept called the Knowledge Vault as well. But just how extensive is Google’s capability? And what about Siri and Bing/Cortana? How do they stack up? To find out, we loaded the Google App onto an iPhone (Google Now is part of the Google App), tested out Siri, and got our hands on a Windows phone so we could test Cortana, and took them all for an extended test drive.
UPDATE! (20 September 2015) With the introduction of a Siri for iOS 9, Re/code Magazine asked Perficient Digital to rerun our question set to see how much Siri has improved (if at all). Read the results here.
Great Knowledge Box Showdown: Google vs Siri vs Cortana
These are the things we set out to measure in this study. To do that, we took 3086 different queries and compared them across all three platforms. These were not random queries. In fact, they were picked because we felt they were likely to trigger a knowledge panel.
In addition, this was a straight up knowledge box comparison, not a personal assistant comparison. In addition, please note that Cortana is in beta, and is promoting itself as a personal assistant. For purposes of this study, a “knowledge box” or “knowledge panel” is defined as content in the search results that attempts to directly answer a question asked in a search query. Others in the industry sometimes refer to these as “Answer Boxes”. Here is a simple example of one:
How Many Quarts in a Gallon
Knowledge boxes can show up in many forms, including:

  1. On the right rail of the search results
  2. As step by step instructions above the regular web search results
  3. As a structured snippet incorporated into the regular web search results
  4. In the form of a carousel above the search results

All queries in this test were done using voice commands via their respective apps), even when using Google and Bing. The reason we did this is that there are many commands in Google and Bing that behave differently when the search query is typed in, and we wanted to do a straight apples to apples comparison. The devices used were:

  1. Cortana running on a Nokia lumia 635 Windows Phone
  2. Siri running on the iPhone 4s and iPhone 5
  3. The Google App (of which Google Now is a part) running on the iPhone 4s and iPhone 5

You can see Perficient Digital staff members Caitlin O’Connell and Justin Markuson demonstrate some basic queries in this short 3 minute video

Click here to jump straight to the results!

Types of Results

Google uses many sources of data for the Knowledge Graph. Here is what Google’s Amit Singhal told us about that back in May 2012:

Google’s Knowledge Graph isn’t just rooted in public sources such as Freebase, Wikipedia and the CIA World Factbook. It’s also augmented at a much larger scale because we’re focused on comprehensive breadth and depth. It currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. And it’s tuned based on what people search for, and what we find out on the web.

Note: When this study was first published, the screenshots we showed here were from their desktop search, and we have now changed that. However, ALL searches performed in the test were performed via phone and voice search as detailed above. This was simply an author error in taking the screenshots.
Not only does Google pull from many sources, but they also have many different types of ways of presenting results. Let’s look at a few interesting examples:
How Tall is the Eiffel Tower
Not only do we get our answer, but Google offers us info on three other tall buildings. Google noticed that people who search on the height of the Eiffel Tower often want to know the height of other tall buildings. If you click on the “Burj Khalifa” link, it becomes even more interesting. Here is what you get:
Burj Khalifa Height Showing Carousel
The fascinating part about this result is that it has dramatically expanded the number of options with regard to other famous buildings, by presenting us a carousel (the common industry name for the strip of results up at the top) of results. I don’t get that result if I simply search on “burj khalifa height.” Instead, I get a much simpler variation as follows:
Burj Khalifa Height
As you see here, the results are quite different. The version with the carousel reflects the fact that I clicked on the link for Burj Khalifa when viewing the Eiffel Tower result. As soon as Google saw I was interested in more than one building, they gave me an even larger set. They could potentially figure out which buildings to show by seeing which queries typically follow your current query.
I.e., of all the people who search “how tall is the Eiffel Tower,” how many of them then search on some other building? Chances are, the most popular follow on queries relate to Burj Khalifa, the Empire State Building, and the Statue of Liberty, which is why these are shown in the Eiffel Tower result above. However, it seems that not many people do follow on queries for other buildings after searching on “burj khalifa height.”
It’s important to note that this is speculation, and it could also be that Google is simply testing different variants to see what works best. But in the long run, you can anticipate that a combination of statistics, testing, and UI design significantly increase the variety of possible search results that you might see, based on the order in which you perform the searches.
You can also see other types of results. Some of these are extracted from third party web sites, such as this one:
Who is on the Five Dollar Bill?
This one is drawn from Wikipedia, but Google may also draw information from other web sites. One example of this is shown here:
How to Clean Suede
This result is an example of what we call step-by-step instructions, where you can actually receive a full procedure in the search results. These also divide into three different types:

  1. All the required steps are presented in the results, so visiting the website is not needed to get the requested info.
  2. Only some of the steps are provided, and therefore getting the complete process requires going to the source site.
  3. All of the steps are provided, but some of the steps are not completely detailed, so this still requires going to the specified site to get all the info needed by the searcher.

Another type of result is what we refer to as a “structured snippet.” These are results that look like the following:
How Old is the Colosseum
Notice how we see a regular search result, but some information has been extracted directly from that result and shown inline. For now, these are relatively rare. They may just be something that Google is testing for the moment, and which could expand significantly in the future if Google likes the results.
One last consideration that we examined is accuracy, or whether or not Google provides the answer to the question answered. Here is one fun example of a query where Google did not answer your question shown from their desktop search when we first tried it:
How Much is a Quarter Cup of Butter Search Result
Note that when we originally did the study, we saw a similar result in the phone results, but it appears that Google has fixed it, as you can see here:
How Much is a Quarter Cup of Butter
At least they left the snarky part in.

What About Siri?

We tested all the same queries on Siri as well. It is been well known that Siri sources data from Wolfram Alpha (a knowledge-based search engine that was at one time touted as a Google killer), but our testing showed that it also pulls in results from Wikipedia, Yahoo, and Bing. Here is a sample result that pulls in data from Wolfram Alpha:
Siri Why is the Sky Blue Result
Here is a sample result using Wikipedia as a source:
Siri How do you Make a Black Russian
In case you were wondering, I happen to like black russians. Next up is a sample result from Yahoo:
Siri When is Sunrise Result
Note that the question on this one was “when is the sunrise,” and the answer I get is when the sun rose this morning here in Southborough, Massachusetts. It also appears that Siri draws its image search results from Bing, as shown here:
Siri Show Me Pictures of a Doorknob
Like Google, Siri does make mistakes too. For example, when I ask “what does a cardiologist do,” I get this answer:
Siri What Does a Cardiologist Do
As you can see, Siri provides me information on two cardiologists located near where I am, which does not relate to the question asked at all. Last, but not least, Siri provides some very entertaining results as well. Here is one of the more fun ones:
Siri Why are Firetrucks Red Result
So now you know.

And Bing/Cortana?

As with Google, there are queries that respond differently when spoken (using Cortana) than when simply entered into the search box using your keyboard. For that reason, all queries tested were spoken. Here is an example of a simple direct answer query:
Cortana What is 2 plus 2 Result
For a large number of the tested queries, Cortana returned YouTube videos that purport to answer the questions. We did not count these as knowledge panel results. Cortana also drew upon the Oxford dictionaries to get definition type terms, such as you can see in this result:
Cortana What is a Bond Result
Cortana also appears to draw data from Wikipedia, Freebase, the New York Times, and other website sources. Here is an example of a query that appears to be drawn from the Facebook.com web site:
Cortana Who Invented Facebook
You can also find some fun stuff using Cortana. Here is the answer to the classic question “what is love?”:
Cortana What is Love Result
Here is hoping that Cortana can speed up its investigation on this matter. ;->

Some Notes on Bing vs. Cortana

It was interesting to note that in many cases Cortana would not return knowledge panels when a text-based search in Bing would. Google actually tended to do the opposite (voice search would bring up results that a regular text search would not). We did a spot check of scenarios where Cortana returned some type of knowledge result, but it did not fully answer the question to see in how many cases Bing returned a more enhanced result.

We checked a total of 234 of these, and 78 of these (33 percent) provided fully complete answers in Bing. So Bing is further along in what they are doing than what is integrated into Cortana at this point.

Detailed Study Results

The study data shown below is as of October 4, 2014. Note that the engines all make changes in the results on an ongoing basis, and we do intend to monitor these results over time. With that said, let’s get to it:

Percent of Queries Showing Some Type of Enhanced Result

This includes knowledge boxes on the right, knowledge panels in the main column, and/or structured snippets. Here is what we found:
Google Siri Cortana Results Comparison
Google Now (this was the Google App running on the iPhone) returns twice as many results as Siri and nearly three times as many results as Cortana. This is clear evidence that Google is much further down the path with this type of work than either Apple or Cortana. As noted above, Bing, using text-based search queries, returns knowledge boxes for more types of results than Cortana does at this time. [Tweet This!]

Do Enhanced Results Fully Answer the Question?

This section focused on whether or not the returned query fully addressed the question. The scoring here was harsh. If you asked “how old is the great wall of China” and the knowledge panel result showed that the Great Wall was completed in 206 BC, you got no credit. In addition, even if the first regular web search result shows the result in its description or title, you also got no credit. Keep in mind, this was a knowledge panel test.
Google Now Siri Cortana Accuracy Comparison
Looking at the scores here, one might conclude that Cortana and Siri are genuinely bad based on the scores. However, please bear in mind that this was a knowledge base test. The enhanced results returned in both systems had a far higher rate of being at least somewhat helpful, and in Cortana’s case, had a high rate of improving the standard search results. But you still need to click to see what you were really looking for. [Tweet This Result!]
Here is an example of a query for which Cortana returns a result, but which does not directly answer the question:
Cortana When is the Next Stanley Cup
Here is one for Siri for the phrase “who has the most patents”:
Siri Who has the Most Patents Result
Last, but not least, here is one for Google Now that when we first tried it did not really get the job done:
How Long Was the Gettysburg Address Result
Note that we saw a similar result in our phone query, but I mistakenly took a desktop for this post. It appears that Google has now fixed this problem, as shown by this phone screenshot:
How Long is the Gettysburg Address?
Each of these shows the examples of the struggles that each vendor has in truly nailing down a definitive answer to the question. The information is potentially helpful, but the answer we requested was not included.

More Specifics on Google Now

Google presents many results without providing attribution. These are generally in the form of well-established facts, such as “what is the capital of Maine?”. The split works out roughly to 75/25, as shown here: [Tweet This Result!]
Knowledge Box Results that Show Links
Also of interest is a closer look at the step-by-step instructions. We actually found 276 examples of step-by-step instructions. One concern that many have expressed is that this might steal traffic from the publisher’s website from which the information was taken. However, we only found 59 different scenarios where the complete instruction set was provided.
I am betting that for those other 217 web sites, being the identified authority on answering this type of query is absolutely awesome:
Step by Step Complete Results

Final Thoughts

So there you have it. As of October 4, Google Now has a clear lead in terms of the sheer volume of queries addressed, and more complete accuracy with its queries than either Siri or Cortana. All three parties will keep investing in this type of technology, but the cold hard facts are that Google is progressing the fastest on all fronts.
Share this study!
View and share our infographic of the results
View and share our slide deck on Slideshare
View and pin the results on Pinterest
Please check out some of our other studies:

  1. Google Plus Impact on SEO
  2. Facebook Impact on SEO
  3. How Does Google Index Tweets?

Study Credits: Thanks to Caitlin O’Connell and Justin Markuson for their hard work on this study, and to Mark Traphagen for creating the opening image.
Here is the full set of queries used in the study
See all our social media & SEO studies!

Thoughts on “The Great Knowledge Box Showdown: Google Now vs. Siri vs. Cortana”

  1. Someone needs to note that the Siri result about patents was for the question “Who has the most patents?”. It’s not written in the article anywhere, so as far as the reader knows, you may have asked “What is a patent?” and gotten the correct answer!

  2. Interesting study. I use both heavily. On Monday, I asked both Siri and Google Now this question “who is playing on Monday night football tonight?” Siri gave me the right answer straight away. Google Now gave me Sunday’s game results. I tried asking Google Now a few other ways and got nothing even close to the right answer.
    That said, I personally find both Google Now and Siri to be relatively similar in the these kinds of questions. I do like the immediacy of cards in Google Now as it runs on an Android and wish Apple would support something similar at the home screen level (like Dashboard on OS X Mavericks)

  3. As a former Android user, I can say anecdotally that Google Now is far superior to Siri. Not only is Siri truly “dumber,” interacting with her is extremely frustrating. It’s well beneath Apple. That’s the problem with a company that says it always sweats the details and act purely — when they have a hunk of junk, it’s harder for them to realize it.

  4. You forgot to use Wolfram Alpha.
    Ask W.A. for the height of the Eiffel Tower and it also tells you the distance to the horizon, the *ratio* of height to the Burj Kalifa, and roughly how many stories high it is.
    As for your “quarter cup of butter” question, I think Wolfram’s answer kicks the Google answer off the field. I got a thorough nutritional profile, and links to alternative answers based on international variations on the “cup” measurement as well as to five specific types of butter.
    Try “Dive calculator”. Or “oscars won by Meryl Streep”. Or “How many m&ms fit in the Grand Canyon?”
    Wolfram Alpha makes the others look simple.

  5. Wilhem von Hapsburg

    Were there any other digital personal assistants worth looking at? How about S Voice by Samsung, HTC’s Hidi, or Voice Mate by LG?

  6. Hi Kosh – Siri uses Wolfram Alpha as its data source, which is why we did not test it separately. But you are right, we could have done that as well.

  7. Hi Wilhem – we had limited resources, so we chose these 3. Given that our focus was on knowledge boxes (or instant answers that are knowledge focused) type solutions, we chose the search engine based ones.

  8. Great article Eric.
    Finally a comprehensive benchmark that doesn’t seem to be biased towards a specific system.
    What’s about the speech recoginition, though? Any thoughts on that? I know that is very hard to measure objectively, but do you favor a specific assistent over the other? How sensitive are they to sourround noises? How well do they recognize sentences? That’s also something i find very important about a personal assistent.

  9. this is weird. for “How long is the Gettysburg address”, Google Now gives me the answer “President Lincoln delivered the 272 words Gettysburg Address on November 19, 1863” and then proceeds to give the first words of it.

  10. Sina Samangooei

    This is a really interesting comparison. Will you share the list of questions you were using for your analysis?

  11. Back in Dec 2011 I had a new Samsung Galaxy Nexus and I tried comparing it with a friend’s iPhone on a few questions. It was very informal, and I couldn’t conclude either one was better at the time, but one question highlighted an interesting behavior.
    I asked both phones “Where have all the flowers gone?”
    Google gave me a link to a video of Peter Paul & Mary singing the song, plus a link to a lyrics web page; while Apple gave me a list of nearby florists.
    It occured to me that Apple was probably trying to monetize the query – choosing results that might allow them to collect advertising revenue or brag about targeted ad results or something like that.
    The cardiologist query might indicate the same thing. I asked Google now what a cardiologist does and got a complete answer plus citation, while Apple didn’t seem to understand the question and gave results that might be monetized.

  12. Your results are already outdated. It appears that you aren’t using iOS 8’s update to Siri, which provides much more functionality (and provides the correct answer to the last query in the embedded youtube video).

  13. I agree with Goran. This article seems pretty biased towards Google and lacking basic information like number of queries used, what queries were used, were the same queries issued to all platforms, etc.

  14. Thanks, just found the number of queries above. Would still love to hear what queries were actually used.

  15. How’s this for a curveball? I just asked the same question, and just got a list of sources to click on. Not actual voice response.

  16. It wasn’t mentioned here, but Google will give you nutrition information if you ask for it. It can also do side by side comparisons and change the quantity to see how it will affect the nutrition info. This even works with foods that don’t typically come with nutrition labels (like produce).

  17. What is the “quarter cup of butter” query? I can’t find reference to it in the article and am interested to compare the results myself.
    I, too, am often impressed by the results I get from Wolfram Alpha. I often begin my Siri inquiries with “Wolfram” to ensure that I get results from that data point.

  18. Never mind. I see it, now. I was searching for the text and only after posting realized that it might have only been in a screenshot that I actually had to parse manually. 😀

  19. Itamar – we will indeed share the query set shortly. Getting it formatted for publication, and flat out at a conference. As soon as I get a couple of hours we will add a link to it into the post.

  20. Hi Mir – We will indeed share the query set shortly. Getting it formatted for publication, and flat out at a conference. As soon as I get a couple of hours we will add a link to it into the post.

  21. John – I am currently at a conference, but plan to publish the queries used and link to it from this post before end of day tomorrow.

  22. Interesting. My own results tightly correspond to yours in that Google Now handles queries much more efficiently. Also, I’ve found that the way you interact with the assistant varies from person to person. For example I will say to Google Now “Navigate to the mall” and it works, but my girlfriend will say “Hi Google, could you please take me to the food court, love you,” and it complies. She almost sees it as another person but conversely I interact with it like a tool. Even though this has nothing to do with the study you guys did, I found this interesting and though I should share.

  23. I’m assuming you’re referring to American football. You can just say “Show me the NFL Schedule” and Google Now will display an interactive card showing the upcoming games, positioned to the current week. This is the same display you probably got, but you didn’t notice that you could swipe left and right to display other weeks.

  24. I just asked Google Now the final question from the video (“How long is the Lincoln Tunnel?”), and it told me (with voice) that it is 1.5 miles long. Seemingly another interesting example that it is either being improved as time and/or usage progresses, or that it simply isn’t entirely consistent in the results it gives.

  25. It’s reporting the value from Wikipedia. If you have a better answer, feel free to change it there (but I have a feeling you’ll be challenged on it, so be prepared to back up your answer). I’ve found a lot of other sources that support Wikipedia’s number.

  26. Methodology? How were the questions asked? I ask because just in the video referencing this report, 3 questions were asked and the wording was different between devices in one of the questions. It seems, to be truly fair, a recording of each question should be made and played back to ensure that each device would get the exact same question, wording, intonation, and volume.

  27. I tried the same thing for tonight’s game “Who is playing on Thursday Night Football tonight?” and was given last Thursday night’s score vocally, and was shown all the scores from last weekend’s games.
    However, when I asked “Who is playing on Monday Night Football next week?” I received no vocal response but was shown the schedule of games for this weekend (including tonight’s game).

  28. There are actually 6 different versions of the Address, each with slight variations, so there is no single “right” answer to your question. The version you probably memorized was the “Bliss” version.

  29. And this is why phrasing is key. Ask anyone who knows how to search Google and you can take the exact same set of words, ordered differently, and either get exactly what you need, or get nothing close to what you were looking for.

  30. Why would your phone’s OS upgrade have anything to do with Siri’s answers? Your phone is not deriving the answers. It’s just sending them out to a processor which does all the work, and then receives the answer. All three of them work the same way.

  31. You think it’s biased, and then complain that you don’t have any of the information needed to determine if it’s actually biased. So basically, you’re basing your belief that it’s biased, on the fact that Apple didn’t win.

  32. Eric, when you publish the details of the study can you be sure to include what devices you used, and what version of the operating systems you used during the study.
    Why all of the screenshots for Cortana and Siri are directly from the cellphone, but everything for Google Now is from a computer?
    This was a good study and hopefully Microsoft, Apple, and Google are notified with the findings so that they may further improve their products.

  33. I’d be interested in seeing an updated version with iOS8’s Siri.
    Also, also be curious to know whether you used Google Now on a mobile device for the testing, or just used the voice recognition in-browser on a PC (as the information displayed on a full browser is obviously going to be a bit more robust than that displayed on a mobile device, which could easily lead to the desired information being shown on a full browser, but truncated on a mobile browser).
    The included video shows queries done on a mobile device, but the screenshots of google are all in-browser.

  34. If you ask Google Now how many words are in the Gettysburg Address it correctly answers – so I guess there is still some work in understanding the different ways people can ask the same question…

  35. I find it troubling that the answer to “who is on the five dollar bill” is touted as a good “combined” answer. I am sure that the picture shown is not Abraham Lincoln.

  36. Comparing a neural network with deep learning (Google Now) to Apple’s programmed Siri is like putting a heavy weight against a feather boxer. They’re not even in the same league.

  37. Cortana is still in beta yet when actually comparing the three to everyday uses Cortana is by far more advanced than siri and Google now. Siri and google now dont have as much access to the phone and personal data as cortana actually learns the user. Google now is the fastest but can only handle searches and simple tasks. Siri and Cortana are truly personal assistants but cortana is a lot faster than siri and actually learns the user to make it a little more personal. Not biased at all as I have a Nokia 1020, LG G3 and iPhone 6. Cortana is definitely a step above the rest.

  38. Who paid for this study? You’re an online marketing company… what are you marketing here?
    Also, I have an iPhone 6 Plus with iOS 8.0.2 and an Xperia Z1S and I’m seeing different results with my queries. Additionally, why have you left out Wolfram Alfa results? That’s a pretty huge chunk of Siri’s capability.
    I have to say that prima facie, this ‘study’ appears to be bunk. Sorry.

  39. Leon – the Wolfram Alpha results were part of the bulk of what Siri returned in the study, so they were included.
    We paid for the study with our own funds. No 3rd party money was involved. In terms of the outcome of this study, we are selling nothing.
    Sorry you did not like the study.

  40. Omar – as noted in the beginning of this study, it was focused on knowledge boxes, not the value of these apps as personal assistants.

  41. Good point that the bills shown are earlier versions of the 5 dollar bill. However, the text clearly calls out that it’s Abraham Lincoln.

  42. It may be worth noting W.A. is a computational search engine that is used by just about all of the other search engines when some type of mathematics, logic, is involved. W.A. is an offshoot/follow-up? of Mathematica , IMO, the greatest math/sci information tool available.(W/O a high level security clearance that is. =o) Great for what it is designed for. Computational Search Engine. Not real good in the abstract> Just an opinion, have a great day. Chief

  43. From what I have seen in advertising, these tools are not be touted as search aids, but as a means to voice control your device to add appointments, call someone, dictate and etc… Was this type of use taken into consideration for each vendors functionality?

  44. Hi Warren – as noted in the study, this was really meant to test the knowledge box / answer box capabilities of each tool, not their personal assistant capabilities. So the appointment, call and dictation capabilities were not included in the test. Our intention is to rerun the test again sometime soon.

  45. I’m assuming your confusing the results of our study for a clear and present bias towards Google. To be honest, I’m not sure where this notion originates. Whenever there is a clear winner in any study (or even sports event), there are always rumors of bias/conspiracy. In this study, there was none. We had no reason to provide preference. All in all, it would have been nice to see one of the underdog command systems match Google’s results, but the data wasn’t even close. That’s not a fact of bias, that’s a testament to how well Google Now operates as a command system in comparison to Siri and Cortana.

  46. The question was not “How long is the Lincoln Tunnel”. If you watch the video, we asked Google Now, “How old is the Lincoln Tunnel”, and the result gave us its length. We chose the query because the answer was seemingly incorrect on all 3 devices.

  47. Google Now was tested using a mobile device. It appears the screenshots came from a laptop/desktop. We apologize for the confusion.

  48. This study was of moderate interest, but my main use of Siri is to actually do things: set timers, wake me up in the morning, create appointments, add to my groceries list, etc. I use these features several times each day with very satisfactory results. I would be much more interested in a comparison between Siri and Google Now on performing these kinds of tasks.

  49. I’m half convinced that Richard there was being sarcastic. How else exactly can *facts* be *biased*?

  50. Google Now does far more than just provide search results. Either you haven’t really used it, or haven’t used it in a very long time.

  51. It’s predictable, but still amusing that those who don’t like the results, because they don’t favor the device they use, assume the study is flawed or biased. We seem to have turned into a people who believe that facts can be changed if only you have strong enough faith in an alternate set of beliefs.

  52. Cortana is what 3 months old? and is more of a personal assistant.
    Not quite apples to oranges, but Cortana is much better as a personal assistant.
    I can click on a search link and read just fine once its provided. I don’t a “HAL9000” answer. Where I have to filter through the speech that Google is giving me.

  53. How can this test be “Google” biased? I f you noticed that the tests were done on a IOS device with the exception of Cortana. Correct me if I am wrong but Siri should have the distinct advantage here as it is integrated into the OS.

  54. Hi Mike – as we acknowledged in the start of the write-up, we were purposefully focused on mapping out the knowledge base of the 3 participants, not the personal assistant capabilities of them. Cortana could well be a much better personal assistant, we don’t know, as we were not testing for that.

  55. Eric, what you list as a Cortana failure is a desktop image, where the knowledge box isn’t a direct answer but two query refinement boxes. I’d not have counted that as Cortana. While I’m pretty sure Google Now still would have done great, did you have a lot of these cases? That wouldn’t seem to count Cortana correctly.

  56. Hi Danny – edited this comment after your email – I removed the example. We actually did count it right in the study itself, and I made a mistake of including this as an example the way I did. Thanks for pointing that out.
    FYI – we plan to release the entire query set early this week. As part of part of that, we are re-running all the queries for which we determined that Siri and Cortana failed to make sure we did not miss anything. The differences we are seeing so far from our published results are not statistically significant (< 1% change in our reported results).

  57. Microsoft is running Cortana ads panning Siri. I don’t think it gets a pass for being “in beta”.

  58. Can you describe how were these ~3100 queries selected ? How did you know they may fire a knowledge box ? Did you test the queries against bing/ google to pick them ?

  59. Hi – we did not pre-test the queries before picking them. We focused on picking queries about places, people, processes, where it would be likely that the engine might be able to return the result from a data base of some sort.

  60. It may be interesting to note that if one alters the question to “Who HOLDS the most patents?” Siri answers the question correctly (Thomas Edison; or IBM for “Which company holds the most patents?”).

  61. Nathanael – that IS interesting. There are lots of these little language twists in the questions. Speaks to what types of phrases each engine gets. Note that the phrases we used were not picked based on prior knowledge that they worked with Google, they were just picked based on our belief that they could trigger a response.

  62. I agree. It’s been my experience that all three personal assistants are more than occasionally plagued with such phrasing issues, to the point where it’s become habitual, when I don’t get the answer I’m looking for, to simply rephrase the question until I do.
    Having said that, I don’t personally often use Siri for her knowledge content. I find myself more dependent on her for voice dictation (I dictated this post, for example), and limited facilities such as schedule management, alarms, quick emails, voice-initiated calls, weather reports, and so forth.

  63. Eric, I’m interested to see the full query set, but it doesn’t appear to be posted yet. Is that still coming soon?

  64. I have seen the same behavior, Nathanael, across services and even with ChaCha over the last several years. Matching algorithms using page rank actually suck for surfacing answers. The NLP involved needs to be much more complex and exact phrasing is crucial at times.
    Often times users of ChaCha would rephrase their questions until they got the answer they were seeking, just like we see happening now with these personal assistants.
    We never solved the problem perfectly, but we found that the ‘shape’ of the question was an important component of the match. I think Google is doing this now, along with named entity matching, to really create a powerful system.

  65. He has an LG G3. How old you think it is that you’d question how long ago he used Google Now?
    Obvious Google fanboi is obvious.

  66. You published the questions, thank you, but it’s kind of hard to reproduce the test without the full results, e.g. the result of individual questions on each platform.

  67. Exactly right. Googler’s, such as Amit Singhal, openly speak about creating the “Star Trek Computer”. One where you can address the computer conversationally, and it can return any piece of information you ask of it.

  68. One little thing that needs clarified is this has nothing to do with Google Now. In fact, Google Now isn’t even available on iOS or any other OS besides Android for that matter. What you referred to Google Now throughout the post is actually just Google search. Google Now is the beast of a personal assistant that attempts to give you information before you ask based on your personal usage habits.

  69. It may be interesting to note that if one alters the question to “Who HOLDS the most patents?” Siri answers the question correctly (Thomas Edison; or IBM for “Which company holds the most patents?”).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Eric Enge

Eric Enge is part of the Digital Marketing practice at Perficient. He designs studies and produces industry-related research to help prove, debunk, or evolve assumptions about digital marketing practices and their value. Eric is a writer, blogger, researcher, teacher, and keynote speaker and panelist at major industry conferences. Partnering with several other experts, Eric served as the lead author of The Art of SEO.

More from this Author

Follow Us
TwitterLinkedinFacebookYoutubeInstagram