Google search is truly one of the great marvels of modern technology. We can ask it almost any question, and in less than a second get a well-ordered list of potential answer sources. But how does that happen?
In this episode of our Here’s Why digital marketing series, Eric Enge pulls back the curtain and reveals the fundamentals of how search works.
Don’t miss a single episode of Here’s Why. Click the subscribe button below to be notified via email each time a new video is published.
Resources
Transcript
So, Eric, just how big is Google?
Eric: Well, Mark, if by how big you mean how much information does it contain, the last we know, Google told us that it knows of about 130,000,000,000,000 webpages, but I think that’s just the ones that it’s actually crawled, and the real number is probably much higher.
Mark: Wow, that’s a lot. I mean, how does Google keep track of it all and deliver us results so fast?
Eric: Well, it boils down to a three-step process: crawling, indexing, and ranking.
[Tweet “Google search is a 3-step process: crawling, indexing, and ranking. More at”]
How Does Google Rank Web Pages?
Mark: Let’s go through each of those one at a time.
Step 1: Crawling Pages
Eric: Okay. Well, first comes crawling. Search engines have web bots known as spiders that crawl the World Wide Web to discover webpages and what they’re about in order to evaluate the usefulness for answering users’ queries. Spiders move through the web via links, links from one webpage to the next, and then down through the link structure of the site itself.
Mark: Okay. So, we’re looking at a webpage now. In this case, the home page of us.gov, is this what the Google spider sees?
Eric: Well, not exactly. The spiders sees the rendered HTML and JavaScript code of page, known as the DOM, or document object model, and that’s where it discovers the links to other parts of the site and to other sites that the page links to. All the links discovered are loaded into a queue for later crawling.
Mark: So, Google goes through all those crawlable links every day?
Eric: Well, no. Even for Google, that would be neither possible nor necessary. They actually spread each crawl over several weeks, using a set of priorities. Priorities let them know that there are some pages that probably never need to be crawled, and some that need to be crawled less often than others.
Step 2: Indexing Pages
Mark: Okay, so, our spiders have crawled the World Wide Web. Now what?
Eric: Well, next comes the process of indexing. All that information collected by the spiders has to be placed into an index to make it useful for searches. The index for each page includes things like detailed data on the nature of the content and topical relevance of each webpage, a map of all the pages that each page links to, and the clickable or anchor text of each link, and other pertinent information about each link such as whether it’s an ad or not, and where it is on the page, and more things like that.
The index is the database where search engines like Google store data, and then retrieve that data when a user types a query into the search engine. Before it decides which pages to show from the index, and in which order, search engines apply these algorithms to help rank the webpages.
Step 3: Ranking Pages
Mark: And there’s that word “rank.” Now, talk about how Google ranks the search results we see.
Eric: Sure. Well, a lot goes into this step, and it has to happen in fractions of a second.
- First, Google has to interpret the intent of the searcher. Is she looking for information, looking to make a purchase? Or in the case of voice search now, perhaps just asking a further question about the topic she previously asked about.
- Next, Google has to identify which pages in its index are relevant to the query, and their degrees of relevancy.
- Finally, Google ranks and returns those pages in the order of importance and their relevance.
Mark: Now, you just said importance and relevance, but is there a distinguishing factor between those two?
Eric: Sure. So, let’s take relevance first. Relevance is the degree to which any given web page matches the intent of the searcher. It’s no small task to figure that out on a massive scale. And just so you’re clear, there are degrees of relevance. So, something can be highly relevant, or a little relevant, or kind of in between. And importance has to do with how often a web page is cited by other pages, and to some degree, with how relevant and important those citing pages are. Typically, that citation comes in the form of a link, but it can come in other ways.
Mark: Okay, let’s make this practical. What does all that mean for someone trying to do SEO? Someone trying to help their site rank higher in Google for the queries that are important to them?
Eric: I think it means the same two things that are important to Google in ranking. Relevance and importance need to be important to SEO. As an SEO, you should be asking yourself about any page, “How relevant is the content on this page to what my target visitor would be looking for, and what have I done to make it a good enough and useful enough experience that other sites will wanna link to it and cite its importance?”
[Tweet “Do you know the two questions an SEO should ask about every web page? Find out here”]
Mark: And, of course, wrapped up in that simple sounding explanation is all the complexity and artistry that comes into what a truly good SEO puts into his or her work.
Eric: Yes. So, of course, there’s really a lot more to it, such as understanding how a search engine views, understands, and evaluates your content.
Mark: Which you go into in great depth in your guide to how Google search results work on our blog.
Don’t miss a single episode of Here’s Why. Click the subscribe button below to be notified via email each time a new video is published.
Awesome explanation ‘How google works’ than already exists for this topic in Google.