We started with the theory that blogs are used as publishing platforms on many sites where they don’t make sense. The reason for this theory is that the majority of content published in blogs is evergreen content – and that two main components of a blog don’t work well with evergreen content. These components are:
- The site structure of a blog contains a lot of “overhead” pages that don’t make good targets for indexation and ranking by Google, such as category pages, paginated pages, tag pages, and date archives. This is counterproductive, as it results in Google having to crawl through pages that contain no SEO value, rather than focusing on pages that do.
- Blog posts “age” over time. What this means is that, as newer content is published, these posts become more distant (in terms of clicks) from the blog home page. This does not make sense for evergreen content, as its value may not age; thus, its placement in a site hierarchy shouldn’t be dependent upon the age of the post, but should be dependent upon the post’s current value.
To investigate just how much of an issue these concerns are, we conducted a study of 100 randomly selected blogs. We took each of these blogs and crawled every page from top to bottom, so we could fully map out their structure and the content on them.
Blog Structure Study Results
The first area we examined is whether or not the structure of a blog is fundamentally flawed. To do this, we evaluated what percentage of the pages in a blog were blog posts vs. other types of “overhead” pages. This is what we found:
We can also present that data a little differently by focusing on the percentage of pages that are NOT posts:
What we see is that more than half of the blogs studied contained 70% or more pages that were not blog posts. In fact, only 29% of the blogs we looked at contained 50% or more pages that were blog posts. This is significant, as category pages or tag pages whose only content is links to posts, posts or post extracts are not likely to rank in search engines.
We also found many other types of pages that held very low value and had very little potential for ranking, such as paginated pages and comment pages.
The data for link depth is equally interesting. This is the data for the average link depth of posts across the 100 blogs we sampled:
A stunning two-thirds of the blogs had an average link depth greater than 5 (where link depth means the number of clicks from the home page of the blog to the content). The problem is, content that many clicks away from the blog’s home page (a link depth of 6 clicks or more) is unlikely to rank for anything unless it has earned external links.
Looking at this data from the perspective of the distribution of the link depth across all the posts, the results are as follows:
That’s fascinating to see. In some cases, the posts were more than 1,000 clicks away from the home pages – we discovered 3,488 posts in our sampling that were buried that deeply in the overall site hierarchy. In fact, 31.5% of the posts we found were 21 clicks or more from the blogs’ home pages, and 9.5% were 50 or more clicks away.
We did find examples of blogs that were very well structured. These examples contained very little waste in the site hierarchy and did not show excessive link depth. What this tells us is that it’s possible to configure blogs in a way that works well, without large quantities of crawlable pages that offer no value to the Google index.
A bigger issue to address is that blogs are inherently temporal in nature. For example, over time, a post’s placement in the blog hierarchy ages in a manner that is solely determined by the mechanics of the blog. The result is that it falls further and further away from the blog’s home page over time. For evergreen content, this makes no sense. A post’s placement on your site is something you will want to control more directly.
Of course, some blogs are very news-oriented. They publish content that is temporal in nature. We evaluated our 100 test blogs and determined that 92% percent of the posts examined were evergreen in nature as opposed to temporal. This shows that a large percentage of the blogs reviewed were primarily publishing content that would likely yield better results by being published outside of a blog.
Building out your own content hub (instead of a blog) is what we recommend for such sites. This will allow you to manually control the placement of the content on your site and ensure that the content of most value to your site’s users appears near the top of your site hierarchy.
This is the approach we recommend for sites investing heavily in content, as it will help to gain the best results from that investment.