Skip to main content

Analytics

Web Analytics Shootout – Final Report – a 2011 Perspective

Perspective

Four years later and the conclusions drawn in this groundbreaking report still stand the test of time. Perficient Digital was the first company to scientifically prove just how different the results are from analytics packages:
Chart Showing Unique Visitor Data
The industry has gone through some changes, and (Clicktracks, IndexTools, & HBX have all been acquired), but the variation in measurement has not.
Some of the key findings in this report are:

  1. Analytics Data will Vary by as Much as 50% – making it impossible to compare the output from one tool to another.
  2. The Value of Analytics is in Relative Measurement – focus on the data your tool shows you and focus your internet marketing optimization campaigns on growing that traffic.
  3. Definition of the Differences – in the report we discuss some of the differences in implementation that drives differences in measurement

While information from the report may be dated, it is important for you to understand its lessons as outlined above. And, if you are looking for help in building your website traffic and conversions from it, please check out our SEO and Internet Marketing Services.

Introduction to the Web Analytics Shoot Out – by Jim Sterne

In this updated 2007 Analytics Shoot Out, Perficient Digital takes the same approach of head-to-head comparisons of major web analytics packages on real websites. Yes, they evaluate things like ease of implementation, use, and reporting. Yes, they look at the strengths and weaknesses of each package. But then they dig deeper into the conundrum of accuracy in web analytics data and discuss where accuracy matters. They look harder at first-party versus third-party cookies. They measure how does JavaScript placement on the web page affect the resulting data? They also get practical, identifying which analytics tools are best for which types of websites.
Are you trying to compare and contrast the different tools out there? This is a great resource.
Jim Sterne
eMetrics Marketing Optimization Summit

Overview of the 2007 Analytics Shoot Out

The 2007 Analytics Shoot Out is targeted at evaluating the performance, accuracy, and capabilities of 7 different analytics packages as implemented across 4 different sites. The goals of the project are as follows:

  1. Evaluate ease of implementation
  2. Evaluate ease of use
  3. Understand the basic capabilities of each package
  4. Solve specific problems on each web site
  5. Discover the unique strengths of each package
  6. Discover the unique weaknesses of each package
  7. Learn about the structural technology elements of each package that affect its capabilities
  8. Learn how to better match a customer’s needs to the right analytics package

How the results of the Shoot Out are delivered

The results of the Shoot Out have been delivered in two stages:

  1. The interim report was officially released at the Emetrics Summit in San Francisco on May 6, 2007.
  2. This report, the final report, contains all of the material in the interim report, along with more comprehensive results and analysis.

What you get in this report

Section 1. An executive summary of the report, key findings, and key takeaways
Section 2. Information about how the study was conducted, and its methodology
Section 3. An analysis of how the user deletion rates of third party cookies and first party cookies differ
Section 4. *** Content Updated in the Final Report ***: Comparative data showing:

  1. Visitors
  2. Unique Visitors
  3. Page Views
  4. Specific segments as defined per site for 2 sites

These numbers have been updated and expanded from the Interim Report.
Section 5. *** All New Content ***: A section on “Why Accuracy Matters”

  1. An overall commentary on accuracy in analytics
  2. A discussion of scenarios where accuracy matters
  3. What this means for how you use analytics to help manage your business
  4. How the analytics vendors measure sessions

Section 6. *** All New Content ***: A detailed study of the effect that location of the JavaScript on the web page has on traffic data:

  1. Test results showing how JavaScript placement affects search results
  2. A discussion of what this means for website owners and marketers

Section 7. *** All New Content ***: A qualitative review of the major strengths and weaknesses of all of the packages we worked with during the study. As all of the packages have strong customer bases, we did not anticipate that we would pick winners and losers per se, and we frankly don’t feel that is the pertinent output from such an examination.
This would imply that one package is best at all things for all people, and this is not the case. Each package has different strengths and weaknesses that ultimately make it a better fit for some types of web sites than others. For many webmasters, cost is also a large factor that needs to be considered.

Section 1: Executive Summary

I have participated in countless discussions with people who have been concerned about the accuracy of their analytics solutions. I have also had the chance to talk with, and interview, many of the leading players in the analytics industry. These leaders have all indicated that accuracy was not a problem, provided that the tools are implemented and used properly.
While I’ve used analytics tools extensively, and followed this business with great interest for quite some time, pursuing this project ultimately required a spark. That spark was provided by Rand Fishkin in a blog post he did in November 2006, titled: Free Linkbait Idea. Basically, Rand suggested that someone do a study based on placing multiple analytics packages simultaneously on multiple web sites, recording the data, and then analyze and publish the results.
I signed Perficient Digital up to do the job, and this study is the result.
As for whether or not the packages are accurate, you’ll see that this is not a simple question. The pundits are right – and they are also wrong. Ultimately, web analytics packages are like any other tool. Used properly, they can certainly help you grow and understand your business. However, it is easy to use them improperly, and it takes a sophisticated level of expertise to use them in an optimal fashion.
Web analytics, done right, is hard. However, done right, web analytics can provide an outstanding ROI on the time and money you put into it, and doing it well provides you with a major advantage over your competitors who do it less well.

Key Findings

  1. Web analytics packages, installed on the same web site, configured the same way, produce different numbers. Sometimes radically different numbers. In some cases the package showing the highest numbers reported 150% more traffic than the package reporting the least traffic.

  2. By far the biggest source of error in analytics is implementation error. A Web analytics implementation needs to be treated like a software development project, and must be subjected to the same scrutiny and testing to make sure it has been done correctly.

Note that we had the support of the analytics vendors themselves in the implementations done for the 2007 Web Analytics Shootout, so we believe that this type of error was not a factor in any of the data in our report, except where noted.

  1. Two other major factors drive differences in the results. One of these is the placement of JavaScript on the site, as being placed far down on a page may result in some users leaving the page before the JavaScript can execute. Traffic that is not counted as a result of the JavaScript can be considered an error because the data for that visit is lost (or at least the data regarding the original landing page and, if the visitor came from the search engine, the keyword data would also be lost).

The other factor is differences in the definition of what each package is counting. The way that analytics packages count visitors and unique visitors is based on the concept of sessions. There are many design decisions made within an analytics package that will cause it to count sessions differently, and this has a profound impact on the reported numbers.
Note that this should not be considered a source of error. It’s just that the packages are counting different things, equally well for the most part.

  1. Page views tend to have a smaller level of variance. The variance in ways an analytics package can count page views is much smaller. JavaScript placement will affect page views, but differences in sessionization algorithms will not. Simply put, if the tracking JavaScript on a page executes, it counts as a page view.
  2. There are scenarios in which these variances and errors matter, particularly if you are trying to compare traffic between sites, or numbers between different analytics packages. This is, generally speaking, an almost fruitless exercise.

  3. To help address these accuracy problems, you should calibrate with other tools and measurement techniques when you can. This helps quantify the nature of any inaccuracies and makes your analytics strategy more effective.

  4. One of the basic lessons is learning what analytics software packages are good at, and what they are not good at. Armed with this understanding, you can take advantage of the analytics capabilities that are strong and reliable, and pay less attention to the other aspects. Some examples of where analytics software is accurate and powerful are:

  1. A/B and multivariate testing
  2. Optimizing PPC Campaigns
  3. Optimizing Organic SEO Campaigns
  4. Segmenting visitor traffic
  1. There are many other examples that could be listed. The critical lesson is that the tools are not accurate, But their relative measurements are worth their weight in gold.

In other words, if your analytics package tells you that Page A converts better than Page B, that’s money in the bank. Or if the software tells you which keywords offer the best conversion rates, that’s also money in the bank. Or, if it says that European visitors buy more blue widgets than North American visitors – you got it – more money in the bank.
To enter the world of analytics accuracy below, and hopefully, you will emerge with a better appreciation of how to use these tools to help your business, as I did.

Section 2: 2007 Analytics Shoot Out Details

Analytics Packages

The following companies actively contributed their time and effort to this project:

  1. Clicktracks
  2. Google Analytics
  3. IndexTools
  4. Unica Affinium NetInsight
  5. Visual Sciences’ HBX Analytics

Each of these analytics packages was installed on multiple web sites, and each of these companies contributed engineering support resources to assist us during the project.
We were also able to evaluate the following analytics packages because they were already on one of the sites we used in the project:

  1. Omniture SiteCatalyst
  2. WebTrends

Participating Web Sites

  1. AdvancedMD (AMD)
  2. City Town Info (CTI)
  3. Home Portfolio (HPort)
  4. Tool Parts Direct (TPD)

Each of these sites installed multiple analytics packages on their sites per our instructions and made revisions as requested by us. Here is a matrix of Web Sites and Analytics Packages that were tested in the Shoot Out:

Site Clicktracks Google Analytics IndexTools Omniture Unica Net Insight WebSideStory HBX Analytics WebTrends
AMD Y Y Y Y Y Y N
CTI Y Y Y N Y Y N
HPort Y Y Y N N Y Y
TPD Y Y Y N N Y N

Additional Contributors

Thanks are also due to the following companies, who contributed to this project:

  1. Alchemist Media
  2. SEOmoz
  3. Market Motive
  4. IndexTools

And a special thanks to Jim Sterne of Target Marketing, and the eMetrics Marketing Optimization Summit for his support of the Shoot Out.

Methodology

The major aspects of the Shoot Out methodology are as follows:

  1. For each package, except WebTrends, we installed JavaScript on the pages of the participating sites. WebTrends was already installed on one of the sites participating in the project, and the implementation used a combination of JavaScript tags and log file analysis.
  2. All the JavaScript was added to website pages through include files. As a result, we have eliminated the possibility of the JavaScript coverage varying by the package.
  3. All packages were run concurrently.
  4. All packages used first-party cookies.
  5. A custom analytics plan was tailored to the needs of each site.
  6. Visitors, Unique Visitors, and Page Views were recorded daily for each site.
  7. Content Groups and Segments were set up for each site. Numbers related to these were recorded daily.
  8. On one site, City Town Info, we varied the order of the JavaScript on the page for a period of time, to see how this altered the comparative statistics for the 5 analytics packages we had running on it.
  9. Also on City Town Info, we placed a tracking pixel at the top of the page, to see how that placement affected the counting of traffic.
  10. We measured the execution time of each of the analytics packages across 3 of the sites.
  11. Detailed ad hoc analysis was done with each analytics package on each site.
  12. Critical strengths and weaknesses of each package were noted and reviewed with each vendor for comment.
  13. Each vendor was given an opportunity to present their product’s strongest features and benefits.

Section 3: First Party Cookies vs. Third Party Cookies

Using Visual Sciences’ HBX Analytics running on CityTownInfo.com, we ran the software for a fixed period of time using third-party cookies (TPCs). We then ran the software for the same amount of time using first-party cookies (FPCs).
During that same period we ran 3 of the other analytics packages (Clicktracks, Google Analytics, and IndexTools), all using first-party cookies.
The results were then compared by examining the relationship of HBX reported volumes to the average of the volumes of the three other packages and then seeing how that relationship changed when we switched from third-party cookies to first-party cookies. In theory, this should give us an estimate of how the user blocking and deletion of third-party cookies compare to user blocking and deletion of first-party cookies.
Here are the results we obtained while HBX Analytics was running third-party cookies:

Visitors Uniques Page Views
Clicktracks 72,224 66,335 120,536
Google Analytics 66,866 64,975 118,230
IndexTools 67,365 65,212 123,279
WebSideStory’s HBX Analytics 48,990 47,813 102,534
Average of all but HBX Analytics 68,818 65,507 120,682
HBX Analytics % of Average 71.19% 72.99% 84.96%

Visitor and unique visitor totals for HBX Analytics are 71 – 73% of the average of the other 3 packages. On the other hand, page views are roughly 85% of the average of the other 3 packages.
Now let’s take a look at the same type of information over the time period when HBX Analytics was making use of first party cookies:

Visitors Uniques Page Views
Clicktracks 71,076 65,314 114,966
Google Analytics 65,906 64,030 112,436
IndexTools 67,117 64,621 119,049
WebSideStory’s HBX Analytics 55,871 54,520 96,453
Average of all but HBX Analytics 68,033 64,655 115,484
HBX Analytics % of Average 82.12% 84.32% 83.52%
Relative Traffic Growth with FPCs (*) 13.32% 13.44%
  • Calculated as 1 – (The HBX Analytics % of Average in the first part of this test / The HBX Analytics % of Average in the second part of this test)

With first-party cookies, the visitor and unique visitor totals for HBX Analytics are now 82 – 84% of the average of the other 3 packages. The pageviews relationship did not change significantly and was roughly 84%.
By observing how the traffic reported by HBX Analytics increased with respect to the average of the other 3 packages, we can estimate how third-party cookie blocking and deletion differs from first-party cookie blocking and deletion.
According to this data, the third party cookie blocking and deletion rate exceed the first party cookie blocking and deletion rate by a little more than 13%. Visual Sciences also reported to Perficient Digital that it saw a 15-20% third-party cookie blocking and deletion rate across sites that they monitor during a 2 week period in January, and about a 2% first party cookie blocking and deletion rate.
This data is fairly consistent with past industry data that estimates the third party cookie deletion rate at about 15%. Visual Sciences reported to me recently that they see a 12% to 15% deletion rate on TPCs and about 1% on FPCs.
Note that the page view numbers do not vary much, because the process of counting page views is not dependent on cookies, so whether or not an FPC or TPC is used is irrelevant.
Note that comScore recently reported more than 30% of cookies are deleted overall, and also seemed to show that the difference between TPC and FPC deletions was significantly smaller. Note that there are many concerns about the accuracy of these numbers given the methods used by comScore to collect their data. In any event, our data above should provide a reasonable indication of how TPC deletions differ from FPC deletions.

Why Cookie Deletion Rates Matter

Cookie deletion rates are of great concern when evaluating web analytics. Every time a cookie is deleted it impacts the visitor and unique visitor counts of the tool. In particular, counting of unique visitors is significantly affected. If a user visits a site in the morning, deletes their cookies, and then visits again in the afternoon, this will show up as 2 different daily unique visitors in the totals for that day, when in fact one user made multiple visits and should be counted only as one unique visitor.
It should be noted that the packages use different methods for setting their cookies. For example, HBX Analytics requires you to set up a CNAME record in your DNS configuration file (note that DNS A records can also be used) to remap a sub-domain of your site to one of their servers.
While this requires someone who is familiar with configuring DNS records to do, it does provide some advantages. For example, simple first party cookie implementations still pass data directly back to the servers of the analytics vendor. Memory resident anti-spyware software will intercept and block these communications.
Using the CNAME record bypasses this problem because all the memory resident anti-spyware software will see is a communication with a sub-domain of your site, and the process of redirecting the data stream to the HBX Analytics server happens at the DNS level.
Unica provides the option of either using a DNS A record based approach for first-party cookies or going with a simpler first-party cookie implementation. Note that an A record can be used to do the same thing as a CNAME record, with only some subtle differences.
Other analytics packages used in this test (Clicktracks, Google Analytics, and IndexTools) have chosen a simple first party cookie approach to an initial configuration which requires no special configuration, and that allows a less technical user to set them up and get started.

Section 4: Visitors, Unique Visitors, and Page Views (aka “traffic numbers”)

For each participating site, we show two sets of results below. First is the set of numbers presented in the Interim report published in May of 2007. The second set of numbers is completely new traffic data for the same sites but over a different period of time. There was no overlap in the two time periods.
The goal with the second set of data is to determine if there were any major shifts in the data over time.

Notes

  1. The Uniques column is the summation of Daily Unique Visitors over a period of time. The resulting total is therefore not an actual unique visitor count for the time period (because some of the visitors may have visited the site multiple times, and have been counted as a Daily Unique Visitor for each visit).

This was done because not all of the packages readily permitted us to obtain Unique Visitor totals over an arbitrary period of time. For example, for some packages, it is not trivial to pull the 12-day Unique Visitor count.
Regardless, the Uniques data in the tables below remains a meaningful measurement of how the analytics packages compare in calculating Daily Unique Visitors.

  1. The time period is not being disclosed to obscure the actual daily traffic numbers of the participating sites. In addition, the time period used for each site differed.
  2. One factor that we examined in detail was the effect of JavaScript order on the results. The details of this will be discussed in a later section of this report, but you can see a table of the placement of the JavaScript for each of the sites in Appendix A.

Traffic Data

1. City Town Info Table 1. The following data is the summary visitor, unique visitor, and page view data for CityTownInfo.com that was presented in the Interim Report:

CityTownInfo.com Analytics Data – Interim Report Data Visitors Uniques Page Views
Clicktracks 645,380 587,658 1,042,604
Google Analytics 600,545 583,199 1,038,995
IndexTools 614,600 595,163 1,099,786
Unica Affinium NetInsight 607,475 593,871 1,027,445
WebSideStory HBX Analytics 524,055 510,882 910,809
Average 598,411 574,155 1,023,928
Clicktracks % 107.85% 102.35% 101.82%
Google Analytics % 100.36% 101.58% 101.47%
IndexTools % 102.71% 103.66% 107.41%
Unica Affinium NetInsight % 101.51% 103.43% 100.34%
WebSideStory HBX Analytics% 87.57% 88.98% 88.95%
Standard Deviation 40209 31930 61868
Clicktracks Std Deviations 1.17 0.42 0.30
Google Analytics Std Deviations 0.05 0.28 0.24
IndexTools Std Deviations 0.40 0.66 1.23
Unica Affinium NetInsight Std Deviations 0.23 0.62 0.06
WebSideStory HBX Analytics Std Deviations -1.85 -1.98 -1.83

2. City Town Info Table 2. The following data is the summary visitor, unique visitor, and page view data for CityTownInfo.com that was recorded for the Final Report:

CityTownInfo.com Analytics Data – Final Report Data Visitors Uniques Page Views
Clicktracks 663,803 609,511 1,071,589
Google Analytics 603,619 586,580 1,045,327
IndexTools 638,602 618,376 1,138,659
Unica Net Insight 627,072 614,512 1,062,493
Visual Sciences HBX Analytics 525,038 513,020 922,692
Average 611,627 588,400 1,048,152
Clicktracks % 108.53% 103.59% 102.24%
Google Analytics % 98.69% 99.69% 99.73%
IndexTools % 104.41% 105.09% 108.63%
Unica Net Insight % 102.53% 104.44% 101.37%
Visual Sciences HBX Analytics% 85.84% 87.19% 88.03%
Standard Deviation 47435 39272 70278
Clicktracks Std Deviations 1.3 0.66 0.38
Google Analytics Std Deviations -0.2 -0.06 -0.05
IndexTools Std Deviations 0.67 0.94 1.46
Unica Affinium Net Insight Std Deviations 0.38 0.82 0.23
Visual Sciences HBX Analytics Std Deviations -2.15 -2.36 -2.03

3. Home Portfolio Table 1: The following data is the summary visitor, unique visitor, and page view data for HomePortfolio.com that was presented in the Interim Report:

HomePortfolio.com Analytics Data – Interim Report Data Visitors Uniques Page Views
Google Analytics 754,446 707,358 7,209,828
IndexTools 731,218 686,518 7,078,720
WebSideStory HBX Analytics 701,895 662,411 6,439,982
WebTrends 804,012 778,280 7,483,154
Average 747,893 708,642 7,052,921
Google Analytics % 100.88% 99.82% 102.22%
IndexTools % 97.77% 96.88% 100.37%
WebSideStory HBX Analytics % 93.85% 93.48% 91.31%
WebTrends % 124.83% 127.53% 106.10%
Standard Deviation 37370 43237 382779
Google Analytics Std Deviations 0.18 -0.03 0.41
IndexTools Std Deviations -0.45 -0.51 0.07
WebSideStory HBX Analytics Std Deviations -1.23 -1.07 -1.60
WebTrends Std Deviations 1.50 1.61 1.12

4. Home Portfolio Table 2: The following data is the summary visitor, unique visitor, and page view data for HomePortfolio.com that was recorded for the Final Report. Note that Clicktracks was not present in the first phase, but was included in the second phase.

HomePortfolio.com Analytics Data – Final Report Data Visitors Uniques Page Views
Clicktracks 906,264 767,128 6,761,954
Google Analytics 800,608 756,164 7,055,278
IndexTools 780,043 734,082 6,794,242
Visual Sciences HBX Analytics 778,789 750,734 6,451,555
WebTrends 1,003,683 964,480 7,312,397
Average 853,877 794,518 6,875,085
Clicktracks % 106.14% 96.55% 98.35%
Google Analytics % 93.76% 95.17% 102.62%
IndexTools % 91.35% 92.39% 98.82%
Visual Sciences HBX Analytics % 91.21% 94.49% 93.84%
WebTrends % 117.54% 121.39% 106.36%
Standard Deviation 88446 85648 290662
Clicktracks Std Deviations 0.59 -0.32 -0.39
Google Analytics Std Deviations -0.6 -0.45 0.62
IndexTools Std Deviations -0.83 -0.71 -0.28
Visual Sciences HBX Analytics Std Deviations -0.85 -0.51 -1.46
WebTrends Std Deviations 1.69 1.98 1.5

5. Tool Parts Direct Table 1: The following data is the summary visitor, unique visitor, and page view data for ToolPartsDirect.com that was presented in the Interim Report:

ToolPartsDirect.com Analytics Data – Interim Report Data Visitors Uniques Page Views
Clicktracks 129,900 91,879 639,892
Google Analytics 159,955 103,260 939,373
IndexTools 108,486 92,070 687,544
WebSideStory HBX Analytics 103,724 91,847 582,887
Average 125,516 94,764 712,424
Clicktracks % 103.49% 96.96% 89.82%
Google Analytics % 127.44% 108.97% 131.86%
IndexTools % 86.43% 97.16% 96.51%
WebSideStory HBX Analytics % 82.64% 96.92% 81.82%
Standard Deviation 22193 4906 136167
Clicktracks Std Deviations 0.20 -0.59 -0.53
Google Analytics Std Deviations 1.55 1.73 1.67
IndexTools Std Deviations -0.77 -0.55 -0.18
WebSideStory HBX Analytics Std Deviations -0.98 -0.59 -0.95

6. Tool Parts Direct Table 2: The following data is the summary visitor, unique visitor, and page view data for ToolPartsDirect.com that was recorded for the Final Report:

ToolPartsDirect.com Analytics Data Visitors Uniques Page Views
Clicktracks 318,189 222,270 1,568,546
Google Analytics 399,784 249,788 2,262,553
IndexTools 261,691 222,248 1,653,576
Visual Sciences HBX Analytics 249,067 220,813 1,417,426
Average 307,183 228,780 1,725,525
Clicktracks % 103.58% 97.15% 90.90%
Google Analytics % 130.15% 109.18% 131.12%
IndexTools % 85.19% 97.14% 95.83%
Visual Sciences HBX Analytics % 81.08% 96.52% 82.14%
Standard Deviation 59462 12144 32138100.00%
Clicktracks Std Deviations 0.19 -0.54 -0.49
Google Analytics Std Deviations 1.56 1.73 1.67
IndexTools Std Deviations -0.77 -0.54 -0.22
Visual Sciences HBX Analytics Std Deviations -0.98 -0.66 -0.96

7. AdvancedMD Table 1: The following data is the summary visitor, unique visitor, and page view data for AdvancedMD.com that was presented in the Interim Report:

AdvancedMD.com Analytics Data – Interim Report Data Visitors Uniques Page Views
Clicktracks 155,396 63,339 234,930
Google Analytics 148,665 63,554 231,511
IndexTools 116,757 52,949 225,859
Omniture SiteCatalyst 110,211 64,016 237,108
Unica Affinium Net Insight 101,419 57,739 196,277
WebSideStory HBX Analytics 110,824 63,156 222,732
Average 123,878 60,792 224,736
Clicktracks % 125.44% 104.19% 104.54%
Google Analytics % 120.01% 104.54% 103.01%
IndexTools % 94.25% 87.10% 100.50%
Omniture Site Catalyst % 88.97% 105.30% 105.51%
Unica Affinium Net Insight % 81.87% 94.98% 87.34%
WebSideStory HBX Analytics % 89.46% 103.89% 99.11%
Standard Deviation 20494 4101 13651
Clicktracks Std Deviations 1.54 0.62 0.75
Google Analytics Std Deviations 1.21 0.67 0.50
IndexTools Std Deviations -0.35 -1.91 0.08
Omniture SiteCatalyst Std Deviations -0.67 0.79 0.91
Unica Affinium Net Insight Std Deviations -1.10 -0.74 -2.08
WebSideStory HBX Analytics Std Deviations -0.64 0.58 -0.15

8. AdvancedMD Table 2: The following data is the summary visitor, unique visitor, and page view data for AdvancedMD.com that was recorded for the Final Report:

AdvancedMD.com Analytics Data – Final Report Data Visitors Uniques Page Views
Clicktracks 1,398,365 600,855 2,039,587
Google Analytics 1,345,801 603,627 2,012,420
IndexTools 1,067,819 489,605 1,960,184
Omniture SiteCatalyst 1,016,563 605,550 2,094,566
Unica Net Insight 944,008 540,424 1,717,584
Visual Sciences HBX Analytics 1,023,003 594,677 1,920,104
Average 1,132,593 572,456 1,957,408
Clicktracks % 123.47% 104.96% 104.20%
Google Analytics % 118.82% 105.45% 102.81%
IndexTools % 94.28% 85.53% 100.14%
Omniture Site Catalyst % 89.76% 105.78% 107.01%
Unica Net Insight % 83.35% 94.40% 87.75%
Visual Sciences HBX Analytics % 90.32% 103.88% 98.09%
Standard Deviation 173842 43316 120766
Clicktracks Std Deviations 1.53 0.66 0.68
Google Analytics Std Deviations 1.23 0.72 0.46
IndexTools Srd Deviations -0.37 -1.91 0.02
Omniture SiteCatalyst Std Deviations -0.67 0.76 1.14
Unica Affinium Net Insight Std Deviations -1.08 -0.74 -1.99
Visual Sciences HBX Analytics Std Deviations -0.63 0.51 -0.31

Initial Observations

There were significant differences in the traffic numbers revealed by the packages. While we might be inclined to think that this is a purely mechanical counting process, it is, in fact, a very complex process.
There are dozens (possibly more) implementation decisions made in putting together an analytics package that affects the method of counting used by each package. The discussion we provided above about different types of first-party cookie implementation is just one example.
Another example is the method used by analytics packages to track user sessions. It turns out that this is done somewhat differently by each package. You can see more details on what these differences are in Appendix B.
Other examples include: whether or not the configuration of the package is done primarily in the JavaScript or the UI, and how a unique visitor is defined (e.g., is a daily unique visitor defined as over the past 24 hours, or for a specific calendar day?).
If we look at the standard deviations in the above data, the distribution appears to be pretty normal. Note that for a normal distribution, 68% of scores should be within 1 standard deviation, and 95% of the scores should be within 2 standard deviations. In our data above, this indeed appears to be holding roughly true.

Charting the Data

The following three charts provide a graphical representation of the tables above. In order to give them more meaning, we have normalized the data to the same scale.
Here is a summary of the visitor data in a chart:
Visitor Data Chart
Here is a summary of the raw unique visitor data in a chart:
Chart Showing Unique Visitor Data
Here is a summary of the raw page view data in a chart:
Page View Data Chart

  1. While HBX Analytics tended to report the lowest numbers of all the packages, this was not always the case. For example, on AdvancedMD.com, HBX was higher than 2 packages for visitors, and unique visitors. In particular, note the scenario labeled “CTI2” (City Town Info, Scenario 2) which corresponds to the time when the JavaScript order was changed on CTI. HBX Analytics was the first JavaScript in the HTML before the change, and first after the change, and the HBX results were on the higher side after the change.
  2. Google Analytics appears to count significantly higher than any of the other vendors on Tool Parts Direct (TPD). However, on TPD, the Google Analytics code is present in the HTML header, and all the other vendors are placed immediately before the tag at the bottom of the HTML for the page.

We measured the average time between the completed execution of Google Analytics, and the completed execution of IndexTools (the next analytics package to execute), and that time delay was about 3.3 seconds (Google was finished at 0.7 seconds after the page began loading, and IndexTools was finished at around 4 seconds).
In another test, for which the results are shown in Section 6, we showed that an execution delay of 1.4 seconds would result in a loss of 2% to 4% of the data. It is our theory that as the delay in execution expands the amount of lost data increases.
The loss occurs because the user sees the link they want, and clicks on it before the analytics software ever executes. On TPD, because the Google Analytics JavaScript is in the header, it always executes before the page is displayed. IndexTools was not finished for another 3.3 seconds. It is reasonable to project that a significant number of users will have moved on by that point in time. In Section 6, we speculate that this number may be 12.2% of the users.

  1. Clicktracks reported the highest numbers on AdvancedMD.com and the second highest numbers on ToolPartsDirect.com. Our later analysis shows reasons why Clicktracks may tend to count quite a bit higher on PPC driven sites (which is the case for AMD and TPD).

Clicktracks uses a shorter inactivity timeout for sessions (see Appendix B for more details on this), and also will treat any new PPC visit to a site as a new session. Clicktracks is more heavily optimized for the management of PPC campaigns than other packages, and this is one of the results of that.

  1. On HomePortfolio.com, WebTrends reported significantly more visitors and unique visitors than the other vendors (about 20% more). This is the only site that we were able to look at WebTrends numbers for at this stage in the project.

Google Analytics reported the second highest numbers on this site.

  1. On CityTownInfo.com, the highest numbers were reports by IndexTools.

Content Group Data

  1. Here is the form completion and content group page view data for each of the analytics packages and CityTownInfo.com:
Form 1 Form 2 Form 3 Group 1 Views Group 2 Views Group 3 Views
Clicktracks 169 567 69 45,646 3,833 9,423
Google Analytics 172 543 59 59,638 4,695 12,255
IndexTools 177 616 68 67,166 4,891 14,461
Unica Affinium NetInsight 172 572 70 60,699 4,713 12,291
WebSideStory HBX Analytics 162 560 69 54,889 4,274 14,763
Average 170 572 67 57,608 4,481 12,639
Clicktracks % 99.18% 99.20% 102.99% 79.24% 85.54% 74.56%
Google Analytics % 100.94% 95.00% 88.06% 103.52% 104.77% 96.96%
IndexTools % 103.87% 107.77% 101.49% 116.59% 109.14% 114.42%
Unica Affinium NetInsight % 100.94% 100.07% 104.48% 105.37% 105.17% 97.25%
WebSideStory HBX Analytics % 95.07% 97.97% 102.99% 95.28% 95.38% 116.81%
  1. Here is the content group page view data for each of the analytics packages and HomePortfolio.com:
Group 1 Views Group 2 Views Group 3 Views Group 4 Views
Google Analytics 4,878,899 514,704 448,355 11,823
IndexTools 4,844,642 520,521 457,857 11,540
WebSideStory HBX Analytics 2,222,843 161,922 317,307 10,787
Average 3,982,128 399,049 407,840 11,383
Google Analytics % 122.52% 128.98% 109.93% 103.86%
IndexTools % 121.66% 130.44% 112.26% 101.38%
WebSideStory HBX Analytics % 55.82% 40.58% 77.80% 94.76%

Analysis and commentary on Content Group data

  1. Interestingly, this data has less variation across packages than the traffic data (we discuss the exception of HBX Analytics running on HomePortfolio.com below). This is largely because it is page view based, and page views are inherently easier to track accurately than visitors.

The reason for this is that page views are, generally speaking, easy to count, and there is less variance in the algorithms that the web analytics packages use. Basically, every time the JavaScript runs, the page view count is updated.
Tracking visitors and unique visitors is quite a bit more complicated. In Appendix B, we explain why in more detail, but basically, session tracking relies on cookies and post-processing to count visitors and unique visitors. There are a number of heuristics and basic implementation decisions that each vendor makes that have a major impact on the visitor and unique visitor totals.

  1. As an exception to this, the HBX Analytics content group data for HomePortfolio is quite a bit lower than that of the other packages. However, we discovered that this is due to an implementation error by our team.

Note that this is not a reflection of the difficulty in implementing HBX Analytics. Instead, it’s a reflection of how important it is to understand exactly what it is that you want the analytics software to do, specifying it accurately, and then double checking that you are measuring what you think you are measuring.
In this case, we set up HBX Analytics to track people who initially entered at pages in the content group, rather than tracking all the page views for the content group, which is what we wanted.
There is a key lesson in this. Implementation of an analytics package requires substantial forethought and planning. And, when you are done with that, you have to check and recheck your results, to make sure they make sense. Here is a summary of some of the issues you face in setting up your implementation correctly:

  1. Tagging errors – an error in tagging (placing JavaScript on) your pages can really throw you for a loop. These errors are easy to make too, as tagging pages is basically a programming task, and you need to remember to tag every page, and this gets exponentially harder as you begin customizing the JavaScript. You need to do a comprehensive job of setting the software up for success.
  2. Understanding the terminology – each package uses terms in different ways, and it’s important to understand them.
  3. Learning the software, and how it does things – each software package has its own way of doing things.
  4. Learning your requirements – this is a process all by itself. If you are implementing analytics for the first time it may be many months before you truly understand how to use it most effectively on your site.
  5. Learning the requirements of others in your organization – these are not necessarily the same as your personal requirements. For example, your CEO may need one set of information, your VP of Sales something else, and your business analyst something else entirely.
  6. Validating the data – even if you are not running more than one analytics package, you need to have a method of testing the quality of your data and making sure it makes sense.

One way to reduce many of these risks is to install multiple analytics packages. We often put Google Analytics on sites, even if they already have an analytics package on them. This is not to say that Google Analytics is the gold standard. With this approach, however, if you spot substantial differences (30% or more, for example) between the two packages, that would provide you a visible clue that something may have gone wrong in your tagging or setup!

Section 5: Why Accuracy Matters

As Jim Sterne is fond of saying, if your yardstick measures 39 inches instead of 36 inches, it’s still great to have a measurement tool. The 39-inch yardstick will still help you measure changes with a great deal of accuracy. So if tomorrow your 39-inch yardstick tells you that you are at 1 yard and 1 inch (i.e., 40 inches), you know you have made some progress.
Having explained the value of a 39-inch yardstick, it is worthwhile to take a moment and consider the value of accuracy in analytics. To evaluate how far apart our yardsticks are getting, we looked a bit further at our data to see how the difference between the packages reporting the most traffic, and the least traffic varied, for each site:
Max Differential Per Site – Visitors

AMD 153.22% Clicktracks / Unica Net Insight
TPD 154.21% Google Analytics / HBX Analytics
HP 114.55% WebTrends / HBX Analytics
CTI 123.15% Clicktracks / HBX Analytics

Max Differential Per Site – Unique Visitors

AMD 120.90% Omniture / Unica Affinium NetInsight
TPD 112.42% Google Analytics / HBX Analytics
HP 136.43% WebTrends / HBX Analytics
CTI 116.50% IndexTools / HBX Analytics

Max Differential Per Site – Page Views

AMD 120.80% Omniture / Unica Affinium NetInsight
TPD 161.15% Google Analytics / HBX Analytics
HP 116.20% WebTrends / HBX Analytics
CTI 120.75% IndexTools / HBX Analytics

As you can see, the differences in the above data between the low counting software and the highest counting software are substantial.
Given the notion of a 39-inch yardstick, how much does this matter? Actually, in some situations, it matters a lot. Here are three example scenarios I have heard about recently:

  1. Company A acquires company B’s web site, and one of the key metrics discussed during the acquisition is the traffic level to the site. One reason that traffic may be a key metric, for example, is that you may know that you have an ability to achieve a certain amount of revenue per visitor, based on the way your analytics package counts visitors.

But if the site you just acquired is running a different analytics package that reports 50% more traffic on the acquired site than your analytics package does, you are going to be extremely unhappy once you set up your analytics package on the acquired site and see the “real numbers”.
This is a clear scenario where you need to calibrate your analytics. Ideally, you should get your analytics software installed on the site to be acquired prior to finalizing the acquisition, so you can see the traffic numbers in real terms that you are familiar with.
A backup plan would be to take one of the free packages, such as Google Analytics or Clicktracks Appetizer, and place them both on the site to be acquired and your own site, so you can get a clear reference point on the traffic.

  1. Company A has been running one analytics software package for a long time but decides to switch to another one. Perhaps there is a limitation in the first package causing them to make the switch.

They get it running, and they find the traffic numbers vary wildly by category of data. In some cases, the discrepancy is quite large. Now management has lost all confidence in the analytics data they are dealing with. The team that has done the implementation is in all kinds of hot water.
Not having any confidence in the metrics for your business can be considered a small scale disaster. Consider this: If you are a senior manager in the business, and you don’t believe the numbers coming from your analytics software, wouldn’t you consider not spending any more money on it?

  1. Company A is running a PPC campaign. They know from other tracking mechanisms that they have in place (such as a parameter on the URL) that they are getting a 27% margin on their PPC campaigns. Now they want to use their analytics solution to give them the insight to further optimize and improve their campaign.

The problem occurs when they start seeing a different set of results from their analytics data. This causes them to lose confidence in the data that they are looking at, and therefore they may choose not to proceed with using the analytics software to help tune their PPC campaigns.

  1. Company A is running a PPC campaign. They are comparing their incoming click data reports from the search engine with the data they see in their analytics, and they don’t match up. They are wondering if the search engine is ripping them off.
  2. Company A is selling impression based advertising to Company B. They are using Company A’s web analytics software to measure the number of impressions generated. Both companies want to make sure that the count is accurate.

Accuracy Summary

The stories above are not uncommon. However, analytics solutions can be extremely effective in helping you tune your web site. The first thing you should know is that often the largest source of error in web analytics when using a JavaScript solution is an implementation problem.
During the Shoot Out, we went to great pains to make sure that we had correct implementations for all the tools, and the analytics vendors helped us with this. But there are many different sources of error. In our test, some of these errors would affect all the analytics packages tested equally. For example, a user who uses multiple computers would likely be seen as multiple users by all the packages.
Here is a summary of some factors that would potentially cause the analytics packages we tested to report different results:

  1. Placement of the JavaScript on the page – as we will see in Section 6, this does affect counting significantly.
  2. Session tracking timeouts algorithm used (see Appendix B for more on this).
  3. Other factors that drive the initiation of new sessions, such as beginning a new session on any new visit from a PPC search engine (see Appendix B for more on this).
  4. Aggressiveness with which questionable sessions are discarded (see Appendix B for more on this).
  5. Cookie blocking – some packages do not fall back on a combination of pixel tracking and/or IP and User agent detection to still count those visitors. Some packages do, and in addition, there are multiple ways for them to implement this.
  6. Spyware blocking communications with the analytics server. This will not affect implementations where first-party cookies are set up at the DNS level.
  7. Analytics server downtime (rare).
  8. Network problems preventing communication with the analytics server.
  9. Analytics servers being blocked by firewalls (e.g. a corporate firewall).

Here is a summary of some factors that would affect the results of the analytics packages we tested equally:

  1. Multiple users on one computer will be treated as a single user.
  2. One user who uses multiple computers will be counted as multiple users.
  3. JavaScript is disabled on the user’s browser.

In rough terms, for two of our sites, our data showed that the highest count was about 20% above the average of all the packages, and the lowest showed data about 20% below the average of all the packages.
This variance is largely attributable to design and implementation decisions made by the software development teams that created each package, resulting in greater or lesser accuracy (but there is no way to know which one was most accurate).

What to do about it

Now we get back to our 39-inch yardstick. Perhaps based on our data we should be referring to this as a 43-inch yardstick (120% of 1 yard). Should we be alarmed at this level of variance in the results? Not really, but it is important to understand that these sources of error exist, and it’s important to understand how to deal with them.
First of all, some of the largest sources of error, such as those that relate to session management and counting, do cause a variance in the traffic results returned by the packages, but they do not affect the ability of the program to monitor the key performance indicators (KPIs) for your site. For example, one large potential source of error is the aggressiveness with which questionable sessions are filtered out.
An example of a potentially questionable session is a visit from a new IP address with an unknown user agent, that views only 1 page, and that has no referrer. One package might throw this out, and another might leave it in. This type of decision-making could have a large effect on traffic counts, but the visitors we are talking about are untraceable.
The analytics packages that are throwing this data out have made the decision that the data is not useful or relevant. There is a chance that they are wrong in some cases. However, even if they do throw out some relevant data, the analytics package is measuring the behavior of the great majority of your users.
Even if an analytics package is measuring the behavior of only 80% of its users, it remains highly relevant and valuable data. By contrast, the traditional print industry relies on subscriber surveys and feels lucky if they get 20% response. They would die for data on 80% of their customers.
The fact is that some percentage of the questionable data is bad, and some of it may actually relate to a real user. The package that throws out too little gets skewed on one direction, and the package that throws out too much gets skewed a little bit in the other direction.
Neither of these changes the ability of these packages to measure trends, or to help you do sophisticated analysis about what users are doing on your site.
Here are some suggestions on what you can do to deal with the natural variances in analytics measurement techniques:

  1. Realize that there are variances and errors, and get comfortable with the fact that the tools all provide very accurate relative data measurement. As we said before, if your 43-inch yardstick tells you that page A is converting better than page B, or that visitors from Europe buy more blue widgets than visitors from North America, that is solid and dependable information.

Similarly, if your 29-inch yardstick told you that you have 500,000 unique visitors two months ago, and also tells you that you received 600,000 unique visitors last month, you can feel comfortable that your business grew by approximately 20%.

  1. Don’t get hung up on the basic traffic numbers. The true power of web analytics comes into play when you begin doing A/B testing, multivariate testing, visitor segmentation, search engine marketing performance tracking and tuning, search engine optimization, etc.
  2. Calibrate whenever you can. For example, if you have a PPC campaign, use some other mechanism to see how your results compare at a global level. This other mechanism will help you cross-check the accuracy of your analytics data, and help ferret out any implementation errors.

Note that the analytics package will be able to do many other extremely valuable things that other tracking mechanisms can’t, such as match up conversions with landing pages, navigation paths, search terms, and search engines, etc.
Or, using the acquisition example we talked about above, use a common analytics package between two different sites to get a better idea as to how the data between the two sites compare. For example, part of the due diligence process could be the installation of Google Analytics on your site, and the site you are looking at acquiring, and then comparing the numbers from Google Analytics side by side.
Comparing two sites using the same analytics tool will remove the largest source of error beyond your control, namely, the specific design and implementation decisions made in building the tool.

  1. Realize that the biggest sources of error are JavaScript implementation errors. This could be as simple as pages that are missing the JavaScript, pages with malformed JavaScript, or problems that crop up as pages get added to the web site, moved, or removed from the web site.

This is an error completely within your control, and one that is quite potentially more devastating than any variance in the counting techniques used by the packages.
Note that we do not believe that this affected our report, except where noted, because we had the active help of the analytics vendor in setting up their JavaScript and making sure that we were error-free.
In addition, the JavaScript was installed using include files, so any failure to place one vendor’s tag on a given page would result in all the vendors not being on that page.
This is the end of Part 1 of the Final Report of the 2007 Web Analytics Shoot Out. You can see Part 2 of the Report here. Part 2 includes an analysis of JavaScript placement and how it matters, as well as detailed qualitative reviews of 5 of the analytics packages.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Eric Enge

Eric Enge is part of the Digital Marketing practice at Perficient. He designs studies and produces industry-related research to help prove, debunk, or evolve assumptions about digital marketing practices and their value. Eric is a writer, blogger, researcher, teacher, and keynote speaker and panelist at major industry conferences. Partnering with several other experts, Eric served as the lead author of The Art of SEO.

More from this Author

Follow Us
TwitterLinkedinFacebookYoutubeInstagram