Google Analytics

Logfile analysis

Mach5 Analyzer report

ABOVE: A logfile analyzer's monthly report, showing traffic at after subtraction of visits by search-engine crawlers. (Click here for a larger image.)

by Durant Imboden, Europe for Visitors

In the early days of the Web, traffic measurement was nearly always based on server logs, which are the usage files that a Web server generates automatically each night. "Logfile analysis" is still around, despite competition from newer Web-based "client-side" tools such as Google Analytics. The screen shot above shows a typical month's report using Mach5 FastStats Analyzer, which turns our Web server's raw usage data into usable statistics.

Logfile analysis has one major advantage over Google Analytics and similar tools:

  • You don't need to add code to your pages. You simply download your server logs and process them with a program like Mach5 FastStats Analyzer or Sawmill. Or, if your Web host provides Web statistics with Analog, Webalizer, AWStats, or another server-based application, you can view your statistics online.

However, logfile analysis also has some important disadvantages, especially if your hosting service doesn't provide online traffic reports:

  • Server logs can be huge. (Even with gzip compression, our daily logs at average 8 to 9 Mb, or 85 to 110 Mb after they've been decompressed by our logfile analyzer.) If you want to save logfiles offline for more than a few months, you'll need a massive hard drive.
  • Once you have a significant amount of traffic, you'll need a powerful PC with lots of RAM to analyze downloaded logfiles. Even then, processing a month's worth of files can take a while.
  • Logfile analyzers vary greatly in their "look and feel"--and in how quickly and reliably they process data. Also, some programs (Sawmill, for example) can be confusing for non-technical users. If possible, download trial versions before you buy. 

To use a logfile analyzer effectively, you need to understand its reports. For example:

  • "Hit counts" are useless in terms of measuring human traffic, because every text and image file on a page generates a separate "hit." (If you brag about your site's "hits" without defining those hits, you'll brand yourself as a newbie.)
  • Logfile analyzers may count people who view images as "visitors," even if they're merely viewing pictures in Google Image Search or seeing photos that people have inserted into forum messages via links to your image files. That can help to inflate your traffic statistics, but it won't give you--or advertisers and PR people--a true understanding of your readership.
  • Logfiles include visits from search-engine crawlers and other "bots," so you'll want to filter those visits out before calculating unique visitors and page views. Here are examples of two reports, one with crawler visits and the other without:
Mach5 statistics without search-engine visits

Statistics without search-crawler visits  

My advice: 

Add Google Analytics to your site and don't worry about analyzing server logs. Logfile analysis has its place, but for most do-it-yourself Web publishers, it's an unneeded hassle.

Next article: "Quantcast"


Feed You can follow this conversation by subscribing to the comment feed for this post.

The comments to this entry are closed.