A frequent problem I see in evaluating web site traffic is that some site pages are missing Google Analytics code, preventing them from being tracked. Very often, these are form submission “thank you” pages (which are ideal Goal Pages in Analytics!) or other pages that have “funny” templates relative to the rest of a site.

The procedure I use to track these down works like this:

  1. Spider the site to capture all files, ignoring images:
    wget --recursive --random-wait -w 1 --force-html -R gif -R png -R jpg -R pdf -R css -R js -R mov http://www.yoursite.com/
  2. Look for pages that DON’T have Analytics, again ignoring image files, css, etc.:
    find . -type f -print | xargs grep -c UA-12341234 | grep -v png | grep -v gif | grep -v jpg | grep -v images | grep -v pdf | grep -v css | grep -v robots.txt | grep -v '\.js' | grep :0

Works like a champ to ferret out weird pages. You can of course use the same approach to grep for other strings that need to be on all site pages.


I was emailing with scrum expert and all-around great guy Dan Greening and thought this might be useful for others as well.

Remember when considering search engine optimization to focus first on goals, then traffic, then phrases, then rank. Ranking for irrelevant phrases won’t get you more or better leads.

Early on, be sure to set up goal/conversion tracking in Analytics. Most sites have several goals:

  1. Site visit duration. Long visits (1 minute longer than average?) = goal worth perhaps $5?
  2. Site visit pages. Many pages (1 or 2 more than average?) = goal worth another $5?
  3. Newsletter subscribes. $20 value?
  4. Contact Us submission. There are probably worth $100?

Here’s a related post if you’re having trouble setting goal values.

Once you have goals and conversions info you can track that back to phrases and find out which ones work best for you. This is a bit of a rosy outlook in that many sites won’t have enough high-value conversions to be statistically significant, which is why I emphasize soft goals like time-on-site and visit-depth.

Folks who spend a lot of time on your site or look at lots of your pages are pretty likely to subscribe to your newsletter or submit your contact form later on, so look at the phrases driving that kind of traffic and put your SEO time and money into those.