Google Analytics Is Great, But The Data Can Be Wonky
Google Analytics is great! You get amazing data that can help you maximize the ROI on your website, reduce your marketing spend, and improve your customer service experience on your website. However, like all things related to the web, you will encounter SPAM. Yup, no matter where you go, them darn interwebs are gonna SPAM ya. Additionally, your data can be subject to reporting inconsistencies as a result of how people type your URLs, and even how your web servers are configured.
Google Analytics Filters Can Save Your Bacon
The good news is that there are some solutions you can deploy in Google analytics to help eliminate the SPAM, and ghost referral traffic, you will see showing up in your reports. In this post, I will share with you the standard filters I implement for my clients and why I implement them.
Include Only Hostname
Why do I want to only include traffic from the hostname?
SPAM and ghost referral traffic will show up in your analytics reports and throw off your numbers. In short, SPAM and ghost referral traffic are an outcome of blackhat digital marketers and various bots that may hit your site either by actually visiting it or hitting the server. Either way, they can throw off the accuracy of your reports. In some cases they can account for as much as 5% of your traffic in some cases. They are relatively harmless and don't mean that your site has been hacked, but you should filter them out nevertheless. "Why should I filter them out if they are only 5% of my traffic? Isn't that more trouble than it's worth?" Well consider this example. If you are considering an Adwords PPC campaign and using your current site traffic to estimate your current conversion rate, you are at risk for overbidding on your Adwords ad groups and thus reducing your ROI on a PPC campaign. In my opinion, it's worth setting up the filter.
How To Configure an Include Only Hostname Filter
This is a pretty straight forward filter, but it will require you to use a regular expression. Rather than ask you to puzzle this out, here is an example that will fit websites that use www. or not as part of their domain.
IMPORTANT: My domain uses .net so you want to make sure you use .com, .org., .edu, or whatever top-level domain your domain uses.
Check out this screenshot to see this filter in action.
While this filter's recipe can be found in multiple places, I want to give credit to the Lunametrics Blog for sharing this little nugget of wisdom.
Consolidate Homepage URLs
Why do I want to consolidate homepage URLs?
A common problem that lots of websites will deal with is seeing their homepage hits counted in different ways because your homepage technically has two URLs. Most of you will recognize your homepage as just being your domain such as www.mysite.com, but websites are software that use a hierarchical structure to organize pages and is often represented as a site map. Each page in your website must fit into your hierarchy, including your homepage, and must have a unique name so it can be managed appropriately by a computer. Lots of homepages will have a name such as /index.php or /index.html, though you will never see it. So you may see hits to your homepage laid out like this in your reports:
This can be maddening when you are trying to report on the bounce rate for your homepage and have to manually calculate your stats. Fortunately, there is another solution that will prevent you ever having to deal with this ever again.
How To Configure a Filter That Consolidates Homepage URLs
First, it's important to note that this filter also uses a regular expression and can be a bit hard to interpret, but before we discuss that, review this screenshot to get an idea of how to start configuring this filter.
The credit for this filter goes again to the Lunametrics Blog, though I put my own twist on this regular expression to make it accommodate websites that use different programming languages and configurations. Use the following regular expression as your search string:
This will turn pages that might end with...
- or any combination of the above...
...into "/" in your Google Analytics reports.
Add or Remove www. From URLs
Why do I want to add or remove www from URLs in Google Analytics?
First and foremost, you want to remove or add www. to your URLs for consistency in your reporting. For the same reason you want to consolidate your homepage URLs to avoid having to manually calculate your stats, you want to add or remove www. because sometimes your users, or creators of referral links, will be inconsistent with how they type the URL. While this won't impact your site traffic, it will impact your analytics.
As for whether or not to include www., that's mostly a style decision. I personally prefer to remove it from my reports because most of my clients don't say 'www.' out loud when discussing their website and it helps reduce noise in the reports by reducing the number of characters that a client's eyes must scan when reading the report.
How To Configure a Filter That Adds or Removes www from URLs
This filter is pretty easy, but it also requires another regular expression. Review this screenshot to see how I set them up.
Again, this filter uses a regular expression to locate all instances where 'www.argyleanalytics.net' show up in my Google Analytics reports and replaces it with 'argyleanalytics.net'. Here is the regular expression you can use in your reports.
Force Lowercase On Key Fields
Why do I want to force lowercase on certain fields in Google Analytics?
Yet again, this will help consolidate a lot of the page counts will show up as separate counts because people mistype URLs. Here are a few examples that all point to the same page, but will show up as separate page counts in Google Analytics.
There are multiple cases where this will be a problem ranging from how users type in a URL to how marketers configure UTM parameters for tracking social media and content marketing campaigns. Here are a list of filters I apply a 'force lowercase' filter to in order to prevent these issues from coming up in the analytics reports:
- Force a lowercase URI - use the Request URI field.
- Force lowercase campaign source - use the Campaign Source field.
- Force lowercase campaign medium - use the Campaign Medium field.
- Force lowercase campaign term - use the Campaign Term field.
- Force lowercase campaign name - use the Campaign Name field.
- Force lowercase for referrals - use the Referral field.
How To Configure Force Lower Case Filters in Google Analytics
Review the following filter configuration and then apply the same configuration to the different fields listed above.
Remember Filter Order Is Important
For the most part, the filters described in this article are pretty self-contained, but not all filters are. As you learn to apply filters to standardize the data you're reviewing in your reports you need to keep in mind the different effects they may have. Google Analytics will apply the filters you set in the order you set them. It's important to always test your filters on your 'Test View' prior to deploying them to your 'Master Reporting View'. Remember, once a filter is applied, it can't be undone. Google Analytics always gives you the option of testing and verifying filters before you deploy them. Take advantage of this and then allow them a few days to run on your test view to make sure they are working as intended.
How do you Think You Will Deploy Filters?
There are lots of different ways to solve data standardization issues in Google Analytics. I would love to hear about how some of your challenges and how you have attempted to solve the problem. Please feel free to share this article with others and apply the lessons learned to your own work or use this as a starting point to ask questions of your peers and open up a dialog for knowledge sharing.