As a data-driven agency, data is a decision-driver for us. But what if our data is wrong? When using analytics tools such as Google Analytics, we know that our data will never be 100% perfect as there might be discrepancies coming from many different factors: ad blockers, consent to be respected, cross-device tracking, ad fraud, bot traffic, internal traffic, etc. Still, there are some things that can be implemented to keep our analytics data as clean as possible. That’s what will be explained in this article.
On a side note, while Google is pushing for Google Analytics 4, we see that some features that are available in Universal Analytics (UA) are not (yet?) available in Google Analytics 4 (GA4).
1. Use filters to configure how data is being displayed in the reports
Filters can be created at account and view levels in UA but only at account level in GA4 since GA4 properties don’t include views. They are used to include data, exclude data or modify how data looks in the reports.
How to use them? In UA, you can create filters in Admin > Account > All filters or in Admin > View > Filters. In terms of filters, you are able to exclude, include, lowercase, uppercase, search and replace or use advanced filters.
Here are some examples of filters that can be implemented:
- Exclude traffic coming from internal IP addresses
- Exclude traffic coming from staging website
- Lowercase source, medium, campaign name, etc.
- Rewrite some source, medium
- Remove parameters
These are just suggestions, this list is not exhaustive and should be adapted based on your needs.
These filters cannot be applied to GA4 properties. The only “filter” that can be set is to define IP addresses whose traffic should be marked as internal. It has to be implemented in the data stream as explained here.
2. Make sure to exclude payment gateways from your traffic sources
Have you ever seen something similar in your data?
The first source is a typical example of a payment gateway that should be excluded from your traffic sources. Let me explain to you why. Imagine any ecommerce website: if you want to buy something you will need to pay at some point. Very often you are redirected to a payment portal then redirected back to the ecommerce website. If you don’t exclude the payment gateways from your traffic sources, since it was the last interaction before the conversion, this one will be attributed to the payment gateway in UA.
How to spot it? The best way is to regularly have a look at the sources that generated conversions. If you see in the source name “payment”, “pay”, “3dsecure”, “3ds”, etc. or the name of a bank, it should be excluded.
How to exclude it? You can do it in Admin > Property > Tracking info > Referral Exclusion List > + Add Referral Exclusion. In GA4 it has to be implemented in the data stream as explained here.
3. Keep an eye open to spot bot traffic
Bot traffic usually generates a massive increase in users that drop after a while without generating any conversions nor revenue.
How to spot it? You’ll never know for sure but there are several dimensions that you can use to confirm your suspicions:
- Geography. Bot traffic usually comes from the same country/city.
- Source. Bot traffic often comes from direct or organic traffic.
- Device. You can go up to the size of the screen to check if most of the identified traffic uses the same device.
- Operating system or browser. You can also check if most of the traffic uses the same operating system or browser.
On top of that, in most cases, bot traffic has a high bounce rate and a very low average session duration.
How to exclude it? In UA you can do it in Admin > View > View settings > Under Bot Filtering, tick the box “Exclude all hits from known bots and spiders”. But be careful because activating this feature doesn’t guarantee you that all bot traffic will be filtered out by Google so we advise you to remain critical. You can also exclude the unwanted traffic by creating custom filters in your view (as explained above).
In GA4 bot traffic is supposed to be excluded automatically, as explained here. Unfortunately Google is not able to identify all bots and we already saw bot traffic in some GA4 accounts. Let’s hope that this will change in the future.
Conclusion
Maintaining your Google Analytics data healthy is key for business decision-making. We recommend you to remain proactive and plan regular checks in order to minimize issues that might pollute your data and result in wrong decisions. In this article we just focused on 3 tips to keep your data clean but there are plenty of others. Feel free to reach out to Semetis if you have any analytics data issues!