#376 Everything You Didn’t Know About Google Analytics: Interview with Tom Capper

In Analytics, Internet Marketing Podcast, The Digital Marketing Blog by Sean1 Comment

In this week’s podcast Andy is joined by Tom Capper, Consultant at Distilled to discuss everything you didn’t know about Google Analytics.

Tom discusses some of the hidden quirks in Google Analytics, things that we ought to know but are hidden, including how GA splits up sessions with things like change of date or periods of inactivity, which is logically intuitive and has parallels in the real world, but has some undesirable consequences. He discusses how attribution in GA is very messy, with a lot of non-direct traffic being reported as direct, and vice versa, and explains how it’s easier than ever to fake data in your own account or anybody else’s, even without any kind of access. (To help combat this, check out our recent blog post, Donald Trump is in Your Google Analytics).

He goes on to discuss misleading metrics and explains why he feels that time on page is an awful metric, why bounce rate can be misleading and how you should consider scroll depth and micro conversions as more reliable ways of measuring engagement.

Finally, Tom provides some of his tips and tricks for using GA including tips for using Advanced Segments.

Tom also spoke on the same topic at Novembers MeasureFest conference and you can find the slides here and if you’d like to follow Tom on Twitter, you can do so here.

Transcript of the show below:

Andy White:       This is internet marketing!

Brought to you by SiteVisibility, at SiteVisibility dot com and today I’m joined by Tom Capper who’s a consultant at Distilled.

Tom, how are you?

Tom Capper:       Not bad, yourself Andy?

Andy White:       Yeah, yeah, pretty good. Let’s start off, surprise surprise, [00:00:30] tell us a little about yourself Tom and a little bit about Distilled.

Tom Capper:       Sure, so I’ll start with Distilled. So, Distilled started out as a web dev agency, 11 years ago, way before my time, then … But over time, sort of became better known for technical SEO expertise. And these days, we’re a broader digital marketing agency, so we do, as well as the technical SEO that we’ve always done, we do CRO, analytics, PPC, creative and [00:01:00] digital PR. And we run three annual conferences around digital marketing and SEO. We have a training programme and a blog as well.

In terms of myself, I studied economics at university and found a Distilled careers fair in 2012. And, so I originally worked with them as a data analysis intern before my final year of study, and I liked it so much that I came back here in 2013, [00:01:30] joined the consulting team, which is where I am now.

Andy White:       I thought you were going to say for a minute, “I liked it so much I bought the company.”

Tom Capper:       *Laughs* If only!

Andy White:       Now, Google Analytics, it’s a topic that we’ve covered quite a lot recently. I wanted your take on it because there’s a … The title of today’s show is “Everything you didn’t know about Google Analytics”. There’s some hidden quirks, aren’t there? Things that we ought to know that aren’t that well known. Tell us a bit about that.

Tom Capper:       Yeah absolutely. I think it’s [00:02:00] … Well we all use Google Analytics so much and it’s so ubiquitous we sort of just take it completely for granted and, as you know, their the source of how everything’s sort of supposed to be done and we don’t really question it that much. But yeah, as you say there are quite a lot of things that sort of go on beneath the surface that we don’t really think about that we should, that have quite meaningful effects on decisions we actually make in the real world.

I’ve seen Google Analytics data presented [00:02:30] without any caveats in, you know, pitch meetings, in funding rounds, in discussions around whether someone is going to get promoted or fired. Certainly about whether an agency contract is going to continue, that kind of thing, you know and, on a basic sort of every day basis, you know a lot of companies have enormous ad spends, and they justify how they do that spending with Google Analytics data. So, you know, it’s pretty important stuff to get right and understand actually, [00:03:00] but we just sort of take it at face value instead.

So, yeah I mean, probably the biggest things are around how Google Analytics has tried to mimic, sort of real world patterns and how we visit shops and things, and convert that into its schemer for how people visit websites. So it has this, you know, system of hits, and sessions, and users, [00:03:30] which just, sort of, become the standard. I mean, I don’t think Google Analytics was the first platform to use that kind of system but actually, how you define those types of interactions, sort of, levels of interaction, are kind of arbitrary in a way.

So, for example, in the real world if I visit a shop one day, and then I visit the same shop the next day, you would probably want to call that two visits.

Andy White:       Yes.

Tom Capper:       To the shop.

[00:04:00] So Google Analytics, you know, follows that logic, and calls it two visits or two sessions these days. And the way it does that, it just goes by calendar date. That has a massive unintended consequence, which is that if you visit the website at 11:59 and start your session, you’re actually going to have two sessions, one of which starts and whatever page you’re on, you know, whatever the first page you clicked onto was after midnight.

So, for websites which are normally 24/7, [00:04:30] a lot of them do get a lot of traffic during the night, a lot of them will be international, and have their time zone set in the country they’re based, which isn’t necessarily where all the visitors come from, can actually have quite a significant effect and there’s also a similar parallel with real world stores, in terms of people leaving and coming back in. You know, if I walk out of Tesco’s and go to Costa, and then come back to Tesco’s half an hour later, Tesco’s probably wants to call that two visits and, you know, it’s the same with a website. So we have this inactivity [00:05:00] thing, except the trouble is with a website, when we come back, we’re going to come back to wherever we left off potentially, in the same tab it might be and that can end up with us having a loss of attribution data, because I’m going to go on and convert half that period of inactivity into a second session and it will be attributed to some page on our website rather than, you know, the PPC spend that brought us around in the first place.

Andy White:       Now, attribution’s a bit tricky in Google Analytics, [00:05:30] isn’t it? Tell us a bit about that, sort of, non-directed traffic and stuff like that.

Tom Capper:       Right, so there’s been a bit of a thing recently around so called, “dark traffic”, which is where search traffic or social traffic, particularly from secure search some people say, but also from … The majority of cases, I would say, from social apps and email apps, you get traffic that isn’t actually direct, reporting [00:06:00] as direct and then on the other side, because, although people assume Google Analytics uses last click attribution, it actually uses last non direct click attribution, which, is probably quite un-coincidentally, quite flattering to Google’s own channels. So, on the one hand, you have this data that, you have these visits that aren’t direct being attributed to direct and then on the other hand, we have these visits that are actually direct. [00:06:30] You know, it could be someone using a bookmark or typing into their browser, being attributed to whatever their previous session was because of last non direct click, which is the attribution model that Google Analytics uses.

So, you know, the whole system of measuring how people got to a website and made these conversions, which is, you know, a huge part of Google Analytics user case, obviously, is actually ridiculously fuzzy.

Andy White:       Now, how easy is it, Tom, to [00:07:00] effectively fake data in Google Analytics? I’m interested in this, because you mentioned earlier Board decisions are made on Google Analytics interpretations.

Tom Capper:       So, I mean, that’s definitely something where I would urge a little bit of caution. Especially if you are in a vertical where you think maybe your competitors aren’t the nicest people. But, it’s almost impossible to fake Google Analytics [00:07:30] data in your own account or in anybody else’s account.

You know, it’s always been possible to take your own tracking code or someone else’s tracking code, put it on some high traffic page and just generate a load of bunf that will end up in that account, and to fake data that way, but, since the measurement protocol was introduced with universe analytics, a few years ago now, it’s now ridiculously easy to do that and you can fake it in very convincing ways as well.

[00:08:00] You know, over time you can simulate pretty much anything you want that could, you know, try to muck around with your competitors ad spend, which is often driven, at least, uses decisions based on, analytics data. Or you could make it look like they needed to invest in some product or country that’s actually not doing anything at all, or you could just falsify data and pictures so it looks like your business is [00:08:30] more valuable than actually it is. Obviously, it goes without saying, I encourage you not to do any of these things but, you should know that they’re possible and, this is why I say, you shouldn’t take analytics data at face value. There is almost no way of getting data about your website that is completely bullet proof, but if you are worried about this kind of thing a good place to start would be logs.

Andy White:       Yeah.

Tom Capper:       People will find ways of mucking around with that, you know, by expanding [00:09:00] your server or whatever it might be, but, at least you can compare the two and try and spot the inconsistencies.

Andy White:       Now, this whole subject of misleading metrics is an interesting one. I know that there’s certain metrics to be wary of, aren’t there. I mean, time on pages is one of them, isn’t it Tom?

Tom Capper:       You’re absolutely right, yeah. So it’s actually a massive pet hate of mine. This is moving away from ways in which Google Analytics [00:09:30] doesn’t work quite how it expects and over to it just not doing its job very well. Engagement metrics are a really interesting subject because, you know, obviously I come from an SEO background, and the big craze for some time in SEO is being the, sort of, top of funnel content. You know, content that’s all about getting links and engagement and this kind of thing. But engagement is not that easy to [00:10:00] measure. You can tell how many people saw something but not whether they actually spent any time looking at it, and whether it made some kind of lasting brand impression.

Andy White:       Yeah, yep.

Tom Capper:       Time on page, you would think, one of the best ways to look at this, you know, intuitively. Except there’s sort of a problem here, not just with Google Analytics, but with time on page in general and that’s that, analytics platforms have no idea how long you’re going to linger on the last page [00:10:30] you visit. So, say I come to your website, I land on some amazing blog post you’ve written, and then I spend 10 minutes reading that blog post, then I close the tab and continue with my day. What the analytics platform has seen in that, you know, good engaged visit, is one hit when you first loaded the page and they don’t know whether you looked at it for zero seconds …

Andy White:       Yeah.

Tom Capper:       Or … You know, five years.

Andy White:       Yes.

Tom Capper:       There’s absolutely no [00:11:00] way for them to tell the difference, if no other hits are being fired on that page.

Andy White:       Yeah.

Tom Capper:       And, Google Analytics actually tried to bodge this by, instead of, you know, you’d think time on page would just be the total time on site divided by the number of page views, you know, it gets an average in effect.

Andy White:       Yeah.

Tom Capper:       But actually, they do time on page divided by page views minus exits. So that basically [00:11:30] to try and dis-include that last interaction, where they’re just assuming you spent zero seconds on a page you might have been viewing for 10 minutes. So, they’ve just removed that from the metric all together, which means that then even when people do put in extra interactions, to try and capture how long people were on that page that they were on when they eventually left.

Andy White:       Yes.

Tom Capper:       It actually ends up even more misleading that when it started, because Google Analytics is then mucking with the average to try and bodge its original solution. [00:12:00] So essentially, in our experience, time on page can actually be … Reported time on page, in Google Analytics can actually just be completed uncorrelated and quite wildly wrong, versus any intuitive measurement of time on page. You know, I’ve seen sites that claimed an average time on page of 17 minutes where it was closer to 10 seconds and pretty much vice versa.

The [00:12:30] … And time on site is no better, I’m afraid. It’s just, almost completely broken metric, because of that inability to capture later interactions.

There are some alternative engagement metrics. Obviously, bounce rate is a very popular one, but that can be quite challenging as well, because if you’re looking at a piece of information on your site, a lower bounce rate might not always be better. Because, maybe you’re looking for people to find the answer [00:13:00] to their questions straight away and that’s what a good brand experience would be. So, bounce rate, is useful but you have to be kind of careful with it and then, the other more robust engagement methods like scroll depth, like micro conversions such as email sign up, they require you to do some set up yourself.

Andy White:       Just remind us, Tom, the definition of scroll depth.

Tom Capper:       Right, of course, [00:13:30] sorry. So, scroll depth basically is how far down a page you got before you left it so, obviously, most pages continue beneath the fold. So, you can set it up so that as you scroll down it fires events into Google Analytics which you can actually record as goals if you wanted. So, for example, you’d have a goal for someone getting right to the bottom of a blog post.

Andy White:       And, micro conversions. Are these a bit like when you set sort of like, mini goals or …

Tom Capper:       Yeah, so [00:14:00] like … On a site where a macro conversion might be, I mean, a really simple case … E commerce, a macro conversion is buying a product.

Andy White:       Yeah.

Tom Capper:       A micro conversion, might be, signing up to some kind of reminder about when this product is on offer. Or signing up to their newsletter or creating an account. Something like that.

Andy White:       So, this is all useful stuff. It’s worth knowing about these weirdnesses with Google Analytics. So, further … our listeners love [00:14:30] tips and tricks. Further tips and tricks. I’m especially thinking about when you’re going through your Google Analytics reports and things. What sort of things should you be aware of? What sort of things should you be looking at?

Tom Capper:       That’s a good question. I think that, obviously, analytics audits are a whole topic in their own right. I think, on theme with some of the stuff I’ve mentioned, today, [00:15:00] I would encourage you to have a look at your landing page report and see if you can find one or both of your check out page, listed as a landing page. Or, in brackets, not set. Because those are both, kind of, interesting cases where obviously it makes absolutely no sense for people to be arriving via the checkout, or have a session with no landing page at all.

[00:15:30] But, most analytics counts that I see feature these quite prominently. Often as some of the highest converting landing pages, in fact and, there are some quite common and easily fixed reasons why these come about. So, in the case of the landing page being the check out for a whole bunch of conversions, that can often be people just having [00:16:00] a think in the middle of their buying process, maybe going away, having a cup of tea, particularly for high value products.

Andy White:       Yeah.

Tom Capper:       So that can just be, you know, an inactivity problem. So, something you could consider doing there is extending the inactivity period. But more commonly, it can happen when people have separate payment sub-domains, or when you take people off site to pay on something like PayPal. And, if you don’t have your cross domain or cross sub-domain tracking set up properly, or if you don’t have PayPal [00:16:30] excluded as a referrer, then that can trigger the start of a new session and you end up with a session that contains the check out, and the conversion, but no attribution data and another session with the attribution data, but no conversion.

Not set is a more interesting one because that’s a session with no pages in it. So, there are two, sort of, interesting ways that can come about. One of them is if people on the last page they’re on on your website … If they [00:17:00] go inactive on that for 30 minutes, then they come back and scroll around, maybe click on some stuff without loading a new page, if they set off an event, you’ll get a session with an event but no page view. You could also, if you’re doing some tracking of offline interactions, it could trigger that. That’s, sort of, hopefully what happens, but in reality, 90 percent of the times we see this, it’s because people have some legacy of their old tracking code hanging around on their website.

So, [00:17:30] for example, we’ll see people who have updated to universal analytics tracking code years ago, but they’ll be some weird widget on their site that still has old fashioned tracking code and it’s just creating sessions in its own right, with just these events in it, and no landing page. It’s completely useless, and lost micro conversions.

Andy White:       So, it’s worth keeping your websites clean and up to date, isn’t it?

Tom Capper:       Well, basically yeah. I mean that’s lesson in a lot of these things isn’t it?

Andy White:       Yeah. I’m glad that I’m not the only one that spends about five or ten minutes, sort of, [00:18:00] wiggling my fingers and taking deep breaths before I press “buy” on expensive items!

Tom Capper:       *Laughs* That’s me! I’ve spent days sometimes, I think.

Andy White:       Yeah, yeah! You just leave, and come back a week later, and you find it’s gone up in price. Especially … Anyway, let’s not go into Amazon. Whoops! I mentioned Amazon.

I know you’ve got some interesting thoughts on the topic of tips and tricks on advance segments, haven’t you? And sequences.

Tom Capper:       Yeah, absolutely. So, basically I just think that advanced segments are [00:18:30] … Pretty much everyone that uses Google Analytics a lot uses advanced segments, but they’re kind of an underutilised feature at the same time. Because of this feature of them, which is a feature within a feature, which is sequences which just allow you to do stuff that you couldn’t do anywhere else in Google Analytics.

So, for example, you could make an advanced segment using the sequences menu that said something like, [00:19:00] “a session starts with any user interaction and then at some point later on there’s the checkout page, and then at some point later after the checkout page there’s the FAQ page all within one session.” So what you’ve done there, is you’ve created a segment of sessions in which, potential buyers got cold feet. Because they looked at … Or buyers remorse. Because they looked at the check out, and in the same session, after [00:19:30] they made it to the check out, they looked at the FAQ page. There’s almost always some of these on an ecommerce website, that has an FAQ page …

Andy White:       Yes.

Tom Capper:       And, it can be really useful for your CRO to try and figure out, you know, what … How people got to the site and what products they were looking at, and what they did between the check out and the FAQ page, you know, what site search terms they used, that kind of thing. Try and figure out, you know, what was it that led them to have those doubts.

But that’s basic use case. If you want to be [00:20:00] a bit fancier you could so, you know, I’m interested in users, who arrived at the site via Google, but they subsequently arrived at the site via email and, basically what you’ve done there is you’ve created a segment of your marketing strategy having gone right, so it’s a great way of figuring out which of your landing pages are really working for you, in terms of converting one time organic visitors into people who want to receive emails from you, and feel positive [00:20:30] about you as a website.

Andy White:       Now, user flow reports, again, are there some customizations you can do there to help with this, s issue of misrepresentation of data?

Tom Capper:       I think user flow reports are just another example of this, commonly used yet underused feature. Because people often, sort of, take a glance at them but not really play around with them that much. If you [00:21:00] go into a user flow report, and then you click on one of the big green blocks, and click on “explore through this page”, then that’s something that people do reasonably often and it just shows you all the traffic moving in and out of an individual page on your website.

It’d be quite useful for understanding which are the most prominent aspects of the nav, what people are trying to do on that page, how people get to that page, that kind of thing. But when you do that, an extra button appears above the block you’re focusing on, which looks like a little pen. [00:21:30] and it’s actually kind of glitchy, and it often overlaps with other texts and stuff. It seems like Google themselves have forgotten about it. But if you click on that pen, it will allow you to use regex to define a page group.

So, for example, you could then define all of your product pages, and then you could say, “I want to look at how people come in and out of all of my product pages rather than just individual URL”, and that makes user flow reports [00:22:00] immensely more powerful as a feature.

Andy White:       Wow. Well, best of luck using regex to define a page!

Tom Capper:       Yeah, right.

Andy White:       We all love regex’s, don’t we?

Well, Tom, thanks so much for coming on. Some really good tips there. How can our listeners find out more about you, and more about Distilled?

Tom Capper:       Well, Distilled of course has a website, where you can find our blog, newsletter, conferences, training platform, etc. And that’s Distilled.net. But, me personally, you can follow on Twitter, I’m at T H [00:22:30] Capper. T H C A P P E R.

Andy White:       Fantastic. And thank you for our fantastic listeners, for listening!

Show notes are in the usual place, SiteVisibility.com forward slash, I am pod cast. I’m still saying forward slash, aren’t I? It’s just slash, we all know which way the slash goes on the internet, don’t we?

If you want to connect with me personally, doctor pod, D O C T O R P O D, on Twitter and LinkedIn. And, if you’ve got any questions that you want to ask, please send those to podcast [00:23:00] at SiteVisibility dot com, that’s the email, or the magic telephone line is plus 44 1273 256 150. Well, that’s all from me, Andy, and it’s all from Tom.

Tom Capper:       Thanks Andy.

Andy White:       And we’ll see you next time on internet marketing.

Contact Us

If you'd like to nominate a guest for the podcast (self-nomination is also ok!) or have a suggestion for a topic that you'd like to see us cover, please get in touch using the form below:

  • This field is for validation purposes and should be left unchanged.




  1. Hi Tom, great podcast – I enjoyed your talk at #measurefest too. Lots more caveats to factor in 🙂 Did you publish anything on ‘dark traffic’ indicators?

Leave a Comment