Optimizing page timings for Google Analytics

Many of you will have one of those recreational text pads where you jot down some notes you wanted to share with others. Like a blog. Sometimes, or in fact very often, you might want to track just how successful these quick notes are. Well you'll probably already know about Google Analytics.

It's one of those tools that everyone uses, it's free, it's easy, but it doesn't solve the problem completely. What I'm talking about is that Google Analytics' page timings are just plain bad when it comes to blogs.

Let's consider the situation : Someone visits your blog from a link aggregator, via an RSS reader or from lumpa-land. What will happen is, the visitor reads your article, hopefully likes it, and then leaves your site. Now this is the ideal situation for showcasing where Google Analytics is lacking somewhat.

This legitimite visitor, that may have even liked your blog post! will be recorded. Of course, you've got Google Analytics installed, that's why, but he will be saved as a bounce off of your site with zero pageview time. So the Google Analytics dashboard will tell you something like this:

Now this surely can't be right, it's impossible to read my blog posts in just 44 seconds, you may think. Well, often you're correct, it's Google Analytics' fault that the average session duration is so low. But instead of complaining about this, let's fix it!

1st attempt

Now let's consider why Google Analytics' page timings are incorrect. Well the root of the cause is that the way Google Analytics measures the time spent on a page, they calculate it by taking the difference in time between visiting different pages.

Ah, see here's the problem! On a blog, like in our scenario, a visitor will only be visiting the one blog post he navigated to in the first place. So there's no opportunity for Google Analytics to calculate a proper page timing! We can easily fix that.

Because Google Analytics will also consider events as pageviews when it comes to calculating the session duration. All we need to do is send off an event at the appropriate time... but when is the appropriate time?

This is a tough question and it's very hard to answer, so let's send one every ten seconds, just to make sure. Here's the adjusted Google Analytics snippet:

ga('create', 'UA-XXXXXXXX-X', 'example.com');
ga('send', 'pageview');

setTimeout((function(timing) {
    ga('send', 'event', 'time', 'log', timing);

    // Bind another timeout to run in 10s
    // to fire off the next event
    setTimeout(arguments.callee.bind(null, timing+10), 10000);
}).bind(null, 10), 10000);

Now what does that give us? Let's see:

Ah, that's what I like to see! We've got a 3,331.25% increase in the average session duration and the bounce rate dropped by 91.54%. So this solution seems to work, or does it? Because I think 26 mins is a bit too long for reading my relatively short blog posts.

Maybe we can find an explanation by adjusting our situation: Now, the visitor visits your page from a link aggregator and if I think about the way I use link aggregators, I'll exit them with 20 new tabs open.

See, there might be the cause! The visitor opens your blog post, but he may not actually view it until he's finished dealing with the other 19 tabs. This wasn't a problem when Google Analytics had to figure the average time out by itself, because most recorded timings were zero.

But now this has changed! Once a user opens the tab with your page loaded in it, we're telling Google Analytics that the visitor is on your site! So that's the next problem to fix!

2nd attempt

As I said, we need to find out, when the user actually starts viewing your blog post. But how are we going to do that? Well, this is the ideal situation to show off the relatively new Page Visibility API. It can tell us if the browser's frontmost tab contains our site, or not.

This should deliver a more accurate measure of when the page is actually viewed and read. But we can't just wait until a user views our page, i.e. the visibility event fires, because the user might just flit briefly over the tab, when navigating between them, for example.

ga('create', 'UA-XXXXXXXX-X', 'example.com');
track_ga();

function track_ga() {
    // We can be quite sure that the user _is_ on our page,
    // once he's there for more than 10s
    var waiting_time = 10000;

    // Just send the pageview if page visibility is not supported
    if (!get_hidden_prop()) {
        ga('send', 'pageview');
        return;
    }

    if (!is_hidden()) {
        // If the page is visible, setup a timeout to send
        // the pageview, that gets cancelled when the user
        // navigates away from the page too soon.
        var timeout = setTimeout(function() {
            ga('send', 'pageview');
            setTimeout(event_poller.bind(null, 0, waiting_time), waiting_time);
        }, waiting_time);

        document.addEventListener('visibilitychange', function() {
            document.removeEventListener('visibilitychange', arguments.callee);
            if (is_hidden()) clearTimeout(timeout);
        });
    } else {
        // Otherwise, wait until the page becomes visible again
        // and do everything over again.
        document.addEventListener('visibilitychange', function() {
            document.removeEventListener('visibilitychange', arguments.callee);
            track_ga();
        });
    }
}

// Same as our previous periodic GA event sender.
function event_poller(timing, wait) {
    ga('send', 'event', 'time', 'log', String(timing));
    setTimeout(event_poller.bind(null, timing + wait / 1000, wait), wait);
}

The functions get_hidden_prop() & is_hidden() are from, and can be found on HTML5Rocks.com get_hidden_prop() checks to see if the Page Visibility API is supported and for any vendor specific prefixes. is_hidden(), obviously, returns true/false depending on if the page is visible.

As you see, this version is a lot more complicated than our initial snippet, but will it improve our measurements?

Hmm, no not really. So it seems, that's the best we can get with this technique.

Final thoughts

Although our second attempt didn't yield the expected results, I can still confidently say, that the second snippet is better than the first. For one, it saves resources (network), due to the fact that we only start sending events once the user is actually active on your site, which is especially important if you've got a lot of mobile vistors reading your blog. And the page timings are definitely more accurate than with the default page timing mechanism. But with the second snippet we also get insight into when our pages are just prerendered.

So as you see, there doesn't seem to be a perfect solution, yet.