Friday, December 15, 2006

Lies, Damned Lies, And Statistics

No matter who coined the phrase, it has often been used to cite the inaccuracy of some conclusions which can be drawn from analysing statistics. I'm usually pretty wary of them myself, but sometimes your server stats are your best friend.

I have recently been involved with the redesign of a website which has been online since the end of 2003. It was originally written by another team, and it contained many nested tables, a few styles, but basically not terribly semantic. The team I work with have been looking after the site's content since it's original launch, and a few months ago, the site owner came to us to ask if we could give it a fresh new look, and a bit of a re-organisation. It had grown organically since it's inception, and things had got a little muddled. It was felt that documents in certain areas of the site just weren't being found.

We undertook some user-centred design, testing our new proposals with paper wireframes and some open and closed card sorting. On the basis of these results, we tinkered a bit more and tested again. Then we set about reorganising the content and making much more semantic pages - lists of documents were coded as a list. I'll admit that one table remains for the basic layout, but this was pretty much proscribed by the templating system in use on the server. Everything else has been pared down to provide minimum tag soup.

On 1st November, the new-look site was relaunched. Fast forward a month, and I ran a statistics check on the site, comparing results from October 2006 (old style) and November 2006 (after relaunch). The results were startling.

October - Access Statistics

  • Total page impressions: 98,037
  • Top URL was the site root (no surprise)
  • 4th was the search page, with 1,530 hits.
  • That suggested people weren't finding what they were looking for.
  • "Responses" section (where the bulk of the answers to the public's FOI requests were published) was at 46th, with only 262 hits.
  • We were regularly publishing responses to very similar questions - because users didn't find them on the site before making their own request.
November - Access Statistics
  • Total page impressions: 108,632
  • Top URL was site root, 2nd was the new "responses" index page, with 1,658 hits.
  • Search page had plummeted to 437th - with a mere 24 hits!
  • Other (new) subpages of the responses section were getting plenty of traffic as people explored the new way of accessing the documents.
  • We are still publishing lots of responses to requests, but the number of near-duplicate queries has dropped significantly.
Page Weight Statistics
Some pages were completely re-structred in terms of their content, but about 20 pretty much retained their original information - it was just recoded from tables to lists. I did some analysis on these as a before and after comparison too.

The smallest page started out at 16Kb, and went down to 7Kb (56% reduction); the largest page was originally 119Kb and dropped to 20Kb (83% reduction). On average these 20 files' sizes were reduced by 73%. Not bad in itself, but when you multiply that by the number of page impresesions, you get an idea of the considerable reduction in bandwidth being used.

Conclusions
So there you have it - some numbers to back up the priciples of good user-centred design. I felt that the search page statistic was the most significant - and certainly backed up the old adage that if you have a decent navigation and information hierarchy, people won't need to use the search but will naturally find things themselves.

No comments: