Using analytics – News Numbers

Search

Google’s new mobile, social media and cloud storage products have made it more omnipresent than usual. That’s why I was surprised to hear Brian Schmidt, Google’s Americas sales director, say that Google still considered itself a search company.

This seemed a little disingenuous at first, just something Schmidt would say to sooth me and the other Online News Association conference field trippers at Google’s Boston office last week.

But then Schmidt explained that everything Google does is “dependent on users opting in and finding value in the experience.” With so much information available on so many different devices (see my New York Times Room for Debate forum commentary on paying for device-based convenience), Google’s success is based on people making a conscious choice to use Google products to find and link them to what they want.

“Search” means more than people typing keywords in a little box. “Search” now means “I’m coming to you – and not someone else – to make my life better by finding what I need and connecting me to it.”

Thus, news organizations should be search companies, too. But they’re not going to get there by counting page views and unique visitors. For some news orgs it seems to be a point of pride that the bulk of the visits to their sites start with their all-things-to-all-people home pages. How about defining success with metrics that indicate whether people are finding what they want?

The humanity of web analytics

This snippet from the Economist’s review of Brian Christian’s book about artificial intelligence – The Most Human Human: What Talking With Computers Teaches Us About What It Means To Be Alive – reminds me that web analytics is about people, not data:

“People produce timely answers, correctly if possible, whereas computers produce correct answers, quickly if possible. Chatbots are also extraordinarily tenacious: such a machine has nothing better to do and it never gets bored.”

This spring the nine hardy students who took my USC Annenberg web analytics class came up with wonderful insights that could never have come just from reading a report straight from Google Analytics, Omniture or any other chatbot. Part of their grade was based on whether the (equally hardy) participating news and nonprofit organizations were actually going to use their analyses for decision-making. This meant each student had to really understand the organization’s strategies, goals and personalities before he/she dug into the data. Here are some of the things we learned.

Content is indeed king, but only if it’s coded
None of the organizations coded site content with enough detail to make decisions about what to do with their sites. Data coming straight out of Google Analytics or Omniture was coded only by date published and by broad categories such as “News.” This is the equivalent of marking a box of books “MISC” – or putting in “stuff” in any search engine.

For example, let’s say an organization believes it can build and engage audiences by adding “more local politics and government coverage.” To track whether it did indeed produce “more,” and what coverage did result in increased visits and engagement, the org needs to track how many politics stories it has, by topic and local geographic area, and how much traffic each topic and/or area gets.

Each student developed a taxonomy of codes the organization could use to classify its content, and then manually (I told you they were hardy!) coded sample data pulled out from chatbots, er, Google Analytics or Omniture.

Track traffic by day, week and by topic, to determine if the site is getting the traffic it should

Many organizations look at monthly data, and celebrate traffic spikes. Hidden in monthly data, however, are clues to where to build targeted audiences and advertisers. Health/fitness section traffic, for example, should increase the second week in January, perhaps fall off after Valentine’s Day (!), and increase before swimsuit season.

More content = more traffic
Looking at visits by day of week, we saw radical drops in visits on the weekends. This seemed to be due to little unique local content being posted on Saturdays or Sundays. In this age of the 24/7 newsroom and increased Internet access through mobile, can news orgs afford to make resource decisions based on non-audience-based, chicken-and-egg logic (“We don’t get much traffic on the weekends so we can’t justify adding weekend staff.”)?

Sometimes you should have separate sites for each audience segment….

Josh Podell, an MBA student, focused on analyzing the e-commerce donation functions on the nonprofit sites. He observed that it’s hard to understand what works and what doesn’t when donors are coming to the site to find out more about the organization but residents are coming to find out about programs and services. Josh’s suggestion: Have a completely separate site – and Google Analytics account – for donors. An org could have much more focused content for each audience, and metrics such as visits per unique visitor, page views per visit and the percent of people who left the site after looking at just the home page (home page bounce rate) would give much more clear indicators for both sites.

….but sometimes you shouldn’t.

One of the organizations had its main site on one Google Analytics account, and its blog on another. Dan Lee, a graduate Strategic Public Relations student, noticed extremely high home page bounce rates from returning visitors compared to new visitors.

With the question of why burning in his head, Dan looked in detail at the site content and structure, and hypothesized that returning visitors were most likely to go to the home page, see the teaser about the latest blog entry, and immediately “leave” the site to go to the blog. The Google Analytics account for the blog did indeed show that its top referring site was the main site.

Google Analytics was “correct,” but only a human could have produced the right answer for the organization.

So here’s to Brian Christian for explaining how to be “the most convincingly human human,” and how the competition for the Loebner prizes continues to result in what the Economist calls “a resounding victory for humanity.” And here’s to the most adventurous and tenacious students at USC Annenberg – the journalism, public relations and business majors who took an experimental elective class with “web analytics” in the title!

The content’s there but the data often isn’t

Neil Heyside’s Oct. 18 story on PBS MediaShift about how newspapers should analyze their content by source type – staff-produced, wire/syndicated or free from citizen journalists – got me thinking about other ways content should be analyzed to craft audience-driven hyper-local and paid-content strategies.

Most news sites have navigation that mimics traditional media products – News, Sports, Business, Opinion, and Entertainment. However, those types of broad titles don’t work well with digital content because people consume news and information in bits and pieces rather than in nicely packaged 56-page print products and 30-minute TV programs.

Each chunk, each piece of content – story, photo, video, audio, whatever – should be tagged or classified with a geographic area and a niche topic so a news org can determine how much content it has for each of its highest priority audience segments – and how much traffic each type of content is getting.

By geographic area I mean hyper-local. East Cyberton, not Cyberton. Maybe even more hyper – East Cyberton north of Main Street, for example, or wherever there’s a distinct audience segment that has different characteristics and thus different news needs and news consumption patterns.

Similarly, news orgs need hyper-topic codes, especially for hyper-local topics. The Cyberton community orchestra – not Classical Music, Music, or Entertainment. If a news org is looking at web traffic data for “Music” it should know whether that traffic is for rock music or classical, and whether the content was about a local, regional, national or international group.

Oh, and there’s one more aspect to this hyper-coding. Content should be coded across the site. Ruthlessly. For example, to really understand whether it needs to add or cut coverage in East Cyberton, a news org needs to add up those East Cyberton stories in Local News, plus those East Cyberton Main St. retail development stories in Business, and those editorials and op-eds in Opinion about how ineffective the East Cyberton neighborhood council is, and….

Sometimes these hyper-codes are in content management systems but not in web analytics systems like Omniture or Google Analytics. Knowing what you’ve got is great – but knowing how much traffic each hyper-coded chunk of content is equally if not more important.

Whether the hyper-codes and thus the data are there only makes a difference if a news org is willing to take a hard, nontraditional look at itself. The data may suggest it needs to radically change what it covers and the way it allocates its news resources so it can produce “relevant, non-commodity local news that differentiates” it, as Neil Heyside’s PBS MediaShift story points out.

Heyside’s study of four U.K. newspaper chains has some interesting ideas about how a news org can cut costs but still maintain quality by changing the ways it uses staff-produced, wire, and free, citizen journalist content. The news orgs in the study “had already undergone extensive staff reductions. In the conventional sense, all the costs had been wrung out. But newspapers have to change the way they think in order to survive. If you’ve wrung out all the costs you can from the existing content creation model, then it’s time to change the model itself.”

If a news org doesn’t know, in incredibly painful detail, what type of content it has and how much traffic each type is getting, then it’s not arming itself with everything it can to mitigate the risks of making radical changes such as investing what it takes to succeed in hyperlocal news and in setting up pay walls. Both are pretty scary, and it’s going to take a lot of bold experimentation – and data – to get it right.

Two million users? Not enough…

Xmarks had lots of users but little engagement. Apparently it forgot to figure out what the users needed or wanted.

From WebNewser:

“For the popular bookmark syncing service Xmarks, 2 million users was apparently not popular enough. Co-founder and CTO Todd Agulnick announced on the company blog Tuesday that, despite growing by 3,000 users each day, the startup was floundering and would shut down its service in 90 days….

Unfortunately, users who tried the system were looking for answers to questions rather than topical lists of sites.”

Believing in time-on-site

A Sept. 12 Pew Center study found that “Americans are spending more time following the news than over the past decade.” Great news – or is this yet another misleading key performance indicator, as my previous blog post about time spent on site might suggest?

No – I like the Pew Center study. It’s a study of attitudes and feelings. Good old-fashioned survey research (with all of its mind-numbing statistical sampling), is an essential component of web analytics. Web site traffic data is audience behavior – the “what.” News orgs have to have attitudinal research to understand the “why” so they can attract audiences that aren’t coming to their sites.

The data you get from Google Analytics or Omniture is enticing, isn’t it? (Work with me, here….) Oh wow, we can track every click! Ah, yes, we can track every click – but we can ONLY track clicks on OUR site, not on anyone else’s.

A time-on-site calculation can only be harvested for you if someone clicks on a page in your site and generates a page view that’s counted by your Google Analytics/Omniture account. Time-on-site is the time in between the first page clicked on YOUR site and the last page clicked on YOUR site.

This means:

1. If someone clicks on YOUR site and then immediately goes to another site (a bounce),
it’s not included by Google Analytics/Omniture in the time- on-site calculation. It’s like it never existed, time-on-site-wise.

It IS counted as one visit and as one page view. So that means that all of those people who come to your site regularly (you know, the ones we really like) just to get the latest on a story aren’t counted in time-on-site – and they should be.

2. If someone is on his/her third page in your site and opens another tab and goes to another site for twenty minutes before returning to your site and clicking on another two pages, those twenty minutes are included in time-on-site – and they shouldn’t be.

3. The time a person spends on the last page of your site isn’t counted.
If someone clicks through a few pages on your site and spends 15
minutes utterly absorbed in a story before leaving your site to go pay
bills online, those 15 minutes aren’t included in time-on-site – and they should be.

So, time-spent-on-site is always either over- or under-counted. And you’ll never know which – this makes this metric utterly unreliable as an indicator of success or failure. So you can’t make a decision with this data because you can’t know whether your action – a section added, the number of long-form videos reduced – caused time spent to go up or down.

More importantly, these days it really doesn’t matter how much time people actually spend on a news site. What matters much, much more is whether people are engaged with the news, whether they believe news sites are an essential component to their lives, so much so that they come back repeatedly, rate a story with five stars, leave comments, click on an ad, and otherwise use the site. It doesn’t matter whether they spend three seconds or three hours.

That’s what makes the Pew Center finding so exciting (surely you’re still with me on how great web analytics is?). People actually said they’re spending more time with the news now than they did a decade ago. It doesn’t matter whether they actually are (!) – they believe they are.
I wish every news org could afford its own Pew Center-like attitudinal research study so it could track how engaged its own targeted audiences are (or aren’t), and to understand how to get and retain new audiences. The information wouldn’t always be pretty, but at least it would be data that could make a difference.

Decision-making with data

Wow – a web analytics story in the New York Times! And one that probably didn't spark mass hysteria by saying that looking at web traffic data is the end of journalism as we know it.

Jeremy Peters' September 5 story, "Some Newspapers, Tracking Readers Online, Shift Coverage," nicely framed the issues about using web analytics to make decisions about coverage. At one end you have journalists who completely ignore audiences and report whatever they want. On the other there are those who "pander to the most base reader interests."

This was one of the few mainstream media stories I've seen that recognized the new role that audiences have in informing – not dictating – news decisions.

Actually what was the most interesting to me was that Raju Narisetti, the Washington Post's managing editor of online operations, said that he used "reader metrics as a tool to help him better determine how to use online resources….the data has proved highly useful in today's world of shrinking newsroom budgets."

In other words, Narisetti didn't just look at overall site traffic numbers like total visits or unique visitors and say "Oh, that's nice," or "Whatever! I've got a paper to put out." He had to cut his budget, he saw that long-form videos had little traffic compared to other types of content, and then he decided to cut "a couple of people" from that department.

Narisetti made the decision – the data didn't. He used the data. In my rather narrowly focused web analytics world that is a really, really big deal. Data just wants to be useful. Seriously. If it could talk, it would say that it doesn't want to be any part of a daily e-mail "about 120 people in The Post's newsroom" get that lays "out how the Web site performed in the closely watched metrics – 46 in all."

Forty-six!

For data to be useful you have to ask it specific questions that you know it can answer. A metrics report won't tell you what department to cut, and by how much. However, it could tell you what types of stories get the most and least audiences, and which story types appear to have growing or declining audiences. You can also get indicators – such as the percent of long-form videos people watched to the end, rated and/or commented on – of whether you're engaging audiences.

Then the hard part – the decision – is up to you.

Number wars

What does the war in Iraq have in common with the war news organizations are fighting for their survival?

Nothing at all, unequivocally.

However, I was struck by Owen Bennett-Jones’ closing of his August 27 BBC story about the disparities in the reports of the numbers of civilians killed in Iraq since the war started in 2003:

“It remains true that people tend to cite a number that reflects not their view of the quality of the research but rather their view of the war.”

In other words, getting someone to use your numbers – your definition of success – is just as hard as getting them to change their thinking.

In the world of web analytics, people “tend to present the metric that’s most likely to work in their favor. They’re tracking the wrong way or they don’t want to look at a particular set of data. They are wrongly using analytics as what [Robert Rose of Big Blue Moose] calls a Weapon of Mass Delusion….Worst of all, they are not learning to apply insight to action.”

— from “A Web Analytics Trap,” by Jim Ericson, Information Management, August 13, 2010

Page views: bad metric #3

Why do news organizations persist in using total page views as a measure of success? Perhaps because if you're afraid of numbers then you're even more afraid of bad numbers, or numbers that tell you that your site isn't as successful as you want.

As with unique visitors and time on site, page views is a deeply flawed metric for understanding how a news organization is growing and retaining audiences.

If the number of page views goes up, it could be a good thing. Or, it could be bad.

If page views go down, it could be a bad thing. Or – you guessed it – it could be good.

We would all like to think that a soaring number of page views means lots of people are eagerly pawing through our sites reading everything that's written. However, how many times have you gone to a site and clicked on, say, 12 pages, fruitlessly looking for something?

This is counted as:

1.    One unique visitor
2.    One visit
3.    12 page views

And one dissatisfied person who may not come back.

The page views metric rewards the bad design and navigation that many news sites have (sorry). Most news sites persist in using section titles that are the same as their legacy media product (e.g., "Local News," "Life"), leaving audiences – if they're so inclined – to have to click numerous times before landing on a story about a particular city or activity like gardening.

Or, a site breaks up a story into multiple pages, which can be annoying to a reader and reduces the possibility the reader will read the entire story and rate it, e-mail it or leave a comment. What could be counted as one page view with a comment is counted as, say, five page views.

If a site is redesigned and readers can find what they want with fewer clicks, total page views will – should – go down.

And, dynamic content, or content that changes on the page without the reader clicking anything – isn't counted at all. Streaming stories, videos, widgets, Flash applications, podcasts? One page view.

To truly grow, a news org must understand every action its audiences are taking on its sites. These are challenging times that require news sites to experiment and try many different things. Not all things will work, which means sometimes the numbers will be bad. But – you guessed it – that's a good thing. We have to know if something's not working so we can fix it.

“A trend is a friend”

Someday I’ll give up web analytics and move on to something real like pottery or something, but until then I’ll keep fighting the good fight to get news orgs to stop using monthly unique visitors as an indicator of success.

It’s tempting, I know, to count UVs because the number of monthly subscribers is the standard for print, and the number of people Nielsen says watched a program is the standard for TV.

But technology has made everything different. Strategy is more important than ever, and understanding audiences, not just counting them, is essential.

UVs are counted by counting the number of cookies, or computers,
that go to a site. This means UVs are always significantly overcounted
or undercounted.

If one person uses three computers, it’s counted as
three unique visitors.

Conversely, a number of people going to one computer – for example, at a school or library – means that UVs will be undercounted.

Neil Mason, a web analytics guru who does some pretty thorough research for his clients, noted in a recent ClickZ column that UVs are usually overcounted.

(So that’s the reason why news orgs use total UVs – would they use this number if it were consistently undercounted? Don’t think so…)

Mason notes that while the UV metric is “particularly important for those sites that are dependent on advertising revenues as a major source of
income,” it “must always be treated with caution and never taken at face
value.”

Believe me, it’s rare to hear a web analytics expert use the word “never.”

The advice from the ever-pragmatic Mason: “A trend is a friend.” Analyzing significant increases or decreases over time will give news orgs the information needed to build audiences.

(Trends are indeed friends, but don’t even think about using counts of Facebook friends and Twitter followers!)

Mobile muddle

How do you measure mobile? It’s a mess, even for web analytics gurus like Judah Phillips at Monster.com, who said as much in an interview with IQWorkforce:

“The mobile space is interesting to me too, but it’s very much like traditional web analytics on smaller screens with some absolutely crazy data collection, sessionization, and visitorization challenges.”

Huh? Let’s start at the beginning.

Just as millions of other people, I have a mobile phone. However, I just discovered that I don’t have a “phone.” I have a “device,” or just simply, “a mobile.” That’s right – “mobile” is now a noun.

I guess the definition of “phone” is now limited to something on which you only make or take calls. And, it turns out, even the lowest end mobiles – or lean mobiles – at least have texting capabilities. (If you want to sound like a techie, use SMS, or short message service. Sheesh.)

Up until last year I had a lean mobile with a camera. I loved using the camera but didn’t send photos to anyone because it would have cost me $15, just like that. I didn’t send any texts because each one made or received cost $0.20 each, which meant every time I got an unsolicited text from a company or an unknowing friend I was a little annoyed. It wasn’t the cost. It was just the principle of it all. I neither wanted nor needed these texts, and I had no control over receiving them.

I’m on a smartphone now, an iPhone 3GS, for no particular reason other than it sounded fun. I’m paying $5/month extra to make or get 200 texts, and I’ve found texting pretty useful – so useful that if I find myself doing more than 200 texts/month I’ll probably pay the AT&T rip-off unlimited-text fee of $20/month. I’ve also been doing everything else everyone else does – reading news stories, tweeting, updating my Facebook status, checking in for flights, buying things.

And I have no idea whether I’ve been using mobile apps or the mobile web.

It turns out “There are ‘three worlds’ in mobile: apps, mobile Web and SMS. In the
case of smartphone owners, they will use all three to varying degrees.” (From Internet2Go.net, March 2009.)

You know what “three worlds” means. Three different sets of metrics.

And, guess what? Apps are device-specific, which means there are different sources of metrics for iPhone apps (which, contrary to popular belief, hasn’t taken over the world), BlackBerry, Palm Pre, etc.

And….

Mobile web browsers (e.g., Safari on an iPhone) are also device-specific.

And….

All of those mobile usage numbers from comScore, Ground Truth (a mobile measurement firm) and the like only measure one mobile world, the world of mobile web browsers. They don’t measure usage from apps. And how many people use mobile apps rather than the mobile web, especially for Facebook and Twitter? I dunno – a lot?

The mobile usage numbers may all be flawed, but they all do point to mobile’s continuing rapid growth. So, unfortunately, that means we’ve got to understand what new nouns like “sessionization” and “visitorization” mean.