A cautionary tale from my humble beginnings.

I conducted one of my first usability tests while working at a large interactive agency in Boston in 1998. Our client was a large pharmaceutical company, and I had designed an online glucose diary designed to help diabetes patients keep track of their glucose levels online.

I conducted the 2 days of usability testing with 8 members of the target audience—both Type 1 and Type 2 diabetics (We were able to recruit from patients at the Joslin Diabetes Center in Boston). Not surprisingly, the results uncovered several “gotcha” problems with the interface, but they were easy to fix.

We changed the design, went into development and the online glucose diary launched within a month.

About a year later

I was still working on the team assigned to the pharmaceutical company account. For some reason, the online glucose diary popped into my head as I was having lunch with the client.

“Hey, how’s the online glucose diary doing?” I casually asked.

“Oh, THAT thing,” the client said sardonically. “We took it down.”

“Why?” I asked, flabbergasted.

“The log files were abysmal. Nobody was using it.”

“But it tested so well!” I said.

“Well, I guess it tested well, but I guess we forgot to ask folks if they would actually use the thing,” he chuckled. “There are so many great electronic portable glucose meters now that automatically tabulate your daily, weekly, and yearly levels and your overall treatment of diabetes. There’s really no need for any manual entry.”

He paused, patted me on the shoulder, and added with a grin: “Nice design though, buddy.”

Even though the client seemed to take it in stride, the conversation triggered a series of panic attacks that perhaps only a UX researcher could understand. I was furious at myself. I mentally berated myself incessantly for several days, talking to my researcher alter-ego:

“Why didn’t you ask about actual utility, you dimwit! You spent 2 full days with diabetes patients and you never asked if they would actually use the thing? And you call yourself a researcher!? You’re gonna get fired that was a huge blunder!”

Well, I didn’t get fired

We continued to produce good work for in the client, who remained happy. At the end of the day, nobody seemed to be bothered by the online glucose diary that wasn’t. Only me.

But in retrospect, I shouldn’t have beat myself up over this omission of measuring utility. Why? Because at a sample size of 8, it’s highly improbable I would have been able to answer the question of usefulness—or utility— with any credible degree of confidence in the first place. Simply put, I didn’t have a large enough sample size or number of participants.

Let me explain.

If you’re been conducting UX research for any amount of time, I’m sure you aware of the “Magic Number 5”, which holds that the first 4-5 users in a usability test will find 80 percent of the severe usability problems in an interface, and that additional testing is less likely to reveal problems. (A previous Verizon colleague of mine, Bob Virzi, wrote one of the seminal papers on that assertion way back in ’92).

And if you’ve ever conducted a good-old fashioned usability study, you know that you usually start to see the same problems repeating over and over again after the first couple of participants.

However, that sample will very rarely be sufficient to ascertain utility, or usefulness of a product or feature. Why? The answer gets to very heart of probability and statistics. But before we go there, let’s first consider what utility is in the first place.

The elusive, difficult-to-measure concept of utility

In my career as a usability specialist, I’ve heard the terms “usability” and “utility” used interchangeably so often that I can’t help but think folks think they are completely synonymous, like “road” and “street” or “car” and “automobile.”

And you can hardly blame them, because utility is a hard concept to define. Indeed, most dictionaries will use ambiguous definitions like “fitness for some purpose or worth to some end” (Merriam Webster) or “the state of being useful, profitable, or beneficial” (Oxford dictionary). Put even more broadly, something is considered to have utility if it simply has the ability to satisfy any human need.

Regarding a website, features and functionality probably contribute the most to utility, but other factors related to marketing, brand perception, or the specific offering of the company will have an impact. And then, there’s the value (and thus utility) that simply comes from being highly used by others.

For example, if your friends use a certain social network you will probably attach a higher perceived value to its site or app (Are you going to get more traction on Facebook or MySpace?). In addition, studies have shown that participants associate a higher level of utility to web sites that are aesthetically pleasing to them.

Usability, on the other hand, is the degree to which a website can be used by specified consumers to achieve quantified objectives with effectiveness, efficiency, and satisfaction in a specific context of use. Ergo, a site may provide great utility, but not be usable. (Think of the software you use to create an expense report after a business trip.) Conversely, the site may be usable, but provide no utility. (Be honest: have you used the web-based version of Mapquest since you upgraded to a phone with a GPS, probably sometime during the Bush administration?)

Wrapping Up

Fortunately, UserZoom offers an expansive number of tools to measure both usability and utility. Furthermore, because of the large number of participants you’re able to reach with your study, UserZoom allows you to measure utility with statistical significance.

In my next post, I’ll explain specifically how to measure utility and the crucial concept of confidence intervals, and how to make them shrink, making your projections of utility iron clad.