Friday, May 23, 2008

Questioning Tog: Does the Mouse Really Rule?

Did you hear this one? According to Tog, the keyboard feels faster than the mouse, but it's actually slower. (This came via Tim Bray and John Gruber).

Tog says Apple spent $50 million on user interface related R&D by 1989, and learned, among (hopefully numerous) other things, that:

  • Test subjects consistently report that keyboarding is faster than mousing.
  • The stopwatch consistently proves mousing is faster than keyboarding.
Tog's explanation: using the mouse is easy and boring, so it feels slower than recalling keyboard shortcuts, even though "the stopwatch will tell you" that it's actually faster.

No matter how many times I've read this very general statement, it never sounded right to me. Nobody seemed to ever mention any details. When I finally took the plunge and read Tog's actual analysis (parts 1, 2 and 3), I found out that the details still weren't there.

I would really like to know the answers to the following questions, for example:
  1. How many different functions did the subjects have to perform in these tests?
  2. What exactly did accomplishing a task by "mousing" and by "keyboarding" involve?
  3. How many key combinations did the user have to remember for a test? How difficult were these key combinations to enter? How easy were they to remember?
  4. How many mouse targets needed to be acquired, how big were they, and how far apart were they positioned?
  5. Did the users have any previous experience with acquiring similar mouse targets to those featured in the test?
  6. Did the users have any previous experience with using similar (or identical) keyboard combinations to those in the test?
  7. Was the user manipulating an object (like a text box or an image) between these "tasks"? If so, how big was the visual representation of this object? Did the user have to return to the object after acquiring the target needed for the task? If so, was it also timed?
I think Tog's "mousing is faster than keyboarding" statement is akin to saying that "walking is faster than driving," and neglecting to mention whether you were running your tests on a speedway or in a staircase.


Tog's test revisited

Tog does provide us, though, some insight into the methodology of at least one of his tests in the third part of his discussion:
The test I did I did several years ago, frankly, I entered into for the express purpose of letting cursor keys win, just to prove they could in some cases be faster than the mouse. Using Microsoft Word on a Macintosh, I typed in a paragraph of text, then replaced every instance of an "e" with a vertical bar (|). The test subject's task was to replace every | with an "e." Just to make it even harder, the test subjects, when using the mouse, were forbidden to just drop the cursor to the right of the | and then use the delete key to get rid of it. Instead, they had to actually drag the mouse pointer across the one-pixel width of the character t o select it, then press the "e" key to replace it.

The average time for the cursor keys was 99.43 seconds, for the mouse, 50.22 seconds. I also asked the test subjects which method was faster, and to a person they reported that the cursor keys were much, much faster. This was a classic example of the difference between subjective time (the passage of time the user experiences) and objective time (the passage of time the clock experiences). Put simply, the more mentally engaging the task, the shorter the time appears.
While I'm not a user interface expert, this passage leaves me scratching my head. It does seem to prove an interesting point on perception vs. reality, but it definitely fails to prove that the mouse is faster in general.

All it proves is that acquiring targets (vertical bar characters) scattered over a large area of the screen is more efficient with the mouse than with the keyboard. As we'll see in a minute, moving to smaller distances (especially single steps) would be much easier with the keyboard.

The same way, if you need to pick up 20 parcels which are miles apart, you'll probably take a car or a bike, but not if the same parcels are placed on the first 20 steps of a staircase.

As the vertical bars can be pretty far from each other, moving your pointer between them constitutes a significant part of the test, so it should favor the mouse, the faster vehicle – so I thought. But when I ran the test* on myself as a (somewhat questionable) test subject, using TextEdit and a paragraph from an Ask Tog article, I got the following results:
  • Mouse: 143 s
  • Keyboard: 133.5 s
I simply kept pressing the right arrow key until a vertical bar came up, at which point I selected it using Shift-Right arrow, and replaced it by pressing the "e" key. Surprisingly, and contrary to Tog's data, it was actually the keyboard that turned out to be marginally faster in this test.

By the way, having performed this test, I really fail to see how much it has to do with what Tog calls the "high-level cognitive function" of deciding "upon which special-function key to press." Maybe I'm really smart, but I didn't find that holding down an arrow key would tax my cognitive capabilities that much.

I also noticed how unnatural it felt to be using the mouse for the task. I was struggling to exert the right amount of force needed for precision alignment. It felt unbelievably tedious and even physically painful – my right wrist actually hurt afterwards. (Could all this suffering be another reason why using the mouse for the task seems longer than it actually is?)

*These test results are meant to serve little more than entertainment purposes. One (biased) subject, who happens to be the same person who devised the test, is less than statistically convincing. Yet I've made efforts to make these tests as meaningful as possible. I performed every task several times, and averaged the results. When I felt that fatigue, proficiency achieved by practice, or other factors interfered with my results, I kept re-doing the tests, switching their order, and sometimes taking long breaks from them, until I found that my results stabilized around a value, and that any difference between the values I got for the two or three tasks I was comparing was representative rather than accidental. I also honestly tried to achieve a reasonably good score in each test, without going into extremes. (I basically pretended that I had to perform actual work, in an office setting, on a looming deadline. I'm an average typist, and a power mouse user, having spent a decade working in Photoshop, Illustrator, InDesign, and Final Cut Pro, all particularly mouse-heavy applications.)


The mouse fights back

It occurred to me later that my results may have been affected by the font size I happened to be using. It was 13 points (Verdana) on a 17" screen at a 1440 x 900 resolution; Tog might have used a larger font and/or a smaller screen resolution almost two decades earlier.

Sure enough, setting the font size to 22 points turned the results around, improving the speed of the mouse by a large margin, and letting it score a narrow victory over the keyboard (though nothing like the double speed that Tog reported). Keyboard times also improved a bit:
  • Mouse, large font: 113 s
  • Keyboard, large font: 122.5 s
So Tog seems to have neglected a very important factor in his keyboard vs. mouse test: font size. Larger mouse targets are obviously easier to acquire, so the point size of the text you're editing can decide which method is better or more efficient.

That got me curious. What happens if we increase the font size to a ridiculous 200 points? I found out that the difference between the mouse and the keyboard almost disappeared in this dreadfully impossible task. In addition, the test results varied greatly, and repeating the test again and again helped improve times by leaps and bounds as I learned new tricks on the way. The keyboard started to win again, though mouse performance improved by a small margin every time I repeated the test. I averaged the results when I felt that the curve of the mouse improvements was flattening out:
  • Mouse, huge font: 155.5 s
  • Keyboard, huge font: 133 s

Beating the test

But there's more. Trying to speed up my "keyboarding," I tried the Alt-Right Arrow key combination, which jumps one whole word ahead, making navigation faster – and stumbled upon a "cheat": TextEdit considers the vertical bar a word boundary, thus whenever one shows up, Alt-Right arrow will stop the pointer right before it. Using this cheat (which involved more cognitive brain functionality, as well as some keyboard acrobatics), I needed more concentration than before, so the task felt much more difficult, and I also had to stop and correct some errors (something I didn't have to do using the other two methods). However, it cut my keyboarding time by a fourth, beating the hell out of the mouse at both normal and large font sizes:
  • Keyboard "cheat," normal font: 95.5 s
  • Keyboard "cheat," large font: 94.5 s
Noticeably, increasing the font size did nothing here. My theory is that finding each vertical line character was somewhat automatized by the cheat, so better visibility wasn't much help.

One side effect of the cheat was that I ended up inadvertently typing a lot of uppercase "E"s. I think the reasons are that here (1) I had to use two modifier keys (Alt and Shift), (2) I had to switch between them pretty fast, and (3) almost every second navigational keystroke (i.e. Alt-Arrow) was followed by a selection (Shift-Arrow), then immediately by a replacement ("e") keystroke, requiring me to switch between three different key commands very often, making me mess up one of them a lot.

Using a huge font size slowed down this task as well, but it was still the fastest way at that size:
  • Keyboard "cheat," huge font: 123 s
The table and graph below summarize how the font size change affects the three input methods tested. The straight lines connecting the dots should by no way be interpreted as implying continuity, they are provided mostly for better visibility (and as hints at very vague trends).

I could come up with several pages' worth of analysis, comparing and contrasting how font size affects each test case, but suffice it to say that without specifying everything down to the smallest details, it's just about impossible to decide whether the keyboard or the mouse is faster even in a very simple test like this. Tog seems to have oversimplified things a lot.



The keyboard's turn

In order to investigate things a bit further, I also came up with a test of my own, one where I felt the keyboard would have the edge, as it required smaller, more precise navigational movements.

In my test, the subject needed to replace every second letter of every word in a (much shorter) paragraph with the letter "p." I was the test subject again, and these were my results:
  • Mouse, normal font: 137 s
  • Keyboard, normal font: 90 s
  • Mouse, large font: 118 s
  • Keyboard, large font: 76.5 s
Boy, did I manage to contrive a test that favors the keyboard! As expected, it performed better as a means for selecting every second character of a word. Rhythmically pushing the right arrow twice would always do the trick, whereas with the mouse, I had to move past small characters of varying width*, and had to strain myself to position the pointer accurately. It seemed unnatural.

Replacing the letters was identical in both cases, so it didn't influence the results. (Or did it? We'll get back to that very soon.)

Somewhat surprisingly, a larger font size helped both tasks equally well. I expected mousing to benefit more, as the targets became larger, thus easier to acquire. Investigating the reason for this anomaly is beyond the scope of my blog post, but it, again, proves my main point: that things are complicated.

If you think by now that I'm some sort of a keyboard evangelist and mouse hater, rigging tests so that they will always let the keyboard win, then well, you've got that mouse pointer hovering over the wrong guy. No, I'm the guy who keeps rigging tests in small ways that change the results in big ways, trying to show that simplistic statements like Tog's are simply wrong. And ultimately, I just want the help the case of redundancy and freedom of user choice in user interface design.

So then I changed a little aspect of my test: instead of the letter "p," the subject now had to enter the "¶" character in place of every second letter of every word. This turned the test on its head, making the previously victorious keyboard lose by a Tog-uesque 50% margin.

What gives? Well, it's simple. While the subject (i.e. yours truly) could easily rest his fingers on the "p" key while navigating with the keyboard in the previous version, that was no longer possible with the hard-to-reach Alt-"7" combination needed for the "¶" character. Thus, every time I had to switch between entering that symbol and navigating/selecting, I had to lift my fingers off the keyboard and reposition them, losing a lot of time. (With practice, my times improved greatly. From an initial dysmal 184 seconds, which I excluded from my calculations as an anomaly, I reached a much more respectable average of 149 seconds by the time I finally decided to give the whole frustrating mess a rest. As always, the number I provide is an average. However, an interesting trend to note is that this is an initially huge margin that gets significantly smaller with practice.)
  • Mouse, large font: 126 s
  • Keyboard, large font: 149 s

So, guess what? You can't say that "the keyboard is faster." You can't even say, "The keyboard is faster when replacing every second letter of every word in a text." No, you even have to say what character you're replacing it with. It gets as specific as that.

*Switching to a monospaced font didn't help much, though. I ran the test with Monaco, and didn't notice any statistically significant improvements.


Conclusions

After testing some very narrow areas of the huge field of "user interaction by keyboard vs. mouse," all I can say is that both input methods have their strengths and weaknesses, and their optimum fields of use should be analyzed and investigated much more carefully than Tog appears to have done.

Personally, I think it's best to let the user decide what method(s) to use, and provide for both keyboard and mouse-based functionalities, whenever it makes any little sense. But even providing several methods isn't going to be enough: you also have to get them right. Designing key commands and mouse targets is pretty easy to mess up, having the user end up with useless input methods even in their preferred contexts.

Also, even though some specific input methods may have intimidating learning curves, in some cases, they may be worth the effort (e.g. in the case of the iPhone keypad). Obviously, users may be a bit tough to sell on such user interface choices, so they must be explained in a careful and convincing way.

Finally, I also wonder whether it's always speed that matters. It may easily be the case that the faster way of performing a task is more exhausting, less natural, or simply less fun. Speed may or may not be important; the developer simply can't imagine every possible way his or her product will be used. Maybe a task you imagined to be pretty rare as a developer will end up being the one that your user needs to perform a thousand times in the course of two hours – so there had better be a fast way of doing it. And while you're at it, why not also create an easy way, and a fun way? Redundancy can be a great thing in the world of user interfaces.

That is, unless you can find a perfect way, one which is easy, fun, fast and natural in every situation – I'm hard-pressed to find too many examples, though.

1 comment:

Anonymous said...

Excellent analysis. The 20 year old Apple test might be significant for deciding which interaction is most useful for new users without a specific task, something Apple products are tuned for. It doesn't do a good job of addressing the best techniques for skilled users with specific tasks.

Thanks for taking a shot at analyzing other scenarios.