Of Modes and Metaphors
In The Human Interface, Jef Raskin hammers on one particular message; Modes result in mode errors, and must be eliminated wherever possible. Along with this he argues that any gesture must retain the same meaning (or effect) in as many places as possible, but preferably, in all places. This anti-modality view is built on the rather inarguable premise that any digital artifact must first be built with the cognitive limitations of people in mind. From this, as we have only one “locus of attention”, requiring a user to change their locus from the content to the context is both inefficient and prone to cause errors, particularly when the context is deep in the periphery of our attention.
Computer interfaces are still far from “humane” in many cases, so a focus on cognitive ergonomics or cognetics even today is hardly problematic. However, some things strike me as particularly… peculiar.
A multiplicity of layered, changing or even conflicting meanings in human gestures is the rule of our daily lives. It’s at the heart of a ‘passive agressive’ act for example. Us humans have a fantastic ability to move between models of structuring the world. We are also fantastic at misunderstanding the contexts within which actions are meant to be taken within. Computers are no different. It seems their biggest problem is an exaggerated inability to understand the context within which our actions are meant to take place.
The simple solution, of course, is to remove the potential for different contexts. A communicative flat land. This is Raskin’s ideal. No applications, no files, no modes. One context, no ambiguity.
While the elimination of preventable errors is nobel. I’m not convinced that a paradigm for designing interfaces that results in the removal of metaphors is either desirable or even the most efficient . Let alone forcing users into a land of a single context smacks of a bend-the-user-to-the-computer practice that Raskin is himself trying to eliminate in his book.
Perhaps some manipulations are universal, or close to it; Cut and Paste, Copy, Undo and Redo. These actions, or at least the pervasiveness of their use, is due to the reconfigurable qualities of computation as much as the structure of human thought. Outside of them there’s plenty of actions that might have very beneficial contextual interpretations. I don’t want auto indentation when typing a letter, but I do want it when writing code. A parenthesis is an individual character, but if I type it while a line of text is selected, I’d rather it bracket the content than replace it with “(“. We might want windows to be in the same place when waking a computer from sleep, but I doubt everyone wants their browser, on launching, to display the last web page they were on when sitting in a coffee shop.
The discrepancy between a computer’s understanding of context and our own is evident even in the removal of modes. This is highly evident in the social faux-pas on Facebook when new users unintentionally publish, publicly, information meant to be private. Raskin might argue that this stems from the modality created by applications, perhaps applications such as an email client. No matter how ideally ignorant a user should be of the computer’s mode, ignorance on the system’s part of the context of the user will always cause problems.
The issue seems not in designing for a universal context (though this can also help in avoiding bad design choices), but in understanding how people float between contexts and how this can be facilitated by a computer, not ignored.
The idea of computers responding to emotions is briefly mentioned. Raskin quickly sets it aside as a nice thought, as the ability for computers to successfully do this, or for designers to account for it at the time of writing is nill. Fair enough. However, he never addresses the emotions that users have — regardless of whether the computer reads them or not. Raskin talks of designing humane interfaces, polite and respectable interfaces, but his guidelines account for people as if they were machinery.
When computers are capable of perceiving the human’s emotional state, then it will be time enough to consider how they should respond. I expect that to be sensible within two or three decades, but it seems harsh to criticize Jef for not addressing that when we still don’t ban modal dialog boxes, which guarantee to break the user’s train of thought.
I nitpicked several versions of the manuscript of THI and still took a long time to realize how fundamental the fact of the singular focus of conscious attention is, to how we ought to design programs to work for people.
Jef succeeded amazingly well with the Cat in 1987 and didn’t even crow about the fact that twenty thousand users never found a single bug. Everyone around computers knows that such a thing has never happened but it did and Jef made it so.
When he discussed the zoom world he designed for the hospital information system described in THI, he failed to mention that utter novices were comfortable and competent with less than one minute of training. With such a system we could teach grandparents or professors to use computers, easily. Indeed even computer experts got it in less than two minutes of training. Such is the power of well designed modeless systems.
@Richard Karpinski
Thanks for your comments.
Let me first say that I do believe that Jef Raskin had some fantastic insights that resulted not only in the successes of the Macintosh and the Cat as you mention but also served as the foundation for applications like Enso and Quicksilver which I couldn’t go without.
Secondly, I’d like to clarify myself. I don’t criticize Raskin for not designing systems that respond to emotion. This is, of course, ridiculously unfeasible to do — at least in any generalized fashion — even now, a decade later. However, I do hold that his views relegated users to the status of thinking, task-centric, executives devoid of other, characteristically human, qualities.
While I think Raskin’s approach resulted in useful insights; I believe our relationships with computing technology are far more complex, nuanced, and messy. This reality is either overlooked or ignored when approaching human centered design as design-for-a-single-locus-of-attention. It’s telling that when arguing for monotony in interfaces Raskin describes how this approach is useful even when machines are acting as users…
I’m not saying that non-modality is a bad idea. I’m saying that if truly “Humane” computer systems are going to be designed, than Raskin’s strict cognitive centric views will need to be sublimated into a framework that takes a more sophisticated view of the qualities that define us humans. I don’t think Raskin should be tossed out, but I refuse to believe that THI is the last word on the design of interactive artifacts.
Indeed I quite agree with your response. I would make one point. The context, for example of a phrase in Italics, forces any characters entered there to be in Italics as well. This kind of mode which is visible at the focus of the typists attention causes no problems for users or for Raskin.
What he was seeking was automaticity, so that the fingers know what to do and the typist’s attention can remain on the content rather than the mechanism being used to modify the content. The distraction to switch attention to the mechanism and then back to the content is surprisingly time consuming and thoughts can suddenly be driven from the mind and lost, sometimes forever.
Since we live with graphical input devices as well as keyboards, monotony is compromised whenever something can be accomplished with the user’s choice among the two devices. Some people learn the way to do one of those things so well that they use it when their hand is already using the other. That takes more time than is necessary just to move to the other device.and then back.
You might be interested in the author’s summary of the rules and principles from The Humane Interface. You can see that on my website and it might even clarify some of the ideas to see them so succinctly summarized.
Thanks for providing another opportunity for me to think about these ideas.
I’m familiar with these summations but I’m not sure how much relevance they bare on my argument; For various reasons, I think computers should facilitate seemingly homonymous actions. I think a singular prescribed metaphor (or input device) is where a designer should start but not the goal of an ideal computer system.
Switching between mouse and keyboard is problematic in its mechanical efficacy, not its cognitive ergonomics. Both pointing and typing are usable, useful, and enjoyable, neither is going away. The problem is not how to remove the cognitive load, but how to facilitate the metaphorical switches in thinking that happen anyway, to facilitate, for example, the switch between TEXT is ONE DIMENSIONAL and TEXT is TWO DIMENSIONAL. Each metaphor frames the situation in ways that allow for actions with results both novel and similar if not identical. If I’m typing and want to copy my previous sentence I should be able to do it with the keyboard. If I’m programming and want to select a column of text, it’s cognitively easier to switch to a two dimensional metaphor and use a pointing device than re-think the action in one dimensional terms. The switch to a pointing device is only a problem when it is a problem. On my laptop, for example, my trackpad is right next to my keyboard and I can tell you the design is tremendously efficient in allowing me to move from single dimensional input (typing) to two dimensional (pointing). Despite the efficacy of switching in this case, if I’d like to move my selected column it’s cognitively easiest to use the same metaphoric framing (TEXT is TWO DIMENSIONAL). The system should support this, even if there’s a keyboard based action with identical results.
Does this mean that more ways of doing the same thing is good? Of course not. The idea is that computers should allow for not only for a user’s mental model, but facilitate switches between models of the same entity. If designed improperly the result will of curse be disaster. Being able to do two things well is ideal, but being able to do one thing well is preferable to doing two things poorly.