Monday, May 3, 2010

Interactive computing systems and the relationship between embodiment and teachability

In this post, I am going to talk about the notion of "embodiment" in human-computer interaction. In particular, I will explain how my understanding of the term has changed and how experiences with teaching my Dad how to use certain programs (Picasa, the Indian Railways online website, etc.) revealed to me another facet of the term: the relationship between the embodiment and teachability of an interactive system. If one wants to teach someone how to perform a certain task on an embodied interactive system, then written instructions (without pictures or visual aids) are almost always insufficient. In other words, one measure of the embodiment of an interactive system can be found by looking at the efficacy of written instructions in helping a novice perform a certain task.

Paul Dourish's "Where the Action is" is possibly one of my favorite books. The book's thesis is that as computers have developed, our interactions with them have changed in nature and have progressively become more "embodied." Dourish divides the history of computer systems into 4 successive time-periods based on the mode of interaction:
  • Electrical: To program a computer, one had to rewire its hardware.
  • Symbolic: Abstraction was introduced and separated hardware from software. Abstraction meant that coding got progressively simpler: from machine language to assembly language to high-level Fortran-like languages.
  • Textual: This refers to the development of command-line interfaces. These, for the first time, made interacting with the computer, seem like a "conversation."
  • Graphical: Finally, there was the development of the graphical user interface (GUI) with its desktop metaphor and 2-dimensional arrangement of files and icons. The 2-dimensionality of the GUI meant that users were able to exploit "further areas of human ability as part of the interactive experience." This meant the use of faculties such as peripheral attention, pattern recognition and spatial reasoning.
Dourish sees each successive stage as involving more and more of the distinctively "human" capabilities i.e. those skills that are most used in our interactions in our everyday life with other human beings. Embodied interaction, as Dourish defines it, therefore means an interaction that involves more and more of distinctively human skills. These could be bodily skills (like pointing, gesturing, moving, pattern recognition) or social skills (our workplace habits, our everyday assumptions, etc.). In other words, embodied interaction is a move towards integrating more and more of our real-world practices (at home, at work, at play and so on) in our interactions with computing systems.

There is another aspect to embodied interaction that has slowly come to my attention as I have been trying to teach my Dad to use computer software (email, Picassa, certain websites, etc.). It involves what I call its teachability.

Let me give some background. My father worked during a time when computers were not such a ubiquitous part of the everyday work environment. In particular, he worked at a time when there were special people who did computer work -- and therefore not everybody needed to use the computer. More so, the computer was used for what can be called high-tech scientific stuff; it wasn't used at all for the mundane things at work: sending emails/memos, for filling up your time-sheets, submitting vouchers for reimbursement, etc.

Consequently, my Dad never interacted with computers in any sustained way when he worked. But he was certainly aware of them. However this was back in the times when the command-line (Dourish's 3rd stage) was the primary mode of interacting with computers; MS-DOS ruled. When my Dad did start interacting with computers in a sustained way however (primarily to keep in touch with me here in the US), he was dealing with the GUI, a new and much more embodied stage of human-computer interaction.

Consequently he would ask me to write down instructions for him on how to do a certain action. I would agree but I would find the writing of instructions to be extraordinarily hard. Here's an example of what I mean. To copy a file from one directory to another in DOS is a simple command:

Now of course, even here, there are variations. You don't have to specify the directory if you want to copy something from or to the same directory you are in, etc. But the command itself is fairly simple and easy to understand.

Now consider doing the same thing in Windows and there turn out to be quite a few ways to copy a file from one folder to another (drag-and-drop, Edit menu, Context menu). I'm going to consider one of them here, to illustrate its complexity:
Go to the folder you want to copy from. Select the file and copy it (this can be done either by right-clicking the file and selecting "Copy" or by selecting the file and selecting "Copy" from the Edit menu). Then go to your target folder and paste the file there (again, this can be done in two ways, right-clicking in the target folder and selecting "Paste" or selecting "Paste" from the Edit menu).
Just the sheer complexity of what I wrote above makes it clear that in terms of giving written instructions, the command-line interface is far simpler than the GUI and involves less tacit assumptions (the phrases in red font, above).

Talking about the GUI involves an almost taken-for-granted use of metaphors. Consider, for example:

Go to the folder/Be in the folder/Be in some application: Go to a folder implies that the folder is a place. Go in a folder implies that the folder is like a house which you can go inside. I also ask him frequently to be inside some folder or to be inside some application.

Select the file: This is not really a metaphor although it comes up frequently in my discussions with my father. Many times, when I instruct him over the telephone to "select the file," he doesn't get what I mean or asks me whether I mean to left-click or right-click on the file icon. I also frequently ask him to "select the text" (e.g. to copy a hyper-link from Skype to Firefox to open a link that I sent him)

How does one teach people to use embodied interactive systems (GUIs), particularly when they have very little experience with computers, don't use the computer frequently and most important, are distant from us so we can only communicate with them through spoken and written instructions? Some points and observations:

Written instructions without pictures are almost completely useless.

Written instructions with pictures (screen-shots) can be useful.

However, it is best to give verbal instructions especially during the task itself*. That way, your instructions can be followed in real time and you will get immediate feedback about their intelligibility (i.e. whether or not they're understood) and efficacy (whether or not they worked).

Having a shared referenced visual object is even better. Meaning that if you are teaching someone how to use the Indian Railways Ticketing website, it is best to have it open in front of you and to carry out an instruction yourself so that you know exactly what the person you are instructing is looking at. This is easy to do for websites but much harder if you are instructing someone about how to use Picasa or how to copy files from a flash-drive to the hard-disk.

Which leads to the best possible and most embodied form of instructions. Screen-sharing along with a voice communication channel is the best way of all! This way, you know exactly what the other person sees and the instruction itself can be modified depending on how you see it being interpreted. E.g. if you ask someone to copy a file to a certain folder and this doesn't seem to get through, you can immediately supply them with instructions on how to copy files and take them through it step-by-step.

I guess the broader point that I am trying to build to is this: embodied interaction uses many bodily and social skills, and these are almost always tacit; it is hard to translate them into words. Verbal instructions (spoken and written) need to be backed up with visual aids like screen-shots and screen-sharing. Therefore it follows that one rough measure of the embodiment of an interactive system can be found by measuring the efficacy of written and spoken instructions, especially when the student and the teacher are distant, and the student is a novice at using computers.

*Whether your instructions will be remembered is another question.

No comments: