This paper discusses the design of a speech user interface (SUI), how users reacted to and performed with it, and the summary of four design challenges.
- For speech system:
- it is important to adhere to conversational conventions;
- should be designed from scratch rather than directly translated from their graphical counterparts.
- (Conventional speech-based interface) are often characterized by a labyrinth of invisible and tedious hierarchies which result when menu options outnumber telephone keys or when choices overload users’ short-term memory.
- Since physical space presents no constraint for a speech system, the number of commands is virtually unlimited.
- Speaking and listening are two parts of a collective activity.
- For example, after a mail header is read, users hear a prompt tone. Almost all users comfortably take the lead and say something appropriate such as “read the message,” or “skip it.” In these cases, we adequately establish a common ground and therefore are rewarded with a conversation that flows naturally without the use of explicit prompts.
- Users had a strong preference for using their voice to interrupt the synthesizer.
- Prosody and pacing are two important factor for simulating conversational speech.
- The field study and the formative study both indicate that it is unlikely users will have success interacting with a system that uses graphical items as speech buttons or spoken commands.
- Recognition errors can be divided into three categories:
- Rejection – occur when the recognizer has no hypothesis about what the user said;
- Substitution – involves the recognizer mistaking the user’s utterance for a different legal utterance.
- Insertion – the recognizer interprets noise as a legal utterance
- For users to succeed with SUI, they must rely on a different set of mental abilities than is necessary for successful GUI interactions. For example, short-term memory, the ability to maintain a mental model of the system;s state, and the capacity for visualizing the organization of information are all more important cognitive skills for SUI interactions than for GUI interactions.
- Most of all, the four design challenges:
- Simulating conversation;
- Transforming GUIs to SUIs;
- Recognition errors;
- The nature of speech.
- An interesting question remains: how can normal computer users (other than the deliberately chosen ‘travelling people’) benefit from speech user interface?
- Two issues I don’t think were solved in the paper:
- Overview: how to give an overview of the interface by speech?
- List of UIs: how to help users go through the (long) list of menu items, controls, options, etc.?