Error Handling Of Spoken Dialogue System
Essay by 24 • November 30, 2010 • 3,813 Words (16 Pages) • 1,449 Views
Abstract
The extensive application of Spoken dialogue systems will facilitate people's lives to a great extent, and also bring about enormous business opportunities. However, Due to imperfect recognition components, the occurrence of errors is avoidable. This congenital defect has hindered the development of this technology.
In this paper, I explain the error handling techniques in overview, from the importance of error handling, source of errors, all the way to some error handling approach. Moreover, some hot research direction is summed up in this paper.
1 Introduction
Speech is the most natural way to communicate for human being. Spoken dialogue systems can make it possible for human to communicate with machines using speech. Spoken dialogue systems technology has made rapid growth in recent years, due to the advance in natural language processing techniques. This trend is also drove by economical interests. Spoken dialogue systems have a wide range of application including computer-based call center, interactive enquiry system, and other hands-free applications. Although the development of spoken dialogue systems has been impressive, some technical obstacles still obstruct spoken dialogue systems from reaching its full potential. One of the serious problems is how to deal with errors.
To design a robust spoken dialogue system is a challenging task. Though automatic speech recognition (ASR) has reached to a relative successful level, it's still imperfect and often blamed for the uncertainty in spoken dialogue systems. What's more, spoken dialogue systems can never make it for certain what user has said, the system merely makes decision based on values of some parameters, that is the second reason for uncertainty.
Miscommunication often happens in human-human dialogue, not to mention in a computer-human environment. Errors in spoken dialogue systems have become a bottleneck, which constrain the popularization of spoken dialogue systems technology in real-life. Since the occurrence of errors in a spoken dialogue system is unavoidable, the system should be equipped with well-designed error handling mechanisms.
In this paper, I explain the error handling techniques in overview. The first two parts will discuss the importance of error handling and the sources of error in spoken dialogue systems. After this, I will review some correlation studies from the following aspects: human-human recovery strategies, studies on human side and application of machine learning theory in error handling domain.
2 Why is error handling important?
A Spoken dialog system is a dialog system delivered through voice [1].
Basically, a spoken dialogue system should combine the following modules:
1.Speech recognizer: This module is the hare core of the system, which guarantees speech signals can be transformed into digital data form. The level of speech recognition will influence the whole performance of the spoken dialogue system.
2.Response generator: which function has beyond traditional Speech Synthesizer and must take task-domain information into account.
3.Dialog manager: which is the brain of a spoken dialogue system. It should decide which dialog strategy should adapt based on different system status.
4.Natural language understanding: which tries to understand the user's intention by employing natural language processing techniques.
The occurrence of errors significantly influences the performance and success of a spoken dialogue system. 25-35% of utterances generated by the system are used to correct system mistakes [2]. Further, the miscommunication will lead to a higher error-rate in subsequent utterances. For example, if a user detects the system can't understand his or her intension, they tend to use another form of language (for example, longer Sentences, marked Word Order, repeatedly information), which makes even worse identification results. In addition, when errors occur, they tend to correct the errors in a hyper articulation manner (could be characterized as longer time, higher voices). This leads to even worse recognition results, since the recognizers are trained with normal speech. What's worse, the users might be bored by repeating their utterances again and again. If they found the system still couldn't understand their intention, they may choose another input method instead.
3 Sources of errors
To develop better error handling strategies, the first thing is to investigate that where are the sources of errors in spoken dialogue systems?
Miscommunications usually are divided into two categories. If the system is over confident with its hypotheses, misunderstanding will be generated.
Misunderstanding means the system constructs an incorrect interpretation of the user's turn. For a non-understanding the system fails totally to construct an interpretation [3]. In this case, usually because the system underestimates its hypotheses, non-understanding will happen.
(Bohus et al., 2005.) conducted an empirical study of spoken dialogue systems errors. They analyzed error sources in the grounding model which ever used in the Conversational Architectures project by (Paek
and Horvitz 2000.). According to this model, the procedure of communication between users and spoken dialogue systems is broken down into four levels: Conversation Level, Intention Level, Signal Level, and Channel Level. The errors are classified based on different levels.
1.Out-of-Application: The user's intention is out of the systems application. This type of errors can be further divided into out-of-domain utterances and out-of-application-scope utterances
2.Out-of-Grammar: The user's utterance is out of the system's semantic grammar.
3.ASR Error: The user's utterance is not recognized correctly due to acoustic or statistical language modeling mismatches.
4.End-pointer Error: The end pointers can't correctly transform the incoming audio signal.
This classification above may not include of the sources of errors and the occurrence of error may vary between specific task-domains. But we might have realized that errors in a spoken dialogue system can happen everywhere. Misunderstanding is rarely happened, but the cost to solve it is much larger than non-understanding. It's difficult to recovery from
...
...