Toward Constructing a Cognitive Model for the Process of Spoken Dialogue

Hiroya FUJISAKI and Sumio OHNO

Department of Applied Electronics, Science University of Tokyo
2641 Noda, 278 Japan
e-mail: fujisaki@te.noda.sut.ac.jp

Spoken language is clearly the primary means of human communication, and will no doubt be the major means of man-machine communication in the future. Compared with the written language, however, messages of the spoken language are less well-formed, with frequent occurrence of omissions, repetitions, word-order changes, errors, etc. Yet these so-called ill-formed features do not seem to present any obstacles to a human-to-human spoken dialogue. This leads us to believe that these features possess their own reason of existence and are actually utilized by the participants of a spoken dialogue to facilitate smooth and efficient information exchange. From this point of view, the present study analyzes human spoken dialogues and then tries to construct a model for the process of a spoken dialogue, with an aim of its utilization in human-machine dialogues.
The results of our preliminary analysis of human dialogues have been summarized in terms of
(1) Prerequisites to a spoken dialogue -- purposeful exchange of information
(2) Basic constrains of a spoken dialogue -- parsimony and real-time production
(3) Principles and methods for a smooth spoken dialogue -- cooperation and reliability.

Most of these findings, however, have been obtained by studying the recorded spoken dialogues, which directly reflect the processes of selection and emission of information on the part of the speaker. The processes of information processing on the part of the listener, however, are only implicitly reflected in the recorded dialogues. For a clear understanding of the human processes involved in a spoken dialogue, further investigation is necessary into the perceptual and cognitive capabilities of the listener.
The above-mentioned results have been further examined from the point of view of their possible utilization in designing a system of man-machine spoken dialogue. Our investigations clearly indicate that the machine has to have appropriate models for the knowledge and intention of the human counterpart in order to conduct an intelligent and smooth dialogue, both in understanding the spoken messages uttered by a human speaker and in generating the spoken messages to be given to a human listener.
A study has also been started on the criterion and the methodology for assessing the process of spoken dialogue. As a criterion for the assessment, we introduce, at any point during the course of a spoken dialogue, an index of achievement of its intended goal(s). As for the cost function, we may adopt the weighted sum of the time and the amount of information exchanged to attain a given level of achievement. Further work is under way to develop a more detailed theory and method of evaluating spoken dialogues.

Keywords: spoken dialogue, cognitive model, parsimony, real-time production, principle of cooperation, principle of reliability, models for the knowledge and intention of speaker/listener, assessment of spoken dialogue