Cognitive Model of Speech Dialogue:

Modeling of Speech Dialogue Processing with Nonverbal Information.

Yuichiro ANZAI and Michita IMAI

Department of Computer Science, Keio University

Kohoku-ku, Yokohama 223, Japan

e-mail: anzai@aa.cs.keio.ac.jp

The characteristic of speech dialogue in natural environments is that not only verbal but also nonverbal information greatly influences dialogue processing. Semantic processing in dialogue depends on mental models and inference in mind of humans having conversation. Mental models in usual speech dialogue research are limited mainly to verbal information models, and there have been presented almost no models for dialogue `fields' that integrate both nonverbal and verbal information. In this research, we construct a model of dialogue fields for speech dialogue processing, and describe how such a model can be realized on a machine. Furthermore, we build models of semantic inference processing in dialogue with the mental model, and implement a speech dialogue system based on the model. Concretely, we implement an utterance generator called Linta-II on the autonomous mobile robot for flexible human-robot interaction developed in our laboratory. Using what we call Attention Mechanism, Linta-II can generate situated utterances with external world information obtained from the robot's sensors. Attention Mechanism consists of entities that make simple symbolic manipulation, and pay involuntary and voluntary attention. Involuntary attention acquires an external world event immediately when it happens. Voluntary attention is activated by an action control unit of the robot, and makes symbolic manipulation in top-down fashion. With these two mechanisms, Linta-II can generate situated utterances without declarative constraints. Now we are implementing Linta-II on our autonomous mobile robot named ASPIRE. ASPIRE has various kinds of sensors such as ultrasonic sensors and touch sensors, which are controlled by multiprocessors connected each other by bus. On each board of ASPIRE, the operating system called PULSER developed in our lab has been implemented. Linta-II is being implemented using what we call `direct interrupt mechanism' offered by PULSER. To evaluate Linta-II, we are making Linta-II generate utterances and trying to see how often declarative constraints are used. The present results suggest that integrating the proposed Attention Mechanism into the utterance generator tends to reduce the times of declarative constraints to be used.

Keywords: attention, utterance generation, human-robot interaction