Model of Dialog
-- Prediction of the Next Topic Based on Utterance Motivation --
Riichiro MIZOGUCHI and Yoichi YAMASHITA
The Institute of Scientific and Industrial Research, Osaka University
8-1, Mihogaoka, Ibaraki, Osaka, 567 JAPAN
e-mail: miz@ei.sanken.osaka-u.ac.jp
Prediction of user's next utterances is an important technique to
understand spoken dialog with the middle or large size vocabulary.
Various kinds of dialog knowledge, such as utterance pairs, topics,
and so on, are indispensable for prediction of the utterances. We
have proposed a basic mechanism for utterance prediction last year.
It was based on a topic transition model, named TPN (topic packet
network). The topic information was very useful to predict the
utterances. TPN is a static model of topic transitions represented
in the form of a sort of network. In a TPN, topic packets (TP)
bundle some topics and they are a priori linked each other. However,
the topic of the utterance dynamically changes in a dialog as the
dialog goes on. It is not easy to describe all the possible topic
transitions in a static network especially for dialog tasks with the
large vocabulary. A dynamic mechanism is appropriate for modeling
topic transitions. In this year, we have discussed a new mechanism
for enumerating possible topics of the next utterance by introducing
a model of utterance motivation.
In a goal-oriented dialog, we carry out a dialog in order to exchange
focussed information. Goals are achieved step by step through repeated
interactions. We usually have a definite motivation of making an
utterance in such a dialog. This motivation of the utterance has a
close connection with the meaning of the utterance. A model of the
utterance motivation is expected to flexibly predict topics in a dialog.
We analyzed several dialogs concerning a task of route direction and
trip reservation in order to investigate motivations of the utterance.
Since it was assumed that a stimulus and a response in an utterance pair
have the same topic, only the stimuli were analyzed. For example,
'to make a comparison because an attribute has multiple values' is a
motivation. Motivations could be divided into two different levels of
motivation: communication and problem solving. The motivation of
communication is triggered by the state of information exchange. A
state that 'an attribute has multiple values' is categorized into one
of the motivations of this level. The motivation of problem solving is
related to how to use the derived information in the process of problem
solving. 'To make a comparison' is an example of the motivations of
problem solving.
The motivations of communication are classified into 8 categories as
follows.
\=a. A value is unknown
\>b. A value is ambiguous
\>c. An attribute has multiple values
\>d. An attribute or an action is unknown
\>e. Attributes or actions are exhausted
\>f. An object or an action sequence is ambiguous
\>g. To confirm something uncertain
\>h. An object or an action sequence is unknown
Knowledge about the dialog domain is necessary for the dialog manager to
know transferred information in a dialog. The dialog manager predicts
the motivations of communication based on the state of transmission for
each piece of information. We introduced two kinds of information
packets to organize the domain knowledge. The 'action sequence'
represents a sequence of actions which are executed in order to achieve
a goal. The 'object' describes an assembly of information except for
action sequences.
The motivations of problem solving are classified into 10 categories as
follows.
\=A. To compare
\>B. To select
\>C. To sort
\>D. To know/inform the reason
\>E. To know/inform the condition
\>F. To know/inform the related information
\>G. To correct
\>H. To satisfy the constraints
\>I. To know/inform the goal
\>J. To know/inform the completion
The purpose in the problem solving process by the user invokes the
motivation of this level. A user modeling is a very important technique
to predict the motivation of problem solving. This is future work.
The motivation of an utterance is modeled by combination of two levels
of the motivation mentioned above. We investigated the frequency of
combined motivations using 6 simulated dialogs. The tasks of two sets
of dialog are route direction and trip reservation, respectively. All
combinations of motivation did not occur and there was a small
difference of distribution of the frequency between the two tasks.
A mechanism for predicting topics in the next utterance is described
according to each utterance motivation. Assume that a slot of an
information packet contains multiple values and 'to compare' is
predicted as the motivation of problem solving. The motivation 'c' is
predicted based on the state of communication. Thus, the combination
'c' and 'A' is identified as the motivation of the next utterance.
The topic prediction pattern for the motivation 'cA' is to enumerate
topics in the information packet subordinated by the slot which
contains multiple values.
The former mechanism based on the TPN model had poor flexibility to
dynamic change of dialogs. A model of the utterance motivation enables
utterance prediction adaptive to situations in a dialog. Evaluation
of the proposed mechanism remains an issue to be discussed.
Keywords: utterance prediction, utterance motivation, topic transition model, spoken dialog recognition