Incremental and Coherence Preserving Rhetorical Structure Analysis on Dialog Processing

Naoyoshi TAMURA

Department of Electrical and Computer Eng., Yokohama National University
Tokiwadai 156, Hodogaya-ku, Yokohama 240, JAPAN

{I. Introduction}
Last year we proposed an LR-parser like analyzing method for text structure of dialog in which the analyzer extracts some structure of dialog text. The analyzer determines next action from the partial structure and the input sentence; (1) shift action to push input sentence into a stack, (2) reduce action to construct a new substructure in a bottom up manner from topmost element of the stack and the input sentence.
Effectivity of this method was demonstrated by an observation; the rhetorical structure and some information of written texts were extracted and the structure considerably corresponded with its original sectioning.
However, this analyzer produces a binary tree as a structure for a dialog. The tree is not powerful to express the structure in which speaker mutually changes in turn. In this report, we propose the Dual Finite States Automata (DualFSA) as a framework to deal incrementally with dialog. DualFSA consists of two FSA's, each FSA drives another FSA by regarding the state transition of one FSA as input sequence for another FSA.
In this report, first we classify dialog sentences into several classes. Then we describe each class as scheme in DualFSA. In the scheme each situation of dialog and exchange of initiative are described. Our analyzer transits its state in DualFSA according to the scheme or selects scheme in DualFSA. The history of states transition describes a kind of context structure of dialog. Possible transitions are used to generate utterances of the system or to understand speaker's utterances.

{II. Results}

{II.(1) Classification of Sentence}
We classified sentences in 16 examples in the database (ONKYOUGAKKAI) into 4 major classes. That is, (1) Information Request, (2) Dialog Prompt, (3) Will Transmission and (4) Statement/Assertion. Moreover each major classes are divided into some subclasses: (1-1) Yes/No Question, (1-2) Wh Question, (1-3) Reconfirmation, (2-1) Prompting, (3-1) Assertion, (3-2) Proposal, (3-3) Request, (3-4) Agreement, (4-1) Yes/No Answer, (4-2) Wh Answer, (4-3) Statement, (4-4) Assertion, (4-5) Impression and (4-4) Conclusion. We will describe each of them at the next section.

{II.(2) DualFSA}
The DualFSA = $$ consists of following two automata:
$A_1 = < Q, P, F_1, s_1, T_1>$
$A_2 = < P, Q, F_2, s_2, T_2>$
where each elements of the automata are set of states, set of inputs, transition function, initial state and set of final state, respectively.
We are aiming to apply our method to such as sightseeing consultation system. In this case P and Q correspond states of human and computer utterance, respectively.

{II.(3) Macroscopic Expansion of Dialog}
The macroscopic expansion of dialog can be also represented in DualFSA like this way: When the system is in the wait state (W) the system is waken up by user's presentation of object and one state transition of an FSA triggers state transition of another FSA.

{II.(4) Microscopic Expression}
To describe actual utterance class, we expand the DualFSA and introduce two other states, one is "don't care state" which matches with any other state, another is "wait state" which is the state with no utterance.
In this report, the DualFSA is nondeterministically described, however in the actual system, semantical process such as planning for mutual cooperative talking try to support the automaton to transit states deterministically.
We will show several utterance types in DualFSA. The description is selected as a scheme by the planner which uses it to understand the utterance of speaker and to generate response of the system.
Plan used in our method is like follows:

Type: Yes/No Question
Object: to get the information from H whether the proposition P is true or not, or whether H knows P.

Precondition: \=(1) the speaker S doesn't know whether P is true or not.
\>(2) S wants to know whether P is true or not.
\>(3) without question, S thinks he cannot know whether H shows P is true or not.
\>(4) S believes H knows whether P is true or not.

Focus: true or false of the proposition P.

We will show DualFSA schemes for some utterance types:

(1) Yes/No Question Scheme: In this scheme, speaker's Yes/No question causes hearer's wait state to Yes/No Answer state. In principle, hearer's state transits into the wait state again after Yes/No Answer.

(2) Wh Question Scheme: This scheme is used like Yes/No Question Scheme basically, however, this scheme permits the hearer's state to transit to the Insist state against speaker's wh question. In this case, an exchange of initiative is occurred. Speaker at this time transits into the Wait state, hearer into the Insist state and continues transition.

(3) Prompting Scheme: This scheme is used for hearer to prompt speaker's assertion, Insist and Conclusion state be continued. In principle, speaker's state doesn't change. Hearer only prompts speaker's utterance, Its previous and next states are the Wait state. Speaker gets of course the initiative.

(4) Insisting Scheme: In this scheme speaker's state transits into the Wait state after his assertion utterances. The initiative is passed to hearer. Hearer is in the Wait state at this point and then transits into the Assertion, Insisting or Agreement state as a reply for the speaker. The speaker gets the initiative and continues utterances.

{III. Concluding Remarks}
In this report, we described the state transitions of user and computer communication of the sightseeing consultation system within the framework of DualFSA. Mutual relation of state transitions is clearly described as automaton, in which the initiative exchange is also represented by the introduction of the wait state.
However it is difficult to eliminate the nondeterminism of action. The system will be driven by some knowledge such as plan.
Though there remains the insufficiency in describing real system, we are going to discuss and implement the system from the view point that even question against the user's utterance is allowed in the real time system.

Keywords: dialog modeling, scheme, dual finite state automata, type of sentence