• K.Kato, K.Inoue, D.Lala, K.Ochi, and T.Kawahara.
    Real-time generation of various types of nodding for avatar attentive listening system.
    In Proc. ICMI, p. (accepted for presentation), 2025.

  • T.Tsuda, S.Yamashita, K.Inoue, T.Kawahara, and R.Higashinaka.
    Constructing a multi-party conversational corpus focusing on interlocutor relationships.
    In Proc. SemDial, p. (accepted for presentation), 2025.

  • K.Inoue, M.Elmers, Y.Fu, Z.H.Pang, D.Lala, K.Ochi, and T.Kawahara.
    Prompt-guided turn-taking prediction.
    In Proc. SIGdial Meeting Discourse \& Dialogue, p. (accepted for presentation), 2025.

  • K.Inoue, Y.Okafuji, J.Baba, Y.Ohira, K.Hyodo, and T.Kawahara.
    A noise-robust turn-taking system for real-world dialogue robots: A field experiment.
    In Proc. IROS, p. (accepted for presentation), 2025.

  • R.Magoshi, S.Sakai, J.Lee, and T.Kawahara.
    Multi-lingual and zero-shot speech recognition by incorporating classification of language-independent articulatory features.
    In Proc. INTERSPEECH, p. (accepted for presentation), 2025.

  • M.Elmers, K.Inoue, D.Lala, and T.Kawahara.
    Triadic multi-party voice activity projection for turn-taking in spoken dialogue systems.
    In Proc. INTERSPEECH, p. (accepted for presentation), 2025.

  • W.Zhou, T.Du, C.Xu, S.Li, Y.Zhao, and T.Kawahara.
    Simple and effective content encoder for singing voice conversion via dimension reduction.
    In Proc. INTERSPEECH, p. (accepted for presentation), 2025.

  • M.Mimura, J.Lee, and T.Kawahara.
    Switch Conformer with universal phonetic experts for multilingual ASR.
    In Proc. INTERSPEECH, p. (accepted for presentation), 2025.

  • W.Zhou, T.Du, W.Guan, M.Xiao, C.Xu, Y.Zhao, and T.Kawahara.
    InvoxSVC: Zero-shot singing voice conversion with in-context learning in latent diffusion.
    In Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), p. (accepted for presentation), 2025.

  • K.Inoue, D.Lala, M.Elmers, K.Ochi, and T.Kawahara.
    An LLM benchmark for addressee recognition in multi-modal multi-party dialogue.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), pp. 330--334, 2025.

  • K.Inoue, D.Lala, M.Elmers, and T.Kawahara.
    Why do we laugh? annotation and taxonomy generation for laughable contexts in spontaneous text conversation.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), pp. 318--323, 2025.

  • D.Lala, M.Elmers, K.Inoue, Z.H.Pang, K.Ochi, and T.Kawahara.
    ScriptBoard: Designing modern spoken dialogue systems through visual programming.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo., pp.176--182, 2025.

  • K.Inoue, D.Lala, T.Kawahara, and G.Skantze.
    Yeah, un, oh: Continuous and real-time backchannel prediction with fine-tuning of voice activity projection.
    In Proc. NAACL, pp.7171--7181, 2025.

  • Z.H.Pang, Y.Fu, D.Lala, M.Elmers, K.Inoue, and T.Kawahara.
    Does the appearance of autonomous conversational robots affect user spoken behaviors in real-world conference interactions?
    In Proc. CHI, Vol.Late Breaking Work, 2025.

  • J.Lee, M.Mimura, and T.Kawahara.
    Leveraging IPA and articulatory features as effective inductive biases for multilingual ASR training.
    In Proc. IEEE-ICASSP, 2025.

  • Z.H.Pang, Y.Fu, D.Lala, M.Elmers, K.Inoue, and T.Kawahara.
    Human-like embodied AI interviewer: Employing android ERICA in real international conference.
    In Proc. COLING, Vol.System Demonstrations, pp.136--150, 2025.

  • J.Chen, C.Chu, S.Li, and T.Kawahara.
    Data selection using spoken language identification for low-resource and zero-resource speech recognition.
    In Proc. APSIPA ASC, 2024.

  • D.Lala, K.Inoue, and T.Kawahara.
    Prediction of negative user reactions towards system responses during attentive listening.
    In Proc. APSIPA ASC, 2024.

  • D.Lala, K.Inoue, H.Kawai, Z.H.Pang, M.Elmers, and T.Kawahara.
    Development and evaluation of a semi-autonomous parallel attentive listening system.
    In Proc. APSIPA ASC, 2024.

  • H.Shi, Y.Gao, Z.Ni, and T.Kawahara.
    Serialized speech information guidance with overlapped encoding separation for multi-speaker automatic speech recognition.
    In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 198--204, 2024.

  • M.Elmers, K.Inoue, D.Lala, K.Ochi, and T.Kawahara.
    Analysis and detection of differences in spoken user behaviors between autonomous and wizard-of-oz systems.
    In Proc. Oriental-COCOSDA Workshop, 2024.

  • Y.Fu, C.Chu, and T.Kawahara.
    StyEmp: Stylizing empathetic response generation via multi-grained prefix encoder and personality reinforcement.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.172--185, 2024.

  • T.Honda, S.Sakai, and T.Kawahara.
    Efficient and robust long-form speech recognition with hybrid H3-Conformer.
    In Proc. INTERSPEECH, pp.2985--2899, 2024.

  • H.Shi and T.Kawahara.
    Dual-path adaptation of pretrained feature extraction module for robust automatic speech recognition.
    In Proc. INTERSPEECH, pp.2850--2854, 2024.

  • Y.Gao, H.Shi, C.Chu, and T.Kawahara.
    Speech emotion recognition with multi-level acoustic and semantic information extraction and interaction.
    In Proc. INTERSPEECH, pp.1060--1064, 2024.

  • K.Ochi, K.Inoue, D.Lala, and T.Kawahara.
    Entrainment analysis and prosody prediction of subsequent interlocutor's backchannels in dialogue.
    In Proc. INTERSPEECH, pp.462--466, 2024.

  • K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
    Multilingual turn-taking prediction using voice activity projection.
    In Proc. COLING, pp.11873--11883, 2024.

  • M.Masuyama, T.Kawahara, and K.Matsuda.
    Video retrieval system using automatic speech recognition for the Japanese Diet.
    In ParlaCLARIN IV Workshop, pp.145--148, 2024.

  • T.Kawahara.
    Quantitative analysis of editing in transcription process in Japanese and European Parliaments and its diachronic changes.
    In ParlaCLARIN IV Workshop, pp.66--69, 2024.

  • H.Shi, K.Shimada, M.Hirano, T.Shibuya, Y.Koyama, Z.Zhong, S.Takahashi, T.Kawahara, and Y.Mitsufuji.
    Diffusion-based speech enhancement with joint generative and predictive decoders.
    In Proc. IEEE-ICASSP, pp.12951--12955, 2024.

  • Y.Gao, H.Shi, C.Chu, and T.Kawahara.
    Enhancing two-stage finetuning for speech emotion recognition using adapters.
    In Proc. IEEE-ICASSP, pp.11316--11320, 2024.

  • W.Zhou, Z.Yang, C.Chu, S.Li, R.Dabre, Y.Zhao, and T.Kawahara.
    MOS-FAD: Improving fake audio detection via automatic mean opinion score prediction.
    In Proc. IEEE-ICASSP, pp.876--880, 2024.

  • K.Shimada, K.Uchida, Y.Koyama, T.Shibuya, S.Takahashi, Y.Mitsufuji, and T.Kawahara.
    Zero- and few-shot sound event localization and detection.
    In Proc. IEEE-ICASSP, pp.636--640, 2024.

  • K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
    An analysis of user behaviours for objectively evaluating spoken dialogue systems.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.

  • H.Kawai, D.Lala, K.Inoue, K.Ochi, and T.Kawahara.
    Evaluation of a semi-autonomous attentive listening system with takeover prompting.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.

  • K.Yamamoto, S.Kawano, T.Kawahara, and K.Yoshino.
    Data augmentation for robust natural language generation based on phrase alignment and sentence structure.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.

  • Y.Fu, H.Song, T.Zhao, and T.Kawahara.
    Enhancing personality recognition in dialogue by data augmentation and heterogeneous conversational graph networks.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.

  • Z.H.Pang, Y.Fu, D.Lala, K.Ochi, K.Inoue, and T.Kawahara.
    Acknowledgment of emotional states: Generating validating responses for empathetic dialogue.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2024.

  • K.Inoue, B.Jiang, E.Ekstedt, T.Kawahara, and G.Skantze.
    Real-time and continuous turn-taking prediction using voice activity projection.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo. Paper, 2024.

  • S.Yamashita, K.Inoue, A.Guo, S.Mochizuki, T.Kawahara, and R.Higashinaka.
    RealPersonaChat: A realistic persona chat corpus with interlocutors' own personalities.
    In Proc. PACLIC, 2023.

  • K.Inoue, D.Lala, K.Ochi, T.Kawahara, and G.Skantze.
    Towards objective evaluation of socially-situated conversational robots: Assessing human-likeness through multimodal user behaviors.
    In Proc. ICMI (Companion; Late Breaking Results), pp.86--90, 2023.

  • Y.Fu, K.Inoue, C.Chu, and T.Kawahara.
    Reasoning before responding: Integrating commonsense-based causality explanation for empathetic response generation.
    In Proc. SIGdial Meeting Discourse \& Dialogue, 2023.

  • S.Kobuki, K.Seaborn, S.Tokunaga, K.Fukumori, S.Hidaka, K.Tamura, K.Inoue, T.Kawahara, and M.Otake-Matsuura.
    Robotic backchanneling in online conversation facilitation: A cross-generational study.
    In Proc. RO-MAN, 2023.

  • Y.Gao, C.Chu, and T.Kawahara.
    Two-stage finetuning of wav2vec 2.0 for speech emotion recognition with ASR and gender pretraining.
    In Proc. INTERSPEECH, pp.3635--3639, 2023.

  • J.Lee, M.Mimura, and T.Kawahara.
    Embedding articulatory constraints for low-resource speech recognition based on large pre-trained model.
    In Proc. INTERSPEECH, pp.1392--1396, 2023.

  • H.Shi, M.Mimura, L.Wang, J.Dang, and T.Kawahara.
    Time-domain speech enhancement assisted by multi-resolution frequency encoder and decoder.
    In Proc. IEEE-ICASSP, 2023.

  • K.Soky, S.Li, C.Chu, and T.Kawahara.
    Domain and language adaptation using heterogeneous datasets for wav2vec2.0-based speech recognition of low-resource language.
    In Proc. IEEE-ICASSP, 2023.

  • K.Yamamoto, K.Inoue, and T.Kawahara.
    Character adaptation of spoken dialogue systems based on user personalities.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.

  • Y.Fu, K.Inoue, D.Lala, K.Yamamoto, C.Chu, and T.Kawahara.
    Improving empathetic response generation with retrieval based on emotion recognition.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.

  • Y.Muraki, H.Kawai, K.Yamamoto, K.Inoue, D.Lala, and T.Kawahara.
    Semi-autonomous guide agents with simultaneous handling of multiple users.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2023.

  • D.Lala, K.Inoue, T.Kawahara, and K.Sawada.
    Backchannel generation model for a third party listening agent.
    In Proc. Human-Agent Interaction (HAI), pp.114--122, 2022.

  • H.Shi, Y.Shu, L.Wang, J.Dang, and T.Kawahara.
    Fusing multiple bandwidth spectrograms for improving speech enhancement.
    In Proc. APSIPA ASC, pp.1935--1940, 2022.

  • H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
    Subband-based spectrogram fusion for speech enhancement by combining mapping and masking approaches.
    In Proc. APSIPA ASC, pp.286--292, 2022.

  • H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
    Non-autoregressive error correction for CTC-based ASR with phone-conditioned masked LM.
    In Proc. INTERSPEECH, pp.3889--3893, 2022.

  • S.Kawano, M.Arioka, A.Yuguchi, K.Yamamoto, K.Inoue, T.Kawahara, S.Nakamura, and K.Yoshino.
    Multimodal persuasive dialogue corpus using teleoperated android.
    In Proc. INTERSPEECH, pp.2308--2312, 2022.

  • J.Nozaki, T.Kawahara, K.Ishizuka, and T.Hashimoto.
    End-to-end speech-to-punctuated-text recognition.
    In Proc. INTERSPEECH, pp.1811--1815, 2022.

  • K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
    Leveraging simultaneous translation for enhancing transcription of low-resource language via cross attention mechanism.
    In Proc. INTERSPEECH, pp.1362--1366, 2022.

  • H.Shi, L.Wang, S.Li, J.Dang, and T.Kawahara.
    Monaural speech enhancement based on spectrogram decomposition for convolutional neural network-sensitive feature extraction.
    In Proc. INTERSPEECH, pp.221--225, 2022.

  • H.Kawai, Y.Muraki, K.Yamamoto, D.Lala, K.Inoue, and T.Kawahara.
    Simultaneous job interview system using multiple semi-autonomous agents.
    In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.107--110, 2022.

  • S.Ueno and T.Kawahara.
    Phone-informed refinement of synthesized mel spectrogram for data augmentation in speech recognition.
    In Proc. IEEE-ICASSP, pp.8572--8576, 2022.

  • H.Zhang, M.Mimura, T.Kawahara, and K.Ishizuka.
    Selective multi-task learning for speech emotion recognition using corpora of different styles.
    In Proc. IEEE-ICASSP, pp.7707--7711, 2022.

  • M.Mimura, S.Sakai, and T.Kawahara.
    An end-to-end model from speech to clean transcript for parliamentary meetings.
    In Proc. APSIPA ASC, pp.465--470, 2021.

  • H.Shi, L.Wang, S.Li, C.Fan, J.Dang, and T.Kawahara.
    Spectrograms fusion-based end-to-end robust automatic speech recognition.
    In Proc. APSIPA ASC, pp.438--442, 2021.

  • K.Soky, S.Li, M.Mimura, C.Chu, and T.Kawahara.
    On the use of speaker information for automatic speech recognition in speaker-imbalanced corpora.
    In Proc. APSIPA ASC, pp.433--437, 2021.

  • H.Futami, H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
    ASR rescoring and confidence estimation with ELECTRA.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.380--387, 2021.

  • S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
    Data augmentation for ASR using TTS via a discrete representation.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.68--75, 2021.

  • K.Soky, M.Mimura, T.Kawahara, S.Li, C.Ding, C.Chu, and S.Sam.
    Khmer speech translation corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC).
    In Proc. Oriental-COCOSDA Workshop, pp.122--127, 2021.

  • H.Inaguma, M.Mimura, and T.Kawahara.
    VAD-free streaming hybrid CTC/Attention ASR for unsegmented recording.
    In Proc. INTERSPEECH, pp.4049--4053, 2021.

  • H.Inaguma, M.Mimura, and T.Kawahara.
    StableEmit: Selection probability discount for reducing emission latency of streaming monotonic attention ASR.
    In Proc. INTERSPEECH, pp.1817--1821, 2021.

  • K.Inoue, H.Sakamoto, K.Yamamoto, D.Lala, and T.Kawahara.
    A multi-party attentive listening robot which stimulates involvement from side participants.
    In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.261--264, 2021.

  • E.Ishii, G.I.Winata, S.Cahyawijaya, D.Lala, T.Kawahara, and P.Fung.
    ERICA: An empathetic android companion for Covid-19 quarantine.
    In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.257--260, 2021.

  • T.Zhao and T.Kawahara.
    Multi-referenced training for dialogue response generation.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.190--201, 2021.

  • H.Inaguma, T.Kawahara, and S.Watanabe.
    Source and target bidirectional knowledge distillation for end-to-end speech translation.
    In Proc. NAACL-HLT, pp.1872--1881, 2021.

  • H.Inaguma, Y.Higuchi, K.Duh, T.Kawahara, and S.Watanabe.
    Non-autoregressive end-to-end speech translation with dual-decoder.
    In Proc. IEEE-ICASSP, pp.7488--7492, 2021.

  • D.Lala, K.Inoue, K.Yamamoto, and T.Kawahara.
    Findings from human-android dialogue research with ERICA.
    In Proc. IJCAI-2020 workshop on ROBOT-DIAL, 2020.

  • S.Zhang, T.Zhao, and T.Kawahara.
    Topic-relevant response generation using optimal transport for an open-domain dialog system.
    In Proc. COLING, pp.4067--4077, 2020.

  • J.Woo, M.Mimura, K.Yoshii, and T.Kawahara.
    End-to-end music-mixed speech recognition.
    In Proc. APSIPA ASC, pp.800--804, 2020.

  • M.Togami, Y.Masuyama, T.Komatsu, K.Yoshii, and T.Kawahara.
    Integration of semi-blind speech source separation and voice activity detection for flexible spoken dialogue.
    In Proc. APSIPA ASC, pp.788--793, 2020.

  • M.Wake, M.Togami, K.Yoshii, and T.Kawahara.
    Integration of semi-blind speech source separation and voice activity detection for flexible spoken dialogue.
    In Proc. APSIPA ASC, pp.775--780, 2020.

  • D.Lala, K.Inoue, and T.Kawahara.
    Prediction of shared laughter for human-robot dialogue.
    In Proc. ICMI (Companion; Late Breaking Results), pp.62--66, 2020.

  • K.Inoue, K.Hara, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
    Job interviewer android with elaborate follow-up question generation.
    In Proc. ICMI, pp.324--332, 2020.

  • K.Yamamoto, K.Inoue, and T.Kawahara.
    Semi-supervised learning for character expression of spoken dialogue systems.
    In Proc. INTERSPEECH, pp.4188--4192, 2020.

  • T.V.Dang, T.Zhao, S.Ueno, H.Inaguma, and T.Kawahara.
    End-to-end speech-to-dialog-act recognition.
    In Proc. INTERSPEECH, pp.3910--3914, 2020.

  • H.Futami, H.Inaguma, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
    Distilling the knowledge of BERT for sequence-to-sequence ASR.
    In Proc. INTERSPEECH, pp.3635--3639, 2020.

  • K.Matsuura, M.Mimura, S.Sakai, and T.Kawahara.
    Generative adversarial training data adaptation for very low-resource automatic speech recognition.
    In Proc. INTERSPEECH, pp.2737--2741, 2020.

  • H.Inaguma, M.Mimura, and T.Kawahara.
    Enhancing monotonic multihead attention for streaming ASR.
    In Proc. INTERSPEECH, pp.2137--2141, 2020.

  • H.Inaguma, M.Mimura, and T.Kawahara.
    CTC-synchronous training for monotonic attention model.
    In Proc. INTERSPEECH, pp.571--575, 2020.

  • H.Feng, S.Ueno, and T.Kawahara.
    End-to-end speech emotion recognition combined with acoustic-to-word ASR model.
    In Proc. INTERSPEECH, pp.501--505, 2020.

  • Y.Du, K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, K.Yoshii, and T.Kawahara.
    Semi-supervised multichannel speech separation based on a phone- and speaker-aware deep generative model of speech spectrograms.
    In Proc. EUSIPCO, pp.870--874, 2020.

  • T.Zhao, D.Lala, and T.Kawahara.
    Designing precise and robust dialogue response evaluators.
    In Proc. ACL, pp.26--33, 2020.

  • K.Inoue, D.Lala, K.Yamamoto, S.Nakamura, K.Takanashi, and T.Kawahara.
    An attentive listening system with android ERICA: Comparison of autonomous and WOZ interactions.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.118--127, 2020.

  • S.Nakamura, C.T.Ishi, and T.Kawahara.
    Analysis and modeling of between-sentence pauses in news speech by Japanese newscasters.
    In Proc. Int'l Conf. Speech Prosody, pp.680--684, 2020.

  • S.Isonishi, K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
    Response generation to out-of-database questions for example-based dialogue systems.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020.

  • K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    A character expression model affecting spoken dialogue behaviors.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2020.

  • K.Matsuura, S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
    Speech corpus of Ainu folklore and end-to-end speech recognition for Ainu language.
    In Proc. Int'l Conf. Language Resources \& Evaluation (LREC), pp.2622--2628, 2020.

  • H.Inaguma, K.Duh, T.Kawahara, and S.Watanabe.
    Multilingual end-to-end speech translation.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.570--577, 2019.

  • K.Soky, S.Li, T.Kawahara, and S.Seng.
    Multi-lingual transformer training for Khmer automatic speech recognition.
    In Proc. APSIPA ASC, pp.1893--1896, 2019.

  • D.Lala, K.Inoue, and T.Kawahara.
    Smooth turn-taking by a robot using an online continuous model to generate turn-taking cues.
    In Proc. ICMI, pp.226--234, 2019.

  • S.Li, R.Dabre, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
    Improving transformer-based speech recognition systems with compressed structure and speech attributes augmentation.
    In Proc. INTERSPEECH, pp.4400--4404, 2019.

  • D.Lala, S.Nakamura, and T.Kawahara.
    Analysis of effect and timing of fillers in natural turn-taking.
    In Proc. INTERSPEECH, pp.4175--4179, 2019.

  • K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
    Turn-taking prediction based on detection of transition relevance place.
    In Proc. INTERSPEECH, pp.4170--4174, 2019.

  • Y.Li, T.Zhao, and T.Kawahara.
    Improved end-to-end speech emotion recognition using self attention mechanism and multitask learning.
    In Proc. INTERSPEECH, pp.2803--2807, 2019.

  • S.Li, X.Lu, C.Ding, P.Shen, T.Kawahara, and H.Kawai.
    Investigating radical-based end-to-end speech recognition systems for Chinese dialects and Japanese.
    In Proc. INTERSPEECH, pp.2200--2204, 2019.

  • S.Li, C.Ding, X.Lu, P.Shen, T.Kawahara, and H.Kawai.
    End-to-end articulatory attribute modeling for low-resource multilingual speech recognition.
    In Proc. INTERSPEECH, pp.2145--2149, 2019.

  • D.Lala, G.Wilcock, K.Jokinen, and T.Kawahara.
    ERICA and WikiTalk.
    In Proc. IJCAI, Vol.Demo. Paper, pp.6533--6535, 2019.

  • S.Nakamura, C.T.Ishi, and T.Kawahara.
    Prosodic characteristics of Japanese newscaster speech for different speaking situations.
    In Proc. Int'l Congress Phonetic Sciences (ICPhS), 2019.

  • S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
    Multi-speaker sequence-to-sequence speech synthesis for data augmentation in acoustic-to-word speech recognition.
    In Proc. IEEE-ICASSP, pp.6161--6165, 2019.

  • H.Inaguma, J.Cho, M.K.Baskar, T.Kawahara, and S.Watanabe.
    Transfer learning of language-independent end-to-end ASR with language model fusion.
    In Proc. IEEE-ICASSP, pp.6096--6100, 2019.

  • K.Inoue, K.Hara, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
    A job interview dialogue system with autonomous android ERICA.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo. Paper, 2019.

  • K.Inoue, D.Lala, K.Yamamoto, K.Takanashi, and T.Kawahara.
    Engagement-based adaptive behaviors for laboratory guide in human-robot dialogue.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019.

  • K.Tanaka, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    End-to-end modeling for selection of utterance constructional units via system internal states.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2019.

  • M.Mimura, S.Ueno, H.Inaguma, S.Sakai, and T.Kawahara.
    Leveraging sequence-to-sequence speech synthesis for enhancing acoustic-to-word speech recognition.
    In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 477--484, 2018.

  • H.Inaguma, M.Mimura, S.Sakai, and T.Kawahara.
    Improving OOV detection and resolution with external language models in acoustic-to-word ASR.
    In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 212--218, 2018.

  • S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
    Improving very deep time-delay neural network with vertical-attention for effectively training CTC-based ASR systems.
    In Proc. IEEE Spoken Language Technology Workshop (SLT), pp. 77--83, 2018.

  • K.Yamamoto, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    Dialogue behavior control model for expressing a character of humanoid robots.
    In Proc. APSIPA ASC, pp.1732--1737, 2018.

  • K.Sekiguchi, Y.Bando, K.Yoshii, and T.Kawahara.
    Bayesian multichannel speech enhancement with a deep speech prior.
    In Proc. APSIPA ASC, pp.1233--1239, 2018.

  • T.Kawahara.
    Human-like conversational robot.
    In Proc. APSIPA ASC, p. (overview talk), 2018.

  • D.Lala, K.Inoue, and T.Kawahara.
    Evaluation of real-time deep learning turn-taking models for multiple dialogue scenarios.
    In Proc. ICMI, pp.78--86, 2018.

  • K.Yoshii, K.Kitamura, Y.Bando, E.Nakamura, and T.Kawahara.
    Independent low-rank tensor analysis for audio source separation.
    In Proc. EUSIPCO, pp.1671--1675, 2018.

  • S.Li, X.Lu, R.Takashima, P.Shen, T.Kawahara, and H.Kawai.
    Improving CTC-based acoustic model with very deep residual time-delay neural networks.
    In Proc. INTERSPEECH, pp.3708--3712, 2018.

  • S.Ueno, T.Moriya, M.Mimura, S.Sakai, Y.Yamaguchi, Y.Aono, and T.Kawahara.
    Encoder transfer for attention-based acoustic-to-word speech recognition.
    In Proc. INTERSPEECH, pp.2424--2428, 2018.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Forward-backward attention decoder.
    In Proc. INTERSPEECH, pp.2232--2236, 2018.

  • K.Hara, K.Inoue, K.Takanashi, and T.Kawahara.
    Prediction of turn-taking using multitask learning with prediction of backchannels and fillers.
    In Proc. INTERSPEECH, pp.991--995, 2018.

  • K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
    Engagement recognition in spoken dialogue via neural network by aggregating different annotators' models.
    In Proc. INTERSPEECH, pp.616--626, 2018.

  • T.Zhao and T.Kawahara.
    A unified neural architecture for joint dialog act segmentation and recognition in spoken dialog system.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.201--208, 2018.

  • K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
    Latent character model for engagement recognition based on multimodal behaviors.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018.

  • R.Nakanishi, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    Generating fillers based on dialog act pairs for smooth turn-taking by humanoid robot.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2018.

  • T.Kawahara.
    Spoken dialogue system for a human-like conversational robot ERICA.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), p. (keynote speech), 2018.

  • S.Ueno, H.Inaguma, M.Mimura, and T.Kawahara.
    Acoustic-to-word attention-based model complemented with character-level CTC-based model.
    In Proc. IEEE-ICASSP, pp.5804--5808, 2018.

  • Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
    Statistical speech enhancement based on probabilistic integration of variational autoencoder and non-negative matrix factorization.
    In Proc. IEEE-ICASSP, pp.716--720, 2018.

  • K.Shimada, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
    Unsupervised beamforming based on multichannel nonnegative matrix factorization for noisy speech recognition.
    In Proc. IEEE-ICASSP, pp.5734--5738, 2018.

  • R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
    Efficient learning of articulatory models based on multi-label training and label correction for pronunciation learning.
    In Proc. IEEE-ICASSP, pp.6239--6243, 2018.

  • H.Inaguma, M.Mimura, K.Inoue, K.Yoshii, and T.Kawahara.
    An end-to-end approach to joint social signal detection and automatic speech recognition.
    In Proc. IEEE-ICASSP, pp.6214--6218, 2018.

  • T.Kawahara, K.Inoue, D.Lala, and K.Takanashi.
    Audio-visual conversation analysis by smart posterboard and humanoid robot.
    In Proc. IEEE-ICASSP, pp.6573--6577, 2018.

  • T.Hagiya, K.Hoashi, and T.Kawahara.
    Voice input tutoring system for older adults using input stumble detection.
    In Proc. ACM Int'l Conf. Intelligent User Interfaces (IUI), pp. 415--419, 2018.

  • S.Li, X.Lu, P.Shen, R.Takashima, T.Kawahara, and H.Kawai.
    Incremental training and constructing the very deep convolutional residual network acoustic models.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.222--227, 2017.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.134--140, 2017.

  • Y.Li, C.T.Ishi, N.Ward, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    Emotion recognition by combining prosody and sentiment analysis for expressing reactive emotion by humanoid robot.
    In Proc. APSIPA ASC, 2017.

  • T.Kawahara.
    Automatic meeting transcription system for the Japanese Parliament (Diet).
    In Proc. APSIPA ASC, p. (overview talk), 2017.

  • T.Zhao and T.Kawahara.
    Joint learning of dialog act segmentation and recognition in spoken dialog using neural networks.
    In Proc. IJCNLP, pp.704--712, 2017.

  • T.Kawahara.
    Modeling difficulties of second language learners using speech technology.
    In Proc. Seoul International Conference on Speech Sciences (SICSS), p. 11 (keynote speech), 2017.

  • D.Lala, K.Inoue, P.Milhorat, and T.Kawahara.
    Detection of social signals for recognizing engagement in human-robot interaction.
    In Proc. AAAI Fall Sympo. Natural Communication for Human-Robot Collaboration, 2017.

  • M.Wake, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
    Semi-blind speech enhancement based on recurrent neural network for source separation and dereverberation.
    In Proc. IEEE Workshop Machine Learning for Signal Processing (MLSP), 2017.

  • M.Mirzaei, K.Meshgi, and T.Kawahara.
    Detecting listening difficulty for second language learners using automatic speech recognition errors.
    In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.164--168, 2017.

  • R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
    Transfer learning based non-native acoustic modeling for pronunciation error detection.
    In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.50--54, 2017.

  • M.Mirzaei, K.Meshgi, and T.Kawahara.
    Listening difficulty detection to foster second language listening with the partial and synchronized caption system.
    In Proc. EUROCALL, pp.211--216, 2017.

  • M.Mimura, Y.Bando, K.Shimada, S.Sakai, K.Yoshii, and T.Kawahara.
    Combined multi-channel NMF-based robust beamforming for noisy speech recognition.
    In Proc. INTERSPEECH, pp.2451--2455, 2017.

  • S.Nakamura, R.Nakanishi, K.Takanashi, and T.Kawahara.
    Analysis of the relationship between prosodic features of fillers and its forms or occurrence positions.
    In Proc. INTERSPEECH, pp.1726--1230, 2017.

  • H.Inaguma, K.Inoue, M.Mimura, and T.Kawahara.
    Social signal detection in spontaneous dialogue using bidirectional LSTM-CTC.
    In Proc. INTERSPEECH, pp.1691--1695, 2017.

  • D.Lala, P.Milhorat, K.Inoue, M.Ishida, K.Takanashi, and T.Kawahara.
    Attentive listening system with backchanneling, response generation and flexible turn-taking.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.127--136, 2017.

  • P.Milhorat, D.Lala, K.Inoue, Z.Tianyu, M.Ishida, K.Takanashi, S.Nakamura, and T.Kawahara.
    A conversational dialogue manager for the humanoid robot ERICA.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2017.

  • R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
    Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data.
    In Proc. IEEE-ICASSP, pp.5815--5819, 2017.

  • S.Li, X.Lu, S.Sakai, M.Mimura, and T.Kawahara.
    Semi-supervised ensemble DNN acoustic model training.
    In Proc. IEEE-ICASSP, pp.5270--5274, 2017.

  • K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
    Bayesian multichannel nonnegative matrix factorization for audio source separation and localization.
    In Proc. IEEE-ICASSP, pp.551--555, 2017.

  • D.Lala, Y.Li, and T.Kawahara.
    Utterance behavior of users while playing basketball with a virtual teammate.
    In Proc. ICAART, pp.28--38, 2017.

  • R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
    Multi-lingual and multi-task DNN learning for articulatory error detection.
    In Proc. APSIPA ASC, 2016.

  • M.Mirzaei, K.Meshgi, and T.Kawahara.
    ASR errors as predictor of L2 listening difficulties and PSC enhancement.
    In Proc. Coling Workshop on Computational Linguistics for Linguistic Complexity (CL4LC), pp.192--201, 2016.

  • K.Inoue, D.Lala, S.Nakamura, K.Takanashi, and T.Kawahara.
    Annotation and analysis of listener's engagement based on multi-modal behaviors.
    In Proc. ICMI Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (MA3HMI), 2016.

  • H.Inaguma, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
    Prediction of ice-breaking between participants using prosodic features in the first meeting dialogue.
    In Proc. ICMI Workshop on Advancements in Social Signal Processing for Multimodal Interaction (ASSP4MI), 2016.

  • D.Lala, P.Milhorat, K.Inoue, T.Zhao, and T.Kawahara.
    Multimodal interaction with the autonomous android ERICA.
    In Proc. ICMI, Vol.Demo. Paper, pp.417--418, 2016.

  • R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
    Pronunaciation error detection using DNN articulatory model based on multi-lingual and multi-task learning.
    In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), 2016.

  • S.Li, X.Lu, S.Mori, Y.Akita, and T.Kawahara.
    Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data.
    In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), 2016.

  • D.Lala and T.Kawahara.
    Managing dialog and joint actions for virtual basketball teammates.
    In Proc. IVA, Vol.Poster, 2016.

  • K.Inoue, P.Milhorat, D.Lala, T.Zhao, and T.Kawahara.
    Talking with ERICA, an autonomous android.
    In Proc. SIGdial Meeting Discourse \& Dialogue, Vol.Demo. Paper, pp.212--215, 2016.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Joint optimization of denoising autoencoder and DNN acoustic model based on multi-target learning for noisy speech recognition.
    In Proc. INTERSPEECH, pp.3803--3807, 2016.

  • T.Kawahara, T.Yamaguchi, K.Inoue, K.Takanashi, and N.Ward.
    Prediction and generation of backchannel form for attentive listening systems.
    In Proc. INTERSPEECH, pp.2890--2894, 2016.

  • D.F.Glas, T.Minato, C.T.Ishi, T.Kawahara, and H.Ishiguro.
    ERICA: The ERATO Intelligent Conversational Android.
    In Proc. RO-MAN, pp.22--29, 2016.

  • M.Mirzaei, K.Meshgi, and T.Kawahara.
    Leveraging automatic speech recognition errors to detect challenging speech segments in TED talks.
    In Proc. EUROCALL, pp.313--318, 2016.

  • N.Ward, Y.Li, T.Zhao, and T.Kawahara.
    Interactional and pragmatics-related prosodic patterns in Mandarin dialog.
    In Proc. Int'l Conf. Speech Prosody, 2016.

  • S.Li, Y.Akita, and T.Kawahara.
    Data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training.
    In Proc. IEEE-ICASSP, pp.5875--5879, 2016.

  • T.Yamaguchi, K.Inoue, K.Yoshino, K.Takanashi, N.Ward, and T.Kawahara.
    Analysis and prediction of morphological patterns of backchannels for attentive listening agents.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2016.

  • T.Kawahara, T.Yamaguchi, M.Uesato, K.Yoshino, and K.Takanashi.
    Synchrony in prosodic and linguistic features between backchannels and preceding utterances in attentive listening.
    In Proc. APSIPA ASC, pp.392--395, 2015.

  • Y.Akita, N.Kuwahara, and T.Kawahara.
    Automatic classification of usability of ASR result for real-time captioning of lectures.
    In Proc. APSIPA ASC, pp.19--22, 2015.

  • S.Li, Y.Akita, and T.Kawahara.
    Discriminative data selection for lightly supervised training of acoustic model using closed caption texts.
    In Proc. INTERSPEECH, pp.3526--3530, 2015.

  • K.Inoue, Y.Wakabayashi, H.Yoshimoto, K.Takanashi, and T.Kawahara.
    Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations.
    In Proc. INTERSPEECH, pp.3086--3090, 2015.

  • S.Li, X.Lu, Y.Akita, and T.Kawahara.
    Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation.
    In Proc. INTERSPEECH, pp.2892--2896, 2015.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Speech dereverberation using long short-term memory.
    In Proc. INTERSPEECH, pp.2435--2439, 2015.

  • M.Mirzaei and T.Kawahara.
    ASR technology to empower partial and synchronized caption for L2 listening development.
    In Proc. Workshop Speech \& Language Technology for Education (SLaTE), pp.65--70, 2015.

  • M.Mirzaei, K.Meshgi, Y.Akita, and T.Kawahara.
    Errors in automatic speech recognition versus difficulties in second language listening.
    In Proc. EUROCALL, pp.410--415, 2015.

  • T.Sasada, S.Mori, T.Kawahara, and Y.Yamakata.
    Named entity recognizer trainable from partially annotated data.
    In Proc. PACLING, pp.10--17, 2015.

  • Y.Akita, Y.Tong, and T.Kawahara.
    Language model adaptation for academic lectures using character recognition result of presentation slides.
    In Proc. IEEE-ICASSP, pp.5431--5435, 2015.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Deep autoencoders augmented with phone-class feature for reverberant speech recognition.
    In Proc. IEEE-ICASSP, pp.4356--4369, 2015.

  • T.Kawahara, M.Uesato, K.Yoshino, and K.Takanashi.
    Toward adaptive generation of backchannels for attentive listening agents.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015.

  • K.Yoshino and T.Kawahara.
    News navigation system based on proactive dialogue strategy.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2015.

  • Y.Wakabayashi, K.Inoue, H.Yoshimoto, and T.Kawahara.
    Speaker diarization based on audio-visual integration for smart posterboard.
    In Proc. APSIPA ASC, 2014.

  • M.Mimura and T.Kawahara.
    Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription.
    In Proc. APSIPA ASC, 2014.

  • M.Mirzaei, Y.Akita, and T.Kawahara.
    Partial and synchronized caption generation to develop second language listening skill.
    In ICCE Workshop on Natural Language Processing Techniques for Educational Applications (NLP-TEA), pp.13--23, 2014.

  • K.Sudoh, M.Nagata, S.Mori, and T.Kawahara.
    Japanese-to-English patent translation system based on domain-adapted word segmentation and post-ordering.
    In Proc. Assoc. for Machine Translation in the Americas (AMTA), Vol.1, pp.234--248, 2014.

  • K.Inoue, Y.Wakabayashi, H.Yoshimoto, and T.Kawahara.
    Speaker diarization using eye-gaze information in multi-party conversations.
    In Proc. INTERSPEECH, pp.562--566, 2014.

  • S.Li, Y.Akita, and T.Kawahara.
    Corpus and transcription system of Chinese Lecture Room.
    In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), pp.442--445, 2014.

  • M.Mirzaei, Y.Akita, and T.Kawahara.
    Partial and synchronized captioning: A new tool for second language listening development.
    In Proc. EUROCALL, pp.230--236, 2014.

  • K.Yoshino and T.Kawahara.
    Information navigation system based on POMDP that tracks user focus.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.32--40, 2014.

  • M.Mimura, S.Sakai, and T.Kawahara.
    Exploring deep neural networks and deep autoencoders in reverberant speech recognition.
    In Workshop on Hands-free Speech Communication \& Microphone Arrays (HSCMA), 2014.

  • T.Kawahara.
    Smart posterboard: Multi-modal sensing and analysis of poster conversations.
    In Proc. APSIPA ASC, p. (plenary overview talk), 2013.

  • K.Yoshino, S.Mori, and T.Kawahara.
    Predicate argument structure analysis using partially annotated corpora.
    In Proc. IJCNLP, pp.957--961, 2013.

  • T.Kawahara, S.Hayashi, and K.Takanashi.
    Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations.
    In Proc. INTERSPEECH, pp.1882--1885, 2013.

  • K.Yoshino, S.Mori, and T.Kawahara.
    Incorporating semantic information to selection of web texts for language model of spoken dialogue system.
    In Proc. IEEE-ICASSP, pp.8252--8256, 2013.

  • K.Yoshino, S.Mori, and T.Kawahara.
    Language modeling for spoken dialogue system based on filtering using predicate-argument structures.
    In Proc. COLING, pp.2993--3002, 2012.

  • C.Lee and T.Kawahara.
    Hybrid vector space model for flexible voice search.
    In Proc. APSIPA ASC, 2012.

  • K.Yoshino, S.Mori, and T.Kawahara.
    Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures.
    In Proc. APSIPA ASC, 2012.

  • Y.Akita, M.Watanabe, and T.Kawahara.
    Automatic transcription of lecture speech using language model based on speaking-style transformation of proceeding texts.
    In Proc. INTERSPEECH, 2012.

  • R.Gomez and T.Kawahara.
    Dereverberation based on wavelet packet filtering for robust automatic speech recognition.
    In Proc. INTERSPEECH, 2012.

  • T.Kawahara, T.Iwatate, and K.Takanashi.
    Prediction of turn-taking by combining prosodic and eye-gaze information in poster conversations.
    In Proc. INTERSPEECH, 2012.

  • T.Kawahara, T.Iwatate, T.Tsuchiya, and K.Takanashi.
    Can we predict who in the audience will ask what kind of questions with their feedback behaviors in poster conversation?
    In Proc. Interdisciplinary Workshop on Feedback Behaviors in Dialog, pp.35--38, 2012.

  • T.Kawahara.
    Transcription system using automatic speech recognition for the Japanese Parliament (Diet).
    In Proc. AAAI/IAAI, pp.2224--2228, 2012.

  • G.Neubig, T.Watanabe, S.Mori, and T.Kawahara.
    Machine translation without words through substring alignment.
    In Proc. ACL, pp.165--174, 2012.

  • T.Kawahara.
    Multi-modal sensing and analysis of poster conversations toward smart posterboard.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.1--9 (keynote speech), 2012.

  • M.Ablimit, T.Kawahara, and A.Hamdulla.
    Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language.
    In Proc. IEEE-ICASSP, pp.5009--5012, 2012.

  • M.Ablimit, A.Hamdulla, and T.Kawahara.
    Morpheme concatenation approach in language modeling for large-vocabulary Uyghur speech recognition.
    In Proc. Oriental-COCOSDA Workshop, 2011.

  • R.Gomez and T.Kawahara.
    Optimized wavelet-based speech enhancement for speech recognition in noisy and reverberant conditions.
    In Proc. APSIPA ASC, 2011.

  • M.Mimura and T.Kawahara.
    Fast speaker normalization and adaptation based on BIC for meeting speech recognition.
    In Proc. APSIPA ASC, 2011.

  • M.Ablimit, T.Kawahara, and A.Hamdulla.
    Lexicon optimization for automatic speech recognition based on discriminative learning.
    In Proc. APSIPA ASC, 2011.

  • H.Wang, T.Kawahara, and Y.Wang.
    Improving non-native speech recognition performance by discriminative training for language model in a CALL system.
    In Proc. APSIPA ASC, 2011.

  • T.Hirayama, Y.Sumi, T.Kawahara, and T.Matsuyama.
    Info-concierge: Proactive multi-modal interaction through mind probing.
    In Proc. APSIPA ASC, 2011.

  • C.Lee, T.Kawahara, and A.Rudnicky.
    Combining slot-based vector space model for voice book search.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), pp. 27--35, 2011.

  • Y.Akita and T.Kawahara.
    Automatic comma insertion of lecture transcripts based on multiple annotations.
    In Proc. INTERSPEECH, pp.2889--2892, 2011.

  • R.Gomez and T.Kawahara.
    Denoising using optimized wavelet filtering for automatic speech recognition.
    In Proc. INTERSPEECH, pp.1673--1676, 2011.

  • G.Neubig, T.Watanabe, E.Sumita, S.Mori, and T.Kawahara.
    An unsupervised model for joint phrase alignment and extraction.
    In Proc. ACL-HLT, pp.632--641, 2011.

  • K.Yoshino, S.Mori, and T.Kawahara.
    Spoken dialogue system based on information extraction using similarity of predicate argument structures.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.59--66, 2011.

  • T.Kawahara, H.Wang, Y.Tsubota, and M.Dantsuji.
    English and Japanese CALL systems developed at Kyoto University.
    In Proc. APSIPA ASC, pp.804--810, 2010.

  • R.Gomez and T.Kawahara.
    Optimizing wavelet parameters for dereverberation in automatic speech recognition.
    In Proc. APSIPA ASC, pp.446--449, 2010.

  • T.Kawahara.
    Automatic transcription of parliamentary meetings and classroom lectures -- a sustainable approach and real system evaluations --.
    In Proc. Int'l Sympo. Chinese Spoken Language Processing (ISCSLP), pp.1--6 (keynote speech), 2010.

  • M.Ablimit, G.Neubig, M.Mimura, S.Mori, T.Kawahara, and A.Hamdulla.
    Uyghur morpheme-based language models and ASR.
    In Proc. Int'l Conf. Signal Processing, pp.581--584, 2010.

  • K.Yoshino and T.Kawahara.
    Spoken dialogue system based on information extraction from web text.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), Vol. Demo. Paper, pp.196--197, 2010.

  • T.Kawahara, K.Sumi, Z.Q.Chang, and K.Takanashi.
    Detection of hot spots in poster conversations based on reactive tokens of audience.
    In Proc. INTERSPEECH, pp.3042--3045, 2010.

  • G.Neubig, M.Mimura, S.Mori, and T.Kawahara.
    Learning a language model from continuous speech.
    In Proc. INTERSPEECH, pp.1053--1056, 2010.

  • Y.Itoh, H.Nishizaki, X.Hu, H.Nanjo, T.Akiba, T.Kawahara, S.Nakagawa, T.Matsui, Y.Yamashita, and K.Aikawa.
    Constructing Japanese test collections for spoken term detection.
    In Proc. INTERSPEECH, pp.677--680, 2010.

  • T.Kawahara, N.Katsumaru, Y.Akita, and S.Mori.
    Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures.
    In Proc. INTERSPEECH, pp.626--629, 2010.

  • R.Gomez and T.Kawahara.
    An improved wavelet-based dereverberation for robust automatic speech recognition.
    In Proc. INTERSPEECH, pp.578--581, 2010.

  • Y.Akita, M.Mimura, G.Neubig, and T.Kawahara.
    Semi-automated update of automatic transcription system for the Japanese national congress.
    In Proc. INTERSPEECH, pp.338--341, 2010.

  • T.Kawahara, Z.Q.Chang, and K.Takanashi.
    Analysis on prosodic features of Japanese reactive tokens in poster conversations.
    In Proc. Int'l Conf. Speech Prosody, 2010.

  • G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
    Improved statistical models for SMT-based speaking style transformation.
    In Proc. IEEE-ICASSP, pp.5206--5209, 2010.

  • R.Gomez and T.Kawahara.
    Optimizing spectral subtraction and Wiener filtering for robust speech recognition in reverberant and noisy conditions.
    In Proc. IEEE-ICASSP, pp.4566--4569, 2010.

  • D.Cournapeau, S.Watanabe, A.Nakamura, and T.Kawahara.
    Using online model comparison in the Variational Bayes framework for online unsupervised voice activity detection.
    In Proc. IEEE-ICASSP, pp.4462--4465, 2010.

  • T.Kawahara.
    New perspectives on spoken language understanding: Does machine need to fully understand speech?
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.46--50 (invited paper), 2009.

  • T.Misu, K.Sugiura, T.Kawahara, K.Ohtake, C.Hori, H.Kashioka, and S.Nakamura.
    Online learning of Bayes risk-based optimization of dialogue management.
    In Proc. Int'l Workshop Spoken Dialogue Systems (IWSDS), 2009.

  • R.Gomez and T.Kawahara.
    Tight integration of dereverberation and automatic speech recognition.
    In Proc. APSIPA ASC, pp.639--643, 2009.

  • T.Akiba, K.Aikawa, Y.Itoh, T.Kawahara, H.Nanjo, H.Nishizaki, N.Yasuda, Y.Yamashita, and K.Itou.
    Developing an SDR test collection from Japanese lecture audio data.
    In Proc. APSIPA ASC, pp.324--330, 2009.

  • K.Katsurada, A.Lee, T.Kawahara, T.Yotsukura, S.Morishima, T.Nishimoto, Y.Yamashita, and T.Nitta.
    Development of a toolkit for spoken dialog systems with an anthropomorphic agent: Galatea.
    In Proc. APSIPA ASC, pp.148--153, 2009.

  • A.Lee and T.Kawahara.
    Recent development of open-source speech recognition engine Julius.
    In Proc. APSIPA ASC, pp.131--137, 2009.

  • G.Neubig, S.Mori, and T.Kawahara.
    A WFST-based log-linear framework for speaking-style transformation.
    In Proc. INTERSPEECH, pp.1495--1498, 2009.

  • R.Gomez and T.Kawahara.
    Optimization of dereverberation parameters based on likelihood of speech recognizer.
    In Proc. INTERSPEECH, pp.1223--1226, 2009.

  • K.Sumi, T.Kawahara, J.Ogata, and M.Goto.
    Acoustic event detection for spotting hot spots in podcasts.
    In Proc. INTERSPEECH, pp.1143--1146, 2009.

  • Y.Akita, M.Mimura, and T.Kawahara.
    Automatic transcription system for meetings of the Japanese national congress.
    In Proc. INTERSPEECH, pp.84--87, 2009.

  • K.Komatani, T.Kawahara, and H.G.Okuno.
    A model of temporally changing user behaviors in a deployed spoken dialogue system.
    In Proc. Int'l Conf. User Modeling, Adaptation, and Personalization (UMAP) (LNCS 5535), pp.409--414, 2009.

  • T.Kawahara, M.Mimura, and Y.Akita.
    Language model transformation applied to lightly supervised training of acoustic model for congress meetings.
    In Proc. IEEE-ICASSP, pp.3853--3856, 2009.

  • M.Ablimit, M.Eli, and T.Kawahara.
    Partly supervised Uighur morpheme segmentation.
    In Proc. Oriental-COCOSDA Workshop, pp.71--76, 2008.

  • T.Shinozaki, S.Furui, and T.Kawahara.
    Aggregated cross-validation and its efficient application to Gaussian mixture optimization.
    In Proc. INTERSPEECH, pp.2382--2385, 2008.

  • T.Sasada, S.Mori, and T.Kawahara.
    Extracting word-pronunciation pairs from comparable set of text and speech.
    In Proc. INTERSPEECH, pp.1821--1824, 2008.

  • H.Wang and T.Kawahara.
    A Japanese CALL system based on dynamic question generation and error prediction for ASR.
    In Proc. INTERSPEECH, pp.1737--1740, 2008.

  • T.Kawahara, M.Toyokura, T.Misu, and C.Hori.
    Detection of feeling through back-channels in spoken dialogue.
    In Proc. INTERSPEECH, p. 1696, 2008.

  • T.Kawahara, H.Setoguchi, K.Takanashi, K.Ishizuka, and S.Araki.
    Multi-modal recording, analysis and indexing of poster sessions.
    In Proc. INTERSPEECH, pp.1622--1625, 2008.

  • K.Komatani, T.Kawahara, and H.G.Okuno.
    Predicting ASR errors by exploiting barge-in rate of individual users for spoken dialogue systems.
    In Proc. INTERSPEECH, pp.183--186, 2008.

  • K.Ishizuka, S.Araki, and T.Kawahara.
    Statistical speech activity detection based on spatial power distribution for analyses of poster presentations.
    In Proc. INTERSPEECH, pp.99--102, 2008.

  • T.Misu and T.Kawahara.
    Bayes risk-based dialogue management for document retrieval system with speech interface.
    In Proc. COLING, Vol.Posters \& Demo., pp.59--62, 2008.

  • H.Wang and T.Kawahara.
    Effective error prediction using decision tree for ASR grammar network in CALL system.
    In Proc. IEEE-ICASSP, pp.5069--5072, 2008.

  • T.Kawahara, Y.Nemoto, and Y.Akita.
    Automatic lecture transcription by exploiting presentation slide information for language model adaptation.
    In Proc. IEEE-ICASSP, pp.4929--4932, 2008.

  • S.Sakai, T.Kawahara, and S.Nakamura.
    Admissible stopping in Viterbi beam search for unit selection in concatenative speech synthesis.
    In Proc. IEEE-ICASSP, pp.4613--4616, 2008.

  • D.Cournapeau and T.Kawahara.
    Using Variational Bayes Free Energy for unsupervised voice activity detection.
    In Proc. IEEE-ICASSP, pp.4429--4432, 2008.

  • T.Shinozaki and T.Kawahara.
    GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.
    In Proc. IEEE-ICASSP, pp.4405--4408, 2008.

  • T.Shinozaki and T.Kawahara.
    HMM training based on CV-EM and CV Gaussian mixture optimization.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.318--322, 2007.

  • H.Setoguchi, K.Takanashi, and T.Kawahara.
    Multi-modal conversational analysis of poster presentations using multiple sensors.
    In Proc. ICMI Workshop on Tagging, Mining and Retrieval of Human Related Activity Information, pp.44--47, 2007.

  • D.Cournapeau and T.Kawahara.
    Evaluation of real-time voice activity detection based on high order statistics.
    In Proc. INTERSPEECH, pp.2945--2948, 2007.

  • T.Misu and T.Kawahara.
    Bayes risk-based optimization of dialogue management for document retrieval system with speech interface.
    In Proc. INTERSPEECH, pp.2705--2708, 2007.

  • C.Waple, H.Wang, T.Kawahara Y.Tsubota, and M.Dantsuji.
    Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.
    In Proc. INTERSPEECH, pp.2177--2180, 2007.

  • T.Shinozaki and T.Kawahara.
    Gaussian mixture optimization for HMM based on efficient cross-validation.
    In Proc. INTERSPEECH, pp.2061--2064, 2007.

  • Y.Akita, Y.Nemoto, and T.Kawahara.
    PLSA-based topic detection in meetings for adaptation of lexicon and language model.
    In Proc. INTERSPEECH, pp.602--605, 2007.

  • K.Komatani, T.Kawahara, and H.G.Okuno.
    Analyzing temporal transition of real user's behaviors in a spoken dialogue system.
    In Proc. INTERSPEECH, pp.142--145, 2007.

  • T.Misu and T.Kawahara.
    An interactive framework for document retrieval and presentation with question-answering function in restricted domain.
    In Proc. Int'l Conf. Industrial, Engineering \& Other Applications of Artificial Intelligent Systems (IEA/AIE) (LNAI 4570), pp. 126--134, 2007.

  • T.Misu and T.Kawahara.
    Speech-based interactive information guidance system using question-answering technique.
    In Proc. IEEE-ICASSP, Vol.4, pp.145--148, 2007.

  • T.Kawahara, M.Saikou, and K.Takanashi.
    Automatic detection of sentence and clause units using local syntactic dependency.
    In Proc. IEEE-ICASSP, Vol.4, pp.125--128, 2007.

  • Y.Akita and T.Kawahara.
    Topic-independent speaking-style transformation of language model for spontaneous speech recognition.
    In Proc. IEEE-ICASSP, Vol.4, pp.33--36, 2007.

  • T.Kawahara.
    Intelligent transcription system based on spontaneous speech processing.
    In Proc. Int'l Conference on Informatics Research for Development of Knowledge Society Infrastructure, pp.19--26, 2007.

  • Y.Kida and T.Kawahara.
    Evaluation of voice activity detection by combining multiple features with weight adaptation.
    In Proc. INTERSPEECH, pp.1966--1969, 2006.

  • S.Sakai and T.Kawahara.
    Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis.
    In Proc. INTERSPEECH, pp.1746--1749, 2006.

  • D.Cournapeau, T.Kawahara, K.Mase, and T.Toriyama.
    Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm.
    In Proc. INTERSPEECH, pp.1201--1204, 2006.

  • Y.Akita, M.Saikou, H.Nanjo, and T.Kawahara.
    Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machines.
    In Proc. INTERSPEECH, pp.1033--1036, 2006.

  • C.Waple, Y.Tsubota, M.Dantsuji, and T.Kawahara.
    Prototyping a CALL system for students of Japanese using dynamic diagram generation and interactive hints.
    In Proc. INTERSPEECH, pp.821--824, 2006.

  • R.Hamabe, K.Uchimoto, T.Kawahara, and H.Isahara.
    Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
    In Proc. INTERSPEECH, pp.729--732, 2006.

  • T.Misu and T.Kawahara.
    A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts.
    In Proc. INTERSPEECH, pp.9--12, 2006.

  • R.Hamabe, K.Uchimoto, T.Kawahara, and H.Isahara.
    Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
    In Proc. COLING-ACL, Vol.Poster Sessions, pp.324--330, 2006.

  • Y.Akita, C.Troncoso, and T.Kawahara.
    Automatic transcription of meetings using topic-oriented language model adaptation.
    In Proc. Western Pacific Acoustics Conference (WESPAC), 2006.

  • H.Nanjo, Y.Akita, and T.Kawahara.
    Computer assisted speech transcription system for efficient speech archive.
    In Proc. Western Pacific Acoustics Conference (WESPAC), 2006.

  • Y.Akita and T.Kawahara.
    Efficient estimation of language model statistics of spontaneous speech via statistical transformation model.
    In Proc. IEEE-ICASSP, Vol.1, pp.1049--1052, 2006.

  • T.Misu and T.Kawahara.
    Speech-based information retrieval system with clarification dialogue strategy.
    In Proc. Human Language Technology Conf. (HLT/EMNLP), pp. 1003--1010, 2005.

  • Y.Kida and T.Kawahara.
    Voice activity detection based on optimally weighted combination of multiple features.
    In Proc. INTERSPEECH, pp.2621--2624, 2005.

  • C.Troncoso and T.Kawahara.
    Trigger-based language model adaptation for automatic meeting transcription.
    In Proc. INTERSPEECH, pp.1297--1300, 2005.

  • T.Misu and T.Kawahara.
    Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
    In Proc. INTERSPEECH, pp.637--640, 2005.

  • H.Nanjo, T.Misu, and T.Kawahara.
    Minimum Bayes-risk decoding considering word significance for information retrieval system.
    In Proc. INTERSPEECH, pp.561--564, 2005.

  • I.R.Lane and T.Kawahara.
    Utterance verification incorporating in-domain confidence and discourse coherence measures.
    In Proc. INTERSPEECH, pp.421--424, 2005.

  • C.Troncoso, T.Kawahara, H.Yamamoto, and G.Kikui.
    Trigger-based language model construction by combining different corpora.
    In Proc. Pacific Assoc. Computational Linguistics (PACLING), pp.340--344, 2005.

  • H.Nanjo and T.Kawahara.
    A new ASR evaluation measure and minimum Bayes-risk decoding for open-domain speech understanding.
    In Proc. IEEE-ICASSP, Vol.1, pp.1053--1056, 2005.

  • I.R.Lane and T.Kawahara.
    Incorporating dialogue context and topic clustering in out-of-domain detection.
    In Proc. IEEE-ICASSP, Vol.1, pp.1045--1048, 2005.

  • Y.Akita and T.Kawahara.
    Generalized statistical modeling of pronunciation variations using variable-length phone context.
    In Proc. IEEE-ICASSP, Vol.1, pp.689--692, 2005.

  • T.Kawahara, A.Lee, K.Takeda, K.Itou, and K.Shikano.
    Recent progress of open-source LVCSR engine Julius and Japanese model repository.
    In Proc. INTERSPEECH, pp.3069--3072, 2004.

  • K.Shitaoka, H.Nanjo, and T.Kawahara.
    Automatic transformation of lecture transcription into document style using statistical framework.
    In Proc. INTERSPEECH, pp.2881--2884, 2004.

  • S.Ueno, I.R.Lane, and T.Kawahara.
    Example-based training of dialogue planning incorporating user and situation models.
    In Proc. INTERSPEECH, pp.2837--2840, 2004.

  • I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
    Topic classification and verification modeling for out-of-domain utterance detection.
    In Proc. INTERSPEECH, pp.2197--2200, 2004.

  • T.Kitade, H.Nanjo, and T.Kawahara.
    Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers.
    In Proc. INTERSPEECH, pp.2169--2172, 2004.

  • Y.Tsubota, M.Dantsuji, and T.Kawahara.
    Practical use of English pronunciation system for Japanese students in the CALL classroom.
    In Proc. INTERSPEECH, pp.1689--1692, 2004.

  • K.Shitaoka, K.Uchimoto, T.Kawahara, and H.Isahara.
    Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
    In Proc. INTERSPEECH, pp.1353--1356, 2004.

  • Y.Akita and T.Kawahara.
    Language model adaptation based on PLSA of topics and speakers.
    In Proc. INTERSPEECH, pp.1045--1048, 2004.

  • T.Misu, K.Komatani, and T.Kawahara.
    Confirmation strategy for document retrieval systems with spoken dialog interface.
    In Proc. INTERSPEECH, pp.45--48, 2004.

  • K.Komatani, T.Misu, T.Kawahara, and H.G.Okuno.
    Efficient confirmation strategy for large-scale text retrieval systems with spoken dialogue interface.
    In Proc. COLING, pp.1100--1106, 2004.

  • K.Shitaoka, K.Uchimoto, T.Kawahara, and H.Isahara.
    Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
    In Proc. COLING, pp.1107--1113, 2004.

  • Y.Akita, M.Hasegawa, and T.Kawahara.
    Automatic audio archiving system for panel discussions.
    In Proc. IEEE Int'l Conf. Multimedia and Expo (ICME), 2004.

  • Y.Tsubota, M.Dantsuji, and T.Kawahara.
    Practical use of autonomous English pronunciation learning system for Japanese students.
    In Proc. InSTIL/ICALL -- NLP and Speech Technologies in Advanced Language Learning Systems, pp.139--142, 2004.

  • A.Lee, K.Shikano, and T.Kawahara.
    Real-time word confidence scoring using local posterior probabilities on tree trellis search.
    In Proc. IEEE-ICASSP, Vol.1, pp.793--796, 2004.

  • I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
    Out-of-domain detection based on confidence measures from multiple topic classification.
    In Proc. IEEE-ICASSP, Vol.1, pp.757--760, 2004.

  • H.Nanjo, T.Kitade, and T.Kawahara.
    Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers.
    In Proc. IEEE-ICASSP, Vol.1, pp.449--452, 2004.

  • M.Nishida and T.Kawahara.
    Speaker indexing and adaptation using speaker clustering based on statistical model selection.
    In Proc. IEEE-ICASSP, Vol.1, pp.353--356, 2004.

  • K.Komatani, R.Ito, T.Kawahara, and H.G.Okuno.
    Recognition of emotional states in spoken dialogue with a robot.
    In Proc. Int'l Conf. Industrial \& Engineering Applications of Artificial Intelligence \& Expert Systems (IEA/AIE) (LNAI 3029), pp. 413--423, 2004.

  • T.Kawahara.
    Automatic speech transcription and archiving system using the Corpus of Spontaneous Japanese.
    In Proc. Int'l Congress Acoustics (ICA), pp.161--164, 2004.

  • T.Kawahara.
    Spoken language processing for audio archives of lectures and panel discussions.
    In Proc. Int'l Conference on Informatics Research for Development of Knowledge Society Infrastructure, pp.23--30, 2004.

  • T.Kawahara, T.Kitade, K.Shitaoka, and H.Nanjo.
    Efficient access to lecture audio archives through spoken language processing.
    In Proc. Special Workshop in Maui (SWIM), 2004.

  • T.Kawahara, K.Shitaoka, T.Kitade, and H.Nanjo.
    Automatic indexing of key sentences for lecture archives.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), 2003.

  • Y.Tsubota, M.Dantsuji, and T.Kawahara.
    An English pronunciation learning system for Japanese students based on diagnosis of critical pronunciation errors.
    In Proc. EUROCALL, p. 204, 2003.

  • Y.Akita and T.Kawahara.
    Unsupervised speaker indexing using anchor models and automatic transcription of discussions.
    In Proc. EUROSPEECH, pp.2985--2988, 2003.

  • M.Nishida and T.Kawahara.
    Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation.
    In Proc. EUROSPEECH, pp.1849--1852, 2003.

  • T.Kawahara, R.Ito, and K.Komatani.
    Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy.
    In Proc. EUROSPEECH, pp.1701--1704, 2003.

  • K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
    User modeling in spoken dialogue systems for flexible guidance generation.
    In Proc. EUROSPEECH, pp.745--748, 2003.

  • I.R.Lane, T.Matsui, S.Nakamura, and T.Kawahara.
    Hierarchical topic classification for dialog speech recognition based on language model switching.
    In Proc. EUROSPEECH, pp.429--432, 2003.

  • K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
    Flexible guidance generation using user model in spoken dialogue systems.
    In Proc. ACL, pp.256--263, 2003.

  • Y.Kiyota, S.Kurohashi, T.Misu, K.Komatani, T.Kawahara, and F.Kido.
    Dialog navigator: A spoken dialog Q-A system based on large text knowledge base.
    In Proc. ACL, Vol.Interactive Poster \& Demo., pp.149--152, 2003.

  • K.Komatani, F.Adachi, S.Ueno, T.Kawahara, and H.G.Okuno.
    Flexible spoken dialogue system based on user models and dynamic generation of VoiceXML scripts.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.87--96, 2003.

  • T.Kawahara, H.Nanjo, T.Shinozaki, and S.Furui.
    Benchmark test for speech recognition using the Corpus of Spontaneous Japanese.
    In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.135--138, 2003.

  • H.Nanjo, K.Shitaoka, and T.Kawahara.
    Automatic transformation of lecture transcription into document style using statistical framework.
    In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.215--218, 2003.

  • Y.Akita, M.Nishida, and T.Kawahara.
    Automatic transcription of discussions using unsupervised speaker indexing.
    In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.79--82, 2003.

  • H.Nanjo and T.Kawahara.
    Unsupervised language model adaptation for lecture speech recognition.
    In Proc. ISCA \& IEEE Workshop on Spontaneous Speech Processing and Recognition, pp.75--78, 2003.

  • I.R.Lane, T.Kawahara, and T.Matsui.
    Language model switching based on topic detection for dialog speech recognition.
    In Proc. IEEE-ICASSP, Vol.1, pp.616--619, 2003.

  • M.Nishida and T.Kawahara.
    Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion.
    In Proc. IEEE-ICASSP, Vol.1, pp.172--175, 2003.

  • K.Okuda, T.Kawahara, and S.Nakamura.
    Speaking rate compensation based on likelihood criterion in acoustic model training and decoding.
    In Proc. ICSLP, pp.2589--2592, 2002.

  • Y.Tsubota, T.Kawahara, and M.Dantsuji.
    Recognition and verification of English by Japanese students for computer-assisted language learning system.
    In Proc. ICSLP, pp.1205--1208, 2002.

  • K.Imoto, Y.Tsubota, A.Raux, T.Kawahara, and M.Dantsuji.
    Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system.
    In Proc. ICSLP, pp.749--752, 2002.

  • A.Raux and T.Kawahara.
    Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learning.
    In Proc. ICSLP, pp.737--740, 2002.

  • Y.Yamakata, T.Kawahara, and H.G.Okuno.
    Belief network based disambiguation of object reference in spoken dialogue system for robot.
    In Proc. ICSLP, pp.177--180, 2002.

  • K.Komatani, T.Kawahara, R.Ito, and H.G.Okuno.
    Efficient dialogue strategy to find users' intended items from information query results.
    In Proc. COLING, pp.481--487, 2002.

  • Y.Yamakata, T.Kawahara, and H.G.Okuno.
    Belief network based disambiguation of object reference in spoken dialogue system for robot.
    In Proc. ISCA Workshop on Multi-Modal Dialogue in Mobile Environments, 2002.

  • A.Lee, T.Kawahara, K.Takeda, M.Mimura, A.Yamada, A.Ito, K.Itou, and K.Shikano.
    Continuous speech recognition consortium -- an open repository for CSR tools and models --.
    In Proc. Int'l Conf. Language Resources \& Evaluation (LREC), pp.1438--1441, 2002.

  • T.Kawahara and M.Hasegawa.
    Automatic indexing of lecture speech by extracting topic-independent discourse markers.
    In Proc. IEEE-ICASSP, pp.1--4, 2002.

  • H.Nanjo and T.Kawahara.
    Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition.
    In Proc. IEEE-ICASSP, pp.725--728, 2002.

  • A.Raux and T.Kawahara.
    Optimizing computer-assisted pronunciation instruction by selecting relevant training topics.
    In InSTIL Advanced Workshop, 2002.

  • Y.Tsubota, T.Kawahara, and M.Dantsuji.
    CALL system for Japanese students of English using pronunciation error prediction and formant structure estimation.
    In InSTIL Advanced Workshop, 2002.

  • T.Kawahara, H.Nanjo, and S.Furui.
    Automatic transcription of spontaneous lecture speech.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), 2001.

  • H.Nanjo, K.Kato, and T.Kawahara.
    Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition.
    In Proc. EUROSPEECH, pp.2531--2534, 2001.

  • A.Lee, T.Kawahara, and K.Shikano.
    Julius -- an open source real-time large vocabulary recognition engine.
    In Proc. EUROSPEECH, pp.1691--1694, 2001.

  • K.Komatani, K.Tanaka, H.Kashima, and T.Kawahara.
    Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model.
    In Proc. EUROSPEECH, pp.1319--1322, 2001.

  • A.Lee, T.Kawahara, and K.Shikano.
    Gaussian mixture selection using context-independent HMM.
    In Proc. IEEE-ICASSP, pp.69--72, 2001.

  • T.Kawahara, A.Lee, T.Kobayashi, K.Takeda, N.Minematsu, S.Sagayama, K.Itou, A.Ito, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
    Free software toolkit for Japanese large vocabulary continuous speech recognition.
    In Proc. ICSLP, Vol.4, pp.476--479, 2000.

  • Y.Tsubota, M.Dantsuji, and T.Kawahara.
    Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures.
    In Proc. ICSLP, Vol.3, pp.566--569, 2000.

  • K.Imoto, M.Dantsuji, and T.Kawahara.
    Modelling of the perception of English sentence stress for computer-assisted language learning.
    In Proc. ICSLP, Vol.3, pp.175--178, 2000.

  • H.Nanjo, A.Lee, and T.Kawahara.
    Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems.
    In Proc. ICSLP, Vol.2, pp.1027--1030, 2000.

  • K.Komatani and T.Kawahara.
    Generating effective confirmation and guidance using two-level confidence measures for dialogue systems.
    In Proc. ICSLP, Vol.2, pp.648--651, 2000.

  • K.Kato, H.Nanjo, and T.Kawahara.
    Automatic transcription of lecture speech using topic-independent language modeling.
    In Proc. ICSLP, Vol.1, pp.162--165, 2000.

  • H.Fujisaki, K.Shirai, S.Doshita, S.Nakagawa, K.Hirose, S.Itahashi, T.Kawahara, S.Ohno, H.Kikuchi, K.Abe, and S.Kiriyama.
    Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language.
    In Proc. ICSLP, Vol.1, pp.70--73, 2000.

  • T.Kawahara, K.Komatani, and S.Doshita.
    Dialogue management using concept-level confidence measures of speech recognition.
    In Proc. Int'l Sympo. on Spoken Dialogue, 2000.

  • K.Komatani and T.Kawahara.
    Flexible mixed-initiative dialogue management using concept-level confidence measures of speech recognizer output.
    In Proc. COLING, pp.467--473, 2000.

  • A.Lee, T.Kawahara, K.Takeda, and K.Shikano.
    A new phonetic tied-mixture model for efficient decoding.
    In Proc. IEEE-ICASSP, pp.1269--1272, 2000.

  • T.Kawahara, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
    Japanese dictation toolkit -- plug-and-play framework for speech recognition R\&D --.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.393--396, 1999.

  • C.-H.Jo, T.Kawahara, and S.Doshita.
    The use of duration similarity templates in speech rhythm training.
    In Proc. IEEE Region 10 Conference (TENCON), pp.146--149, 1999.

  • C.-H.Jo, T.Kawahara, and S.Doshita.
    Mora-timed speech rhythm training system using rhythm pattern templates.
    In Proc. Int'l Conf. on Speech Processing, pp.129--134, 1999.

  • T.Kawahara, K.Tanaka, and S.Doshita.
    Domain-independent platform of spoken dialogue interfaces for information query.
    In Proc. ESCA Workshop on Interactive Dialogue in Multi-Modal Systems, pp.69--72, 1999.

  • T.Kawahara, K.Tanaka, and S.Doshita.
    Virtual fitting room with spoken dialogue interface.
    In Proc. ESCA Workshop on Interactive Dialogue in Multi-Modal Systems, pp.5--8, 1999.

  • T.Kawahara, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
    Japanese dictation toolkit -- free software repository for speech recognition --.
    In Proc. ESCA Workshop on Interactive Dialogue in Multi-Modal Systems, p. 161, 1999.

  • T.Kawahara and S.Doshita.
    Topic independent language model for key-phrase detection and verification.
    In Proc. IEEE-ICASSP, pp.685--688, 1999.

  • T.Kawahara, K.Ishizuka, S.Doshita, and C.-H.Lee.
    Speaking-style dependent lexicalized filler model for key-phrase detection and verification.
    In Proc. ICSLP, pp.3253--3256, 1998.

  • T.Kawahara, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, M.Yamamoto, T.Utsuro, and K.Shikano.
    Sharable software repository for Japanese large vocabulary continuous speech recognition.
    In Proc. ICSLP, pp.3257--3260, 1998.

  • F.M.Quimbo, T.Kawahara, and S.Doshita.
    Prosodic analysis of fillers and self-repair in Japanese speech.
    In Proc. ICSLP, pp.3313--3316, 1998.

  • A.Lee, T.Kawahara, and S.Doshita.
    An efficient two-pass search algorithm using word trellis index.
    In Proc. ICSLP, pp.1831--1834, 1998.

  • C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji.
    Automatic pronunciation error detection and guidance for foreign language learning.
    In Proc. ICSLP, pp.2639--2642, 1998.

  • S.Itahashi, M.Yamamoto, and T.Kawahara.
    Speech corpus by ``spoken dialogue'' project.
    In Proc. Oriental-COCOSDA Workshop, pp.156--161, 1998.

  • T.Kawahara, A.Lee, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, A.Ito, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
    Common platform of Japanese large vocabulary continuous speech recognizer assessment -- proposal and initial results --.
    In Proc. Oriental-COCOSDA Workshop, pp.117--122, 1998.

  • T.Kawahara, S.Doshita, and C.-H.Lee.
    Phrase language models for detection and verification-based speech understanding.
    In Proc. IEEE Workshop Automatic Speech Recognition \& Understanding (ASRU), pp.49--56, 1997.

  • C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji.
    Japanese pronunciation training system with HMM segmentation and distinctive feature classification.
    In Proc. Int'l Conf. on Speech Processing, pp.341--346, 1997.

  • C.-H.Jo, T.Kawahara, and S.Doshita.
    Computer-aided foreign language pronunciation learning system under virtual environment.
    In Proc. World Conf. on Artificial Intelligence in Education (AI-ED), pp.601--603, 1997.

  • T.Kawahara, C.H.Lee, and B.H.Juang.
    Combining key-phrase detection and subword-based verification for flexible speech understanding.
    In Proc. IEEE-ICASSP, pp.1159--1162, 1997.

  • H.Masataki, Y.Sagisaka, K.Hisaki, and T.Kawahara.
    Task adaptation using MAP estimation in n-gram language modeling.
    In Proc. IEEE-ICASSP, pp.783--786, 1997.

  • T.Kawahara, C.H.Lee, and B.H.Juang.
    Key-phrase detection and verification for flexible speech understanding.
    In Proc. ICSLP, pp.861--864, 1996.

  • T.Kawahara, N.Kitaoka, and S.Doshita.
    Concept-based phrase spotting approach for spontaneous speech understanding.
    In Proc. IEEE-ICASSP, pp.291--294, 1996.

  • T.Kawahara, M.Araki, and S.Doshita.
    Comparison of parsing and spotting approaches for spoken dialogue understanding.
    In Proc. ESCA Workshop on Spoken Dialogue Systems, pp.21--24, 1995.

  • T.Kawahara, T.Munetsugu, N.Kitaoka, and S.Doshita.
    Keyword and phrase spotting with heuristic language model.
    In Proc. ICSLP, Vol.2, pp.815--818, 1994.

  • T.Kawahara, M.Araki, and S.Doshita.
    Heuristic search integrating syntactic, semantic and dialog-level constraints.
    In Proc. IEEE-ICASSP, Vol.2, pp.25--28, 1994.

  • T.Kawahara, M.Araki, and S.Doshita.
    Reducing syntactic perplexity of user utterances with automaton dialogue model.
    In Int'l Sympo. on Spoken Dialogue, pp.65--68, 1993.

  • T.Kawahara, S.Matsumoto, and S.Doshita.
    A*-admissible context-free parsing on HMM trellis for speech understanding.
    In Proc. Pacific Rim Int'l Conf. on Artificial Intelligence, Vol.2, pp.1203--1208, 1992.

  • T.Kawahara and S.Doshita.
    HMM based on pair-wise Bayes classifiers.
    In Proc. IEEE-ICASSP, Vol.1, pp.365--368, 1992.

  • P.Fung, T.Kawahara, and S.Doshita.
    Unsupervised speaker normalization by speaker Markov model converter for speaker-independent speech recognition.
    In Proc. EUROSPEECH, pp.1111--1114, 1991.

  • T.Kawahara and S.Doshita.
    Phoneme recognition by combining discriminant analysis and HMM.
    In Proc. IEEE-ICASSP, pp.557--560, 1991.

  • T.Kawahara, T.Ogawa, S.Kitazawa, and S.Doshita.
    Phoneme recognition by combining Bayesian linear discriminations of selected pairs of classes.
    In Proc. ICSLP, pp.229--232, 1990.