Speech Recognition of Lectures and Meetings

  • T.Kawahara.
    Automatic meeting transcription system for the Japanese Parliament (Diet).
    In Proc. APSIPA ASC, (overview talk), 2017. (PDF file)
  • S.Li, Y.Akita, and T.Kawahara.
    Semi-supervised acoustic model training by discriminative data selection from multiple ASR systems' hypotheses.
    IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.24, No.9, pp.1524--1534, 2016. (text) (PDF file) (KURENAI)

Robust Speech Recognition

  • M.Mimura, S.Sakai, and T.Kawahara.
    Reverberant speech recognition combining deep neural networks and deep autoencoders augumented with phone-class feature.
    EURASIP J. Advances in Signal Processing, Vol.2015, No.62, pp.1--13, 2015. (text) (PDF file) (KURENAI)

Source Separation and Speech Enhancement

  • K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
    Bayesian multichannel audio source separation based on integrated source and spatial models.
    IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26, No.4, pp.831--846, 2018. (text) (PDF file)
  • Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, T.Kawahara, and H.G.Okuno.
    Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms.
    IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26, No.2, pp.215--230, 2018. (text) (PDF file)

Dialogue Systems

  • D.Lala, P.Milhorat, K.Inoue, M.Ishida, K.Takanashi, and T.Kawahara.
    Attentive listening system with backchanneling, response generation and flexible turn-taking.
    In Proc. SIGdial Meeting Discourse \& Dialogue, pp.127--136, 2017. (PDF file)
  • K.Yoshino and T.Kawahara.
    Conversational system for information navigation based on POMDP with user focus tracking.
    Computer Speech and Language, Vol.34, No.1, pp.275--291, 2015. (text) (PDF file)

Interaction Analysis and Model

  • T.Kawahara, T.Yamaguchi, K.Inoue, K.Takanashi, and N.Ward.
    Prediction and generation of backchannel form for attentive listening systems.
    In Proc. INTERSPEECH, pp.2890--2894, 2016. (PDF file)

Multi-modal Conversation Analysis

  • T.Kawahara, T.Iwatate, K.Inoue, S.Hayashi, H.Yoshimoto, and K.Takanashi.
    Multi-modal sensing and analysis of poster conversations with smart posterboard.
    APSIPA Trans. Signal \& Information Process., Vol.5, No.e2, pp.1--12, 2016. (text)

CALL (Computer Assisted Language Learning)

  • M.Mirzaei, K.Meshgi, and T.Kawahara.
    Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening.
    Computer Speech and Language, Vol.49, pp.17--36, 2018. (text)
  • H.Wang, C.J.Waple, and T.Kawahara.
    Computer assisted language learning system based on dynamic question generation and error prediction for automatic speech recognition.
    Speech Communication, Vol.51, No.10, pp.995--1005, 2009. (text) (PDF file)
  • Y.Tsubota, T.Kawahara, and M.Dantsuji.
    An English pronunciation learning system for Japanese students based on diagnosis of critical pronunciation errors.
    ReCALL Journal, Vol.16, No.1, pp.173--188, 2004. (text) (PDF file)

Speech Understanding

  • T.Zhao and T.Kawahara.
    Joint learning of dialog act segmentation and recognition in spoken dialog using neural networks.
    In Proc. IJCNLP, pp.704--712, 2017. (PDF file)
  • T.Kawahara, C.-H.Lee, and B.-H.Juang.
    Flexible speech understanding based on combined key-phrase detection and verification.
    IEEE Trans. Speech \& Audio Process., Vol.6, No.6, pp.558--568, 1998. (text) (PDF file)

Natural Language Processing for Rich Transcription

  • G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
    A monotonic statistical machine translation approach to speaking style transformation.
    Computer Speech and Language, Vol.26, No.5, pp.349--370, 2012. (text) (PDF file)
  • T.Kawahara, M.Hasegawa, K.Shitaoka, T.Kitade, and H.Nanjo.
    Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers.
    IEEE Trans. Speech \& Audio Process., Vol.12, No.4, pp. 409--419, 2004. (text) (PDF file) (KURENAI)

Large Vocabulary Continuous Speech Recognition Platform

  • A.Lee and T.Kawahara.
    Recent development of open-source speech recognition engine Julius.
    In Proc. APSIPA ASC, pp.131--137, 2009. (PDF file)
  • T.Kawahara, A.Lee, K.Takeda, K.Itou, and K.Shikano.
    Recent progress of open-source LVCSR engine Julius and Japanese model repository.
    In Proc. ICSLP, pp.3069--3072, 2004. (PDF file)
  • A.Lee, T.Kawahara, and K.Shikano.
    Julius -- an open source real-time large vocabulary recognition engine.
    In Proc. EUROSPEECH, pp.1691--1694, 2001. (PDF file)