FY 2025 | 2024 | 2023 | 2022 | 2021 | 2020 |
FY 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 |
FY 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2002 | 2001 | 2000 |

FY 2025

Y.Gao, H.Shi, Y.Fu, C.Chu, and T.Kawahara.
Bridging speech emotion recognition and personality: Dataset and temporal interaction condition network.
IEEE Transactions on Affective Computing, p. (accepted for publication), 2025. (text)
H.Shi, X.Lu, K.Shimada, and T.Kawahara.
Combining deterministic enhanced conditions with dual-streaming encoding for diffusion-based speech enhancement.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.33, pp.4253--4266, 2025. (text)
Y.Gao, H.Shi, C.Chu, and T.Kawahara.
Multi-attribute learning for multi-level emotion recognition from speech.
APSIPA Trans. Signal \& Information Process., Vol.14, No. e20, pp.1--29, 2025. (text)
K.Shimada, K.Uchida, Y.Koyama, T.Shibuya, S.Takahashi, Y.Mitsufuji, and T.Kawahara.
Open-vocabulary sound event localization and detection with joint learning of CLAP embedding and activity-coupled cartesian DOA vector.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.33, pp.2946--2960, 2025. (text)
K.Ochi, D.Lala, K.Inoue, T.Kawahara, and H.Kumazaki.
Robot-mediated multi-party conversation aimed at affect improvement for psychiatric patients.
IEEE Transactions on Affective Computing, p. (accepted for publication), 2025. (text)
Tatsuya Kawahara, Yuya Akita, and Mikitaka Masuyama.
Captioning parliamentary meeting videos using official meeting transcripts.
The Journal of Professional Reporting and Transcription (Tiro), No.1, 2025. (text)

FY 2024

S.Ueno, A.Lee, and T.Kawahara.
Refining synthesized speech using speaker information and phone masking for data augmentation of speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.32, pp.3924--3933, 2024. (text) (KURENAI)
H.Shi, M.Mimura, and T.Kawahara.
Waveform-domain speech enhancement using spectrogram encoding for robust speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.32, pp.3049--3060, 2024. (text) (KURENAI)
K.Soky, S.Li, C.Chu, and T.Kawahara.
Finetuning pretrained model with embedding of domain and language information for ASR of very low-resource settings.
International Journal of Asian Language Processing, Vol.33, No.4, pp.2350024:1--16, 2024. (text) (KURENAI)

FY 2023

K.Yamamoto, K.Inoue, and T.Kawahara.
Character expression of a conversational robot for adapting to user personality.
Advanced Robotics, Vol.38, No.4, pp.256--266, 2024. (text)
Y.Fu, K.Inoue, D.Lala, K.Yamamoto, C.Chu, and T.Kawahara.
Dual variational generative model and auxiliary retrieval for empathetic response generation by conversational robot.
Advanced Robotics, Vol.37, No.21, pp.1406--1418, 2023. (text) (KURENAI preprint)
K.Ochi, K.Inoue, D.Lala, T.Kawahara, and H.Kumazaki.
Effect of attentive listening robot on pleasure and arousal change in psychiatric daycare.
Advanced Robotics, Vol.37, No.21, pp.1382--1391, 2023. (text) (KURENAI) (KURENAI preprint)
K.Yamamoto, K.Inoue, and T.Kawahara.
Character expression for spoken dialogue systems with semi-supervised learning using variational auto-encoder.
Computer Speech and Language, Vol.79, No. 101469, pp.1--14, 2023. (text)

FY 2022

三村正人, 河原達也.
国会会議録のための音声から書き言葉へのend-to-end変換.
自然言語処理, Vol.30, No.1, pp.88--124, 2023. (text)
K.Inoue, D.Lala, and T.Kawahara.
Can a robot laugh with you?: Shared laughter generation for empathetic spoken dialogue.
Frontiers in Robotics and AI, Vol.Computational Intelligence in Robotics, pp.1--11, 9:933261, 2022. (text) (KURENAI)
K.Soky, M.Mimura, T.Kawahara, C.Chu, S.Li, C.Ding, and S.Sam.
TriECCC: Trilingual corpus of the Extraordinary Chambers in the Courts of Cambodia for speech recognition and translation studies.
International Journal of Asian Language Processing, Vol.31, No. 3\&4, pp.225007:1--21, 2022. (text) (KURENAI)
K.Sekiguchi, Y.Bando, A.A.Nugraha, M.Fontaine, K.Yoshii, and T.Kawahara.
Autoregressive moving average jointly-diagonalizable spatial covariance analysis for joint source separation and dereverberation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.30, pp.2368--2382, 2022. (text)

FY 2021

Y.Du, R.Scheibler, M.Togami, K.Yoshii, and T.Kawahara.
Computationally-efficient overdetermined blind source separation based on iterative source steering.
IEEE Signal Processing Letters, Vol.29, pp.927--931, 2021. (text)
H.Inaguma and T.Kawahara.
Alignment knowledge distillation for online streaming attention-based speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.31, pp.1371--1385, 2021. (text)
S.Ueno, M.Mimura, S.Sakai, and T.Kawahara.
Synthesizing waveform sequence-to-sequence to augment training data for sequence-to-sequence speech recognition.
Acoustical Science \& Technology, Vol.42, No.6, pp.333--343, 2021. (text) (PDF file)
松浦孝平, 三村正人, 河原達也.
アイヌ民話アーカイブに対する音声認識.
自然言語処理, Vol.28, No.3, pp.824--846, 2021. (text) (PDF file)
井上昂治, ララディベッシュ, 山本賢太, 中村静, 高梨克也, 河原達也.
アンドロイドERICAの傾聴対話システム--人間による傾聴との比較評価--.
人工知能学会論文誌, Vol.36, No.5, pp.H--L51\_1--12, 2021. (text) (PDF file)
E.Nakamura and K.Yoshii.
Musical Rhythm Transcription Based on Bayesian Piece-Specific Score Models Capturing Repetitions.
Information Sciences, Vol. 572, pp. 482-500, 2021. (text)
K.Shibata, E.Nakamura, and K.Yoshii.
Non-Local Musical Statistics as Guides for Audio-to-Score Piano Transcription.
Information Sciences, Vol. 566, pp. 262–280, 2021. (text)
Tatsuya Kawahara.
Captioning software using automatic speech recognition for online lectures.
The Journal of Professional Reporting and Transcription (Tiro), No.1, 2021. (text)
T.Kawahara, N.Muramatsu, K.Yamamoto, D.Lala, and K.Inoue.
Semi-autonomous avatar enabling unconstrained parallel conversations --seamless hybrid of WOZ and autonomous dialogue systems--.
Advanced Robotics, Vol.35, No.11, pp.657--663, 2021. (text)
R.Nishikimi, E.Nakamura, M.Goto, and K.Yoshii.
Audio-to-Score Singing Transcription Based on a CRNN-HSMM Hybrid Model.
APSIPA Trans. Signal \& Information Process., Vol.10, No.e7, pp.1–13, 2021. (text)

FY 2020

A.A.Nugraha, K.Sekiguchi, M.Fontaine, Y.Bando, and K.Yoshii.
Flow-Based Independent Vector Analysis for Blind Source Separation.
IEEE Signal Processing Letters, Vol. 28, pp. 2173–2177, 2020. (text)
Y.Wu, T.Carsault, E.Nakamura, and K.Yoshii.
Semi-Supervised Neural Chord Estimation Based on a Variational Autoencoder With Latent Chord Labels and Features.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.2956-2966, 2020. (text)
井上昂治, 原康平, ララディベッシュ, 山本賢太, 中村静, 高梨克也, 河原達也.
掘り下げ質問を行う就職面接対話システムの自律型アンドロイドでの実装と評価.
人工知能学会論文誌, Vol.35, No.5, pp.D--K43\_1--10, 2020. (text) (PDF file)
K.Sekiguchi, Y.Bando, A.A.Nugraha, K.Yoshii, and T.Kawahara.
Fast multichannel nonnegative matrix factorization with directivity-aware jointly-diagonalizable spatial covariance matrices for blind source separation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.2610--2625, 2020. (text)
R.Nishikimi, E.Nakamura, M.Goto, K.Itoyama, and K.Yoshii.
Bayesian Singing Transcription Based on a Hierarchical Generative Model of Keys, Musical Notes, and F0 Trajectories.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.1678--1691, 2020. (text)
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Bayesian Melody Harmonization Based on a Tree-Structured Generative Model of Chord Sequences and Melodies.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.1644--1655, 2020. (text)
A.A.Nugraha, K.Sekiguchi, and K.Yoshii.
A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.1104--1117, 2020. (text)
Tatsuya Kawahara, Shoko Ueno, and Masaya Morikawa.
Transcription system using automatic speech recognition in the Japanese Parliament.
The Journal of Professional Reporting and Transcription (Tiro), No.1, 2020. (text)
柴田剛, 錦見亮, 中村栄太, 吉井和佳.
同質性・反復性・規則性を考慮した階層隠れセミマルコフモデルに基づく統計的音楽構造解析.
情報処理学会論文誌, Vol.61, No.4, pp.757--767, 2020. (text)

FY 2019

A.A.Nugraha, K.Sekiguchi, and K.Yoshii.
A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.1104--1117, 2020. (text)
R.Duan, T.Kawahara, M.Dantsuji, and H.Nanjo.
Cross-lingual transfer learning of non-native acoustic modeling for pronunciation error detection and diagnosis.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.28, pp.391--401, 2020. (text) (KURENAI)
K.Sekiguchi, Y.Bando, A.A.Nugraha, K.Yoshii, and T.Kawahara.
Semi-supervised multichannel speech enhancement with a deep speech prior.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.27, No.12, pp.2197--2212, 2019. (text)
Y.Li, C.T.Ishi, K.Inoue, S.Nakamura, K.Takanashi, and T.Kawahara.
Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human-robot interaction.
Advanced Robotics, Vol.33, No.20, pp.1030--1041, 2019. (text)
T.Zhao and T.Kawahara.
Joint dialog act segmentation and recognition in human conversations using attention to dialog context.
Computer Speech and Language, Vol.57, pp.108--127, 2019. (text) (KURENAI)
K.Shimada, Y.Bando, M.Mimura, K.Itoyama, K.Yoshii, and T.Kawahara.
Unsupervised speech enhancement based on multichannel NMF-informed beamforming for noise-robust automatic speech recognition.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.27, No.5, pp.960--971, 2019. (text) (KURENAI)

FY 2018

Y.Ojima, E.Nakamura, K.Itoyama and K.Yoshii.
Chord-aware automatic music transcription based on hierarchical Bayesian integration of acoustic and language models.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e14, pp.1--14, 2018. (text)
E.Nakamura and K.Yoshii.
Statistical piano reduction controlling performance difficulty.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e13, pp.1--12, 2018. (text)
K.Inoue, D.Lala, K.Takanashi, and T.Kawahara.
Engagement recognition by a latent character model based on multimodal listener behaviors in spoken dialogue.
APSIPA Trans. Signal \& Information Process., Vol.7, No.e9, pp.1--16, 2018. (text)
山本賢太, 井上昂治, 中村静, 高梨克也, 河原達也.
人間型ロボットのキャラクタ表現のための対話の振る舞い制御モデル.
人工知能学会論文誌, Vol.33, No.5, pp.C--I37\_1--9, 2018. (text) (PDF file)
H.Tsushima, E.Nakamura, K.Itoyama, and K.Yoshii.
Generative Statistical Models with Self-Emergent Grammar of Chord Sequences.
Journal of New Music Research, 2018. (text)
M.Mirzaei, K.Meshgi, and T.Kawahara.
Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening.
Computer Speech and Language, Vol.49, pp.17--36, 2018. (text) (KURENAI)
T.Hagiya, T.Horiuchi, T.Yazaki, and T.Kawahara.
Typing Tutor: Individualized tutoring in text entry for older adults based on statistical input stumble detection.
J. Information Processing, Vol.26, No.4, 2018. (text)
K.Itakura, Y.Bando, E.Nakamura, K.Itoyama, K.Yoshii, and T.Kawahara.
Bayesian multichannel audio source separation based on integrated source and spatial models.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26, No.4, pp.831--846, 2018. (text) (PDF file)

FY 2017

Y.Bando, K.Itoyama, M.Konyo, S.Tadokoro, K.Nakadai, K.Yoshii, T.Kawahara, and H.G.Okuno.
Speech enhancement based on Bayesian low-rank and sparse decomposition of multichannel magnitude spectrograms.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.26, No.2, pp.215--230, 2018. (text) (PDF file) (Errata)
井上昂治, Divesh Lala, 吉井和佳, 高梨克也, 河原達也.
潜在キャラクタモデルによる聞き手のふるまいに基づく対話エンゲージメントの推定.
人工知能学会論文誌, Vol.33, No.1, pp.DSH--F\_1--12, 2018. (text) (PDF file)
R.Duan, T.Kawahara, M.Dantsuji, and J.Zhang.
Articulatory modeling for pronunciation error detection without non-native training data based on DNN transfer learning.
IEICE Trans., Vol.E100-D, No.9, pp.2174--2182, 2017. (text)
T.Hagiya, T.Horiuchi, T.Yazaki, T.Kato, and T.Kawahara.
Assistive typing application for older adults based on input stumble detection.
J. Information Processing, Vol.25, No.6, 2017. (text)

FY 2016

M.Mirzaei, K.Meshgi, Y.Akita, and T.Kawahara.
Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill.
ReCALL Journal, Vol.29, No.2, pp.178--199, 2017. (text) (PDF file)
M.Ohkita, Y.Bando, Y.Ikemiya, E.Nakamura, K.Itoyama, and K.Yoshii.
Audio-visual beat tracking based on a state-space model for a robot dancer performing with a human dancer
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.125-136, 2017. (text)
K.Sekiguchi, Y.Bando, K.Itoyama, and K.Yoshii.
Layout optimization of cooperative distributed microphone arrays based on estimation of source separation performance.
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.83-93, 2017. (text)
K.Youssef, K.Itoyama, and K.Yoshii.
Simultaneous identification and localization of still and mobile speakers based on binaural robot audition.
Journal Robotics \& Mechatronics, Vol.29, No.1, pp.59-71, 2017. (text)
Y.Ikemiya, K.Itoyama, and K.Yoshii.
Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.24, No.11, pp.2084--2095, 2016. (text) (PDF file)
S.Li, Y.Akita, and T.Kawahara.
Semi-supervised acoustic model training by discriminative data selection from multiple ASR systems' hypotheses.
IEEE/ACM Trans. Audio, Speech \& Language Process., Vol.24, No.9, pp.1524--1534, 2016. (text) (PDF file) (KURENAI)
山口貴史, 井上昂治, 吉野幸一郎, 高梨克也, NigelG. Ward, 河原達也.
傾聴対話システムのための言語情報と韻律情報に基づく多様な形態の相槌の生成.
人工知能学会論文誌, Vol.31, No.4, pp.C--G31\_1--10, 2016. (text) (PDF file)

FY 2015

T.Kawahara, T.Iwatate, K.Inoue, S.Hayashi, H.Yoshimoto, and K.Takanashi.
Multi-modal sensing and analysis of poster conversations with smart posterboard.
APSIPA Trans. Signal \& Information Process., Vol.5, No.e2, pp.1--12, 2016. (text)
井上昂治, 若林佑幸, 吉本廣雅, 河原達也.
多人数会話における音響・視線情報を統合した話者区間検出.
電子情報通信学会論文誌, Vol.J99-D, No.3, pp.348--357, 2016. (text)
若林佑幸, 井上昂治, 中山雅人, 西浦敬信, 山下洋一, 吉本廣雅, 河原達也.
視聴覚情報の統合に基づく音源数推定と話者ダイアライゼーション.
電子情報通信学会論文誌, Vol.J99-D, No.3, pp.326--336, 2016. (text)
三村正人, 河原達也.
講演音声認識のための類似話者選択に基づくDNN-HMMの教師なし適応.
電子情報通信学会論文誌, Vol.J98-D, No.11, pp.1411--1418, 2015. (text)
K.Yoshino and T.Kawahara.
Conversational system for information navigation based on POMDP with user focus tracking.
Computer Speech and Language, Vol.34, No.1, pp.275--291, 2015. (text)
I.Nishimuta, K.Yoshii, K.Itoyama, and H.G.Okuno.
Toward a Quizmaster Robot for Speech-based Multiparty Interaction.
Advanced Robotics., Vol.29, No.18, pp.1205--1219, 2015. (text)
S.Li, Y.Akita, and T.Kawahara.
Automatic lecture transcription based on discriminative data selection for lightly supervised acoustic model training.
IEICE Trans., Vol.E98-D, No.8, pp.1545--1552, 2015. (text)
R.Gomez, T.Kawahara, and K.Nakadai.
Optimized wavelet-domain filtering under noisy and reverberant conditions.
APSIPA Trans. Signal \& Information Process., Vol.4, No.e3, pp.1--12, 2015. (text)
M.Mimura, S.Sakai, and T.Kawahara.
Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with phone-class feature.
EURASIP J. Advances in Signal Processing, Vol.2015, No.62, pp.1--13, 2015. (text) (PDF file) (KURENAI)
笹田鉄郎, 前田浩邦, 森信介, 河原達也, 山肩洋子.
レシピ表現の定義とその自動認識のためのタグ付与コーパスの構築.
自然言語処理, Vol.22, No.2, pp.107--131, 2015. (text) (PDF file)

FY 2014

T.Tung, R.Gomez, T.Kawahara, and T.Matsuyama.
Multi-party interaction understanding using smart multimodal digital signage.
IEEE Trans. Human-Machine Systems, Vol.44, No.5, pp. 625--637, 2014. (text) (PDF file)
M.Ablimit, T.Kawahara, and A.Hamdulla.
Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language.
Speech Communication, Vol.60, pp.78--87, 2014. (text) (PDF file)

FY 2013

吉野幸一郎, 森信介, 河原達也.
述語項構造を介した文の選択に基づく音声対話用言語モデルの構築.
人工知能学会論文誌, Vol.29, No.1, pp.53--59, 2014. (text) (PDF file)
S.Sakai and T.Kawahara.
Admissible stopping in Viterbi beam search for unit selection speech synthesis.
IEICE Trans., Vol.E96-D, No.6, pp.1359--1367, 2013. (text)
G.Neubig, T.Watanabe, S.Mori, and T.Kawahara.
Substring-based machine translation.
Machine Translation, Vol.27, No.2, pp.139--166, 2013. (text) (PDF file)

FY 2012

秋田祐哉, 河原達也.
講演に対する読点の複数アノテーションに基づく自動挿入.
情報処理学会論文誌, Vol.54, No.2, pp.463--470, 2013. (text)
伊藤慶明, 西崎博光, 中川聖一, 秋葉友良, 河原達也, 胡新輝, 南條浩輝, 松井知子, 山下洋一, 相川清明.
音声中の検索語検出のためのテストコレクションの構築と分析.
情報処理学会論文誌, Vol.54, No.2, pp.471--483, 2013. (text)
H.Nishizaki, T.Akiba, K.Aikawa, T.Kawahara, and T.Matsui.
Evaluation framework design of spoken term detection study at the NTCIR-9 IR for spoken documents task.
自然言語処理, Vol.19, No.4, pp.329--350, 2012. (text) (PDF file)
G.Neubig, Y.Akita, S.Mori, and T.Kawahara.
A monotonic statistical machine translation approach to speaking style transformation.
Computer Speech and Language, Vol.26, No.5, pp.349--370, 2012. (text) (PDF file)
三村正人, 河原達也.
会議音声認識におけるBICに基づく高速な話者正規化と話者適応.
電子情報通信学会論文誌, Vol.J95-D, No.7, pp.1467--1475, 2012. (text)
G.Neubig, T.Watanabe, E.Sumita, S.Mori, and T.Kawahara.
Joint phrase alignment and extraction for statistical machine translation.
J. Information Processing, Vol.20, No.2, pp.512--523, 2012. (text)

FY 2011

G.Neubig, M.Mimura, S.Mori, and T.Kawahara.
Bayesian learning of a language model from continuous speech.
IEICE Trans., Vol.E95-D, No.2, pp.614--625, 2012. (text)
吉野幸一郎, 森信介, 河原達也.
述語項の類似度に基づく情報抽出・推薦を行う音声対話システム.
情報処理学会論文誌, Vol.52, No.12, pp.3386--3397, 2011. (text)
河原達也, 須見康平, 緒方淳, 後藤真孝.
音声会話コンテンツにおける聴衆の反応に基づく音響イベントとホットスポットの検出.
情報処理学会論文誌, Vol.52, No.12, pp.3363--3373, 2011. (text)
森信介, G.Neubig, 坪井祐太.
点予測による単語分割.
情報処理学会論文誌, Vol.52, No.10, pp.2944--2952, 2011. (text)
S.Sakai, T.Kawahara, and H.Kawai.
Probabilistic concatenation modeling for corpus-based speech synthesis.
IEICE Trans., Vol.E94-D, No.10, pp.2006--2014, 2011. (text)
森信介, 中田陽介, Neubig Graham, 河原達也.
点予測による形態素解析.
自然言語処理, Vol.18, No.4, pp.367--381, 2011. (text) (PDF file)

FY 2010

森信介, 笹田鉄郎, Neubig Graham.
確率的タグ付与コーパスからの言語モデル構築.
自然言語処理, Vol.18, No.2, pp.71--87, 2011. (text) (PDF file)
三村正人, 秋田祐哉, 河原達也.
統計的言語モデル変換を用いた音響モデルの準教師付き学習.
電子情報通信学会論文誌, Vol.J94-D, No.2, pp.460--468, 2011. (text)
D.Cournapeau, S.Watanabe, A.Nakamura, and T.Kawahara.
Online unsupervised classification with model comparison in the Variational Bayes framework for voice activity detection.
IEEE J. Selected Topics in Signal Processing, Vol.4, No.6, pp.1071--1083, 2010. (text) (PDF file) (KURENAI)
秋田祐哉, 三村正人, 河原達也.
会議録作成支援のための国会審議の音声認識システム.
電子情報通信学会論文誌, Vol.J93-D, No.9, pp.1736--1744, 2010. (text)
R.Gomez and T.Kawahara.
Robust speech recognition based on dereverberation parameter optimization using acoustic model likelihood.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.7, pp.1708--1716, 2010. (text) (PDF file) (KURENAI)
Y.Akita and T.Kawahara.
Statistical transformation of language and pronunciation models for spontaneous speech recognition.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.6, pp.1539--1549, 2010. (text) (PDF file) (KURENAI)
K.Ishizuka, S.Araki, and T.Kawahara.
Speech activity detection for multi-party conversation analyses based on likelihood ratio test on spatial magnitude.
IEEE Trans. Audio, Speech \& Language Process., Vol.18, No.6, pp.1354--1365, 2010. (text) (PDF file)
笹田鉄郎, 森信介, 河原達也.
自動獲得した未知語の読み・文脈情報による仮名漢字変換.
自然言語処理, Vol.17, No.4, pp.131--153, 2010. (text) (PDF file)
T.Shinozaki, S.Furui, and T.Kawahara.
Gaussian mixture optimization based on efficient cross-validation.
IEEE J. Selected Topics in Signal Processing, Vol.4, No.3, pp.540--547, 2010. (text) (PDF file)

FY 2009

T.Misu and T.Kawahara.
Bayes risk-based dialogue management for document retrieval system with speech interface.
Speech Communication, Vol.52, No.1, pp.61--71, 2010. (text) (PDF file)
H.Wang and T.Kawahara.
Effective prediction of errors by non-native speakers using decision tree for speech recognition-based CALL system.
IEICE Trans., Vol.E92-D, No.12, pp.2462--2468, 2009. (text)
小窪浩明, 畑岡信夫, 李晃伸, 河原達也, 鹿野清宏.
SuperHマイコンへの搭載を目的とした連続音声認識ソフトウエア Juliusの計算量削減.
情報処理学会論文誌, Vol.50, No.11, pp.2597--2606, 2009. (text)
H.Wang, C.J.Waple, and T.Kawahara.
Computer assisted language learning system based on dynamic question generation and error prediction for automatic speech recognition.
Speech Communication, Vol.51, No.10, pp.995--1005, 2009. (text) (PDF file)

FY 2008

西光雅弘, 秋田祐哉, 高梨克也, 尾嶋憲治, 河原達也.
局所的な係り受けの情報を用いた話し言葉の節・文境界の推定.
情報処理学会論文誌, Vol.50, No.2, pp.544--552, 2009. (text)
T.Akiba, K.Aikawa, Y.Itoh, T.Kawahara, H.Nanjo, H.Nishizaki, N.Yasuda, Y.Yamashita, and K.Itou.
Construction of a test collection for spoken document retrieval from lecture audio data.
情報処理学会論文誌, Vol.50, No.2, pp.501--513, 2009. (text)
河原達也, 根本雄介, 勝丸徳浩, 秋田祐哉.
スライド情報を用いた言語モデル適応による講義音声認識.
情報処理学会論文誌, Vol.50, No.2, pp.469--476, 2009. (text)
浜辺良二, 内元清貴, 河原達也, 井佐原均.
話し言葉における引用節・挿入節の自動認定および係り受け解析への応用.
自然言語処理, Vol.16, No.1, pp.3--23, 2009. (text) (PDF file)
D.Cournapeau and T.Kawahara.
Voice activity detection based on high order statistics and online EM algorithm.
IEICE Trans., Vol.E91-D, No.12, pp.2854--2861, 2008. (text)
南條浩輝, 河原達也, 七里崇.
音声理解を指向したベイズリスク最小化枠組みに基づく音声認識.
電子情報通信学会論文誌, Vol.J91-D, No.5, pp.1314--1324, 2008. (text)

FY 2007

翠輝久, 河原達也, 正司哲朗, 美濃導彦.
質問応答・情報推薦機能を備えた音声による情報案内システム.
情報処理学会論文誌, Vol.48, No.12, pp.3602--3611, 2007. (text)
翠輝久, 河原達也.
ドメインとスタイルを考慮したwebテキストの選択による音声対話システム用言語モデルの構築.
電子情報通信学会論文誌, Vol.J90-D, No.11, pp.3024--3032, 2007. (text)

FY 2006

I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Out-of-domain utterance detection using classification confidences of multiple topics.
IEEE Trans. Audio, Speech \& Language Process., Vol.15, No.1, pp.150--161, 2007. (text) (PDF file) (KURENAI)
T.Misu and T.Kawahara.
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
Speech Communication, Vol.48, No.9, pp.1137--1150, 2006. (text) (PDF file)
木田祐介, 河原達也.
複数特徴の重み付き統合による雑音に頑健な発話区間検出.
電子情報通信学会論文誌, Vol.J89-DII, No.8, pp.1820--1828, 2006. (text)

FY 2005

C.Troncoso and T.Kawahara.
Trigger-based language model adaptation for automatic transcription of panel discussions.
IEICE Trans., Vol.E89-D, No.3, pp.1024--1031, 2006. (text)
I.R.Lane and T.Kawahara.
Verification of speech recognition results incorporating in-domain confidence and discourse coherence measures.
IEICE Trans., Vol.E89-D, No.3, pp.931--938, 2006. (text)
秋田祐哉, 河原達也.
話し言葉音声認識のための汎用的な統計的発音変動モデル.
電子情報通信学会論文誌, Vol.J88-DII, No.9, pp.1780--1789, 2005. (text)
下岡和也, 内元清貴, 河原達也, 井佐原均.
日本語話し言葉の係り受け解析と文境界推定の相互作用による高精度化.
自然言語処理, Vol.12, No.3, pp.3--17, 2005. (text) (PDF file)
M.Nishida and T.Kawahara.
Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing.
IEEE Trans. Speech \& Audio Process., Vol.13, No.4, pp. 583--592, 2005. (text) (PDF file) (KURENAI)

FY 2004

K.Komatani, S.Ueno, T.Kawahara, and H.G.Okuno.
User modeling in spoken dialogue systems to generate flexible guidance.
User Modeling and User-Adapted Interaction, Vol.15, No.1, pp. 169--183, 2005. (text) (PDF file)
翠輝久, 駒谷和範, 清田陽司, 河原達也.
音声対話によるソフトウェアサポートタスクのための効率的な確認戦略.
電子情報通信学会論文誌, Vol.J88-DII, No.3, pp.499--508, 2005. (text)
I.R.Lane, T.Kawahara, T.Matsui, and S.Nakamura.
Dialogue speech recognition by combining hierarchical topic classification and language model switching.
IEICE Trans., Vol.E88-D, No.3, pp.446--454, 2005. (text)
Y.Akita and T.Kawahara.
Language model adaptation based on PLSA of topics and speakers for automatic transcription of panel discussions.
IEICE Trans., Vol.E88-D, No.3, pp.439--445, 2005. (text)
駒谷和範, 上野晋一, 河原達也, 奥乃博.
音声対話システムにおける適応的な応答生成を行うためのユーザモデル.
電子情報通信学会論文誌, Vol.J87-DII, No.10, pp.1921--1928, 2004. (text)
T.Kawahara, M.Hasegawa, K.Shitaoka, T.Kitade, and H.Nanjo.
Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers.
IEEE Trans. Speech \& Audio Process., Vol.12, No.4, pp. 409--419, 2004. (text) (PDF file) (KURENAI)
H.Nanjo and T.Kawahara.
Language model and speaking rate adaptation for spontaneous presentation speech recognition.
IEEE Trans. Speech \& Audio Process., Vol.12, No.4, pp. 391--400, 2004. (text) (PDF file) (KURENAI)
Y.Tsubota, T.Kawahara, and M.Dantsuji.
An English pronunciation learning system for Japanese students based on diagnosis of critical pronunciation errors.
ReCALL Journal, Vol.16, No.1, pp.173--188, 2004. (text) (PDF file)
下岡和也, 南條浩輝, 河原達也.
講演の書き起こしに対する統計的手法を用いた文体の整形.
自然言語処理, Vol.11, No.2, pp.67--83, 2004. (text) (PDF file)

FY 2003

西田昌史, 河原達也.
BICに基づく統計的話者モデル選択による教師なし話者インデキシング.
電子情報通信学会論文誌, Vol.J87-DII, No.2, pp.504--512, 2004. (text)
秋田祐哉, 河原達也.
多数話者モデルを用いた討論音声の教師なし話者インデキシング.
電子情報通信学会論文誌, Vol.J87-DII, No.2, pp.495--503, 2004. (text)
山肩洋子, 河原達也, 奥乃博, 美濃導彦.
音声対話システムにおける物体指示のための信念ネットワークを用いた曖昧性の解消.
人工知能学会論文誌, Vol.19, No.1, pp.47--56, 2004. (text) (PDF file)
駒谷和範, 鹿島博晶, 田中克明, 河原達也.
複合的言語制約に基づくキーフレーズ検出を用いた汎用的なデータベース検索音声対話プラットフォーム.
情報処理学会論文誌, Vol.44, No.5, pp.1333--1342, 2003. (text)
井本和範, 坪田康, 河原達也, 壇辻正剛.
英語韻律発音学習支援システムのための英語文強勢のモデル化と自動検出.
日本音響学会誌, Vol.59, No.4, pp.183--191, 2003. (PDF file)
南條浩輝, 加藤一臣, 李晃伸, 河原達也.
大規模な日本語話し言葉データベースを用いた講演音声認識.
電子情報通信学会論文誌, Vol.J86-DII, No.4, pp.450--459, 2003. (text)

FY 2002

Y.Tsubota, T.Kawahara, and M.Dantsuji.
Formant structure estimation using vocal tract length normalization for CALL system.
Acoustical Science \& Technology, Vol.24, No.2, pp.93--96, 2003. (text) (PDF file)
奥田浩三, 河原達也, 中村哲.
尤度基準による分析周期・窓長の自動選択手法を用いた発話速度の補正と音響モデルの構築.
電子情報通信学会論文誌, Vol.J86-DII, No.2, pp.204--211, 2003. (text)
駒谷和範, 河原達也.
音声認識結果の信頼度を用いた効率的な確認・誘導を行う対話管理.
情報処理学会論文誌, Vol.43, No.10, pp.3078--3086, 2002. (text)
長谷川将宏, 秋田祐哉, 河原達也.
談話標識の抽出に基づいた講演音声の自動インデキシング.
情報処理学会論文誌, Vol.43, No.7, pp.2222--2229, 2002. (text)
李晃伸, 河原達也, 鹿野清宏.
音素環境独立HMMを用いた混合ガウス分布選択による音響尤度計算の削減.
情報処理学会論文誌, Vol.43, No.7, pp.2214--2221, 2002. (text)
伊藤亮介, 駒谷和範, 河原達也.
機器操作マニュアルの知識と構造を利用した音声対話ヘルプシステム.
情報処理学会論文誌, Vol.43, No.7, pp.2147--2154, 2002. (text)

FY 2001

M.Mimura and T.Kawahara.
Difference of acoustic modeling for read speech and dialogue speech.
Acoustical Science \& Technology, Vol.22, No.5, pp.373--374, 2001. (text) (PDF file)

FY 2000

河原達也, 李晃伸, 小林哲則, 武田一哉, 峯松信明, 嵯峨山茂樹, 伊藤克亘, 伊藤彰則, 山本幹雄, 山田篤, 宇津呂武仁, 鹿野清宏.
日本語ディクテーション基本ソフトウェア (99年度版).
日本音響学会誌, Vol.57, No.3, pp.210--214, 2001.
李晃伸, 河原達也, 武田一哉, 鹿野清宏.
Phonetic Tied-Mixtureモデルを用いた大語彙連続音声認識.
電子情報通信学会論文誌, Vol.J83-DII, No.12, pp.2517--2525, 2000. (text)
C.-H.Jo, T.Kawahara, S.Doshita, and M.Dantsuji.
Japanese pronunciation instruction system using speech recognition methods.
IEICE Trans., Vol.E83-D, No.11, pp.1960--1968, 2000. (text)
河原達也, 李晃伸, 小林哲則, 武田一哉, 峯松信明, 伊藤克亘, 山本幹雄, 山田篤, 宇津呂武仁, 鹿野清宏.
日本語ディクテーション基本ソフトウェア (98年度版).
日本音響学会誌, Vol.56, No.4, pp.255--259, 2000.

FY 1999

T.Kawahara, A.Lee, T.Kobayashi, K.Takeda, N.Minematsu, K.Itou, A.Ito, M.Yamamoto, A.Yamada, T.Utsuro, and K.Shikano.
Japanese dictation toolkit -- 1997 version --.
J. Acoust. Soc. Japan (E), Vol.20, No.3, pp.233--239, 1999. (text) (PDF file)