Kazuyoshi Yoshii

International Journals

Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii, Tatsuya Kawahara. Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 2368–2382, 2022. [DOI]
Yiming Wu, Kazuyoshi Yoshii. Joint Chord and Key Estimation Based on a Hierarchical Variational Autoencoder with Multi-Task Learning. APSIPA Transactions on Signal and Information Processing, Vol. 11, No. 1, pp. 1–27, 2022. [DOI]
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii. Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 30, pp. 1734–1748, 2022. [DOI] [arXiv]
Yicheng Du, Robin Scheibler, Masahito Togami, Kazuyoshi Yoshii, Tatsuya Kawahara. Computationally-Efficient Overdetermined Blind Source Separation Based on Iterative Source Steering. IEEE Signal Processing Letters, Vol. 29, pp. 927–931, 2021. [DOI]
Yoshiaki Bando, Kouhei Sekiguchi, Yoshiki Masuyama, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii. Neural Full-Rank Spatial Covariance Analysis for Blind Source Separation. IEEE Signal Processing Letters, Vol. 28, pp. 1670–1674, 2021. [DOI]
Ryoto Ishizuka, Ryo Nishikimi, Kazuyoshi Yoshii. Global Structure-Aware Drum Transcription Based on Self-Attention Mechanisms. Signals, Vol. 2, No. 3, pp. 508–526, 2021. [DOI]
Pasrawin Taechawattananant, Kazuyoshi Yoshii, Yasushi Ishihama. Peak Identification and Quantification by Proteomic Mass Spectrogram Decomposition. Journal of Proteome Research, Vol. 20, No. 5, pp. 2291–2298, 2021. [DOI] [bioRxiv]
Kentaro Shibata, Eita Nakamura, Kazuyoshi Yoshii. Non-Local Musical Statistics as Guides for Audio-to-Score Piano Transcription. Information Sciences, Vol. 566, pp. 262–280, 2021. [DOI] [arXiv]
Takayuki Nakatsuka, Kazuyoshi Yoshii, Yuki Koyama, Satoru Fukayama, Masataka Goto, Shigeo Morishima. MirrorNet: A Deep Reflective Approach to 2D Pose Estimation for Single-Person Images. Journal of Information Processing, Vol. 29, pp. 406–423, 2021. [DOI]
Eita Nakamura, Kazuyoshi Yoshii. Musical Rhythm Transcription Based on Bayesian Piece-Specific Score Models Capturing Repetitions. Information Sciences, Vol. 572, pp. 482–500, 2021. [DOI] [arXiv]
Ryo Nishikimi, Eita Nakamura, Masataka Goto, Kazuyoshi Yoshii. Audio-to-Score Singing Transcription Based on a CRNN-HSMM Hybrid Model. APSIPA Transactions on Signal and Information Processing, Vol. 10, No. e7, pp. 1–13, 2021. [DOI]
Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. Flow-Based Independent Vector Analysis for Blind Source Separation. IEEE Signal Processing Letters, Vol. 27, pp. 2173–2177, 2020. [DOI]
Yiming Wu, Tristan Carsault, Eita Nakamura, Kazuyoshi Yoshii. Semi-supervised Neural Chord Estimation Based on a Variational Autoencoder with Latent Chord Labels and Features. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 2956–2966, 2020. [DOI] [arXiv]
Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara. Fast Multichannel Nonnegative Matrix Factorization with Directivity-Aware Jointly-Diagonalizable Spatial Covariance Matrices for Blind Source Separation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 2610–2625, 2020. [DOI]
Ryo Nishikimi, Eita Nakamura, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii. Bayesian Singing Transcription Based on a Hierarchical Generative Model of Keys, Musical Notes, and F0 Trajectories. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1678–1691, 2020. [DOI]
Hiroaki Tsushima, Eita Nakamura, Kazuyoshi Yoshii. Bayesian Melody Harmonization Based on a Tree-Structured Generative Model of Chord Sequences and Melodies. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1644–1655, 2020. [DOI]
Aditya Arie Nugraha, Kouhei Sekiguchi, Kazuyoshi Yoshii. A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, pp. 1104–1117, 2020. [DOI]
Eita Nakamura, Yasuyuki Saito, Kazuyoshi Yoshii. Statistical Learning and Estimation of Piano Fingering. Information Sciences, Vol. 517, pp. 68–85, 2020. [DOI] [arXiv]
Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii, Tatsuya Kawahara. Semi-supervised Multichannel Speech Enhancement with a Deep Speech Prior. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 27, No. 12, pp. 2197–2212, 2019. [DOI] [PDF]
Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Speech Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 27, No. 5, pp. 960–971, 2019. [DOI] [arXiv]
Yuta Ojima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Chord-Aware Automatic Music Transcription Based on Hierarchical Bayesian Integration of Acoustic and Language Models. APSIPA Transactions on Signal and Information Processing, Vol. 7, No. e14, pp. 1–14, 2018. [DOI] [PDF]
Eita Nakamura, Kazuyoshi Yoshii. Statistical Piano Reduction Controlling Performance Difficulty. APSIPA Transactions on Signal and Information Processing, Vol. 7, No. e13, pp. 1–12, 2018. [DOI] [PDF]
Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Generative Statistical Models with Self-Emergent Grammar of Chord Sequences. Journal of New Music Research, pp. 1–23, 2018. [DOI] [arXiv]
Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, No. 4, pp. 831–846, 2018. [DOI] [PDF]
Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno. Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, No. 2, pp. 215–230, 2018. [DOI] [PDF]
Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon. Note Value Recognition for Rhythm Transcription Using a Markov Random Field Model for Musical Scores and Performances of Piano Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 9, pp. 1846–1858, 2017. [DOI] [PDF]
Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama. Rhythm Transcription of Polyphonic Music Based on Merged-Output HMM for Multiple Voices. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 4, pp. 794–806, 2017. [DOI] [PDF]
Karim Youssef, Katsutoshi Itoyama, Kazuyoshi Yoshii. Simultaneous Identification and Localization of Immobile and Moving Speakers Based on Binaural Sound Acquisition. Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 59–71, 2017. [DOI] [PDF]
Kouhei Sekiguchi, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii. Layout Optimization of Cooperative Distributed Microphone Arrays Based on Estimation of Source Separation Performance. Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 83–93, 2017. [DOI] [PDF]
Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Audio-Visual Beat Tracking Based on a State-Space Model for a Robot Dancer Performing with a Human Dancer. Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 125–136, 2017. [DOI] [PDF]
Yoshiaki Bando, Hiroshi Saruwatari, Nobutaka Ono, Shoji Makino, Katustoshi Itoyama, Daichi Kitamura, Masaru Ishimura, Moe Takakusaki, Narumi Mae, Kouei Yamaoka, Yutaro Matsui, Yuichi Ambe, Masashi Konyo, Satoshi Tadokoro, Kazuyoshi Yoshii, Hiroshi G. Okuno. Low-Latency and High-Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot. Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 198–212, 2017. [DOI] [PDF]
Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii. Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, No. 11, pp. 2084–2095, 2016. [DOI] [PDF]
Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto. Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models. International Journal of Semantic Computing, Vol. 10, No. 1, pp. 27–52, 2016. [DOI] [PDF]
Izaya Nishimuta, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Toward a Quizmaster Robot for Speech-Based Multiparty Interaction. Advanced Robotics, Vol. 29, No. 18, pp. 1205–1219, 2015. [DOI] [DOI]
Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Nonparametric Bayesian Dereverberation of Power Spectrograms Based on Infinite-Order Autoregressive Processes. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 12, pp. 1918–1930, 2014. [DOI] [PDF]
Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto. AutoMashUpper: Automatic Creation of Multi-Song Music Mashups. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 12, pp. 1726–1737, 2014. [DOI] [PDF]
Kazuyoshi Yoshii, Masataka Goto. A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, No. 3, pp. 717–730, 2012. [DOI] [PDF]
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. An Efficient Hybrid Music Recommender System Using an Incrementally Trainable Probabilistic Generative Model. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, No. 2, pp. 435–447, 2008. Funai Research Incentive Award. [DOI] [PDF]
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates with Harmonic Structure Suppression. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 1, pp. 333–345, 2007. Telecom System Technology Student Award by the Telecommunications Advancement Foundation (TAF). [DOI] [PDF]
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. Drumix: An Audio Player with Real-Time Drum-Part Rearrangement Functions for Active Music Listening. IPSJ Digital Courier, Vol. 3, pp. 134–144, 2007. IPSJ Digital Courier Funai Young Researcher Encouragement Award. [DOI] [PDF]

International Conferences

Tsung-Ping Chen, Kazuyoshi Yoshii. Learning Multifaceted Self-Similarity over Time and Frequency for Music Structure Analysis. International Society for Music Information Retrieval Conference (ISMIR), pp. XXX–XXX, November 2024. [PDF]
Weixing Wei, Jiahao Zhao, Yulun Wu, Kazuyoshi Yoshii. Streaming Piano Transcription Based on Consistent Onset and Offset Decoding with Sustain Pedal Detection. International Society for Music Information Retrieval Conference (ISMIR), pp. XXX–XXX, November 2024. [PDF]
Yoshiaki Sumura, Diego Di Carlo, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii. Joint Audio Source Localization and Separation With Distributed Microphone Arrays Based on Spatially-Regularized Multichannel NMF. IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. XXX–XXX, September 2024. [DOI] [PDF]
Liam Kelley, Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. RIR-in-a-Box: Estimating Room Acoustics from 3D Mesh Data through Shoebox Approximation. Annual Conference of the International Speech Communication Association (Interspeech), pp. 3255–3259, September 2024. [DOI] [PDF]
Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. Neural Steerer: Novel Steering Vector Synthesis with a Causal Neural Field over Frequency and Direction. IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), pp. 740–744, April 2024. [DOI] [PDF]
Jiahao Zhao, Kazuyoshi Yoshii. Multimodal Multifaceted Music Emotion Recognition Based on Self-Attentive Fusion of Psychology-Inspired Symbolic and Acoustic Features. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1641–1645, October 2023. [DOI] [PDF]
Tsung-Ping Chen, Li Su, Kazuyoshi Yoshii. Learning Multifaceted Self-Similarity for Musical Structure Analysis. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 165–172, October 2023. [DOI] [PDF]
Yoto Fujita, Yoshiaki Bando, Keisuke Imoto, Masaki Onishi, Kazuyoshi Yoshii. DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 2061–2067, October 2023. [DOI] [PDF]
Tengyu Deng, Eita Nakamura, Kazuyoshi Yoshii. Audio-to-Score Singing Transcription Based on Joint Estimation of Pitches, Onsets, and Metrical Positions With Tatum-Level CTC Loss. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 583–590, October 2023. [DOI] [PDF]
Daichi Kamakura, Eita Nakamura, Kazuyoshi Yoshii. CTC2: End-to-End Drum Transcription Based on Connectionist Temporal Classification With Constant Tempo Constraint. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 158–164, October 2023. [DOI] [PDF]
Daichi Kamakura, Eita Nanamura, Takehisa Oyama, Kazuyoshi Yoshii. Joint Drum Transcription and Metrical Analysis Based on Periodicity-Aware Multi-Task Learning. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 151–157, October 2023. [DOI] [PDF]
Takuto Nabeoka, Eita Nakamura, Kazuyoshi Yoshii. Automatic Orchestration of Piano Scores for Wind Bands with User-Specified Instrumentation. International Symposium on Computer Music Multidisciplinary Research (CMMR), pp. 1–8, November 2023. [PDF]
Aditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii. Time-Domain Audio Source Separation Based on Gaussian Processes with Deep Kernel Learning. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5, October 2023. [DOI] [PDF]
Yoshiaki Bando, Yoshiki Masuyama, Aditya Arie Nugraha, Kazuyoshi Yoshii. Neural Fast Full-Rank Spatial Covariance Analysis for Blind Source Separation. European Signal Processing Conference (EUSIPCO), pp. 51–55, September 2023. [DOI] [PDF]
Moyu Terao, Eita Nakamura, Kazuyoshi Yoshii. Neural Band-to-Piano Score Arrangement with Stepless Difficulty Control. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1–5, June 2023. [DOI] [PDF]
Tengyu Deng, Eita Nakamura, Kazuyoshi Yoshii. End-to-End Lyrics Transcription Informed by Pitch and Onset Estimation. International Society for Music Information Retrieval Conference (ISMIR), pp. 633–639, December 2022. [PDF]
Florian Thalmann, Eita Nakamura, Kazuyoshi Yoshii. Tracking the Evolution of a Band's Performances over Decades. International Society for Music Information Retrieval Conference (ISMIR), pp. 850–857, December 2022. [PDF]
Keitaro Tanaka, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima. Unsupervised Disentanglement of Timbral, Pitch, and Variation Features From Musical Instrument Sounds with Random Perturbation. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 709–716, November 2022. [DOI] [PDF]
Kouhei Sekiguchi, Aditya Arie Nugraha, Yicheng Du, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii. Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9266–9273, October 2022. [DOI] [PDF]
Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii. Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments. Annual Conference of the International Speech Communication Association (Interspeech), pp. 2918–2922, September 2022. [DOI] [PDF]
Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF. IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 1–5, September 2022. [DOI] [PDF]
Yoshiaki Sumura, Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Kazuyoshi Yoshii. Joint Localization and Synchronization of Distributed Camera-Attached Microphone Arrays for Indoor Scene Analysis. IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 1–5, September 2022. [DOI] [PDF]
Mathieu Fontaine, Diego Di Carlo, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii. Elliptically Contoured Alpha-Stable Representation for MUSIC-Based Sound Source Localization. European Signal Processing Conference (EUSIPCO), pp. 26–30, August 2022. [DOI] [PDF]
Moyu Terao, Yuki Hiramatsu, Ryoto Ishizuka, Yiming Wu, Kazuyoshi Yoshii. Difficulty-Aware Neural Band-to-Piano Score Arrangement Based on Note- and Statistic-Level Criteria. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 196–200, May 2022. [DOI] [PDF]
Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii. Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 501–505, May 2022. [DOI] [PDF]
Takehisa Oyama, Ryoto Ishizuka, Kazuyoshi Yoshii. Phase-Aware Joint Beat and Downbeat Estimation Based on Periodicity of Metrical Structure. International Society for Music Information Retrieval Conference (ISMIR), pp. 493–499, November 2021. [PDF]
Yuki Hiramatsu, Eita Nakamura, Kazuyoshi Yoshii. Joint Estimation of Note Values and Voices for Audio-to-Score Piano Transcription. International Society for Music Information Retrieval Conference (ISMIR), pp.278–284, November 2021. [PDF]
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii. Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation. Annual Conference of the International Speech Communication Association (Interspeech), pp. 661–665, August 2021. [DOI] [PDF]
Yoshiaki Bando, Kouhei Sekiguchi, Kazuyoshi Yoshii. Gamma Process FastMNMF for Separating an Unknown Number of Sound Sources. European Signal Processing Conference (EUSIPCO), pp. 291–295, August 2021. [DOI] [PDF]
Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii. Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Blind Source Separation and Dereverberation. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 511–515, June 2021. [DOI] [PDF]
Yuki Hiramatsu, Go Shibata, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii. Statistical Correction of Transcribed Melody Notes Based on Probabilistic Integration of a Music Language Model and a Transcription Error Model. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 256–260, June 2021. [DOI] [PDF]
Keitaro Tanaka, Ryo Nishikimi, Yoshiaki Bando, Kazuyoshi Yoshii, Shigeo Morishima. Pitch-Timbre Disentanglement of Musical Instrument Sounds Based on VAE-Based Metric Learning. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 111–115, June 2021. [DOI] [PDF]
Kazuyoshi Yoshii, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Aditya Arie Nugraha. Fast Multichannel Correlated Tensor Factorization for Blind Source Separation. European Signal Processing Conference (EUSIPCO), pp. 306–310, January 2021. [DOI] [PDF]
Yicheng Du, Kouhei Sekiguchi, Yoshiaki Bando, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii, Tatsuya Kawahara. Semi-supervised Multichannel Speech Separation Based on a Phone- and Speaker-Aware Deep Generative Model of Speech Spectrograms. European Signal Processing Conference (EUSIPCO), pp. 870–874, January 2021. [DOI] [PDF]
Ryoto Ishizuka, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii. Tatum-Level Drum Transcription Based on a Convolutional Recurrent Neural Network with Language Model-Based Regularized Training. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 359–364, December 2020. [PDF]
Yiming Wu, Eita Nakamura, Kazuyoshi Yoshii. A Variational Autoencoder for Joint Chord and Key Estimation from Audio Chromagrams. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 500–506, December 2020. [PDF]
Masaya Wake, Masahito Togami, Kazuyoshi Yoshii, Tatsuya Kawahara. Integration of Semi-blind Speech Source Separation and Voice Activity Detection for Flexible Spoken Dialogue. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 775–780, December 2020. [PDF]
Masahito Togami, Yoshiki Masuyama, Tatsuya Komatsu, Kazuyoshi Yoshii, Tatsuya Kawahara. Computer-Resource-Aware Deep Speech Separation with a Run-Time-Specified Number of BLSTM Layers. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 788–793, December 2020. [PDF]
Jeongwoo Woo, Masato Mimura, Kazuyoshi Yoshii, Tatsuya Kawahara. End-to-End Music-Mixed Speech Recognition. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 800–804, December 2020. [PDF]
Yoshiaki Bando, Kouhei Sekiguchi and Kazuyoshi Yoshii. Adaptive Neural Speech Enhancement with a Denoising Variational Autoencoder. Annual Conference of the International Speech Communication Association (Interspeech), pp. 2437–2441, October 2020. [DOI] [PDF]
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Kazuyoshi Yoshii. Unsupervised Robust Speech Enhancement Based on Alpha-Stable Fast Multichannel Nonnegative Matrix Factorization. Annual Conference of the International Speech Communication Association (Interspeech), pp. 4541–4545, October 2020. [DOI] [PDF]
Go Shibata, Ryo Nishikimi, Kazuyoshi Yoshii. Music Structure Analysis Based on an LSTM-HSMM Hybrid Model. International Society for Music Information Retrieval Conference (ISMIR), pp. 15–22, October 2020. [PDF]
Keitaro Tanaka, Takayuki Nakatsuka, Ryo Nishikimi, Kazuyoshi Yoshii, Shigeo Morishima. Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams. International Society for Music Information Retrieval Conference (ISMIR), pp. 327–334, October 2020. [PDF]
Florian Thalmann, Kazuyoshi Yoshii, Wiggins Geraint, Mark B. Sandler. A Method for Analysis of Shared Structure in Large Music Collections Using Techniques from Genetic Sequencing and Graph Theory. International Society for Music Information Retrieval Conference (ISMIR), pp. 343–350, October 2020. [PDF]
Andrew McLeod, James Owers, Kazuyoshi Yoshii. The MIDI Degradation Toolkit: Symbolic Music Augmentation and Correction. International Society for Music Information Retrieval Conference (ISMIR), pp. 846–850, October 2020. [PDF]
Go Shibata, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii. Statistical Music Structure Analysis Based on a Homogeneity-, Repetitiveness-, and Regularity-Aware Hierarchical Hidden Semi-Markov Model. International Society for Music Information Retrieval Conference (ISMIR), pp. 268–275, November 2019. [PDF]
Adrien Ycart, Andrew McLeod, Emmanouil Benetos, Kazuyoshi Yoshii. Blending Acoustic and Language Model Predictions for Automatic Music Transcription. International Society for Music Information Retrieval Conference (ISMIR), pp. 454–461, November 2019. [PDF]
Ryo Nishikimi, Eita Nakamura, Masataka Goto, Kazuyoshi Yoshii. End-to-End Melody Note Transcription Based on a Beat-Synchronous Attention Mechanism. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 26–30, October 2019. [PDF]
Tomoyasu Nakano, Kazuyoshi Yoshii, Yiming Wu, Ryo Nishikimi, Kin Wah Edward Lin, Masataka Goto. Joint Singing Pitch Estimation and Voice Separation Based on a Neural Harmonic Structure Renderer. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 160–164, October 2019. [PDF]
Tristan Carsault, Andrew McLeod, Philippe Esling, Jérôme Nika, Eita Nakamura, Kazuyoshi Yoshii. Multi-Step Chord Sequence Prediction Based on Aggregated Multi-Scale Encoder-Decoder Networks. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, October 2019. [PDF]
Yoshiaki Bando, Yoko Sasaki, Kazuyoshi Yoshii. Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, October 2019. [PDF]
Aaron Chau, Kouhei Sekiguchi, Aditya Arie Nugraha, Kazuyoshi Yoshii, Kotaro Funakoshi. Audio-Visual SLAM towards Human Tracking and Human-Robot Interaction in Indoor Environments. IEEE International Conference on Robot and Human Interactive Communication (ROMAN), pp. 1–8, October 2019. Best Conference Paper Award. [PDF]
Yiming Wu, Tristan Carsault, Kazuyoshi Yoshii. Automatic Chord Estimation Based on a Frame-wise Convolutional Recurrent Neural Network with Non-Aligned Annotations. European Signal Processing Conference (EUSIPCO), pp. 1–5, September 2019. [PDF]
Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii. Fast Multichannel Source Separation Based on Jointly Diagonalizable Spatial Covariance Matrices. European Signal Processing Conference (EUSIPCO), pp. 1–5, September 2019. [PDF]
Mathieu Fontaine, Aditya Arie Nugraha, Roland Badeau, Kazuyoshi Yoshii, Antoine Liutkus. Cauchy Multichannel Speech Enhancement with a Deep Speech Prior. European Signal Processing Conference (EUSIPCO), pp. 1–5, September 2019. [PDF]
Ryo Nishikimi, Eita Nakamura, Satoru Fukayama, Masataka Goto, Kazuyoshi Yoshii. Automatic Singing Transcription Based on Encoder-Decoder Recurrent Neural Networks with a Weakly-Supervised Attention Mechanism. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 161–165, May 2019. [PDF]
Kentaro Shibata, Ryo Nishikimi, Satoru Fukayama, Masataka Goto, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Joint Transcription of Lead, Bass, and Rhythm Guitars Based on a Factorial Hidden Semi-Markov Model. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 236–240, May 2019. [PDF]
Shun Ueda, Kentaro Shibata, Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii. Bayesian Drum Transcription Based on Nonnegative Matrix Factor Decomposition with a Deep Score Prior. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 456–460, May 2019. [PDF]
Eita Nakamura, Kentaro Shibata, Ryo Nishikimi, Kazuyoshi Yoshii. Unsupervised Melody Style Conversion. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 196–200, May 2019. [PDF]
Andrew McLeod, Eita Nakamura, Kazuyoshi Yoshii. Improved Metrical Alignment of MIDI Performance Based on a Repetition-Aware Online-Adapted Grammar. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 186–190, May 2019. [PDF]
Aditya Arie Nugraha, Kouhei Sekiguchi, Kazuyoshi Yoshii. A Deep Generative Model of Speech Complex Spectrograms. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 905–909, May 2019. [PDF]
Kouhei Sekiguchi, Yoshiaki Bando, Kazuyoshi Yoshii, Tatsuya Kawahara. Bayesian Multichannel Speech Enhancement with a Deep Speech Prior. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1233–1239, November 2018. [PDF]
Eita Nakamura, Ryo Nishikimi, Simon Dixon, Kazuyoshi Yoshii. Probabilistic Sequential Patterns for Singing Transcription. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1905–1912, November 2018. [PDF]
Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Sequential Generation of Singing F0 Contours from Musical Note Sequences Based on WaveNet. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 983–989, November 2018. [PDF]
Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Interactive Arrangement of Chords and Melodies Based on a Tree-Structured Generative Model. International Society for Music Information Retrieval Conference (ISMIR), pp. 145–151, September 2018. [PDF]
Kazuyoshi Yoshii, Koichi Kitamura, Yoshiaki Bando, Eita Nakamura, Tatsuya Kawahara. Independent Low-Rank Tensor Analysis for Audio Source Separation. European Signal Processing Conference (EUSIPCO), pp. 1671–1675, September 2018. [PDF] [Slides]
Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, Simon Dixon. Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 101–105, April 2018. [PDF]
Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Nonnegative Matrix Factorization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 716–720, April 2018. [PDF]
Kazuyoshi Yoshii. Correlated Tensor Factorization for Audio Source Separation. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 731–735, April 2018. [PDF] [Poster]
Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5734–5738, April 2018. [PDF]
Hirofumi Inaguma, Masato Mimura, Koji Inoue, Kazuyoshi Yoshii, Tatsuya Kawahara. An End-to-End Approach to Joint Social Signal Detection and Speech Recognition. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 6214–6218, April 2018. [PDF]
Izaya Nishimuta, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Multi-Party Interactions by Robot Quiz Master in Speech-Based Jeopardy!-Like Games. International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1787–1792, December 2017. [PDF]
Eita Nakamura, Kazuyoshi Yoshii, Haruhiro Katayose. Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment. International Society for Music Information Retrieval Conference (ISMIR), pp. 347–353, October 2017. [PDF]
Ryo Nishikimi, Eita Nakamura, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii. Scale- and Rhythm-Aware Musical Note Estimation for Vocal F0 Trajectories Based on a Semi-Tatum-Synchronous Hierarchical Hidden Semi-Markov Model. International Society for Music Information Retrieval Conference (ISMIR), pp. 376–382, October 2017. [PDF]
Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Function- and Rhythm-Aware Melody Harmonization Based on Tree-Structured Parsing and Split-Merge Sampling of Chord Sequences. International Society for Music Information Retrieval Conference (ISMIR), pp. 502–508, October 2017. [PDF]
Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto. Infinite Probabilistic Latent Component Analysis For Audio Source Separation. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF] [Poster]
Antoine Liutkus, Kazuyoshi Yoshii. A Diagonal Plus Low-Rank Covariance Model For Computationally Efficient Source Separation. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF] [Slides]
Masaya Wake, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Semi-Blind Speech Enhancement Based On Recurrent Neural Network For Source Separation and Dereverberation. Student Paper Award Nominee. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF]
Masato Mimura, Yoshiaki Bando, Kazuki Shimada, Shinsuke Sakai, Kazuyoshi Yoshii, Tatsuya Kawahara. Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition. Annual Conference of the International Speech Communication Association (Interspeech), pp. 2451–2455, August 2017. [PDF]
Yusuke Wada, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. An Adaptive Karaoke System that Plays Accompaniment Parts of Music Audio Signals Synchronously with Users' Singing Voices. Sound and Music Computing Conference (SMC), pp. 110–116, July 2017. [PDF]
Yuta Ojima, Tomoyasu Nakano, Satoru Fukayama, Jun Kato, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii. A Singing Instrument for Real-Time Vocal-Part Arrangement of Music Audio Signals. Sound and Music Computing Conference (SMC), pp. 443–449, July 2017. [PDF]
Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara. Bayesian Multichannel Nonnegative Matrix Factorization for Audio Source Separation and Localization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 551–555, March 2017. [PDF]
Yoshiaki Bando, Hiroki Suhara, Motoyasu Tanaka, Tetsushi Kamegawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Fumitoshi Matsuno, Hiroshi G. Okuno. Sound-Based Online Localization for an In-Pipe Snake Robot. IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 207–213, October 2016. [PDF]
Kouhei Sekiguchi, Yoshiaki Bando, Keisuke Nakamura, Kazuhiro Nakadai, Katsutoshi Itoyama, Kazuyoshi Yoshii. Online Simultaneous Localization and Mapping of Multiple Sound Sources and Asynchronous Microphone Arrays. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1973–1979, October 2016. [PDF]
Koichi Kitamura, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii. Student's t Multichannel Nonnegative Matrix Factorization for Blind Source Separation. IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 1–5, September 2016. [PDF]
Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama. Rhythm Transcription of Polyphonic MIDI Performances Based on a Merged-Output HMM for Multiple Voices. Sound and Music Computing Conference (SMC), pp. 338–343, September 2016. [PDF]
Yoshiaki Bando, Katsuyoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno. Variational Bayesian Multi-Channel Robust NMF for Human-Voice Enhancement with a Deformable and Partially-Occluded Microphone Array. European Signal Processing Conference (EUSIPCO), pp. 1018–1022, August 2016. [PDF]
Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Rhythm Transcription of MIDI Performances Based on Hierarchical Bayesian Modelling of Repetition and Modification of Musical Note Patterns. European Signal Processing Conference (EUSIPCO), pp. 1946–1950, August 2016. [PDF]
Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. A Unified Bayesian Model of Time-Frequency Clustering and Low-Rank Approximation for Multi-Channel Source Separation. European Signal Processing Conference (EUSIPCO), pp. 2280–2284, August 2016. [PDF]
Yuta Ojima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis. International Society for Music Information Retrieval Conference (ISMIR), pp. 309–315, August 2016. [PDF]
Ryo Nishikimi, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii. Musical Note Estimation for F0 Trajectories of Singing Voices Based on a Bayesian Semi-Beat-Synchronous HMM. International Society for Music Information Retrieval Conference (ISMIR), pp. 461–467, August 2016. [PDF]
Tomoyasu Nakano, Daichi Mochihashi, Kazuyoshi Yoshii, Masataka Goto. Musical Typicality: How Many Similar Songs Exist?. International Society for Music Information Retrieval Conference (ISMIR), pp. 695–701, August 2016. [PDF]
Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto. Student's t Nonnegative Matrix Factorization and Positive Semidefinite Tensor Factorization for Single-Channel Audio Source Separation. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 51–55, March 2016. [PDF]
Eita Nakamura, Masatoshi Hamanaka, Keiji Hirata, Kazuyoshi Yoshii. Tree-Structured Probabilistic Model of Monophonic Written Music Based on the Generative Theory of Tonal Music. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 276–280, March 2016. [PDF]
Masataka Goto, Kazuyoshi Yoshii, Tomoyasu Nakano. Songle Widget: Making Animation and Physical Devices Synchronized with Music Videos on the Web. IEEE International Symposium on Multimedia (ISM), pp. 85–88, December 2015. [PDF]
Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto. Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models. IEEE International Symposium on Multimedia (ISM), pp. 197–204, December 2015. [PDF]
Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto. Infinite Superimposed Discrete All-Pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms. International Society for Music Information Retrieval Conference (ISMIR), pp. 86–92, October 2015. [PDF] [Poster]
Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno. Human-Voice Enhancement Based on Online RPCA for a Hose-Shaped Rescue Robot with a Microphone Array. IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6, October 2015. Most Innovative Paper Award & People's Choice Demo Award. [PDF]
Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Unified Inter- and Intra-Recording Duration Model for Multiple Music Audio Alignment. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5, October 2015. [PDF]
Karim Youssef, Katsutoshi Itoyama, Kazuyoshi Yoshii. Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context. IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 407–412, October 2015. [PDF]
Kouhei Sekiguchi, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii. Optimizing the Layout of Multiple Mobile Robots for Cooperative Sound Source Separation. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5548–5554, September 2015. [PDF]
Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii. Audio-Visual Beat Tracking Based on a State-Space Model for a Music Robot Dancing with Humans. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5555–5560, September 2015. [PDF]
Yoshiaki Bando, Kazuhiro Nakadai, Katsutoshi Itoyama, Kazuyoshi Yoshii, Masashi Konyo, Hiroshi G. Okuno, Satoshi Tadokoro. Microphone-Accelerometer Based 3D Posture Estimation for a Hose-Shaped Rescue Robot. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5580–5586, September 2015. [PDF]
Kousuke Itakura, Izaya Nishimuta, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii. Bayesian Integration of Sound Source Separation and Speech Recognition: A New Approach to Simultaneous Speech Recognition. Annual Conference of the International Speech Communication Association (Interspeech), pp. 736–740, September 2015. [PDF]
Ayaka Dobashi, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii. A Music Performance Assistance System Based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals. Sound and Music Computing Conference (SMC), pp. 99–104, July 2015. [PDF]
Tsubasa Fukuda, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii. A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification. Sound and Music Computing Conference (SMC), pp. 105–110, July 2015. [PDF]
Tatsunori Hirai, Yukara Ikemiya, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima. Automatic Singing Voice to Music Video Generation via Mashup of Singing Video Clips. Sound and Music Computing Conference (SMC), pp. 153–159, July 2015. [PDF]
Satoshi Maruo, Kazuyoshi Yoshii, Katsutoshi Itoyama, Matthias Mauch, Masataka Goto. A Feedback Framework for Improved Chord Recognition Based on NMF-Based Approximate Note Transcription. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 196–200, May 2015. [PDF]
Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. Singing Voice Analysis and Editing Based on Mutually Dependent F0 Estimation and Source Separation. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 574–578, May 2015. [PDF]
Yoshiaki Bando, Takuma Otsuka, Katsutoshi Itoyama, Kazuyoshi Yoshii, Yoko Sasaki, Satoshi Kagami, Hiroshi G. Okuno. Challenges in Deploying a Microphone Array to Localize and Separate Sound Sources in Real Auditory Scenes. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 723–727, May 2015. [PDF]
Yoshiaki Bando, Takuma Otsuka, Ikkyu Aihara, Hiromitsu Awano, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Recognition of In-Field Frog Chorusing Using Bayesian Nonparametric Microphone Array Processing. AAAI Workshop on Computational Sustainability, pp. 2–6, January 2015. [PDF]
Izaya Nishimuta, Kazuyoshi Yoshii, Katsutoshi Itoyama, Hiroshi G. Okuno. Development of a Robot Quizmaster with Auditory Functions for Speech-Based Multiparty Interaction. IEEE/SICE International Symposium on System Integration (SII), pp. 328–333, December 2014. [PDF]
Izaya Nishimuta, Naoki Hirayama, Kazuyoshi Yoshii, Katsutoshi Itoyama, Hiroshi G. Okuno. A Robot Quizmaster that can Localize, Separate, and Recognize Simultaneous Utterances for a Fastest-Voice-First Quiz Game. IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp. 967–972, November 2014. [PDF]
Yoshiaki Bando, Katsutoshi Itoyama, Satoshi Tadokoro, Masashi Konyo, Kazuhiro Nakadai, Kazuyoshi Yoshii. A Sound-Based Online Method for Estimating the Time-Varying Posture of a Hose-Shaped Robot. IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6, October 2014. Best Student Paper Award. [PDF]
Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, and Shigeo Morishima. Spotting a Query Phrase from Polyphonic Music Audio Signals Based on Semi-Supervised Nonnegative Matrix Factorization. International Society for Music Information Retrieval Conference (ISMIR), pp. 227–232, October 2014. [PDF] [Video] [Video (Local)]
Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno. Bayesian Audio Alignment Based on a Unified Generative Model of Music Composition and Performance. International Society for Music Information Retrieval Conference (ISMIR), pp. 233–238, October 2014. [PDF] [Video] [Video (Local)]
Shoto Sasaki, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima. LyricsRadar: A Lyrics Retrieval System Based on Latent Topics of Lyrics. International Society for Music Information Retrieval Conference (ISMIR), pp. 585–590, October 2014. [PDF] [Poster]
Kazuyoshi Yoshii, Hiromasa Fujihara, Tomoyasu Nakano, Masataka Goto. Cultivating Vocal Activity Detection for Music Audio Signals in a Circulation-Type Crowdsourcing Ecosystem. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 624–628, May 2014. [PDF] [Poster]
Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto. Vocal Timbre Analysis Using Latent Dirichlet Allocation and Cross-Gender Vocal Timbre Similarity. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5239–5243, May 2014. [PDF]
Tomohiko Nakamura, Hirokazu Kameoka, Kazuyoshi Yoshii, Masataka Goto. Timbre Replacement of Harmonic and Drum Components for Music Audio Signals. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7520–7524, May 2014. [PDF]
Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii, Masataka Goto. Transfer Learning in MIR: Sharing Learned Latent Representations for Music Audio Classification and Similarity. International Society for Music Information Retrieval Conference (ISMIR), pp. 9–14, November 2013. [PDF]
Kazuyoshi Yoshii, Ryota Tomioka, Daichi Mochihashi, Masataka Goto. Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction. International Society for Music Information Retrieval Conference (ISMIR), pp. 369–374, November 2013. Best Oral Presentation Award. [PDF] [Code]
Satoru Fukayama, Kazuyoshi Yoshii, Masataka Goto. ChordSequenceFactory: A Chord Arrangement System Modifying Factorized Chord Sequence Probabilities. International Society for Music Information Retrieval Conference (ISMIR), pp. 457–462, November 2013. [PDF]
Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto. AutoMashUpper: An Automatic Multi-Song Mashup System. International Society for Music Information Retrieval Conference (ISMIR), pp. 575–580, November 2013. [PDF]
Yoko Sasaki, Kazuyoshi Yoshii, Satoshi Kagami. Nested iGMM Recognition and Multiple Hypothesis Tracking of Moving Sound Sources for Mobile Robot Audition. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2013. [PDF]
Yoko Sasaki, Kazuyoshi Yoshii, Satoshi Kagami. A Nested Infinite Gaussian Mixture Model for Identifying Known and Unknown Audio Events. International Workshop on Image and Audio Analysis for Multimedia Interactive services (WIA2MIS), July 2013. [PDF] [Poster]
Kazuyoshi Yoshii, Ryota Tomioka, Daichi Mochihashi, Masataka Goto. Infinite Positive Semidefinite Tensor Factorization for Source Separation of Mixture Signals. International Conference on Machine Learning (ICML), pp. 576–584, June 2013. [PDF] [Supplementary] [Spotlight] [Poster] [Video]
Kazuyoshi Yoshii, Masataka Goto. Infinite Kernel Linear Prediction for Joint Estimation of Spectral Envelope and Fundamental Frequency. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 463–467, May 2013. [PDF] [Poster]
Kazuyoshi Yoshii, Masataka Goto. Infinite Composite Autoregressive Models for Music Signal Analysis. International Society for Music Information Retrieval Conference (ISMIR), pp. 79–84, October 2012. [PDF] [Slide] [Poster]
Masataka Goto, Jun Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano. PodCastle and Songle: Crowdsourcing-Based Web Services for Retrieval and Browsing of Speech and Music Content. International Workshop on Crowdsourcing Web Search (CrowdSearch), pp. 1–6, April 2012. [PDF] [Podcastle] [Songle]
Kazuyoshi Yoshii, Masataka Goto. Unsupervised Music Understanding Based on Nonparametric Bayesian Models. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5353–5356, March 2012. [PDF]
Masataka Goto, Jun Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano. PodCastle and Songle: Crowdsourcing-Based Web Services for Spoken Document Retrieval and Active Music Listening. Information Theory and Applications Workshop (ITA), pp. 298–299, February 2012. [PDF] [Podcastle] [Songle]
Kazuyoshi Yoshii, Matthias Mauch, Masataka Goto. A Unified Probabilistic Model of Note Combinations and Chord Progressions. International Workshop on Music and Machine Learning (MML), December 2011. [PDF] [Video]
Kazuyoshi Yoshii, Masataka Goto. A Vocabulary-Free Infinity-Gram Model for Nonparametric Bayesian Chord Progression Analysis. International Society for Music Information Retrieval Conference (ISMIR), pp. 645–650, October 2011. [PDF]
Masataka Goto, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano. Songle: A Web Service for Active Music Listening Improved by User Contributions. International Society for Music Information Retrieval Conference (ISMIR), pp. 311–316, October 2011. [PDF] [Web Service]
Matthias Mauch, Hiromasa Fujihara, Kazuyoshi Yoshii, Masataka Goto. Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music. International Society for Music Information Retrieval Conference (ISMIR), pp. 233–238, October 2011. [PDF]
Kazuyoshi Yoshii, Masataka Goto. Infinite Latent Harmonic Allocation: A Nonparametric Bayesian Approach to Multipitch Analysis. International Society for Music Information Retrieval Conference (ISMIR), pp. 309–314, August 2010. [PDF]
Kazuyoshi Yoshii, Masataka Goto. Continuous pLSI and Smoothing Techniques for Hybrid Music Recommendation. International Society for Music Information Retrieval Conference (ISMIR), pp. 339–344, October 2009. [PDF]
Kazuyoshi Yoshii, Masataka Goto. MusicCommentator: Generating Comments Synchronized with Musical Audio Signals by a Joint Probabilistic Model of Acoustic and Textual Features. International Conference on Entertainment Computing (ICEC), pp. 85–97, September 2009. [PDF] [Demo]
Kazuyoshi Yoshii, Masataka Goto. MusicThumbnailer: Visualizing Musical Pieces in Thumbnail Images Based on Acoustic Features. International Conference on Music Information Retrieval (ISMIR), pp. 211–216, September 2008. [PDF] [Demo]
Kouhei Sumi, Katsutoshi Itoyama, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. Automatic Chord Recognition Based on Probabilistic Integration of Chord Transition and Bass Pitch Estimation. International Conference on Music Information Retrieval (ISMIR), pp. 39–44, September 2008. [PDF]
Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino. A Robot Singer with Music Recognition Based on Real-Time Beat Tracking. International Conference on Music Information Retrieval (ISMIR), pp. 199–204, September 2008. [PDF]
Takehiro Abe, Katsutoshi Itoyama, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. Analysis-and-Manipulation Approach to Pitch and Duration of Musical Instrument Sounds without Distorting Timbral Characteristics. International Conference on Digital Audio Effects (DAFX), pp. 249–256, September 2008. [PDF] [Slides] [Demo]
Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino. A Robot Uses Its Own Microphone to Synchronize Its Steps to Musical Beats while Scatting and Singing. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2459–2464, September 2008. Award for Entertainment Robots and Systems (NTF Award) Nomination Finalist (4/649). [PDF]
Takeshi Mizumoto, Ryu Takeda, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1538–1543, September 2008. Award for Entertainment Robots and Systems (NTF Award) Nomination Finalist (4/649). [PDF]
Kazuyoshi Yoshii, Kazuhiro Nakadai, Toyotaka Torii, Yuji Hasegawa, Hiroshi Tsujino, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. A Biped Robot that Keeps Steps in Time with Musical Beats while Listening to Music with Its Own Ears. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1743–1750, October 2007. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. Improving Efficiency and Scalability of Model-Based Music Recommender System Based on Incremental Training. International Conference on Music Information Retrieval (ISMIR), pp. 89–94, September 2007. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. Hybrid Collaborative and Content-Based Music Recommendation Using Probabilistic Model with Latent User Preferences. International Conference on Music Information Retrieval (ISMIR), pp. 296–301, October 2006. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno. An Error Correction Framework Based on Drum Pattern Periodicity for Improving Drum Sound Detection. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vol. V, pp. 237–240, May 2006. IEEE Signal Processing Society Japan Chapter Student Paper Award. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. INTER:D: A Drum Sound Equalizer for Controlling Volume and Timbre of Drums. European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (EWIMT), pp. 205–212, November 2005. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. Automatic Drum Sound Description for Real-World Music Using Template Adaptation and Matching Methods. International Conference on Music Information Retrieval (ISMIR), pp. 184–191, October 2004. [PDF]
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA), October 2004. [PDF]

International Competitions

Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. MIREX 2016: Audio Melody Extraction. Annual Music Information Retrieval Evaluation eXchange (MIREX), August 2016. [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset] [ORCHSET15 Dataset]
Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. MIREX 2015: Singing Voice Separation. Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2015. [PDF] [Results]
Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. MIREX 2015: Audio Melody Extraction. Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2015. [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset] [ORCHSET15 Dataset]
Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. MIREX 2014: Singing Voice Separation. Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2014. Winner of the Singing Voice Separation Track. [PDF] [Results]
Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama. MIREX 2014: Audio Melody Extraction. Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2014. [PDF] [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset]
Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno. MIREX 2005: Audio Drum Detection. Annual Music Information Retrieval Evaluation eXchange (MIREX), September 2005. Winner of the Audio Drum Detection Track (MIREX 2005 Best-in-Class Award). [PDF] [Results]