International Journals

  1. Masataka Goto, Yuta Kawasaki, Takahiro Inoue, Kazuyoshi Yoshii, Tomoyasu Nakano.   Songle: A Web Service for Enriching Music Listening Experiences Based on Music-Understanding Technologies.   Applied Sciences, Accepted, 2018.  
  2. Eita Nakamura, Kazuyoshi Yoshii.   Statistical Piano Reduction Controlling Performance Difficulty.   APSIPA Transactions on Signal and Information Processing, Accepted, 2018.  
  3. Yuta Ojima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Chord-Aware Automatic Music Transcription Based on Hierarchical Bayesian Integration of Acoustic and Language Models.   APSIPA Transactions on Signal and Information Processing, Accepted, 2018.  
  4. Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Speech Recognition.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Major Revision (RQ), 2018.  
  5. Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Generative Statistical Models with Self-Emergent Grammar of Chord Sequences.   Journal of New Music Research, pp. 1–23, 2018.  
  6. Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, No. 4, pp. 831–846, 2018.   [PDF]
  7. Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Tatsuya Kawahara, Hiroshi G. Okuno.   Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 26, No. 2, pp. 215–230, 2018.   [PDF]
  8. Eita Nakamura, Kazuyoshi Yoshii, Simon Dixon.   Note Value Recognition for Rhythm Transcription Using a Markov Random Field Model for Musical Scores and Performances of Piano Music.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 9, pp. 1846–1858, 2017.   [PDF]
  9. Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama.   Rhythm Transcription of Polyphonic Music Based on Merged-Output HMM for Multiple Voices.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 4, pp. 794–806, 2017.   [PDF]
  10. Karim Youssef, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Simultaneous Identification and Localization of Immobile and Moving Speakers Based on Binaural Sound Acquisition.   Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 59–71, 2017.   [PDF]
  11. Kouhei Sekiguchi, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Layout Optimization of Cooperative Distributed Microphone Arrays Based on Estimation of Source Separation Performance.   Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 83–93, 2017.   [PDF]
  12. Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Audio-Visual Beat Tracking Based on a State-Space Model for a Robot Dancer Performing with a Human Dancer.   Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 125–136, 2017.   [PDF]
  13. Yoshiaki Bando, Hiroshi Saruwatari, Nobutaka Ono, Shoji Makino, Katustoshi Itoyama, Daichi Kitamura, Masaru Ishimura, Moe Takakusaki, Narumi Mae, Kouei Yamaoka, Yutaro Matsui, Yuichi Ambe, Masashi Konyo, Satoshi Tadokoro, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Low-Latency and High-Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot.   Journal of Robotics and Mechatronics, Vol. 29, No. 1, pp. 198–212, 2017.   [PDF]
  14. Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 24, No. 11, pp. 2084–2095, 2016.   [PDF]
  15. Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto.   Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models.   International Journal of Semantic Computing, Vol. 10, No. 1, pp. 27–52, 2016.   [PDF]
  16. Izaya Nishimuta, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Toward a Quizmaster Robot for Speech-Based Multiparty Interaction.   Advanced Robotics, Vol. 29, No. 18, pp. 1205–1219, 2015.  
  17. Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Nonparametric Bayesian Dereverberation of Power Spectrograms Based on Infinite-Order Autoregressive Processes.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 12, pp. 1918–1930, 2014.   [PDF]
  18. Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto.   AutoMashUpper: Automatic Creation of Multi-Song Music Mashups.   IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 22, No. 12, pp. 1726–1737, 2014.   [PDF]
  19. Kazuyoshi Yoshii, Masataka Goto.   A Nonparametric Bayesian Multipitch Analyzer Based on Infinite Latent Harmonic Allocation.   IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, No. 3, pp. 717–730, 2012.   [PDF]
  20. Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   An Efficient Hybrid Music Recommender System Using an Incrementally Trainable Probabilistic Generative Model.   IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, No. 2, pp. 435–447, 2008.   Funai Research Incentive Award.   [PDF]
  21. Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno.   Drum Sound Recognition for Polyphonic Audio Signals by Adaptation and Matching of Spectrogram Templates with Harmonic Structure Suppression.   IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, No. 1, pp. 333–345, 2007.   Telecom System Technology Student Award by the Telecommunications Advancement Foundation (TAF).   [PDF]
  22. Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   Drumix: An Audio Player with Real-Time Drum-Part Rearrangement Functions for Active Music Listening.   IPSJ Digital Courier, Vol. 3, pp. 134–144, 2007.   IPSJ Digital Courier Funai Young Researcher Encouragement Award.   [PDF] [Demo] [Demo (Japanese)]

International Conferences

  1. Ryo Nishikimi, Eita Nakamura, Satoru Fukayama, Masataka Goto, Kazuyoshi Yoshii.   Automatic Singing Transcription Based on Encoder-Decoder Recurrent Neural Networks with a Weakly-Supervised Attention Mechanism.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  2. Kentaro Shibata, Ryo Nishikimi, Satoru Fukayama, Masataka Goto, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Joint Transcription of Lead, Bass, and Rhythm Guitars Based on a Factorial Hidden Semi-Markov Model.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  3. Shun Ueda, Kentaro Shibata, Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii.   Bayesian Drum Transcription Based on Nonnegative Matrix Factor Decomposition with a Deep Score Prior.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  4. Lisa Zahray, Eita Nakamura, Kazuyoshi Yoshii.   Beat and Downbeat Detection with Chord Recognition Based on Multi-Task Learning of Recurrent Neural Networks.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  5. Eita Nakamura, Kentaro Shibata, Ryo Nishikimi, Kazuyoshi Yoshii.   Unsupervised Melody Arrangement for Style Conversion.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  6. Andrew McLeod, Eita Nakamura, Kazuyoshi Yoshii.   Improved Metrical Alignment of MIDI Performance Based on a Repetition-Aware Online-Adapted Grammar.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  7. Aditya Arie Nugraha, Kouhei Sekiguchi, Kazuyoshi Yoshii.   A Deep Generative Model of Speech Complex Spectrograms.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Submitted, May 2019.
  8. Kouhei Sekiguchi, Yoshiaki Bando, Kazuyoshi Yoshii, Tatsuya Kawahara.   Bayesian Multichannel Speech Enhancement with a Deep Speech Prior.   Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Accepted, December 2018.
  9. Eita Nakamura, Ryo Nishikimi, Simon Dixon, Kazuyoshi Yoshii.   Probabilistic Sequential Patterns for Singing Transcription.   Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Accepted, December 2018.
  10. Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Sequential Generation of Singing F0 Contours from Musical Note Sequences Based on WaveNet.   Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Accepted, December 2018.
  11. Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Interactive Arrangement of Chords and Melodies Based on a Tree-Structured Generative Model.   International Society for Music Information Retrieval Conference (ISMIR), pp. 145–151, September 2018. [PDF]
  12. Kazuyoshi Yoshii, Koichi Kitamura, Yoshiaki Bando, Eita Nakamura, Tatsuya Kawahara.   Independent Low-Rank Tensor Analysis for Audio Source Separation.   European Signal Processing Conference (EUSIPCO), pp. 1671–1675, September 2018. [PDF] [Slides]
  13. Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, Simon Dixon.   Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 101–105, April 2018. [PDF]
  14. Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Nonnegative Matrix Factorization.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 716–720, April 2018. [PDF]
  15. Kazuyoshi Yoshii.   Correlated Tensor Factorization for Audio Source Separation.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 731–735, April 2018. [PDF] [Poster]
  16. Kazuki Shimada, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Reconigtion.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5734–5738, April 2018. [PDF]
  17. Hirofumi Inaguma, Masato Mimura, Koji Inoue, Kazuyoshi Yoshii, Tatsuya Kawahara.   An End-to-End Approach to Joint Social Signal Detection and Speech Recognition.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 6214–6218, April 2018. [PDF]
  18. Izaya Nishimuta, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Multi-Party Interactions by Robot Quiz Master in Speech-Based Jeopardy!-Like Games.   International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1787–1792, December 2017. [PDF]
  19. Eita Nakamura, Kazuyoshi Yoshii, Haruhiro Katayose.   Performance Error Detection and Post-Processing for Fast and Accurate Symbolic Music Alignment.   International Society for Music Information Retrieval Conference (ISMIR), pp. 347–353, October 2017. [PDF]
  20. Ryo Nishikimi, Eita Nakamura, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Scale- and Rhythm-Aware Musical Note Estimation for Vocal F0 Trajectories Based on a Semi-Tatum-Synchronous Hierarchical Hidden Semi-Markov Model.   International Society for Music Information Retrieval Conference (ISMIR), pp. 376–382, October 2017. [PDF]
  21. Hiroaki Tsushima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Function- and Rhythm-Aware Melody Harmonization Based on Tree-Structured Parsing and Split-Merge Sampling of Chord Sequences.   International Society for Music Information Retrieval Conference (ISMIR), pp. 502–508, October 2017. [PDF]
  22. Kazuyoshi Yoshii, Eita Nakamura, Katsutoshi Itoyama, Masataka Goto.   Infinite Probabilistic Latent Component Analysis For Audio Source Separation.   IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF] [Poster]
  23. Antoine Liutkus, Kazuyoshi Yoshii.   A Diagonal Plus Low-Rank Covariance Model For Computationally Efficient Source Separation.   IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF] [Slides]
  24. Masaya Wake, Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Semi-Blind Speech Enhancement Based On Recurrent Neural Network For Source Separation and Dereverberation.   Student Paper Award Nominee.   IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6, September 2017. [PDF]
  25. Masato Mimura, Yoshiaki Bando, Kazuki Shimada, Shinsuke Sakai, Kazuyoshi Yoshii, Tatsuya Kawahara.   Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition.   Annual Conference of the International Speech Communication Association (Interspeech), pp. 2451–2455, August 2017. [PDF]
  26. Yusuke Wada, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   An Adaptive Karaoke System that Plays Accompaniment Parts of Music Audio Signals Synchronously with Users' Singing Voices.   Sound and Music Computing Conference (SMC), pp. 110–116, July 2017. [PDF]
  27. Yuta Ojima, Tomoyasu Nakano, Satoru Fukayama, Jun Kato, Masataka Goto, Katsutoshi Itoyama, Kazuyoshi Yoshii.   A Singing Instrument for Real-Time Vocal-Part Arrangement of Music Audio Signals.   Sound and Music Computing Conference (SMC), pp. 443–449, July 2017. [PDF]
  28. Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara.   Bayesian Multichannel Nonnegative Matrix Factorization for Audio Source Separation and Localization.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 551–555, March 2017. [PDF]
  29. Yoshiaki Bando, Hiroki Suhara, Motoyasu Tanaka, Tetsushi Kamegawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Fumitoshi Matsuno, Hiroshi G. Okuno.   Sound-Based Online Localization for an In-Pipe Snake Robot.   IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 207–213, October 2016. [PDF]
  30. Kouhei Sekiguchi, Yoshiaki Bando, Keisuke Nakamura, Kazuhiro Nakadai, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Online Simultaneous Localization and Mapping of Multiple Sound Sources and Asynchronous Microphone Arrays.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1973–1979, October 2016. [PDF]
  31. Koichi Kitamura, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Student's t Multichannel Nonnegative Matrix Factorization for Blind Source Separation.   IEEE International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 1–5, September 2016. [PDF]
  32. Eita Nakamura, Kazuyoshi Yoshii, Shigeki Sagayama.   Rhythm Transcription of Polyphonic MIDI Performances Based on a Merged-Output HMM for Multiple Voices.   Sound and Music Computing Conference (SMC), pp. 338–343, September 2016. [PDF]
  33. Yoshiaki Bando, Katsuyoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Variational Bayesian Multi-Channel Robust NMF for Human-Voice Enhancement with a Deformable and Partially-Occluded Microphone Array.   European Signal Processing Conference (EUSIPCO), pp. 1018–1022, August 2016. [PDF]
  34. Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Rhythm Transcription of MIDI Performances Based on Hierarchical Bayesian Modelling of Repetition and Modification of Musical Note Patterns.   European Signal Processing Conference (EUSIPCO), pp. 1946–1950, August 2016. [PDF]
  35. Kousuke Itakura, Yoshiaki Bando, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   A Unified Bayesian Model of Time-Frequency Clustering and Low-Rank Approximation for Multi-Channel Source Separation.   European Signal Processing Conference (EUSIPCO), pp. 2280–2284, August 2016. [PDF]
  36. Yuta Ojima, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis.   International Society for Music Information Retrieval Conference (ISMIR), pp. 309–315, August 2016. [PDF]
  37. Ryo Nishikimi, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Musical Note Estimation for F0 Trajectories of Singing Voices Based on a Bayesian Semi-Beat-Synchronous HMM.   International Society for Music Information Retrieval Conference (ISMIR), pp. 461–467, August 2016. [PDF]
  38. Tomoyasu Nakano, Daichi Mochihashi, Kazuyoshi Yoshii, Masataka Goto.   Musical Typicality: How Many Similar Songs Exist?.   International Society for Music Information Retrieval Conference (ISMIR), pp. 695–701, August 2016. [PDF]
  39. Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto.   Student's t Nonnegative Matrix Factorization and Positive Semidefinite Tensor Factorization for Single-Channel Audio Source Separation.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 51–55, March 2016. [PDF]
  40. Eita Nakamura, Masatoshi Hamanaka, Keiji Hirata, Kazuyoshi Yoshii.   Tree-Structured Probabilistic Model of Monophonic Written Music Based on the Generative Theory of Tonal Music.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 276–280, March 2016. [PDF]
  41. Masataka Goto, Kazuyoshi Yoshii, Tomoyasu Nakano.   Songle Widget: Making Animation and Physical Devices Synchronized with Music Videos on the Web.   IEEE International Symposium on Multimedia (ISM), pp. 85–88, December 2015. [PDF]
  42. Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto.   Musical Similarity and Commonness Estimation Based on Probabilistic Generative Models.   IEEE International Symposium on Multimedia (ISM), pp. 197–204, December 2015. [PDF]
  43. Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto.   Infinite Superimposed Discrete All-Pole Modeling for Source-Filter Decomposition of Wavelet Spectrograms.   International Society for Music Information Retrieval Conference (ISMIR), pp. 86–92, October 2015. [PDF] [Poster]
  44. Yoshiaki Bando, Katsutoshi Itoyama, Masashi Konyo, Satoshi Tadokoro, Kazuhiro Nakadai, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Human-Voice Enhancement Based on Online RPCA for a Hose-Shaped Rescue Robot with a Microphone Array.   IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6, October 2015.   Most Innovative Paper Award & People's Choice Demo Award.   [PDF]
  45. Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Unified Inter- and Intra-Recording Duration Model for Multiple Music Audio Alignment.   IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 1–5, October 2015. [PDF]
  46. Karim Youssef, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context.   IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 407–412, October 2015. [PDF]
  47. Kouhei Sekiguchi, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Optimizing the Layout of Multiple Mobile Robots for Cooperative Sound Source Separation.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5548–5554, September 2015. [PDF]
  48. Misato Ohkita, Yoshiaki Bando, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Audio-Visual Beat Tracking Based on a State-Space Model for a Music Robot Dancing with Humans.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5555–5560, September 2015. [PDF]
  49. Yoshiaki Bando, Kazuhiro Nakadai, Katsutoshi Itoyama, Kazuyoshi Yoshii, Masashi Konyo, Hiroshi G. Okuno, Satoshi Tadokoro.   Microphone-Accelerometer Based 3D Posture Estimation for a Hose-Shaped Rescue Robot.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5580–5586, September 2015. [PDF]
  50. Kousuke Itakura, Izaya Nishimuta, Yoshiaki Bando, Katsutoshi Itoyama, Kazuyoshi Yoshii.   Bayesian Integration of Sound Source Separation and Speech Recognition: A New Approach to Simultaneous Speech Recognition.   Annual Conference of the International Speech Communication Association (Interspeech), pp. 736–740, September 2015. [PDF]
  51. Ayaka Dobashi, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii.   A Music Performance Assistance System Based on Vocal, Harmonic, and Percussive Source Separation and Content Visualization for Music Audio Signals.   Sound and Music Computing Conference (SMC), pp. 99–104, July 2015. [PDF]
  52. Tsubasa Fukuda, Yukara Ikemiya, Katsutoshi Itoyama, Kazuyoshi Yoshii.   A Score-Informed Piano Tutoring System with Mistake Detection and Score Simplification.   Sound and Music Computing Conference (SMC), pp. 105–110, July 2015. [PDF]
  53. Tatsunori Hirai, Yukara Ikemiya, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima.   Automatic Singing Voice to Music Video Generation via Mashup of Singing Video Clips.   Sound and Music Computing Conference (SMC), pp. 153–159, July 2015. [PDF]
  54. Satoshi Maruo, Kazuyoshi Yoshii, Katsutoshi Itoyama, Matthias Mauch, Masataka Goto.   A Feedback Framework for Improved Chord Recognition Based on NMF-Based Approximate Note Transcription.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 196–200, May 2015. [PDF]
  55. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   Singing Voice Analysis and Editing Based on Mutually Dependent F0 Estimation and Source Separation.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 574–578, May 2015. [PDF]
  56. Yoshiaki Bando, Takuma Otsuka, Katsutoshi Itoyama, Kazuyoshi Yoshii, Yoko Sasaki, Satoshi Kagami, Hiroshi G. Okuno.   Challenges in Deploying a Microphone Array to Localize and Separate Sound Sources in Real Auditory Scenes.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 723–727, May 2015. [PDF]
  57. Yoshiaki Bando, Takuma Otsuka, Ikkyu Aihara, Hiromitsu Awano, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Recognition of In-Field Frog Chorusing Using Bayesian Nonparametric Microphone Array Processing.   AAAI Workshop on Computational Sustainability, pp. 2–6, January 2015. [PDF]
  58. Izaya Nishimuta, Kazuyoshi Yoshii, Katsutoshi Itoyama, Hiroshi G. Okuno.   Development of a Robot Quizmaster with Auditory Functions for Speech-Based Multiparty Interaction.   IEEE/SICE International Symposium on System Integration (SII), pp. 328–333, December 2014. [PDF]
  59. Izaya Nishimuta, Naoki Hirayama, Kazuyoshi Yoshii, Katsutoshi Itoyama, Hiroshi G. Okuno.   A Robot Quizmaster that can Localize, Separate, and Recognize Simultaneous Utterances for a Fastest-Voice-First Quiz Game.   IEEE-RAS International Conference on Humanoid Robots (Humanoids), pp. 967–972, November 2014. [PDF]
  60. Yoshiaki Bando, Katsutoshi Itoyama, Satoshi Tadokoro, Masashi Konyo, Kazuhiro Nakadai, Kazuyoshi Yoshii.   A Sound-Based Online Method for Estimating the Time-Varying Posture of a Hose-Shaped Robot.   IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pp. 1–6, October 2014.   Best Student Paper Award.   [PDF]
  61. Taro Masuda, Kazuyoshi Yoshii, Masataka Goto, and Shigeo Morishima.   Spotting a Query Phrase from Polyphonic Music Audio Signals Based on Semi-Supervised Nonnegative Matrix Factorization.   International Society for Music Information Retrieval Conference (ISMIR), pp. 227–232, October 2014.   [PDF] [Video] [Video (Local)]
  62. Akira Maezawa, Katsutoshi Itoyama, Kazuyoshi Yoshii, Hiroshi G. Okuno.   Bayesian Audio Alignment Based on a Unified Generative Model of Music Composition and Performance.   International Society for Music Information Retrieval Conference (ISMIR), pp. 233–238, October 2014.   [PDF] [Video] [Video (Local)]
  63. Shoto Sasaki, Kazuyoshi Yoshii, Tomoyasu Nakano, Masataka Goto, Shigeo Morishima.   LyricsRadar: A Lyrics Retrieval System Based on Latent Topics of Lyrics.   International Society for Music Information Retrieval Conference (ISMIR), pp. 585–590, October 2014.   [PDF] [Poster]
  64. Kazuyoshi Yoshii, Hiromasa Fujihara, Tomoyasu Nakano, Masataka Goto.   Cultivating Vocal Activity Detection for Music Audio Signals in a Circulation-Type Crowdsourcing Ecosystem.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 624–628, May 2014. [PDF] [Poster]
  65. Tomoyasu Nakano, Kazuyoshi Yoshii, Masataka Goto.   Vocal Timbre Analysis Using Latent Dirichlet Allocation and Cross-Gender Vocal Timbre Similarity.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5239–5243, May 2014. [PDF]
  66. Tomohiko Nakamura, Hirokazu Kameoka, Kazuyoshi Yoshii, Masataka Goto.   Timbre Replacement of Harmonic and Drum Components for Music Audio Signals.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 7520–7524, May 2014. [PDF]
  67. Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii, Masataka Goto.   Transfer Learning in MIR: Sharing Learned Latent Representations for Music Audio Classification and Similarity.   International Society for Music Information Retrieval Conference (ISMIR), pp. 9–14, November 2013.   [PDF]
  68. Kazuyoshi Yoshii, Ryota Tomioka, Daichi Mochihashi, Masataka Goto.   Beyond NMF: Time-Domain Audio Source Separation without Phase Reconstruction.   International Society for Music Information Retrieval Conference (ISMIR), pp. 369–374, November 2013.   Best Oral Presentation Award.   [PDF] [Code]
  69. Satoru Fukayama, Kazuyoshi Yoshii, Masataka Goto.   ChordSequenceFactory: A Chord Arrangement System Modifying Factorized Chord Sequence Probabilities.   International Society for Music Information Retrieval Conference (ISMIR), pp. 457–462, November 2013.   [PDF]
  70. Matthew E. P. Davies, Philippe Hamel, Kazuyoshi Yoshii, Masataka Goto.   AutoMashUpper: An Automatic Multi-Song Mashup System.   International Society for Music Information Retrieval Conference (ISMIR), pp. 575–580, November 2013.   [PDF]
  71. Yoko Sasaki, Kazuyoshi Yoshii, Satoshi Kagami.   Nested iGMM Recognition and Multiple Hypothesis Tracking of Moving Sound Sources for Mobile Robot Audition.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), November 2013.   [PDF]
  72. Yoko Sasaki, Kazuyoshi Yoshii, Satoshi Kagami.   A Nested Infinite Gaussian Mixture Model for Identifying Known and Unknown Audio Events.   International Workshop on Image and Audio Analysis for Multimedia Interactive services (WIA2MIS), July 2013.   [PDF] [Poster]
  73. Kazuyoshi Yoshii, Ryota Tomioka, Daichi Mochihashi, Masataka Goto.   Infinite Positive Semidefinite Tensor Factorization for Source Separation of Mixture Signals.   International Conference on Machine Learning (ICML), pp. 576–584, June 2013.   [PDF] [Supplementary] [Spotlight] [Poster] [Video]
  74. Kazuyoshi Yoshii, Masataka Goto.   Infinite Kernel Linear Prediction for Joint Estimation of Spectral Envelope and Fundamental Frequency.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 463–467, May 2013.   [PDF] [Poster]
  75. Kazuyoshi Yoshii, Masataka Goto.   Infinite Composite Autoregressive Models for Music Signal Analysis.   International Society for Music Information Retrieval Conference (ISMIR), pp. 79–84, October 2012.   [PDF] [Slide] [Poster]
  76. Masataka Goto, Jun Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano.   PodCastle and Songle: Crowdsourcing-Based Web Services for Retrieval and Browsing of Speech and Music Content.   International Workshop on Crowdsourcing Web Search (CrowdSearch), pp. 1–6, April 2012.   [PDF] [Podcastle] [Songle]
  77. Kazuyoshi Yoshii, Masataka Goto.   Unsupervised Music Understanding Based on Nonparametric Bayesian Models.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 5353–5356, March 2012.   [PDF]
  78. Masataka Goto, Jun Ogata, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano.   PodCastle and Songle: Crowdsourcing-Based Web Services for Spoken Document Retrieval and Active Music Listening.   Information Theory and Applications Workshop (ITA), pp. 298–299, February 2012.   [PDF] [Podcastle] [Songle]
  79. Kazuyoshi Yoshii, Matthias Mauch, Masataka Goto.   A Unified Probabilistic Model of Note Combinations and Chord Progressions.   International Workshop on Music and Machine Learning (MML), December 2011.   [PDF] [Video]
  80. Kazuyoshi Yoshii, Masataka Goto.   A Vocabulary-Free Infinity-Gram Model for Nonparametric Bayesian Chord Progression Analysis.   International Society for Music Information Retrieval Conference (ISMIR), pp. 645–650, October 2011.   [PDF]
  81. Masataka Goto, Kazuyoshi Yoshii, Hiromasa Fujihara, Matthias Mauch, Tomoyasu Nakano.   Songle: A Web Service for Active Music Listening Improved by User Contributions.   International Society for Music Information Retrieval Conference (ISMIR), pp. 311–316, October 2011.   [PDF] [Web Service]
  82. Matthias Mauch, Hiromasa Fujihara, Kazuyoshi Yoshii, Masataka Goto.   Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music.   International Society for Music Information Retrieval Conference (ISMIR), pp. 233–238, October 2011.   [PDF]
  83. Kazuyoshi Yoshii, Masataka Goto.   Infinite Latent Harmonic Allocation: A Nonparametric Bayesian Approach to Multipitch Analysis.   International Society for Music Information Retrieval Conference (ISMIR), pp. 309–314, August 2010.   [PDF]
  84. Kazuyoshi Yoshii, Masataka Goto.   Continuous pLSI and Smoothing Techniques for Hybrid Music Recommendation.   International Society for Music Information Retrieval Conference (ISMIR), pp. 339–344, October 2009.   [PDF]
  85. Kazuyoshi Yoshii, Masataka Goto.   MusicCommentator: Generating Comments Synchronized with Musical Audio Signals by a Joint Probabilistic Model of Acoustic and Textual Features.   International Conference on Entertainment Computing (ICEC), pp. 85–97, September 2009.   [PDF] [Demo]
  86. Kazuyoshi Yoshii, Masataka Goto.   MusicThumbnailer: Visualizing Musical Pieces in Thumbnail Images Based on Acoustic Features.   International Conference on Music Information Retrieval (ISMIR), pp. 211–216, September 2008.   [PDF] [Demo]
  87. Kouhei Sumi, Katsutoshi Itoyama, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   Automatic Chord Recognition Based on Probabilistic Integration of Chord Transition and Bass Pitch Estimation.   International Conference on Music Information Retrieval (ISMIR), pp. 39–44, September 2008.   [PDF]
  88. Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino.   A Robot Singer with Music Recognition Based on Real-Time Beat Tracking.   International Conference on Music Information Retrieval (ISMIR), pp. 199–204, September 2008.   [PDF]
  89. Takehiro Abe, Katsutoshi Itoyama, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   Analysis-and-Manipulation Approach to Pitch and Duration of Musical Instrument Sounds without Distorting Timbral Characteristics.   International Conference on Digital Audio Effects (DAFX), pp. 249–256, September 2008.   [PDF] [Slides] [Demo]
  90. Kazumasa Murata, Kazuhiro Nakadai, Kazuyoshi Yoshii, Ryu Takeda, Toyotaka Torii, Hiroshi G. Okuno, Yuji Hasegawa, Hiroshi Tsujino.   A Robot Uses Its Own Microphone to Synchronize Its Steps to Musical Beats while Scatting and Singing.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2459–2464, September 2008.   Award for Entertainment Robots and Systems (NTF Award) Nomination Finalist (4/649).   [PDF]
  91. Takeshi Mizumoto, Ryu Takeda, Kazuyoshi Yoshii, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   A Robot Listens to Music and Counts Its Beats Aloud by Separating Music from Counting Voice.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1538–1543, September 2008.   Award for Entertainment Robots and Systems (NTF Award) Nomination Finalist (4/649).   [PDF]
  92. Kazuyoshi Yoshii, Kazuhiro Nakadai, Toyotaka Torii, Yuji Hasegawa, Hiroshi Tsujino, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   A Biped Robot that Keeps Steps in Time with Musical Beats while Listening to Music with Its Own Ears.   IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1743–1750, October 2007.   [PDF]
  93. Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   Improving Efficiency and Scalability of Model-Based Music Recommender System Based on Incremental Training.   International Conference on Music Information Retrieval (ISMIR), pp. 89–94, September 2007.   [PDF]
  94. Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   Hybrid Collaborative and Content-Based Music Recommendation Using Probabilistic Model with Latent User Preferences.   International Conference on Music Information Retrieval (ISMIR), pp. 296–301, October 2006.   [PDF]
  95. Kazuyoshi Yoshii, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno.   An Error Correction Framework Based on Drum Pattern Periodicity for Improving Drum Sound Detection.   IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vol. V, pp. 237–240, May 2006.   IEEE Signal Processing Society Japan Chapter Student Paper Award.   [PDF]
  96. Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno.   INTER:D: A Drum Sound Equalizer for Controlling Volume and Timbre of Drums.   European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies (EWIMT), pp. 205–212, November 2005.   [PDF]
  97. Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno.   Automatic Drum Sound Description for Real-World Music Using Template Adaptation and Matching Methods.   International Conference on Music Information Retrieval (ISMIR), pp. 184–191, October 2004.   [PDF]
  98. Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno.   Drum Sound Identification for Polyphonic Music Using Template Adaptation and Matching Methods.   ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing (SAPA), October 2004.   [PDF]

International Competitions

  1. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   MIREX 2016: Audio Melody Extraction.   Annual Music Information Retrieval Evaluation eXchange (MIREX), August 2016.   [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset] [ORCHSET15 Dataset]
  2. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   MIREX 2015: Singing Voice Separation.   Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2015.   [PDF] [Results]
  3. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   MIREX 2015: Audio Melody Extraction.   Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2015.   [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset] [ORCHSET15 Dataset]
  4. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   MIREX 2014: Singing Voice Separation.   Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2014.   Winner of the Singing Voice Separation Track.   [PDF] [Results]
  5. Yukara Ikemiya, Kazuyoshi Yoshii, Katsutoshi Itoyama.   MIREX 2014: Audio Melody Extraction.   Annual Music Information Retrieval Evaluation eXchange (MIREX), October 2014.   [PDF] [PDF] [ADC04 Dataset] [MIREX05 Dataset] [INDIAN08 Dataset] [MIREX09 0dB Dataset] [MIREX09 -5dB Dataset] [MIREX09 +5dB Dataset]
  6. Kazuyoshi Yoshii, Masataka Goto, Hiroshi G. Okuno.   MIREX 2005: Audio Drum Detection.   Annual Music Information Retrieval Evaluation eXchange (MIREX), September 2005.   Winner of the Audio Drum Detection Track (MIREX 2005 Best-in-Class Award).   [PDF] [Results]

Author: Kazuyoshi Yoshii (AIST)
Mail: yoshii(at)kuis.kyoto-u.ac.jp

Valid HTML 4.01 Transitional