affiliated with @ LTI/CMU.

This is Watanabe’s Audio and Voice (WAV) Lab at the Language Technologies Institute of Carnegie Mellon University. Our research interests include automatic speech recognition, speech enhancement, spoken language understanding, and machine learning for speech and language processing.

Our Lab Party at the Porch before Interspeech, 03.06.2022


Jun 15, 2022 Our lab has 23 paper accepted in the Interspeech2022. Detailed list will be available in publication page.
Mar 1, 2022 Our lab has 18 ICASSP paper accepted in the ICASSP2022. Detailed list is already available in publication page.
Sep 13, 2021 Our lab has 9 ASRU paper accepted in the ASRU2021. Detailed list is already available in publication page.
Jun 7, 2021 Shinji, with Keisuke, Yusuke, and Naoyuki, delivered a tutorial on “Distant Conversational Speech Recognition And Analysis: Recent Advances, And Trends Towards End-To-End Optimization” in ICASSP 2021. Detailed slides can be found here.
Jun 3, 2021 Our lab has 20 Interspeech paper accepted in the Interspeech2021. Detailed list will be available soon in publication page.

selected publications

  1. SD CSL
    A review of speaker diarization: Recent advances with deep learning
    Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J Han, Shinji Watanabe, and Shrikanth Narayanan
    Computer Speech & Language 2022
  2. ASR&SD&SLU&ER Interspeech
    SUPERB: Speech processing Universal PERformance Benchmark
    Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Lai, Kushal Lakhotia, Yist Y., Andy T., Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, and Hung-yi Lee
    In Proceedings of Interspeech 2021
    Espnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit
    Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, and Xu Tan
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
  4. ST ACL
    ESPnet-ST: All-in-One Speech Translation Toolkit
    Hirofumi Inaguma, Shun Kiyono, Kevin Duh, Shigeki Karita, Nelson Yalta, Tomoki Hayashi, and Shinji Watanabe
    In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2020
  5. SD Interspeech
    End-to-end speaker diarization for an unknown number of speakers with encoder-decoder based attractors
    Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, and Kenji Nagamatsu
  6. SD ASRU
    End-to-end neural speaker diarization with self-attention
    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
    In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
  7. SD Interspeech
    End-to-End Neural Speaker Diarization with Permutation-Free Objectives
    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe
    In Proceedings of Interspeech 2019
  8. ASR Interspeech
    Improving Transformer Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration
    Shigeki Karita, Nelson Yalta, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, and Tomohiro Nakatani
    In Proceedings of Interspeech 2019
  9. ASR Interspeech
    ESPnet: End-to-End Speech Processing Toolkit
    Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, and Tsubasa Ochiai
    Proceedings of Interspeech 2018
  10. SE&ASR Interspeech
    The Fifth ’CHiME’ Speech Separation and Recognition Challenge: Dataset, Task and Baselines
    Jon Barker, Shinji Watanabe, Emmanuel Vincent, and Jan Trmal
    Proceedings of Interspeech 2018
  11. SD Interspeech
    Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge.
    Gregory Sell, David Snyder, Alan McCree, Daniel Garcia-Romero, Jesús Villalba, Matthew Maciejewski, Vimal Manohar, Najim Dehak, Daniel Povey, Shinji Watanabe, and others
    In Interspeech 2018