affiliated with @ LTI/CMU.

This is Watanabe’s Audio and Voice (WAV) Lab at the Language Technologies Institute of Carnegie Mellon University. Our research interests include automatic speech recognition, speech enhancement, spoken language understanding, and machine learning for speech and language processing.

Our Lab Party at the Porch before Interspeech, 03.06.2022


Jun 15, 2022 Our lab has 23 paper accepted in the Interspeech2022. Detailed list will be available in publication page.
Mar 1, 2022 Our lab has 18 ICASSP paper accepted in the ICASSP2022. Detailed list is already available in publication page.
Sep 13, 2021 Our lab has 9 ASRU paper accepted in the ASRU2021. Detailed list is already available in publication page.
Jun 7, 2021 Shinji, with Keisuke, Yusuke, and Naoyuki, delivered a tutorial on “Distant Conversational Speech Recognition And Analysis: Recent Advances, And Trends Towards End-To-End Optimization” in ICASSP 2021. Detailed slides can be found here.
Jun 3, 2021 Our lab has 20 Interspeech paper accepted in the Interspeech2021. Detailed list will be available soon in publication page.

selected publications

  1. ASR&SD&SLU&ER Interspeech
    SUPERB: Speech processing Universal PERformance Benchmark
    Shu-wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Lai, Kushal Lakhotia, Yist Y., Andy T., Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, and Hung-yi Lee
    In Proceedings of Interspeech 2021
    Espnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit
    Tomoki Hayashi, Ryuichi Yamamoto, Katsuki Inoue, Takenori Yoshimura, Shinji Watanabe, Tomoki Toda, Kazuya Takeda, Yu Zhang, and Xu Tan
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
  3. SD ASRU
    End-to-end neural speaker diarization with self-attention
    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe
    In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
  4. SD Interspeech
    End-to-End Neural Speaker Diarization with Permutation-Free Objectives
    Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe
    In Proceedings of Interspeech 2019
  5. ASR Interspeech
    Improving Transformer Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration
    Shigeki Karita, Nelson Yalta, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, and Tomohiro Nakatani
    In Proceedings of Interspeech 2019
  6. ASR Interspeech
    ESPnet: End-to-End Speech Processing Toolkit
    Shinji Watanabe, Takaaki Hori, Shigeki Karita, Tomoki Hayashi, Jiro Nishitoba, Yuya Unno, Nelson Enrique Yalta Soplin, Jahn Heymann, Matthew Wiesner, Nanxin Chen, Adithya Renduchintala, and Tsubasa Ochiai
    Proceedings of Interspeech 2018
  7. SE&ASR Interspeech
    The Fifth ’CHiME’ Speech Separation and Recognition Challenge: Dataset, Task and Baselines
    Jon Barker, Shinji Watanabe, Emmanuel Vincent, and Jan Trmal
    Proceedings of Interspeech 2018
  8. SD Interspeech
    Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge.
    Gregory Sell, David Snyder, Alan McCree, Daniel Garcia-Romero, Jesús Villalba, Matthew Maciejewski, Vimal Manohar, Najim Dehak, Daniel Povey, Shinji Watanabe, and others
    In Interspeech 2018