WAVLab | 2019 Papers

ASR ASRU

Transformer ASR with contextual block processing

Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, and Shinji Watanabe

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
ASR ASRU

MIMO-Speech: End-to-end multi-channel multi-speaker speech recognition

Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, and Shinji Watanabe

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
ST ASRU

Multilingual end-to-end speech translation

Hirofumi Inaguma, Kevin Duh, Tatsuya Kawahara, and Shinji Watanabe

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
ASR+SD ASRU

Simultaneous speech recognition and speaker diarization for monaural dialogue recordings with target-speaker acoustic models

Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
ASR ASRU

Espresso: A fast end-to-end neural speech recognition toolkit

Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, and Sanjeev Khudanpur

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
ASR ASRU

A comparative study on transformer vs rnn in speech applications

Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, and others

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019

arXiv HTML
SD ASRU

End-to-end neural speaker diarization with self-attention

Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, and Shinji Watanabe

In IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019

arXiv HTML PDF
ASR ASRU

Multi-stream end-to-end speech recognition

Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Shinji Watanabe, Takaaki Hori, and Hynek Hermansky

IEEE Automatic Speech Recogiton and Understanding Workshop (ASRU) 2019
SS WASPAA

Analysis of robustness of deep single-channel speech separation using corpora constructed from multiple domains

Matthew Maciejewski, Gregory Sell, Yusuke Fujita, Leibny Paola Garcia-Perera, Shinji Watanabe, and Sanjeev Khudanpur

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
ASR WASPAA

Generalized weighted-prediction-error dereverberation with varying source priors for reverberant speech recognition

Toru Taniguchi, Aswin Shanmugam Subramanian, Xiaofei Wang, Dung Tran, Yuya Fujita, and Shinji Watanabe

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
ASR WASPAA

Speech enhancement using end-to-end speech recognition objectives

Aswin Shanmugam Subramanian, Xiaofei Wang, Murali Karthick Baskar, Shinji Watanabe, Toru Taniguchi, Dung Tran, and Yuya Fujita

In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) 2019
ASR Interspeech

End-to-End Multilingual Multi-Speaker Speech Recognition

In Proceedings of Interspeech 2019
ASR Interspeech

Pretraining by Backtranslation for End-to-end ASR in Low-Resource Settings

In Proceedings of Interspeech 2019
TTS Interspeech

Pre-trained Text Embeddings for Enhanced Text-to-Speech Synthesis

In Proceedings of Interspeech 2019
SD Interspeech

End-to-End Neural Speaker Diarization with Permutation-Free Objectives

Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, and Shinji Watanabe

In Proceedings of Interspeech 2019

HTML PDF
ASR Interspeech

Analysis of Multilingual Sequence-to-Sequence speech recognition systems

Murali Karthick Baskar Martin Karafiat, Shinji Watanabe, Takaaki Hori, Matthew Wiesner, and Jan Černocký

In Proceedings of Interspeech 2019
ASR Interspeech

End-to-end SpeakerBeam for single channel target speech recognition

Marc Delcroix, Shinji Watanabe, Tsubasa Ochiai, Keisuke Kinoshita, Shigeki Karita, Atsunori Ogawa, and Tomohiro Nakatani

In Proceedings of Interspeech 2019
ASR Interspeech

Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text

Murali Karthick Baskar, Shinji Watanabe, Ramón Astudillo, Takaaki Hori, Lukas Burget, and Jan Černocký

In Proceedings of Interspeech 2019
ASR Interspeech

Study of the performance of automatic speech recognition systems in speakers with Parkinson’s Disease

Laureano Moro Velazquez, Jaejin Cho, Shinji Watanabe, Mark Hasegawa-Johnson, Odette Scharenborg, Kim Heejin, and Najim Dehak

In Proceedings of Interspeech 2019
ASR Interspeech

Vectorized Beam Search for CTC-Attention-based Speech Recognition

Hiroshi Seki, Takaaki Hori, Shinji Watanabe, Niko Moritz, and Jonathan Le Roux

In Proceedings of Interspeech 2019
ASR Interspeech

Speaker recognition benchmark using the CHiME-5 corpus

Daniel Garcia-Romero, David Snyder, Shinji Watanabe, Gregory Sell, Alan McCree, Dan Povey, and Sanjeev Khudanpur

In Proceedings of Interspeech 2019
ASR Interspeech

Improving Transformer Based End-to-End Speech Recognition with Connectionist Temporal Classification and Language Model Integration

Shigeki Karita, Nelson Yalta, Shinji Watanabe, Marc Delcroix, Atsunori Ogawa, and Tomohiro Nakatani

In Proceedings of Interspeech 2019

HTML PDF
ASR Interspeech

Interference Speaker Loss for Target-Speaker Speech Recognition

Naoyuki Kanda, Shota Horiguchi, Ryoichi Takashima, Yusuke Fujita, Kenji Nagamatsu, and Shinji Watanabe

In Proceedings of Interspeech 2019
ASR EUSIPCO

CNN-based multichannel end-to-end speech recognition for everyday home environments

Nelson Yalta, Shinji Watanabe, Takaaki Hori, Kazuhiro Nakadai, and Tetsuya Ogata

In 2019 27th European Signal Processing Conference (EUSIPCO) 2019
OCR ICDAR

Using ASR methods for OCR

Ashish Arora, Chun Chieh Chang, Babak Rekabdar, Bagher BabaAli, Daniel Povey, David Etter, Desh Raj, Hossein Hadian, Jan Trmal, Paola Garcia, and others

In 2019 International Conference on Document Analysis and Recognition (ICDAR) 2019
Music IJCNN

Weakly-supervised deep recurrent neural networks for basic dance step generation

Nelson Yalta, Shinji Watanabe, Kazuhiro Nakadai, and Tetsuya Ogata

In 2019 International Joint Conference on Neural Networks (IJCNN) 2019
ASR NAACL

Massively Multilingual Adversarial Speech Recognition

Oliver Adams, Matthew Wiesner, Shinji Watanabe, and David Yarowsky

In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2019
ASR ICASSP

Promising accurate prefix boosting for sequence-to-sequence ASR

Murali Karthick Baskar, Lukáš Burget, Shinji Watanabe, Martin Karafiát, Takaaki Hori, and Jan Honza Černockỳ

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Transfer learning of language-independent end-to-end asr with language model fusion

Hirofumi Inaguma, Jaejin Cho, Murali Karthick Baskar, Tatsuya Kawahara, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Improving end-to-end speech recognition with pronunciation-assisted sub-word modeling

Hainan Xu, Shuoyang Ding, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Language model integration based on memory control for sequence to sequence speech recognition

Jaejin Cho, Shinji Watanabe, Takaaki Hori, Murali Karthick Baskar, Hirofumi Inaguma, Jesus Villalba, and Najim Dehak

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Stream attention-based multi-array end-to-end speech recognition

Xiaofei Wang, Ruizhi Li, Sri Harish Mallidi, Takaaki Hori, Shinji Watanabe, and Hynek Hermansky

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Acoustic modeling for overlapping speech recognition: JHU CHiME-5 challenge system

Vimal Manohar, Szu-Jui Chen, Zhiqi Wang, Yusuke Fujita, Shinji Watanabe, and Sanjeev Khudanpur

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Cycle-consistency training for end-to-end speech recognition

Takaaki Hori, Ramon Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, and Jonathan Le Roux

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
AED ICASSP

Joint acoustic and class inference for weakly supervised sound event detection

Sandeep Kothinti, Keisuke Imoto, Debmalya Chakrabarty, Gregory Sell, Shinji Watanabe, and Mounya Elhilali

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
SE ICASSP

The phasebook: Building complex masks via discrete representations for source separation

Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy Sarroff, and John R Hershey

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

End-to-end monaural multi-speaker ASR system without pretraining

Xuankai Chang, Yanmin Qian, Kai Yu, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Semi-supervised end-to-end speech recognition using text-to-speech and autoencoders

Shigeki Karita, Shinji Watanabe, Tomoharu Iwata, Marc Delcroix, Atsunori Ogawa, and Tomohiro Nakatani

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
ASR ICASSP

Acoustic modeling for distant multi-talker speech recognition with single-and multi-channel branches

Naoyuki Kanda, Yusuke Fujita, Shota Horiguchi, Rintaro Ikeshita, Kenji Nagamatsu, and Shinji Watanabe

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019