no code implementations • 28 Apr 2024 • Lingxia Zhang, Xiaodie Lin, Peidong Wang, Kaiyan Yang, Xiao Zeng, Zhaohui Wei, Zizhu Wang
Optimization is one of the keystones of modern science and engineering.
1 code implementation • 20 Jan 2024 • Yiqun Zhang, Fanheng Kong, Peidong Wang, Shuang Sun, Lingshuai Wang, Shi Feng, Daling Wang, Yifei Zhang, Kaisong Song
Stickers, while widely recognized for enhancing empathetic communication in online interactions, remain underexplored in current empathetic dialogue research, notably due to the challenge of a lack of comprehensive datasets.
no code implementations • 23 Oct 2023 • Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li, Yashesh Gaur
The growing need for instant spoken language transcription and translation is driven by increased global communication and cross-lingual interactions.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 6 Oct 2023 • Junkun Chen, Jian Xue, Peidong Wang, Jing Pan, Jinyu Li
Simultaneous Speech-to-Text translation serves a critical role in real-time crosslingual communication.
1 code implementation • 14 Sep 2023 • Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka
End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion.
no code implementations • 7 Jul 2023 • Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Jinyu Li, Yashesh Gaur
In real-world applications, users often require both translations and transcriptions of speech to enhance their comprehension, particularly in streaming scenarios where incremental generation is necessary.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Mar 2023 • Eric Sun, Jinyu Li, Yuxuan Hu, Yimeng Zhu, Long Zhou, Jian Xue, Peidong Wang, Linquan Liu, Shujie Liu, Edward Lin, Yifan Gong
We propose gated language experts and curriculum training to enhance multilingual transformer transducer models without requiring language identification (LID) input from users during inference.
no code implementations • 10 Nov 2022 • Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang
In this paper, we investigate SSL for streaming multi-talker speech recognition, which generates transcriptions of overlapping speakers in a streaming fashion.
no code implementations • 5 Nov 2022 • Peidong Wang, Eric Sun, Jian Xue, Yu Wu, Long Zhou, Yashesh Gaur, Shujie Liu, Jinyu Li
In this paper, we propose LAMASSU, a streaming language-agnostic multilingual speech recognition and translation model using neural transducers.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 4 Nov 2022 • Jian Xue, Peidong Wang, Jinyu Li, Eric Sun
In this paper, we introduce our work of building a Streaming Multilingual Speech Model (SM2), which can transcribe or translate multiple spoken languages into texts of the target language.
no code implementations • 27 Apr 2022 • Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Peidong Wang, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei
Recently, self-supervised learning (SSL) has demonstrated strong performance in speaker recognition, even if the pre-training objective is designed for speech recognition.
1 code implementation • 11 Apr 2022 • Jian Xue, Peidong Wang, Jinyu Li, Matt Post, Yashesh Gaur
Neural transducers have been widely used in automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 1 Mar 2022 • Yufeng Yang, Peidong Wang, DeLiang Wang
The proposed model builds on the wide residual bi-directional long short-term memory network (WRBN) with utterance-wise dropout and iterative speaker adaptation, but employs a Conformer encoder instead of the recurrent network.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 29 Oct 2021 • Glenn Liu, Peidong Wang, Matthew Beveridge, Young-Oh Kwon, Iddo Drori
Atlantic Multidecadal Variability (AMV) describes variations of North Atlantic sea surface temperature with a typical cycle of between 60 and 70 years.
no code implementations • 28 Oct 2021 • Yixuan Zhang, Zhuo Chen, Jian Wu, Takuya Yoshioka, Peidong Wang, Zhong Meng, Jinyu Li
In this paper, we propose to apply recurrent selective attention network (RSAN) to CSS, which generates a variable number of output channels based on active speaker counting.
1 code implementation • ICCV 2021 • Yuwei Cheng, Jiannan Zhu, Mengxin Jiang, Jie Fu, Changsong Pang, Peidong Wang, Kris Sankaran, Olawale Onabola, Yimin Liu, Dianbo Liu, Yoshua Bengio
To promote the practical application for autonomous floating wastes cleaning, we present FloW, the first dataset for floating waste detection in inland water areas.
no code implementations • 9 Nov 2020 • Peidong Wang, DeLiang Wang
On-device end-to-end speech recognition poses a high requirement on model efficiency.
no code implementations • 27 Oct 2020 • Peidong Wang, Tara N. Sainath, Ron J. Weiss
We propose a multitask training method for attention-based end-to-end speech recognition models.
no code implementations • 20 Oct 2020 • Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong
We propose speaker separation using speaker inventories and estimated speech (SSUSIES), a framework leveraging speaker profiles and estimated speech for speaker separation.
2 code implementations • 4 Oct 2020 • Zhong-Qiu Wang, Peidong Wang, DeLiang Wang
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
no code implementations • 11 Mar 2019 • Peidong Wang, Ke Tan, DeLiang Wang
In this study, we analyze the distortion problem, compare different acoustic models, and investigate a distortion-independent training scheme for monaural speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 14 Dec 2016 • Peidong Wang, DeLiang Wang
This paper proposed a class of novel Deep Recurrent Neural Networks which can incorporate language-level information into acoustic models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 14 Dec 2016 • Peidong Wang, Zhongqiu Wang, DeLiang Wang
This paper presented our work on applying Recurrent Deep Stacking Networks (RDSNs) to Robust Automatic Speech Recognition (ASR) tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1