ProfessorsProfessors of Shanghai Jiao Tong UniversityWangyou Zhang

Wangyou Zhang

Shanghai Jiao Tong University

H-index: 14

Asia-China

About Wangyou Zhang

Wangyou Zhang, With an exceptional h-index of 14 and a recent h-index of 14 (since 2020), a distinguished researcher at Shanghai Jiao Tong University, specializes in the field of Signal Processing, Speech Separation, Speech Enhancement, Robust Speech Recognition.

His recent articles reflect a diverse array of research interests and contributions to the field:

Improving Design of Input Condition Invariant Speech Enhancement

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

End-to-End Multi-speaker ASR with Independent Vector Analysis

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation

Exploring Time-Frequency Domain Target Speaker Extraction For Causal and Non-Causal Processing

Joint prediction and denoising for large-scale multilingual self-supervised learning

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction

Wangyou Zhang Information

University	Shanghai Jiao Tong University
Position	Ph.D. candidate Department of Computer Science and Engineering
Citations(all)	1490
Citations(since 2020)	1490
Cited By	307
hIndex(all)	14
hIndex(since 2020)	14
i10Index(all)	15
i10Index(since 2020)	15
Email	Access Email
University Profile Page	Shanghai Jiao Tong University
Google Scholar	View Google Scholar Profile

Wangyou Zhang Skills & Research Interests

Signal Processing

Speech Separation

Speech Enhancement

Robust Speech Recognition

Top articles of Wangyou Zhang

Title	Journal	Author(s)	Publication Date
Improving Design of Input Condition Invariant Speech Enhancement	IEEE ICASSP 2024	Wangyou Zhang* Jee-weon Jung* Shinji Watanabe Yanmin Qian	2024/4/14
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition	arXiv preprint arXiv:2401.18045	Yihan Wu Soumi Maiti Yifan Peng Wangyou Zhang Chenda Li ...	2024/1/31
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models	arXiv preprint arXiv:2401.17230	Jee-weon Jung Wangyou Zhang Jiatong Shi Zakaria Aldeneh Takuya Higuchi ...	2024/1/30
End-to-End Multi-speaker ASR with Independent Vector Analysis		Robin Scheibler Wangyou Zhang Xuankai Chang Shinji Watanabe Yanmin Qian	2023/1/9
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation		Yoshiki Masuyama Xuankai Chang Wangyou Zhang Samuele Cornell Zhong-Qiu Wang ...	2023/10/22
Exploring Time-Frequency Domain Target Speaker Extraction For Causal and Non-Causal Processing		Wangyou Zhang Lei Yang Yanmin Qian	2023/12/16
Joint prediction and denoising for large-scale multilingual self-supervised learning	arXiv preprint arXiv:2309.15317	William Chen Jiatong Shi Brian Yan Dan Berrebbi Wangyou Zhang ...	2023/9/26
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction		Kohei Saijo Wangyou Zhang Zhong-Qiu Wang Shinji Watanabe Tetsunori Kobayashi ...	2023/12/16
Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition	arXiv preprint arXiv:2305.16286	Wangyou Zhang Yanmin Qian	2023/5/25
Toward Universal Speech Enhancement For Diverse Input Conditions		Wangyou Zhang Kohei Saijo Zhong-Qiu Wang Shinji Watanabe Yanmin Qian	2023/12/16
A Heterogeneous Graph to Abstract Syntax Tree Framework for Text-to-SQL	IEEE Transactions on Pattern Analysis and Machine Intelligence	Ruisheng Cao Lu Chen Jieyu Li Hanchong Zhang Hongshen Xu ...	2023/7/26
Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data		Yifan Peng Jinchuan Tian Brian Yan Dan Berrebbi Xuankai Chang ...	2023/12/16
Two-Stage Single-Channel Speech Enhancement with Multi-Frame Filtering	Applied Sciences	Shaoxiong Lin Wangyou Zhang Yanmin Qian	2023/4/14
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing	Journal of Open Source Software	Yen-Ju Lu Xuankai Chang Chenda Li Wangyou Zhang Samuele Cornell ...	2023/11/20
Exploring Effective Data Utilization for Low-Resource Speech Recognition		Zhikai Zhou Wei Wang Wangyou Zhang Yanmin Qian	2022/5/23
Text-Informed Knowledge Distillation for Robust Speech Enhancement and Recognition		Wei Wang Wangyou Zhang Shaoxiong Lin Yanmin Qian	2022/12/11
Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPnet-SE Submission to the L3DAS22 Challenge		Yen-Ju Lu Samuele Cornell Xuankai Chang Wangyou Zhang Chenda Li ...	2022/5/23
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding	arXiv preprint arXiv:2207.09514	Yen-Ju Lu Xuankai Chang Chenda Li Wangyou Zhang Samuele Cornell ...	2022/7/19
Separating Long-Form Speech with Group-Wise Permutation Invariant Training	arXiv preprint arXiv:2110.14142	Wangyou Zhang Zhuo Chen Naoyuki Kanda Shujie Liu Jinyu Li ...	2021/10/27
End-to-End Dereverberation, Beamforming, and Speech Recognition in a Cocktail Party	IEEE/ACM Transactions on Audio, Speech, and Language Processing	Wangyou Zhang Xuankai Chang Christoph Boeddeker Tomohiro Nakatani Shinji Watanabe ...	2022/9/27