Zhiyong WU (吴志勇)

Zhiyong WU (吴志勇)

Tsinghua University

H-index: 27

Asia-China

About Zhiyong WU (吴志勇)

Zhiyong WU (吴志勇), With an exceptional h-index of 27 and a recent h-index of 25 (since 2020), a distinguished researcher at Tsinghua University, specializes in the field of Speech synthesis, Deep learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction

Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models

SCNet: Sparse Compression Network for Music Source Separation

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style Information

Multi-view MidiVAE: Fusing Track-and Bar-view Representations for Long Multi-track Symbolic Music Generation

Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations

Zhiyong WU (吴志勇) Information

University

Position

Associate Professor

Citations(all)

3100

Citations(since 2020)

2427

Cited By

1255

hIndex(all)

27

hIndex(since 2020)

25

i10Index(all)

90

i10Index(since 2020)

73

Email

University Profile Page

Tsinghua University

Google Scholar

View Google Scholar Profile

Zhiyong WU (吴志勇) Skills & Research Interests

Speech synthesis

Deep learning

Top articles of Zhiyong WU (吴志勇)

Title

Journal

Author(s)

Publication Date

Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction

arXiv preprint arXiv:2401.17796

Xueyuan Chen

Yuejiao Wang

Xixin Wu

Disong Wang

Zhiyong Wu

...

2024/1/31

Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

Shun Lei

Yixuan Zhou

Liyang Chen

Dan Luo

Zhiyong Wu

...

2024/4/14

Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models

Haiwei Xue

Sicheng Yang

Zhensong Zhang

Zhiyong Wu

Minglei Li

...

2024/4/14

SCNet: Sparse Compression Network for Music Source Separation

arXiv preprint arXiv:2401.13276

Weinan Tong

Jiaxu Zhu

Jun Chen

Shiyin Kang

Tao Jiang

...

2024/1/24

Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model

arXiv preprint arXiv:2404.01862

Xu He

Qiaochu Huang

Zhensong Zhang

Zhiwei Lin

Zhiyong Wu

...

2024/4/2

Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style Information

Qiaochu Huang

Xu He

Boshi Tang

Haolin Zhuang

Liyang Chen

...

2024/4/14

Multi-view MidiVAE: Fusing Track-and Bar-view Representations for Long Multi-track Symbolic Music Generation

arXiv preprint arXiv:2401.07532

Zhiwei Lin

Jun Chen

Boshi Tang

Binzhu Sha

Jing Yang

...

2024/1/15

Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations

Proceedings of the AAAI Conference on Artificial Intelligence

Zilin Wang

Haolin Zhuang

Lu Li

Yinmin Zhang

Junjie Zhong

...

2024/3/25

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation

Yuanyuan Wang

Hangting Chen

Dongchao Yang

Jianwei Yu

Chao Weng

...

2024/4/14

SimCalib: Graph Neural Network Calibration based on Similarity between Nodes

Proceedings of the AAAI Conference on Artificial Intelligence

Boshi Tang

Zhiyong Wu

Xixin Wu

Qiaochu Huang

Jun Chen

...

2024/3/24

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

arXiv preprint arXiv:2401.03476

Sicheng Yang

Zunnan Xu

Haiwei Xue

Yongkang Cheng

Shaoli Huang

...

2024/1/7

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-based Approach for One-shot Singing Voice Conversion

Binzhu Sha

Xu Li

Zhiyong Wu

Ying Shan

Helen Meng

2024/4/14

SECap: Speech Emotion Captioning with Large Language Model

Proceedings of the AAAI Conference on Artificial Intelligence

Yaoxun Xu

Hangting Chen

Jianwei Yu

Qiaochu Huang

Zhiyong Wu

...

2024/3/24

StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis

Xueyuan Chen

Xi Wang

Shaofei Zhang

Lei He

Zhiyong Wu

...

2024/4/14

Inter-Subnet: Speech Enhancement with Subband Interaction

Jun Chen

Wei Rao

Zilin Wang

Jiuxin Lin

Zhiyong Wu

...

2023/6/4

WavSyncSwap: End-To-End Portrait-Customized Audio-Driven Talking Face Generation

Weihong Bao

Liyang Chen

Chaoyong Zhou

Sicheng Yang

Zhiyong Wu

2023/6/4

SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias

Sipan Li

Songxiang Liu

Luwen Zhang

Xiang Li

Yanyao Bian

...

2023/7/10

Text-Only Domain Adaptation for End-to-End Speech Recognition through Down-Sampling Acoustic Representation

Jiaxu Zhu

Weinan Tong

Yaoxun Xu

Changhe Song

Zhiyong Wu

...

2023/8/22

Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model

Xiang Li

Songxiang Liu

Max WY Lam

Zhiyong Wu

Chao Weng

...

2023

AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation

arXiv preprint arXiv:2310.07236

Liyang Chen

Weihong Bao

Shun Lei

Boshi Tang

Zhiyong Wu

...

2023/10/11

See List of Professors in Zhiyong WU (吴志勇) University(Tsinghua University)

Co-Authors

H-index: 47
Hung-yi Lee

Hung-yi Lee

National Taiwan University

H-index: 38
Jia Jia (贾珈)

Jia Jia (贾珈)

Tsinghua University

H-index: 19
Xixin Wu

Xixin Wu

University of Cambridge

H-index: 9
Yishuang Ning

Yishuang Ning

Tsinghua University

H-index: 5
Yixuan Zhou (周逸轩)

Yixuan Zhou (周逸轩)

Tsinghua University

H-index: 5
Xi Ma

Xi Ma

Tsinghua University

academic-engine