Kai Yu(俞凯)

Kai Yu(俞凯)

Shanghai Jiao Tong University

H-index: 49

Asia-China

About Kai Yu(俞凯)

Kai Yu(俞凯), With an exceptional h-index of 49 and a recent h-index of 39 (since 2020), a distinguished researcher at Shanghai Jiao Tong University, specializes in the field of dialogue system, speech recognition, speech synthesis, natural language processing, machine learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

ChemDFM: Dialogue Foundation Model for Chemistry

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

Label-Aware Auxiliary Learning for Dialogue State Tracking

Scieval: A multi-level large language model evaluation benchmark for scientific research

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Kai Yu(俞凯) Information

University

Position

___

Citations(all)

8314

Citations(since 2020)

5084

Cited By

4910

hIndex(all)

49

hIndex(since 2020)

39

i10Index(all)

167

i10Index(since 2020)

127

Email

University Profile Page

Shanghai Jiao Tong University

Google Scholar

View Google Scholar Profile

Kai Yu(俞凯) Skills & Research Interests

dialogue system

speech recognition

speech synthesis

natural language processing

machine learning

Top articles of Kai Yu(俞凯)

Title

Journal

Author(s)

Publication Date

ChemDFM: Dialogue Foundation Model for Chemistry

arXiv preprint arXiv:2401.14818

Zihan Zhao

Da Ma

Lu Chen

Liangtai Sun

Zihao Li

...

2024/1/26

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

Yiwei Guo

Chenpeng Du

Ziyang Ma

Xie Chen

Kai Yu

2024/4/14

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech

arXiv preprint arXiv:2401.14321

Chenpeng Du

Yiwei Guo

Hankun Wang

Yifan Yang

Zhikang Niu

...

2024/1/25

Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding

Hongshen Xu

Lu Chen

Zihan Zhao

Da Ma

Ruisheng Cao

...

2024/3/4

Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback

arXiv preprint arXiv:2403.18349

Hongshen Xu

Zichen Zhu

Da Ma

Situo Zhang

Shuai Fan

...

2024/3/27

Label-Aware Auxiliary Learning for Dialogue State Tracking

Yuncong Liu

Lu Chen

Kai Yu

2024/4/14

Scieval: A multi-level large language model evaluation benchmark for scientific research

Proceedings of the AAAI Conference on Artificial Intelligence

Liangtai Sun

Yang Han

Zihan Zhao

Da Ma

Zhennan Shen

...

2024/3/24

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Yifan Yang

Feiyu Shen

Chenpeng Du

Ziyang Ma

Kai Yu

...

2024/4/14

Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Advances in Neural Information Processing Systems

Danyang Zhang

Lu Chen

Situo Zhang

Hongshen Xu

Zihan Zhao

...

2024/2/13

Acoustic bpe for speech generation with discrete tokens

Feiyu Shen

Yiwei Guo

Chenpeng Du

Xie Chen

Kai Yu

2024/4/14

UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding

Proceedings of the AAAI Conference on Artificial Intelligence

Chenpeng Du

Yiwei Guo

Feiyu Shen

Zhijun Liu

Zheng Liang

...

2024/3/24

Multi: Multimodal Understanding Leaderboard with Text and Images

arXiv preprint arXiv:2402.03173

Zichen Zhu

Yang Xu

Lu Chen

Jingkai Yang

Yichuan Ma

...

2024/2/5

SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention

Junjie Li

Yiwei Guo

Xie Chen

Kai Yu

2024/4/14

StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations

Sen Liu

Yiwei Guo

Xie Chen

Kai Yu

2024/4/14

DIR: A Large-Scale Dialogue Rewrite Dataset for Cross-Domain Conversational Text-to-SQL

Applied Sciences

Jieyu Li

Zhi Chen

Lu Chen

Zichen Zhu

Hanqi Li

...

2023/2/9

Enhance Temporal Relations in Audio Captioning with Sound Event Detection

arXiv preprint arXiv:2306.01533

Zeyu Xie

Xuenan Xu

Mengyue Wu

Kai Yu

2023/6/2

Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge

Tao Liu

Zhengyang Chen

Yanmin Qian

Kai Yu

2023/6/4

Speaker Adaptive Text-to-Speech with Timbre-Normalized Vector-Quantized Feature

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Chenpeng Du

Yiwei Guo

Xie Chen

Kai Yu

2023/8/24

Iterative Noisy-Target Approach: Speech Enhancement Without Clean Speech

Yifan Zhang

Wenbin Jiang

Qing Zhuo

Kai Yu

2023/12/8

On the Structural Generalization in Text-to-SQL

arXiv preprint arXiv:2301.04790

Jieyu Li

Lu Chen

Ruisheng Cao

Su Zhu

Hongshen Xu

...

2023/1/12

See List of Professors in Kai Yu(俞凯) University(Shanghai Jiao Tong University)

Co-Authors

H-index: 74
Philip Woodland

Philip Woodland

University of Cambridge

H-index: 67
Mark Gales

Mark Gales

University of Cambridge

H-index: 40
Milica Gasic

Milica Gasic

Heinrich-Heine-Universität Düsseldorf

H-index: 40
Yanmin Qian

Yanmin Qian

Shanghai Jiao Tong University

H-index: 26
Nanxin Chen

Nanxin Chen

Johns Hopkins University

H-index: 22
Shuai Wang

Shuai Wang

Shanghai Jiao Tong University

academic-engine