Kai Yu(俞凯)
Shanghai Jiao Tong University
H-index: 49
Asia-China
Top articles of Kai Yu(俞凯)
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
ChemDFM: Dialogue Foundation Model for Chemistry | arXiv preprint arXiv:2401.14818 | Zihan Zhao Da Ma Lu Chen Liangtai Sun Zihao Li | 2024/1/26 |
VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching | Yiwei Guo Chenpeng Du Ziyang Ma Xie Chen Kai Yu | 2024/4/14 | |
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech | arXiv preprint arXiv:2401.14321 | Chenpeng Du Yiwei Guo Hankun Wang Yifan Yang Zhikang Niu | 2024/1/25 |
Hierarchical Multimodal Pre-training for Visually Rich Webpage Understanding | Hongshen Xu Lu Chen Zihan Zhao Da Ma Ruisheng Cao | 2024/3/4 | |
Rejection Improves Reliability: Training LLMs to Refuse Unknown Questions Using RL from Knowledge Feedback | arXiv preprint arXiv:2403.18349 | Hongshen Xu Zichen Zhu Da Ma Situo Zhang Shuai Fan | 2024/3/27 |
Label-Aware Auxiliary Learning for Dialogue State Tracking | Yuncong Liu Lu Chen Kai Yu | 2024/4/14 | |
Scieval: A multi-level large language model evaluation benchmark for scientific research | Proceedings of the AAAI Conference on Artificial Intelligence | Liangtai Sun Yang Han Zihan Zhao Da Ma Zhennan Shen | 2024/3/24 |
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS | Yifan Yang Feiyu Shen Chenpeng Du Ziyang Ma Kai Yu | 2024/4/14 | |
Large Language Models Are Semi-Parametric Reinforcement Learning Agents | Advances in Neural Information Processing Systems | Danyang Zhang Lu Chen Situo Zhang Hongshen Xu Zihan Zhao | 2024/2/13 |
Acoustic bpe for speech generation with discrete tokens | Feiyu Shen Yiwei Guo Chenpeng Du Xie Chen Kai Yu | 2024/4/14 | |
UniCATS: A unified context-aware text-to-speech framework with contextual vq-diffusion and vocoding | Proceedings of the AAAI Conference on Artificial Intelligence | Chenpeng Du Yiwei Guo Feiyu Shen Zhijun Liu Zheng Liang | 2024/3/24 |
Multi: Multimodal Understanding Leaderboard with Text and Images | arXiv preprint arXiv:2402.03173 | Zichen Zhu Yang Xu Lu Chen Jingkai Yang Yichuan Ma | 2024/2/5 |
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention | Junjie Li Yiwei Guo Xie Chen Kai Yu | 2024/4/14 | |
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations | Sen Liu Yiwei Guo Xie Chen Kai Yu | 2024/4/14 | |
DIR: A Large-Scale Dialogue Rewrite Dataset for Cross-Domain Conversational Text-to-SQL | Applied Sciences | Jieyu Li Zhi Chen Lu Chen Zichen Zhu Hanqi Li | 2023/2/9 |
Enhance Temporal Relations in Audio Captioning with Sound Event Detection | arXiv preprint arXiv:2306.01533 | Zeyu Xie Xuenan Xu Mengyue Wu Kai Yu | 2023/6/2 |
Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge | Tao Liu Zhengyang Chen Yanmin Qian Kai Yu | 2023/6/4 | |
Speaker Adaptive Text-to-Speech with Timbre-Normalized Vector-Quantized Feature | IEEE/ACM Transactions on Audio, Speech, and Language Processing | Chenpeng Du Yiwei Guo Xie Chen Kai Yu | 2023/8/24 |
Iterative Noisy-Target Approach: Speech Enhancement Without Clean Speech | Yifan Zhang Wenbin Jiang Qing Zhuo Kai Yu | 2023/12/8 | |
On the Structural Generalization in Text-to-SQL | arXiv preprint arXiv:2301.04790 | Jieyu Li Lu Chen Ruisheng Cao Su Zhu Hongshen Xu | 2023/1/12 |