ProfessorsProfessors of Zhejiang UniversityZhou Zhao

Zhou Zhao

Zhejiang University

H-index: 55

Asia-China

About Zhou Zhao

Zhou Zhao, With an exceptional h-index of 55 and a recent h-index of 52 (since 2020), a distinguished researcher at Zhejiang University, specializes in the field of Machine Learning, Data Mining, Multimedia Computing.

His recent articles reflect a diverse array of research interests and contributions to the field:

Audiogpt: Understanding and generating speech, music, sound, and talking head

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities

Real3d-portrait: One-shot realistic 3d talking portrait synthesis

Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Language Model is a Branch Predictor for Simultaneous Machine Translation

Achieving Cross Modal Generalization with Multimodal Unified Representation

Zhou Zhao Information

University	Zhejiang University
Position	___
Citations(all)	12319
Citations(since 2020)	11455
Cited By	3296
hIndex(all)	55
hIndex(since 2020)	52
i10Index(all)	176
i10Index(since 2020)	169
Email	Access Email
University Profile Page	Zhejiang University
Google Scholar	View Google Scholar Profile

Zhou Zhao Skills & Research Interests

Machine Learning

Data Mining

Multimedia Computing

Top articles of Zhou Zhao

Title	Journal	Author(s)	Publication Date
Audiogpt: Understanding and generating speech, music, sound, and talking head	Proceedings of the AAAI Conference on Artificial Intelligence	Rongjie Huang Mingze Li Dongchao Yang Jiatong Shi Xuankai Chang ...	2024/3/24
Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks	Advances in Neural Information Processing Systems	Haoyi Duan Yan Xia Zhou Mingze Li Tang Jieming Zhu ...	2024/2/13
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations	Proceedings of the AAAI Conference on Artificial Intelligence	Yufeng Huang Jiji Tang Zhuo Chen Rongsheng Zhang Xinfeng Zhang ...	2024/3/24
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities	arXiv preprint arXiv:2404.13322	Kunxi Li Tianyu Zhan Shengyu Zhang Kun Kuang Jiwei Li ...	2024/4/20
Real3d-portrait: One-shot realistic 3d talking portrait synthesis	arXiv preprint arXiv:2401.08503	Zhenhui Ye Tianyun Zhong Yi Ren Jiaqi Yang Weichuang Li ...	2024/1/16
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt	arXiv preprint arXiv:2403.11780	Yongqi Wang Ruofan Hu Rongjie Huang Zhiqing Hong Ruiqi Li ...	2024/3/18
Language Model is a Branch Predictor for Simultaneous Machine Translation		Aoxiong Yin Tianyun Zhong Haoyuan Li Siliang Tang Zhou Zhao	2024/4/14
Achieving Cross Modal Generalization with Multimodal Unified Representation	Advances in Neural Information Processing Systems	Yan Xia Hai Huang Jieming Zhu Zhou Zhao	2024/2/13
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models	arXiv preprint arXiv:2402.12208	Shengpeng Ji Minghui Fang Ziyue Jiang Rongjie Huang Jialung Zuo ...	2024/2/19
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis	Proceedings of the AAAI Conference on Artificial Intelligence	Yu Zhang Rongjie Huang Ruiqi Li JinZheng He Yan Xia ...	2024/3/24
Textrolspeech: A text style control speech corpus with codec language text-to-speech models		Shengpeng Ji Jialong Zuo Minghui Fang Ziyue Jiang Feiyang Chen ...	2024/4/14
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models	arXiv preprint arXiv:2404.14755	Bo Lin Yingjing Xu Xuanwen Bao Zhou Zhao Zuyong Zhang ...	2024/4/23
Multimodal Pretraining, Adaptation, and Generation for Recommendation: A Survey	arXiv preprint arXiv:2404.00621	Qijiong Liu Jieming Zhu Yanting Yang Quanyu Dai Zhaocheng Du ...	2024/3/31
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension	arXiv preprint arXiv:2402.07729	Qian Yang Jin Xu Wenrui Liu Yunfei Chu Ziyue Jiang ...	2024/2/12
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias	arXiv preprint arXiv:2306.03509	Ziyue Jiang Yi Ren Zhenhui Ye Jinglin Liu Chen Zhang ...	2023/6/6
Unsupervised domain adaptation for video object grounding with cascaded debiasing learning		Mengze Li Haoyu Zhang Juncheng Li Zhou Zhao Wenqiao Zhang ...	2023/10/26
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation	arXiv preprint arXiv:2305.00787	Zhenhui Ye Jinzheng He Ziyue Jiang Rongjie Huang Jiawei Huang ...	2023/5/1
Prosody-tts: Improving prosody with masked autoencoder and conditional diffusion model for expressive text-to-speech		Rongjie Huang Chunlei Zhang Yi Ren Zhou Zhao Dong Yu	2023/7
Semantic-conditioned dual adaptation for cross-domain query-based visual segmentation		Ye Wang Tao Jin Wang Lin Xize Cheng Linjun Li ...	2023/7
Date: Domain adaptive product seeker for e-commerce		Haoyuan Li Hao Jiang Tao Jin Mengyan Li Yan Chen ...	2023