Zhou Zhao
Zhejiang University
H-index: 55
Asia-China
Top articles of Zhou Zhao
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Audiogpt: Understanding and generating speech, music, sound, and talking head | Proceedings of the AAAI Conference on Artificial Intelligence | Rongjie Huang Mingze Li Dongchao Yang Jiatong Shi Xuankai Chang | 2024/3/24 |
Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks | Advances in Neural Information Processing Systems | Haoyi Duan Yan Xia Zhou Mingze Li Tang Jieming Zhu | 2024/2/13 |
Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations | Proceedings of the AAAI Conference on Artificial Intelligence | Yufeng Huang Jiji Tang Zhuo Chen Rongsheng Zhang Xinfeng Zhang | 2024/3/24 |
MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities | arXiv preprint arXiv:2404.13322 | Kunxi Li Tianyu Zhan Shengyu Zhang Kun Kuang Jiwei Li | 2024/4/20 |
Real3d-portrait: One-shot realistic 3d talking portrait synthesis | arXiv preprint arXiv:2401.08503 | Zhenhui Ye Tianyun Zhong Yi Ren Jiaqi Yang Weichuang Li | 2024/1/16 |
Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt | arXiv preprint arXiv:2403.11780 | Yongqi Wang Ruofan Hu Rongjie Huang Zhiqing Hong Ruiqi Li | 2024/3/18 |
Language Model is a Branch Predictor for Simultaneous Machine Translation | Aoxiong Yin Tianyun Zhong Haoyuan Li Siliang Tang Zhou Zhao | 2024/4/14 | |
Achieving Cross Modal Generalization with Multimodal Unified Representation | Advances in Neural Information Processing Systems | Yan Xia Hai Huang Jieming Zhu Zhou Zhao | 2024/2/13 |
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models | arXiv preprint arXiv:2402.12208 | Shengpeng Ji Minghui Fang Ziyue Jiang Rongjie Huang Jialung Zuo | 2024/2/19 |
StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis | Proceedings of the AAAI Conference on Artificial Intelligence | Yu Zhang Rongjie Huang Ruiqi Li JinZheng He Yan Xia | 2024/3/24 |
Textrolspeech: A text style control speech corpus with codec language text-to-speech models | Shengpeng Ji Jialong Zuo Minghui Fang Ziyue Jiang Feiyang Chen | 2024/4/14 | |
SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models | arXiv preprint arXiv:2404.14755 | Bo Lin Yingjing Xu Xuanwen Bao Zhou Zhao Zuyong Zhang | 2024/4/23 |
Multimodal Pretraining, Adaptation, and Generation for Recommendation: A Survey | arXiv preprint arXiv:2404.00621 | Qijiong Liu Jieming Zhu Yanting Yang Quanyu Dai Zhaocheng Du | 2024/3/31 |
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension | arXiv preprint arXiv:2402.07729 | Qian Yang Jin Xu Wenrui Liu Yunfei Chu Ziyue Jiang | 2024/2/12 |
Mega-tts: Zero-shot text-to-speech at scale with intrinsic inductive bias | arXiv preprint arXiv:2306.03509 | Ziyue Jiang Yi Ren Zhenhui Ye Jinglin Liu Chen Zhang | 2023/6/6 |
Unsupervised domain adaptation for video object grounding with cascaded debiasing learning | Mengze Li Haoyu Zhang Juncheng Li Zhou Zhao Wenqiao Zhang | 2023/10/26 | |
Geneface++: Generalized and stable real-time audio-driven 3d talking face generation | arXiv preprint arXiv:2305.00787 | Zhenhui Ye Jinzheng He Ziyue Jiang Rongjie Huang Jiawei Huang | 2023/5/1 |
Prosody-tts: Improving prosody with masked autoencoder and conditional diffusion model for expressive text-to-speech | Rongjie Huang Chunlei Zhang Yi Ren Zhou Zhao Dong Yu | 2023/7 | |
Semantic-conditioned dual adaptation for cross-domain query-based visual segmentation | Ye Wang Tao Jin Wang Lin Xize Cheng Linjun Li | 2023/7 | |
Date: Domain adaptive product seeker for e-commerce | Haoyuan Li Hao Jiang Tao Jin Mengyan Li Yan Chen | 2023 |