Wentao Zhu
University of California, Irvine
H-index: 27
North America-United States
Top articles of Wentao Zhu
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
A multimodal benchmark and improved architecture for zero shot learning | Keval Doshi Amanmeet Garg Burak Uzkent Xiaolong Wang Mohamed Omar | 2024 | |
Social Motion Prediction with Cognitive Hierarchies | Advances in Neural Information Processing Systems | Wentao Zhu Jason Qin Yuke Lou Hang Ye Xiaoxuan Ma | 2024/2/13 |
Real-time Holistic Robot Pose Estimation with Unknown States | arXiv preprint arXiv:2402.05655 | Shikun Ban Juling Fan Wentao Zhu Xiaoxuan Ma Yu Qiao | 2024/2/8 |
Efficient selective audio masked multimodal bottleneck transformer for audio-video classification | arXiv preprint arXiv:2401.04154 | Wentao Zhu | 2024/1/8 |
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification | arXiv preprint arXiv:2401.04023 | Wentao Zhu | 2024/1/8 |
Token Propagation Controller for Efficient Vision Transformer | arXiv preprint arXiv:2401.01470 | Wentao Zhu | 2024/1/3 |
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification | ICASSP | Wentao Zhu Mohamed Omar | 2023/3/19 |
Motionbert: A unified perspective on learning human motion representations | Wentao Zhu Xiaoxuan Ma Zhaoyang Liu Libin Liu Wayne Wu | 2023 | |
Deformable Audio Transformer for Audio Event Detection | arXiv preprint arXiv:2312.16228 | Wentao Zhu | 2023/12/24 |
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection | Wentao Zhu Yufang Huang Xiufeng Xie Wenxian Liu Jincan Deng | 2023 | |
Chimpact: A longitudinal dataset for understanding chimpanzee behaviors | Advances in Neural Information Processing Systems | Xiaoxuan Ma Stephan Kaufhold Jiajun Su Wentao Zhu Jack Terwilliger | 2023/12/15 |
Dynamic inference with grounding based vision and language models | Burak Uzkent Amanmeet Garg Wentao Zhu Keval Doshi Jingru Yi | 2023 | |
Human motion generation: A survey | Wentao Zhu Xiaoxuan Ma Dongwoo Ro Hai Ci Jinlu Zhang | 2023/11/8 | |
Hnssl: Hard negative-based self-supervised learning | Wentao Zhu Jingya Liu Yufang Huang | 2023 | |
Selective Structured State-Spaces for Long-Form Video Understanding | CVPR | Jue Wang Wentao Zhu Pichao Wang Xiang Yu Linda Liu | 2023/3/25 |
Gfpose: Learning 3d human pose prior with gradient fields | Hai Ci Mingdong Wu Wentao Zhu Xiaoxuan Ma Hao Dong | 2023 | |
Multiscale Multimodal Transformer for Multimodal Action Recognition | ICLR Submission | Wentao Zhu Keval Doshi Jingru Yi Xiaohang Sun Zhu Liu | 2022 |
Towards comprehensive monocular depth estimation: Multiple heads are better than one | IEEE Transactions on Multimedia | Shuwei Shao Ran Li Zhongcai Pei Zhong Liu Weihai Chen | 2022/11/25 |
Self-supervised monocular depth and ego-motion estimation in endoscopy: Appearance flow to the rescue | Medical image analysis | Shuwei Shao Zhongcai Pei Weihai Chen Wentao Zhu Xingming Wu | 2022/4/1 |
CelebV-HQ: A large-scale video facial attributes dataset | Hao Zhu Wayne Wu Wentao Zhu Liming Jiang Siwei Tang | 2022 |