ProfessorsProfessors of University of California, IrvineWentao Zhu

Wentao Zhu

University of California, Irvine

H-index: 27

North America-United States

About Wentao Zhu

Wentao Zhu, With an exceptional h-index of 27 and a recent h-index of 26 (since 2020), a distinguished researcher at University of California, Irvine, specializes in the field of Deep Learning, AI and Machine Learning, Neural Networks, Computer Vision, Speech and Language.

His recent articles reflect a diverse array of research interests and contributions to the field:

A multimodal benchmark and improved architecture for zero shot learning

Social Motion Prediction with Cognitive Hierarchies

Real-time Holistic Robot Pose Estimation with Unknown States

Efficient selective audio masked multimodal bottleneck transformer for audio-video classification

Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification

Token Propagation Controller for Efficient Vision Transformer

Multiscale Audio Spectrogram Transformer for Efficient Audio Classification

Motionbert: A unified perspective on learning human motion representations

Wentao Zhu Information

University	University of California, Irvine
Position	Researcher Kuaishou Tech; NVIDIA; ; CAS
Citations(all)	4532
Citations(since 2020)	4000
Cited By	1714
hIndex(all)	27
hIndex(since 2020)	26
i10Index(all)	43
i10Index(since 2020)	42
Email	Access Email
University Profile Page	University of California, Irvine
Google Scholar	View Google Scholar Profile

Wentao Zhu Skills & Research Interests

Deep Learning

AI and Machine Learning

Neural Networks

Computer Vision

Speech and Language

Top articles of Wentao Zhu

Title	Journal	Author(s)	Publication Date
A multimodal benchmark and improved architecture for zero shot learning		Keval Doshi Amanmeet Garg Burak Uzkent Xiaolong Wang Mohamed Omar	2024
Social Motion Prediction with Cognitive Hierarchies	Advances in Neural Information Processing Systems	Wentao Zhu Jason Qin Yuke Lou Hang Ye Xiaoxuan Ma ...	2024/2/13
Real-time Holistic Robot Pose Estimation with Unknown States	arXiv preprint arXiv:2402.05655	Shikun Ban Juling Fan Wentao Zhu Xiaoxuan Ma Yu Qiao ...	2024/2/8
Efficient selective audio masked multimodal bottleneck transformer for audio-video classification	arXiv preprint arXiv:2401.04154	Wentao Zhu	2024/1/8
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification	arXiv preprint arXiv:2401.04023	Wentao Zhu	2024/1/8
Token Propagation Controller for Efficient Vision Transformer	arXiv preprint arXiv:2401.01470	Wentao Zhu	2024/1/3
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification	ICASSP	Wentao Zhu Mohamed Omar	2023/3/19
Motionbert: A unified perspective on learning human motion representations		Wentao Zhu Xiaoxuan Ma Zhaoyang Liu Libin Liu Wayne Wu ...	2023
Deformable Audio Transformer for Audio Event Detection	arXiv preprint arXiv:2312.16228	Wentao Zhu	2023/12/24
AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection		Wentao Zhu Yufang Huang Xiufeng Xie Wenxian Liu Jincan Deng ...	2023
Chimpact: A longitudinal dataset for understanding chimpanzee behaviors	Advances in Neural Information Processing Systems	Xiaoxuan Ma Stephan Kaufhold Jiajun Su Wentao Zhu Jack Terwilliger ...	2023/12/15
Dynamic inference with grounding based vision and language models		Burak Uzkent Amanmeet Garg Wentao Zhu Keval Doshi Jingru Yi ...	2023
Human motion generation: A survey		Wentao Zhu Xiaoxuan Ma Dongwoo Ro Hai Ci Jinlu Zhang ...	2023/11/8
Hnssl: Hard negative-based self-supervised learning		Wentao Zhu Jingya Liu Yufang Huang	2023
Selective Structured State-Spaces for Long-Form Video Understanding	CVPR	Jue Wang Wentao Zhu Pichao Wang Xiang Yu Linda Liu ...	2023/3/25
Gfpose: Learning 3d human pose prior with gradient fields		Hai Ci Mingdong Wu Wentao Zhu Xiaoxuan Ma Hao Dong ...	2023
Multiscale Multimodal Transformer for Multimodal Action Recognition	ICLR Submission	Wentao Zhu Keval Doshi Jingru Yi Xiaohang Sun Zhu Liu ...	2022
Towards comprehensive monocular depth estimation: Multiple heads are better than one	IEEE Transactions on Multimedia	Shuwei Shao Ran Li Zhongcai Pei Zhong Liu Weihai Chen ...	2022/11/25
Self-supervised monocular depth and ego-motion estimation in endoscopy: Appearance flow to the rescue	Medical image analysis	Shuwei Shao Zhongcai Pei Weihai Chen Wentao Zhu Xingming Wu ...	2022/4/1
CelebV-HQ: A large-scale video facial attributes dataset		Hao Zhu Wayne Wu Wentao Zhu Liming Jiang Siwei Tang ...	2022