Limin Wang
Nanjing University
H-index: 57
Asia-China
Top articles of Limin Wang
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Data-efficient Event Camera Pre-training via Disentangled Masked Modeling | arXiv preprint arXiv:2403.00416 | Zhenpeng Huang Chao Li Hao Chen Yongjian Deng Yifeng Geng | 2024/3/1 |
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding | arXiv preprint arXiv:2403.09626 | Guo Chen Yifei Huang Jilan Xu Baoqi Pei Zhe Chen | 2024/3/14 |
VBench: Comprehensive Benchmark Suite for Video Generative Models | Ziqi Huang Yinan He Jiashuo Yu Fan Zhang Chenyang Si | 2024 | |
End-to-end dense video grounding via parallel regression | Computer Vision and Image Understanding | Fengyuan Shi Weilin Huang Limin Wang | 2024/5/1 |
Dual DETRs for Multi-Label Temporal Action Detection | Yuhan Zhu Guozhen Zhang Jing Tan Gangshan Wu Limin Wang | 2024/6 | |
Dual graph networks for pose estimation in crowded scenes | International Journal of Computer Vision | Jun Tu Gangshan Wu Limin Wang | 2024/3 |
VideoMamba: State Space Model for Efficient Video Understanding | arXiv preprint arXiv:2403.06977 | Kunchang Li Xinhao Li Yi Wang Yinan He Yali Wang | 2024/3/11 |
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | Kunchang Li Yali Wang Yinan He Yizhuo Li Yi Wang | 2024/6 | |
Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion | IEEE Transactions on Pattern Analysis and Machine Intelligence | Haisong Liu Tao Lu Yihui Xu Jia Liu Limin Wang | 2023/11/7 |
EgoExoLearn: A Dataset for Bridging Asynchronous Ego-and Exo-centric View of Procedural Activities in Real World | Yifei Huang Guo Chen Jilan Xu Mingfang Zhang Lijin Yang | 2024/6 | |
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities | arXiv preprint arXiv:2401.15071 | Chaochao Lu Chen Qian Guodong Zheng Hongxing Fan Hongzhi Gao | 2024/1/26 |
Spatiotemporal predictive pre-training for robotic motor control | arXiv preprint arXiv:2403.05304 | Jiange Yang Bei Liu Jianlong Fu Bocheng Pan Gangshan Wu | 2024/3/8 |
Asymmetric Masked Distillation for Pre-Training Small Foundation Models | Zhiyu Zhao Bingkun Huang Sen Xing Gangshan Wu Yu Qiao | 2024/6 | |
BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models | arXiv preprint arXiv:2312.02813 | Fengyuan Shi Jiaxi Gu Hang Xu Songcen Xu Wei Zhang | 2023/12/5 |
Multiple Object Tracking as ID Prediction | arXiv preprint arXiv:2403.16848 | Ruopeng Gao Yijun Zhang Limin Wang | 2024/3/25 |
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding | IEEE transactions on pattern analysis and machine intelligence (TPAMI) | Fengyuan Shi Ruopeng Gao Weilin Huang Limin Wang | 2024/1/8 |
StableDrag: Stable Dragging for Point-based Image Editing | arXiv preprint arXiv:2403.04437 | Yutao Cui Xiaotong Zhao Guozhen Zhang Shengming Cao Kai Ma | 2024/3/7 |
Internvid: A large-scale video-text dataset for multimodal understanding and generation | Yi Wang Yinan He Yizhuo Li Kunchang Li Jiashuo Yu | 2024 | |
Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering | Tao Lu Mulin Yu Linning Xu Yuanbo Xiangli Limin Wang | 2024/6 | |
InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding | arXiv preprint arXiv:2403.15377 | Yi Wang Kunchang Li Xinhao Li Jiashuo Yu Yinan He | 2024/3/22 |