Chen Sun
Brown University
H-index: 40
North America-United States
Top articles of Chen Sun
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos? | Qi Zhao* Shijie Wang* Ce Zhang Changcheng Fu Nakul Agarwal | 2024 | |
End-to-End Spatio-Temporal Action Localisation with Video Transformers | arXiv preprint arXiv:2304.12160 | Alexey Gritsenko Xuehan Xiong Josip Djolonga Mostafa Dehghani Chen Sun | 2023/4/24 |
Pre-trained Vision-Language Models Learn Discoverable Visual Concepts | arXiv preprint arXiv:2404.12652 | Yuan Zang Tian Yun Hao Tan Trung Bui Chen Sun | 2024/4/19 |
Self-Correcting Self-Consuming Loops for Generative Model Training | arXiv preprint arXiv:2402.07087 | Nate Gillman Michael Freeman Daksh Aggarwal Chia-Hong Hsu Calvin Luo | 2024/2/11 |
Pixel Aligned Language Models | arXiv preprint arXiv:2312.09237 | Jiarui Xu Xingyi Zhou Shen Yan Xiuye Gu Anurag Arnab | 2023/12/14 |
Comparing Trajectory and Vision Modalities for Verb Representation | arXiv preprint arXiv:2303.12737 | Dylan Ebert Chen Sun Ellie Pavlick | 2023/3/8 |
Emergence of Abstract State Representations in Embodied Sequence Modeling | arXiv preprint arXiv:2311.02171 | Tian Yun Zilai Zeng Kunal Handa Ashish V Thapliyal Bo Pang | 2023/11/3 |
Steerable equivariant representation learning | arXiv preprint arXiv:2302.11349 | Sangnie Bhardwaj Willie McClinton Tongzhou Wang Guillaume Lajoie Chen Sun | 2023/2/22 |
Does Visual Pretraining Help End-to-End Reasoning? | Advances in Neural Information Processing Systems | Chen Sun Calvin Luo Xingyi Zhou Anurag Arnab Cordelia Schmid | 2024/2/13 |
Spacewalk-18: A Benchmark for Multimodal and Long-form Procedural Video Understanding in Novel Domains | arXiv preprint arXiv:2311.18773 | Rohan Myer Krishnan Zitian Tang Zhiqiu Yu Chen Sun | 2023/11/30 |
How can objects help action recognition? | Xingyi Zhou Anurag Arnab Chen Sun Cordelia Schmid | 2023 | |
Goal-Conditioned Predictive Coding for Offline Reinforcement Learning | Zilai Zeng Ce Zhang Shijie Wang Chen Sun | 2023/7/7 | |
Vamos: Versatile Action Models for Video Understanding | arXiv preprint arXiv:2311.13627 | Shijie Wang Qi Zhao Minh Quan Do Nakul Agarwal Kwonjoon Lee | 2023/11/22 |
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory | Ziniu Hu Ahmet Iscen Chen Sun Zirui Wang Kai-Wei Chang | 2023 | |
Dense Video Object Captioning from Disjoint Supervision | arXiv preprint arXiv:2306.11729 | Xingyi Zhou Anurag Arnab Chen Sun Cordelia Schmid | 2023/6/20 |
Towards A Unified Neural Architecture for Visual Recognition and Reasoning | arXiv preprint arXiv:2311.06386 | Calvin Luo Boqing Gong Ting Chen Chen Sun | 2023/11/10 |
AVIS: Autonomous Visual Information Seeking with Large Language Models | Ziniu Hu Ahmet Iscen Chen Sun Kai-Wei Chang Yizhou Sun | 2023/11/2 | |
Analyzing Modular Approaches for Visual Question Decomposition | arXiv preprint arXiv:2311.06411 | Apoorv Khandelwal Ellie Pavlick Chen Sun | 2023/11/10 |
Masking Modalities for Cross-modal Video Retrieval | Valentin Gabeur Arsha Nagrani Chen Sun Karteek Alahari Cordelia Schmid | 2022 | |
TL; DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency | Medhini Narasimhan Arsha Nagrani Chen Sun Michael Rubinstein Trevor Darrell | 2022 |