Wanrong Zhu
University of California, Santa Barbara
H-index: 13
North America-United States
Top articles of Wanrong Zhu
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs | arXiv preprint arXiv:2404.16375 | An Yan Zhengyuan Yang Junda Wu Wanrong Zhu Jianwei Yang | 2024/4/25 |
Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models | arXiv preprint arXiv:2404.15271 | Wanrong Zhu Jennifer Healey Ruiyi Zhang William Yang Wang Tong Sun | 2024/4/23 |
Multimodal procedural planning via dual text-image prompting | arXiv preprint arXiv:2305.01795 | Yujie Lu Pan Lu Zhiyu Chen Wanrong Zhu Xin Eric Wang | 2023/5/2 |
VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use | NeurIPS 2023 - Dataset and Benchmark Track | Yonatan Bitton Hritik Bansal Jack Hessel Rulin Shao Wanrong Zhu | 2023/8/12 |
Multimodal C4: An Open, Billion-Scale Corpus of Images Interleaved with Text | Advances in Neural Information Processing Systems | Wanrong Zhu Jack Hessel Anas Awadalla Samir Yitzhak Gadre Jesse Dodge | 2024/2/13 |
Openflamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models | arXiv preprint arXiv:2308.01390 | Anas Awadalla Irena Gao Josh Gardner Jack Hessel Yusuf Hanafy | 2023/8/2 |
Large Language Models are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning | NeurIPS 2023 | Xinyi Wang Wanrong Zhu William Yang Wang | 2023/1/27 |
Velma: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View | AAAI 2024 | Raphael Schumann Wanrong Zhu Weixi Feng Tsu-Jui Fu Stefan Riezler | 2023/7/12 |
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models | Weixi Feng Wanrong Zhu Tsu-jui Fu Varun Jampani Arjun Akula | 2023/5/24 | |
Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation | arXiv preprint arXiv:2305.11317 | Wanrong Zhu Xinyi Wang Yujie Lu Tsu-Jui Fu Xin Eric Wang | 2023/5/18 |
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation | arXiv preprint arXiv:2311.07562 | An Yan Zhengyuan Yang Wanrong Zhu Kevin Lin Linjie Li | 2023/11/13 |
End-to-end Dense Video Captioning as Sequence Generation | Wanrong Zhu Bo Pang Ashish Thapliyal William Yang Wang Radu Soricut | 2022/4 | |
Imagination-Augmented Natural Language Understanding | arXiv preprint arXiv:2204.08535 | Yujie Lu Wanrong Zhu Xin Eric Wang Miguel Eckstein William Yang Wang | 2022/4/18 |
Clip also understands text: Prompting clip for phrase understanding | arXiv preprint arXiv:2210.05836 | An Yan Jiacheng Li Wanrong Zhu Yujie Lu William Yang Wang | 2022/10/11 |
Visualize Before You Write: Imagination-Guided Open-Ended Text Generation | arXiv preprint arXiv:2210.03765 | Wanrong Zhu An Yan Yujie Lu Wenda Xu Xin Eric Wang | 2022/10/7 |
ImaginE: An Imagination-based Automatic Evaluation Metric for Natural Language Generation | arXiv preprint arXiv:2106.05970 | Wanrong Zhu Xin Eric Wang An Yan Miguel Eckstein William Yang Wang | 2021/6/10 |
Diagnosing Vision-and-Language Navigation: What Really Matters | arXiv preprint arXiv:2103.16561 | Wanrong Zhu Yuankai Qi Pradyumna Narayana Kazoo Sone Sugato Basu | 2021/3/30 |
Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation | Wanrong Zhu Xin Eric Wang Tsu-Jui Fu An Yan Pradyumna Narayana | 2020/7 | |
Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations | Wanrong Zhu Xin Eric Wang Pradyumna Narayana Kazoo Sone Sugato Basu | 2020/11 |