Wanrong Zhu

Wanrong Zhu

University of California, Santa Barbara

H-index: 13

North America-United States

About Wanrong Zhu

Wanrong Zhu, With an exceptional h-index of 13 and a recent h-index of 13 (since 2020), a distinguished researcher at University of California, Santa Barbara, specializes in the field of Natural Language Processing, Vision and Language.

His recent articles reflect a diverse array of research interests and contributions to the field:

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models

Multimodal procedural planning via dual text-image prompting

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

Multimodal C4: An Open, Billion-Scale Corpus of Images Interleaved with Text

Openflamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

Large Language Models are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning

Velma: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

Wanrong Zhu Information

University

Position

___

Citations(all)

577

Citations(since 2020)

572

Cited By

73

hIndex(all)

13

hIndex(since 2020)

13

i10Index(all)

14

i10Index(since 2020)

14

Email

University Profile Page

University of California, Santa Barbara

Google Scholar

View Google Scholar Profile

Wanrong Zhu Skills & Research Interests

Natural Language Processing

Vision and Language

Top articles of Wanrong Zhu

Title

Journal

Author(s)

Publication Date

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

arXiv preprint arXiv:2404.16375

An Yan

Zhengyuan Yang

Junda Wu

Wanrong Zhu

Jianwei Yang

...

2024/4/25

Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models

arXiv preprint arXiv:2404.15271

Wanrong Zhu

Jennifer Healey

Ruiyi Zhang

William Yang Wang

Tong Sun

2024/4/23

Multimodal procedural planning via dual text-image prompting

arXiv preprint arXiv:2305.01795

Yujie Lu

Pan Lu

Zhiyu Chen

Wanrong Zhu

Xin Eric Wang

...

2023/5/2

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

NeurIPS 2023 - Dataset and Benchmark Track

Yonatan Bitton

Hritik Bansal

Jack Hessel

Rulin Shao

Wanrong Zhu

...

2023/8/12

Multimodal C4: An Open, Billion-Scale Corpus of Images Interleaved with Text

Advances in Neural Information Processing Systems

Wanrong Zhu

Jack Hessel

Anas Awadalla

Samir Yitzhak Gadre

Jesse Dodge

...

2024/2/13

Openflamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

arXiv preprint arXiv:2308.01390

Anas Awadalla

Irena Gao

Josh Gardner

Jack Hessel

Yusuf Hanafy

...

2023/8/2

Large Language Models are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Learning

NeurIPS 2023

Xinyi Wang

Wanrong Zhu

William Yang Wang

2023/1/27

Velma: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View

AAAI 2024

Raphael Schumann

Wanrong Zhu

Weixi Feng

Tsu-Jui Fu

Stefan Riezler

...

2023/7/12

LayoutGPT: Compositional Visual Planning and Generation with Large Language Models

Weixi Feng

Wanrong Zhu

Tsu-jui Fu

Varun Jampani

Arjun Akula

...

2023/5/24

Collaborative Generative AI: Integrating GPT-k for Efficient Editing in Text-to-Image Generation

arXiv preprint arXiv:2305.11317

Wanrong Zhu

Xinyi Wang

Yujie Lu

Tsu-Jui Fu

Xin Eric Wang

...

2023/5/18

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation

arXiv preprint arXiv:2311.07562

An Yan

Zhengyuan Yang

Wanrong Zhu

Kevin Lin

Linjie Li

...

2023/11/13

End-to-end Dense Video Captioning as Sequence Generation

Wanrong Zhu

Bo Pang

Ashish Thapliyal

William Yang Wang

Radu Soricut

2022/4

Imagination-Augmented Natural Language Understanding

arXiv preprint arXiv:2204.08535

Yujie Lu

Wanrong Zhu

Xin Eric Wang

Miguel Eckstein

William Yang Wang

2022/4/18

Clip also understands text: Prompting clip for phrase understanding

arXiv preprint arXiv:2210.05836

An Yan

Jiacheng Li

Wanrong Zhu

Yujie Lu

William Yang Wang

...

2022/10/11

Visualize Before You Write: Imagination-Guided Open-Ended Text Generation

arXiv preprint arXiv:2210.03765

Wanrong Zhu

An Yan

Yujie Lu

Wenda Xu

Xin Eric Wang

...

2022/10/7

ImaginE: An Imagination-based Automatic Evaluation Metric for Natural Language Generation

arXiv preprint arXiv:2106.05970

Wanrong Zhu

Xin Eric Wang

An Yan

Miguel Eckstein

William Yang Wang

2021/6/10

Diagnosing Vision-and-Language Navigation: What Really Matters

arXiv preprint arXiv:2103.16561

Wanrong Zhu

Yuankai Qi

Pradyumna Narayana

Kazoo Sone

Sugato Basu

...

2021/3/30

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Wanrong Zhu

Xin Eric Wang

Tsu-Jui Fu

An Yan

Pradyumna Narayana

...

2020/7

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

Wanrong Zhu

Xin Eric Wang

Pradyumna Narayana

Kazoo Sone

Sugato Basu

...

2020/11

See List of Professors in Wanrong Zhu University(University of California, Santa Barbara)

Co-Authors

H-index: 114
Eric Xing

Eric Xing

Carnegie Mellon University

H-index: 95
Yejin Choi

Yejin Choi

University of Washington

H-index: 59
William Yang Wang

William Yang Wang

University of California, Santa Barbara

H-index: 54
Miguel Eckstein

Miguel Eckstein

University of California, Santa Barbara

H-index: 43
Zhiting Hu

Zhiting Hu

Carnegie Mellon University

H-index: 31
Xin Eric Wang

Xin Eric Wang

University of California, Santa Cruz

academic-engine