Peihao Chen

About Peihao Chen

Peihao Chen, With an exceptional h-index of 10 and a recent h-index of 10 (since 2020), a distinguished researcher at South China University of Technology, specializes in the field of Embodied AI, Multi-Modal Video Understanding.

His recent articles reflect a diverse array of research interests and contributions to the field:

3D-VLA: A 3D Vision-Language-Action Generative World Model

Vesper: A compact and effective pretrained model for speech emotion recognition

FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

3d-llm: Injecting the 3d world into large language models

A Simple Knowledge Distillation Framework for Open-world Object Detection

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

Peihao Chen Information

University

Position

Ph.D. candidate

Citations(all)

1062

Citations(since 2020)

1062

Cited By

163

hIndex(all)

10

hIndex(since 2020)

10

i10Index(all)

10

i10Index(since 2020)

10

Email

University Profile Page

Google Scholar

Peihao Chen Skills & Research Interests

Embodied AI

Multi-Modal Video Understanding

Top articles of Peihao Chen

3D-VLA: A 3D Vision-Language-Action Generative World Model

arXiv preprint arXiv:2403.09631

2024/3/14

Vesper: A compact and effective pretrained model for speech emotion recognition

IEEE Transactions on Affective Computing

2024/2/26

Peihao Chen
Peihao Chen

H-Index: 7

Xiangmin Xu
Xiangmin Xu

H-Index: 22

FGPrompt: Fine-grained Goal Prompting for Image-goal Navigation

Advances in Neural Information Processing Systems

2024/2/13

MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

arXiv preprint arXiv:2401.08577

2024/1/16

3d-llm: Injecting the 3d world into large language models

Advances in Neural Information Processing Systems

2023/12/15

A Simple Knowledge Distillation Framework for Open-world Object Detection

arXiv preprint arXiv:2312.08653

2023/12/14

Ying Wei
Ying Wei

H-Index: 16

Peihao Chen
Peihao Chen

H-Index: 7

DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning

arXiv preprint arXiv:2312.05783

2023/12/10

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

arXiv preprint arXiv:2311.03354

2023/11/6

Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models

arXiv preprint arXiv:2308.07997

2023/8/15

Detecting the open-world objects with the help of the Brain

arXiv preprint arXiv:2303.11623

2023/3/21

Ying Wei
Ying Wei

H-Index: 16

Peihao Chen
Peihao Chen

H-Index: 7

Learning vision-and-language navigation from youtube videos

2023

Masked motion encoding for self-supervised video representation learning

2023

Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation

2022/10/14

Learning Active Camera for Multi-Object Navigation

2022/10/14

RSPNet: Relative Speed Perception for Unsupervised Video Representation Learning

Proceedings of the AAAI Conference on Artificial Intelligence

2021/5/18

Generating visually aligned sound from videos

IEEE Transactions on Image Processing

2020/7/28

Location-aware graph convolutional networks for video question answering

Proceedings of the AAAI Conference on Artificial Intelligence

2020/4/3

Dense regression network for video grounding

2020

Foley music: Learning to generate music from videos

2020

See List of Professors in Peihao Chen University(South China University of Technology)

Co-Authors

academic-engine