Po-Yao (Bernie) Huang

Po-Yao (Bernie) Huang

Carnegie Mellon University

H-index: 23

North America-United States

About Po-Yao (Bernie) Huang

Po-Yao (Bernie) Huang, With an exceptional h-index of 23 and a recent h-index of 23 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of Multimodal machine learning, Multi-modal learning, natural language processing.

His recent articles reflect a diverse array of research interests and contributions to the field:

Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

MoDE: CLIP Data Experts via Clustering

Av-superb: A multi-task evaluation benchmark for audio-visual representation models

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Diffusion Models as Masked Autoencoders

Generating Hashtags for Short-form Videos with Guided Signals

Cit: Curation in training for effective vision-language data

Po-Yao (Bernie) Huang Information

University

Position

___

Citations(all)

3746

Citations(since 2020)

3642

Cited By

438

hIndex(all)

23

hIndex(since 2020)

23

i10Index(all)

39

i10Index(since 2020)

37

Email

University Profile Page

Carnegie Mellon University

Google Scholar

View Google Scholar Profile

Po-Yao (Bernie) Huang Skills & Research Interests

Multimodal machine learning

Multi-modal learning

natural language processing

Top articles of Po-Yao (Bernie) Huang

Title

Journal

Author(s)

Publication Date

Adversarially Masked Video Consistency for Unsupervised Domain Adaptation

arXiv preprint arXiv:2403.16242

Xiaoyu Zhu

Junwei Liang

Po-Yao Huang

Alex Hauptmann

2024/3/24

MoDE: CLIP Data Experts via Clustering

arXiv preprint arXiv:2404.16030

Jiawei Ma

Po-Yao Huang

Saining Xie

Shang-Wen Li

Luke Zettlemoyer

...

2024/4/24

Av-superb: A multi-task evaluation benchmark for audio-visual representation models

Yuan Tseng

Layne Berry

Yi-Ting Chen

I-Hsiang Chiu

Hsuan-Hao Lin

...

2024/4/14

VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild

arXiv preprint arXiv:2403.16973

Puyuan Peng

Po-Yao Huang

Daniel Li

Abdelrahman Mohamed

David Harwath

2024/3/25

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

arXiv preprint arXiv:2308.11596

Loïc Barrault

Yu-An Chung

Mariano Cora Meglioli

David Dale

Ning Dong

...

2023/8/22

Diffusion Models as Masked Autoencoders

ICCV

Chen Wei

Karttikeya Mangalam

Po-Yao Huang

Yanghao Li

Haoqi Fan

...

2023

Generating Hashtags for Short-form Videos with Guided Signals

Tiezheng Yu

Hanchao Yu

Davis Liang

Yuning Mao

Shaoliang Nie

...

2023/7

Cit: Curation in training for effective vision-language data

Hu Xu

Saining Xie

Po-Yao Huang

Licheng Yu

Russell Howes

...

2023

STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

CVPR 2023

Xiaoyu Zhu

Po-Yao Huang

Junwei Liang

Celso M de Melo

Alexander Hauptmann

2023/3/31

Data processing system for classifying keyed data representing inhaler device operation

2023/6/13

Flap: Fast language-audio pre-training

Ching-Feng Yeh

Po-Yao Huang

Vasu Sharma

Shang-Wen Li

Gargi Gosh

2023/12/16

Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles

ICML

Chaitanya Ryali

Yuan-Ting Hu

Daniel Bolya

Chen Wei

Haoqi Fan

...

2023

Demystifying clip data

arXiv preprint arXiv:2309.16671

Hu Xu

Saining Xie

Xiaoqing Ellen Tan

Po-Yao Huang

Russell Howes

...

2023/9/28

Dinov2: Learning robust visual features without supervision

arXiv preprint arXiv:2304.07193

Maxime Oquab

Timothée Darcet

Théo Moutakanni

Huy Vo

Marc Szafraniec

...

2023/4/14

Video pivoting unsupervised multi-modal machine translation

IEEE Transactions on Pattern Analysis and Machine Intelligence

Mingjie Li

Po-Yao Huang

Xiaojun Chang

Junjie Hu

Yi Yang

...

2022/6/9

On adversarial robustness of large-scale audio visual learning

Juncheng B Li

Shuhui Qu

Xinjian Li

Po-Yao Bernie Huang

Florian Metze

2022/5/23

AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification

arXiv preprint arXiv:2203.13448

Juncheng B Li

Shuhui Qu

Po-Yao Huang

Florian Metze

2022/3/25

MAViL: Masked Audio-Video Learners

Advances in Neural Information Processing Systems

Po-Yao Huang

Vasu Sharma

Hu Xu

Chaitanya Ryali

Yanghao Li

...

2024/2/13

Cm3: A causal masked multimodal model of the internet

arXiv preprint arXiv:2201.07520

Armen Aghajanyan

Bernie Huang

Candace Ross

Vladimir Karpukhin

Hu Xu

...

2022/1/19

Masked autoencoders that listen

Advances in Neural Information Processing Systems

Po-Yao Huang

Hu Xu

Juncheng Li

Alexei Baevski

Michael Auli

...

2022/12/6

See List of Professors in Po-Yao (Bernie) Huang University(Carnegie Mellon University)

Co-Authors

H-index: 100
Luke Zettlemoyer

Luke Zettlemoyer

University of Washington

H-index: 93
Alex Hauptmann

Alex Hauptmann

Carnegie Mellon University

H-index: 82
Graham Neubig

Graham Neubig

Carnegie Mellon University

H-index: 58
Xiaojun Chang

Xiaojun Chang

Monash University

H-index: 53
Florian Metze

Florian Metze

Carnegie Mellon University

H-index: 23
Junjie Hu

Junjie Hu

Carnegie Mellon University

academic-engine