Xuankai Chang

Xuankai Chang

Carnegie Mellon University

H-index: 23

North America-United States

About Xuankai Chang

Xuankai Chang, With an exceptional h-index of 23 and a recent h-index of 22 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of Automatic Speech Recognition, Acoustic Models.

His recent articles reflect a diverse array of research interests and contributions to the field:

Hypothesis stitcher for speech recognition of long-form audio

Improving audio captioning models with fine-grained audio features, text embedding supervision, and llm mix-up augmentation

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing

OWSM v3. 1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study

A Large-Scale Evaluation of Speech Foundation Models

VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks

Xuankai Chang Information

University

Position

Student

Citations(all)

2558

Citations(since 2020)

2506

Cited By

379

hIndex(all)

23

hIndex(since 2020)

22

i10Index(all)

39

i10Index(since 2020)

38

Email

University Profile Page

Carnegie Mellon University

Google Scholar

View Google Scholar Profile

Xuankai Chang Skills & Research Interests

Automatic Speech Recognition

Acoustic Models

Top articles of Xuankai Chang

Title

Journal

Author(s)

Publication Date

Hypothesis stitcher for speech recognition of long-form audio

2024/3/19

Improving audio captioning models with fine-grained audio features, text embedding supervision, and llm mix-up augmentation

IEEE ICASSP 2024

Shih-Lun Wu

Xuankai Chang

Gordon Wichern

Jee-weon Jung

François Germain

...

2024/4/14

TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages

arXiv preprint arXiv:2402.16021

Minsu Kim

Jee-weon Jung

Hyeongseop Rha

Soumi Maiti

Siddhant Arora

...

2024/2/25

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing

Brian Yan

Xuankai Chang

Antonios Anastasopoulos

Yuya Fujita

Shinji Watanabe

2024/4/14

OWSM v3. 1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

arXiv preprint arXiv:2401.16658

Yifan Peng

Jinchuan Tian

William Chen

Siddhant Arora

Brian Yan

...

2024/1/30

Exploring speech recognition, translation, and understanding with discrete speech units: A comparative study

Xuankai Chang

Brian Yan

Kwanghee Choi

Jee-Weon Jung

Yichen Lu

...

2024/4/14

A Large-Scale Evaluation of Speech Foundation Models

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Shu-wen Yang

Heng-Jui Chang

Zili Huang

Andy T Liu

Cheng-I Lai

...

2024/4/16

VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks

IEEE ICASSP 2024

Soumi Maiti

Yifan Peng

Shukjae Choi

Jee-weon Jung

Xuankai Chang

...

2024/4/14

Audiogpt: Understanding and generating speech, music, sound, and talking head

Proceedings of the AAAI Conference on Artificial Intelligence

Rongjie Huang

Mingze Li

Dongchao Yang

Jiatong Shi

Xuankai Chang

...

2024/3/24

Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model

Takashi Maekaku

Jiatong Shi

Xuankai Chang

Yuya Fujita

Shinji Watanabe

2024/4/14

A study on the integration of pre-trained ssl, asr, lm and slu models for spoken language understanding

Yifan Peng

Siddhant Arora

Yosuke Higuchi

Yushi Ueda

Sujay Kumar

...

2023/1/9

Exploration of efficient end-to-end asr using discretized input from self-supervised learning

arXiv preprint arXiv:2305.18108

Xuankai Chang

Brian Yan

Yuya Fujita

Takashi Maekaku

Shinji Watanabe

2023/5/29

Tokensplit: Using discrete speech representations for direct, refined, and transcript-conditioned speech separation and recognition

arXiv preprint arXiv:2308.10415

Hakan Erdogan

Scott Wisdom

Xuankai Chang

Zalán Borsos

Marco Tagliasacchi

...

2023/8/21

Joint Prediction and Denoising for Large-Scale Multilingual Self-Supervised Learning

arXiv preprint arXiv:2309.15317

William Chen

Jiatong Shi

Brian Yan

Dan Berrebbi

Wangyou Zhang

...

2023/9/26

End-to-end integration of speech recognition, dereverberation, beamforming, and self-supervised learning representation

Yoshiki Masuyama

Xuankai Chang

Samuele Cornell

Shinji Watanabe

Nobutaka Ono

2023/1/9

A New Benchmark of Aphasia Speech Recognition and Detection Based on E-Branchformer and Multi-task Learning

arXiv preprint arXiv:2305.13331

Jiyang Tang

William Chen

Xuankai Chang

Shinji Watanabe

Brian MacWhinney

2023/5/19

Reproducing whisper-style training using an open-source toolkit and publicly available data

Yifan Peng

Jinchuan Tian

Brian Yan

Dan Berrebbi

Xuankai Chang

...

2023/12/16

The chime-7 dasr challenge: Distant meeting transcription with multiple devices in diverse scenarios

arXiv preprint arXiv:2306.13734

Samuele Cornell

Matthew Wiesner

Shinji Watanabe

Desh Raj

Xuankai Chang

...

2023/6/23

Superb@ slt 2022: Challenge on generalization and efficiency of self-supervised speech representation learning

Tzu-hsun Feng

Annie Dong

Ching-Feng Yeh

Shu-wen Yang

Tzu-Quan Lin

...

2023/1/9

ML-SUPERB: Multilingual speech universal performance benchmark

Jiatong Shi

Dan Berrebbi

William Chen

Ho-Lam Chung

En-Pei Hu

...

2023

See List of Professors in Xuankai Chang University(Carnegie Mellon University)

Co-Authors

H-index: 74
Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

H-index: 47
Hung-yi Lee

Hung-yi Lee

National Taiwan University

H-index: 40
Yanmin Qian

Yanmin Qian

Shanghai Jiao Tong University

H-index: 31
Tomoki Hayashi

Tomoki Hayashi

Nagoya University

H-index: 17
Aswin Shanmugam Subramanian

Aswin Shanmugam Subramanian

Johns Hopkins University

H-index: 15
Jiatong Shi (史嘉彤)

Jiatong Shi (史嘉彤)

Johns Hopkins University

academic-engine