Florian Metze

Florian Metze

Carnegie Mellon University

H-index: 53

North America-United States

About Florian Metze

Florian Metze, With an exceptional h-index of 53 and a recent h-index of 37 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of speech recognition, video understanding.

His recent articles reflect a diverse array of research interests and contributions to the field:

Modeling early phonetic acquisition from child-centered audio data

Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation

Legonn: Building modular encoder-decoder models

Dissecting Efficient Architectures for Wake-Word Detection

Audio-journey: Efficient visual+ llm-aided audio encodec diffusion

Xinjian Li Carnegie Mellon University

CTC alignments improve autoregressive translation

Error-aware Quantization through Noise Tempering

Florian Metze Information

University

Position

; FACEBOOK

Citations(all)

11008

Citations(since 2020)

6087

Cited By

6848

hIndex(all)

53

hIndex(since 2020)

37

i10Index(all)

193

i10Index(since 2020)

115

Email

University Profile Page

Carnegie Mellon University

Google Scholar

View Google Scholar Profile

Florian Metze Skills & Research Interests

speech recognition

video understanding

Top articles of Florian Metze

Title

Journal

Author(s)

Publication Date

Modeling early phonetic acquisition from child-centered audio data

Cognition

Marvin Lavechin

Maureen de Seyssel

Marianne Métais

Florian Metze

Abdelrahman Mohamed

...

2024/4/1

Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation

Jackson Michaels

Juncheng B Li

Laura Yao

Lijun Yu

Zach Wood-Doughty

...

2024/4/14

Legonn: Building modular encoder-decoder models

IEEE/ACM Transactions on Audio, Speech, and Language Processing

Siddharth Dalmia

Dmytro Okhonko

Mike Lewis

Sergey Edunov

Shinji Watanabe

...

2023/7/17

Dissecting Efficient Architectures for Wake-Word Detection

Cody Berger

Juncheng B Li

Yiyuan Li

Aaron Berger

Dmitri Berger

...

2023/7/16

Audio-journey: Efficient visual+ llm-aided audio encodec diffusion

Juncheng B Li

Jackson Sam Michaels

Laura Yao

Lijun Yu

Zach Wood-Doughty

...

2023/7/16

Xinjian Li Carnegie Mellon University

Shinji Watanabe

Alan W Black

David R Mortensen

Florian Metze

Patrick Littell

2023/6

CTC alignments improve autoregressive translation

arXiv preprint arXiv:2210.05200

Brian Yan

Siddharth Dalmia

Yosuke Higuchi

Graham Neubig

Florian Metze

...

2022/10/11

Error-aware Quantization through Noise Tempering

arXiv preprint arXiv:2212.05603

Zheng Wang

Juncheng B Li

Shuhui Qu

Florian Metze

Emma Strubell

2022/12/11

Self-supervised object detection from audio-visual correspondence

Triantafyllos Afouras

Yuki M Asano

Francois Fagan

Andrea Vedaldi

Florian Metze

2022

Robustness of Neural Architectures for Audio Event Detection

arXiv preprint arXiv:2205.03268

Juncheng B Li

Zheng Wang

Shuhui Qu

Florian Metze

2022/5/6

Asr2k: Speech recognition for around 2000 languages without audio

Interspeech 2022

Xinjian Li

Florian Metze

David R Mortensen

Alan W Black

Shinji Watanabe

2022/9/6

Masked autoencoders that listen

Advances in Neural Information Processing Systems

Po-Yao Huang

Hu Xu

Juncheng Li

Alexei Baevski

Michael Auli

...

2022/12/6

Zero-shot learning for grapheme to phoneme conversion with language ensemble

Xinjian Li

Florian Metze

David R Mortensen

Shinji Watanabe

Alan W Black

2022/5

Phone inventories and recognition for every language

Xinjian Li

Florian Metze

David R Mortensen

Alan W Black

Shinji Watanabe

2022

Normalized contrastive learning for text-video retrieval

arXiv preprint arXiv:2212.11790

Yookoon Park

Mahmoud Azab

Bo Xiong

Seungwhan Moon

Florian Metze

...

2022/11/30

AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification

arXiv preprint arXiv:2203.13448

Juncheng B Li

Shuhui Qu

Po-Yao Huang

Florian Metze

2022/3/25

On advances in text generation from images beyond captioning: A case study in self-rationalization

arXiv preprint arXiv:2205.11686

Shruti Palaskar

Akshita Bhagia

Yonatan Bisk

Florian Metze

Alan W Black

...

2022/5/24

Token-level sequence labeling for spoken language understanding using compositional end-to-end models

arXiv preprint arXiv:2210.15734

Siddhant Arora

Siddharth Dalmia

Brian Yan

Florian Metze

Alan W Black

...

2022/10/27

Statistical learning models of early phonetic acquisition struggle with child-centered audio data

Marvin Lavechin

Maureen De Seyssel

Marianne Métais

Florian Metze

Abdelrahman Mohamed

...

2022/3/8

End-to-end speech summarization using restricted self-attention

Roshan Sharma

Shruti Palaskar

Alan W Black

Florian Metze

2022/5/23

See List of Professors in Florian Metze University(Carnegie Mellon University)

Co-Authors

H-index: 95
Alexander Waibel

Alexander Waibel

Carnegie Mellon University

H-index: 93
Alex Hauptmann

Alex Hauptmann

Carnegie Mellon University

H-index: 78
Alan W Black

Alan W Black

Carnegie Mellon University

H-index: 67
Tanja Schultz

Tanja Schultz

Universität Bremen

H-index: 32
Sebastian Stüker

Sebastian Stüker

Karlsruher Institut für Technologie

H-index: 23
Po-Yao (Bernie) Huang

Po-Yao (Bernie) Huang

Carnegie Mellon University

academic-engine