Florian Metze
Carnegie Mellon University
H-index: 53
North America-United States
Top articles of Florian Metze
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Modeling early phonetic acquisition from child-centered audio data | Cognition | Marvin Lavechin Maureen de Seyssel Marianne Métais Florian Metze Abdelrahman Mohamed | 2024/4/1 |
Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation | Jackson Michaels Juncheng B Li Laura Yao Lijun Yu Zach Wood-Doughty | 2024/4/14 | |
Legonn: Building modular encoder-decoder models | IEEE/ACM Transactions on Audio, Speech, and Language Processing | Siddharth Dalmia Dmytro Okhonko Mike Lewis Sergey Edunov Shinji Watanabe | 2023/7/17 |
Dissecting Efficient Architectures for Wake-Word Detection | Cody Berger Juncheng B Li Yiyuan Li Aaron Berger Dmitri Berger | 2023/7/16 | |
Audio-journey: Efficient visual+ llm-aided audio encodec diffusion | Juncheng B Li Jackson Sam Michaels Laura Yao Lijun Yu Zach Wood-Doughty | 2023/7/16 | |
Xinjian Li Carnegie Mellon University | Shinji Watanabe Alan W Black David R Mortensen Florian Metze Patrick Littell | 2023/6 | |
CTC alignments improve autoregressive translation | arXiv preprint arXiv:2210.05200 | Brian Yan Siddharth Dalmia Yosuke Higuchi Graham Neubig Florian Metze | 2022/10/11 |
Error-aware Quantization through Noise Tempering | arXiv preprint arXiv:2212.05603 | Zheng Wang Juncheng B Li Shuhui Qu Florian Metze Emma Strubell | 2022/12/11 |
Self-supervised object detection from audio-visual correspondence | Triantafyllos Afouras Yuki M Asano Francois Fagan Andrea Vedaldi Florian Metze | 2022 | |
Robustness of Neural Architectures for Audio Event Detection | arXiv preprint arXiv:2205.03268 | Juncheng B Li Zheng Wang Shuhui Qu Florian Metze | 2022/5/6 |
Asr2k: Speech recognition for around 2000 languages without audio | Interspeech 2022 | Xinjian Li Florian Metze David R Mortensen Alan W Black Shinji Watanabe | 2022/9/6 |
Masked autoencoders that listen | Advances in Neural Information Processing Systems | Po-Yao Huang Hu Xu Juncheng Li Alexei Baevski Michael Auli | 2022/12/6 |
Zero-shot learning for grapheme to phoneme conversion with language ensemble | Xinjian Li Florian Metze David R Mortensen Shinji Watanabe Alan W Black | 2022/5 | |
Phone inventories and recognition for every language | Xinjian Li Florian Metze David R Mortensen Alan W Black Shinji Watanabe | 2022 | |
Normalized contrastive learning for text-video retrieval | arXiv preprint arXiv:2212.11790 | Yookoon Park Mahmoud Azab Bo Xiong Seungwhan Moon Florian Metze | 2022/11/30 |
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification | arXiv preprint arXiv:2203.13448 | Juncheng B Li Shuhui Qu Po-Yao Huang Florian Metze | 2022/3/25 |
On advances in text generation from images beyond captioning: A case study in self-rationalization | arXiv preprint arXiv:2205.11686 | Shruti Palaskar Akshita Bhagia Yonatan Bisk Florian Metze Alan W Black | 2022/5/24 |
Token-level sequence labeling for spoken language understanding using compositional end-to-end models | arXiv preprint arXiv:2210.15734 | Siddhant Arora Siddharth Dalmia Brian Yan Florian Metze Alan W Black | 2022/10/27 |
Statistical learning models of early phonetic acquisition struggle with child-centered audio data | Marvin Lavechin Maureen De Seyssel Marianne Métais Florian Metze Abdelrahman Mohamed | 2022/3/8 | |
End-to-end speech summarization using restricted self-attention | Roshan Sharma Shruti Palaskar Alan W Black Florian Metze | 2022/5/23 |