Hirofumi Inaguma
Kyoto University
H-index: 18
Asia-Japan
Top articles of Hirofumi Inaguma
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Efficient monotonic multihead attention | arXiv preprint arXiv:2312.04515 | Xutai Ma Anna Sun Siqi Ouyang Hirofumi Inaguma Paden Tomasello | 2023/12/7 |
Sequence-to-sequence speech recognition with latency threshold | 2023/5/18 | ||
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction | arXiv preprint arXiv:2310.02720 | Jiatong Shi Hirofumi Inaguma Xutai Ma Ilia Kulikov Anna Sun | 2023/10/4 |
Hybrid transducer and attention based encoder-decoder modeling for speech-to-text tasks | arXiv preprint arXiv:2305.03101 | Yun Tang Anna Y Sun Hirofumi Inaguma Xinyue Chen Ning Dong | 2023/5/4 |
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation | arXiv preprint arXiv:2308.11596 | Loïc Barrault Yu-An Chung Mariano Cora Meglioli David Dale Ning Dong | 2023/8/22 |
Espnet-st-v2: Multipurpose spoken language translation toolkit | arXiv preprint arXiv:2304.04596 | Brian Yan Jiatong Shi Yun Tang Hirofumi Inaguma Yifan Peng | 2023/4/10 |
Enhancing Speech-To-Speech Translation with Multiple TTS Targets | Jiatong Shi Yun Tang Ann Lee Hirofumi Inaguma Changhan Wang | 2023/6/4 | |
Findings of the IWSLT 2023 evaluation campaign | Milind Agarwal Sweta Agarwal Antonios Anastasopoulos Luisa Bentivogli Ondřej Bojar | 2023 | |
Named Entity Detection and Injection for Direct Speech Translation | Marco Gaido Yun Tang Ilia Kulikov Rongqing Huang Hongyu Gong | 2023/6/4 | |
Seamless: Multilingual Expressive and Streaming Speech Translation | arXiv preprint arXiv:2312.05187 | Loïc Barrault Yu-An Chung Mariano Coria Meglioli David Dale Ning Dong | 2023/12/8 |
Exploration on HuBERT with multiple resolutions | arXiv preprint arXiv:2306.01084 | Jiatong Shi Yun Tang Hirofumi Inaguma Hongyu Gong Juan Pino | 2023/6/1 |
Speech-to-speech translation for a real-world unwritten language | arXiv preprint arXiv:2211.06474 | Peng-Jen Chen Kevin Tran Yilin Yang Jingfei Du Justine Kao | 2022/11/11 |
Simple and effective unsupervised speech translation | arXiv preprint arXiv:2210.10191 | Changhan Wang Hirofumi Inaguma Peng-Jen Chen Ilia Kulikov Yun Tang | 2022/10/18 |
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM | arXiv preprint arXiv:2209.04062 | Hayato Futami Hirofumi Inaguma Sei Ueno Masato Mimura Shinsuke Sakai | 2022/9/8 |
Distilling the Knowledge of BERT for CTC-based ASR | arXiv preprint arXiv:2209.02030 | Hayato Futami Hirofumi Inaguma Masato Mimura Shinsuke Sakai Tatsuya Kawahara | 2022/9/5 |
Unity: Two-pass direct speech-to-speech translation with discrete units | arXiv preprint arXiv:2212.08055 | Hirofumi Inaguma Sravya Popuri Ilia Kulikov Peng-Jen Chen Changhan Wang | 2022/12/15 |
Non-autoregressive end-to-end speech translation with parallel autoregressive rescoring | arXiv preprint arXiv:2109.04411 | Hirofumi Inaguma Yosuke Higuchi Kevin Duh Tatsuya Kawahara Shinji Watanabe | 2021/9/9 |
The 2020 espnet update: new features, broadened applications, performance improvements, and future plans | Shinji Watanabe Florian Boyer Xuankai Chang Pengcheng Guo Tomoki Hayashi | 2021/6/5 | |
A comparative study on non-autoregressive modelings for speech-to-text generation | Yosuke Higuchi Nanxin Chen Yuya Fujita Hirofumi Inaguma Tatsuya Komatsu | 2021/12/13 | |
VAD-free streaming hybrid CTC/attention ASR for unsegmented recording | arXiv preprint arXiv:2107.07509 | Hirofumi Inaguma Tatsuya Kawahara | 2021/7/15 |