ProfessorsProfessors of Kyoto UniversityHirofumi Inaguma

Hirofumi Inaguma

Kyoto University

H-index: 18

Asia-Japan

About Hirofumi Inaguma

Hirofumi Inaguma, With an exceptional h-index of 18 and a recent h-index of 17 (since 2020), a distinguished researcher at Kyoto University, specializes in the field of Speech recognition, Speech translation.

His recent articles reflect a diverse array of research interests and contributions to the field:

Efficient monotonic multihead attention

Sequence-to-sequence speech recognition with latency threshold

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Hybrid transducer and attention based encoder-decoder modeling for speech-to-text tasks

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Espnet-st-v2: Multipurpose spoken language translation toolkit

Enhancing Speech-To-Speech Translation with Multiple TTS Targets

Findings of the IWSLT 2023 evaluation campaign

Hirofumi Inaguma Information

University	Kyoto University
Position	Ph.D. student at
Citations(all)	1933
Citations(since 2020)	1904
Cited By	435
hIndex(all)	18
hIndex(since 2020)	17
i10Index(all)	25
i10Index(since 2020)	25
Email	Access Email
University Profile Page	Kyoto University
Google Scholar	View Google Scholar Profile

Hirofumi Inaguma Skills & Research Interests

Speech recognition

Speech translation

Top articles of Hirofumi Inaguma

Title	Journal	Author(s)	Publication Date
Efficient monotonic multihead attention	arXiv preprint arXiv:2312.04515	Xutai Ma Anna Sun Siqi Ouyang Hirofumi Inaguma Paden Tomasello	2023/12/7
Sequence-to-sequence speech recognition with latency threshold			2023/5/18
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction	arXiv preprint arXiv:2310.02720	Jiatong Shi Hirofumi Inaguma Xutai Ma Ilia Kulikov Anna Sun	2023/10/4
Hybrid transducer and attention based encoder-decoder modeling for speech-to-text tasks	arXiv preprint arXiv:2305.03101	Yun Tang Anna Y Sun Hirofumi Inaguma Xinyue Chen Ning Dong ...	2023/5/4
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation	arXiv preprint arXiv:2308.11596	Loïc Barrault Yu-An Chung Mariano Cora Meglioli David Dale Ning Dong ...	2023/8/22
Espnet-st-v2: Multipurpose spoken language translation toolkit	arXiv preprint arXiv:2304.04596	Brian Yan Jiatong Shi Yun Tang Hirofumi Inaguma Yifan Peng ...	2023/4/10
Enhancing Speech-To-Speech Translation with Multiple TTS Targets		Jiatong Shi Yun Tang Ann Lee Hirofumi Inaguma Changhan Wang ...	2023/6/4
Findings of the IWSLT 2023 evaluation campaign		Milind Agarwal Sweta Agarwal Antonios Anastasopoulos Luisa Bentivogli Ondřej Bojar ...	2023
Named Entity Detection and Injection for Direct Speech Translation		Marco Gaido Yun Tang Ilia Kulikov Rongqing Huang Hongyu Gong ...	2023/6/4
Seamless: Multilingual Expressive and Streaming Speech Translation	arXiv preprint arXiv:2312.05187	Loïc Barrault Yu-An Chung Mariano Coria Meglioli David Dale Ning Dong ...	2023/12/8
Exploration on HuBERT with multiple resolutions	arXiv preprint arXiv:2306.01084	Jiatong Shi Yun Tang Hirofumi Inaguma Hongyu Gong Juan Pino ...	2023/6/1
Speech-to-speech translation for a real-world unwritten language	arXiv preprint arXiv:2211.06474	Peng-Jen Chen Kevin Tran Yilin Yang Jingfei Du Justine Kao ...	2022/11/11
Simple and effective unsupervised speech translation	arXiv preprint arXiv:2210.10191	Changhan Wang Hirofumi Inaguma Peng-Jen Chen Ilia Kulikov Yun Tang ...	2022/10/18
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM	arXiv preprint arXiv:2209.04062	Hayato Futami Hirofumi Inaguma Sei Ueno Masato Mimura Shinsuke Sakai ...	2022/9/8
Distilling the Knowledge of BERT for CTC-based ASR	arXiv preprint arXiv:2209.02030	Hayato Futami Hirofumi Inaguma Masato Mimura Shinsuke Sakai Tatsuya Kawahara	2022/9/5
Unity: Two-pass direct speech-to-speech translation with discrete units	arXiv preprint arXiv:2212.08055	Hirofumi Inaguma Sravya Popuri Ilia Kulikov Peng-Jen Chen Changhan Wang ...	2022/12/15
Non-autoregressive end-to-end speech translation with parallel autoregressive rescoring	arXiv preprint arXiv:2109.04411	Hirofumi Inaguma Yosuke Higuchi Kevin Duh Tatsuya Kawahara Shinji Watanabe	2021/9/9
The 2020 espnet update: new features, broadened applications, performance improvements, and future plans		Shinji Watanabe Florian Boyer Xuankai Chang Pengcheng Guo Tomoki Hayashi ...	2021/6/5
A comparative study on non-autoregressive modelings for speech-to-text generation		Yosuke Higuchi Nanxin Chen Yuya Fujita Hirofumi Inaguma Tatsuya Komatsu ...	2021/12/13
VAD-free streaming hybrid CTC/attention ASR for unsegmented recording	arXiv preprint arXiv:2107.07509	Hirofumi Inaguma Tatsuya Kawahara	2021/7/15