Hung-yi Lee
National Taiwan University
H-index: 47
Asia-Taiwan
Top articles of Hung-yi Lee
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data | arXiv preprint arXiv:2402.06959 | Hsuan-Fu Wang Yi-Jen Shih Heng-Jui Chang Layne Berry Puyuan Peng | 2024/2/10 |
Multimodal Transformer Distillation for Audio-Visual Synchronization | Xuanjun Chen Haibin Wu Chung-Che Wang Hung-yi Lee Jyh-Shing Roger Jang | 2024/4/14 | |
Paralinguistics-enhanced large language modeling of spoken dialogue | Guan-Ting Lin Prashanth Gurunath Shivakumar Ankur Gandhe Chao-Han Huck Yang Yile Gu | 2024/4/14 | |
Examining forgetting in continual pre-training of aligned large language models | arXiv preprint arXiv:2401.03129 | Chen-An Li Hung-Yi Lee | 2024/1/6 |
Integrating Self-supervised Speech Model with Pseudo Word-level Targets from Visually-grounded Speech Model | arXiv preprint arXiv:2402.05819 | Hung-Chieh Fang Nai-Xuan Ye Yi-Jen Shih Puyuan Peng Hsuan-Fu Wang | 2024/2/8 |
Towards audio language modeling-an overview | Haibin Wu Xuanjun Chen Yi-Cheng Lin Kai-wei Chang Ho-Lam Chung | 2024/2/20 | |
Scalable Ensemble-Based Detection Method Against Adversarial Attacks For Speaker Verification | Haibin Wu Heng-Cheng Kuo Yu Tsao Hung-yi Lee | 2024/4/14 | |
PEFT for Speech: Unveiling Optimal Placement, Merging Strategies, and Ensemble Techniques | arXiv preprint arXiv:2401.02122 | Tzu-Han Lin How-Shing Wang Hao-Yung Weng Kuang-Chen Peng Zih-Ching Chen | 2024/1/4 |
Merging Facts, Crafting Fallacies: Evaluating the Contradictory Nature of Aggregated Factual Claims in Long-Form Generations | arXiv preprint arXiv:2402.05629 | Cheng-Han Chiang Hung-yi Lee | 2024/2/8 |
Codec-SUPERB: An In-Depth Analysis of Sound Codec Models | arXiv preprint arXiv:2402.13071 | Haibin Wu Ho-Lam Chung Yi-Cheng Lin Yuan-Kuei Wu Xuanjun Chen | 2024/2/20 |
Zero resource code-switched speech benchmark using speech utterance pairs for multiple spoken languages | Kuan-Po Huang Chih-Kai Yang Yu-Kuan Fu Ewan Dunbar Hung-yi Lee | 2024/4/14 | |
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR | arXiv preprint arXiv:2402.03988 | Liang-Hsuan Tseng En-Pei Hu Cheng-Han Chiang Yuan Tseng Hung-yi Lee | 2024/2/6 |
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations | arXiv preprint arXiv:2402.12786 | Guan-Ting Lin Cheng-Han Chiang Hung-yi Lee | 2024/2/20 |
Av-superb: A multi-task evaluation benchmark for audio-visual representation models | Yuan Tseng Layne Berry Yi-Ting Chen I-Hsiang Chiu Hsuan-Hao Lin | 2024/4/14 | |
A Large-Scale Evaluation of Speech Foundation Models | IEEE/ACM Transactions on Audio, Speech, and Language Processing | Shu-wen Yang Heng-Jui Chang Zili Huang Andy T Liu Cheng-I Lai | 2024/4/16 |
SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering | arXiv preprint arXiv:2401.13463 | Chyi-Jiunn Lin Guan-Ting Lin Yung-Sung Chuang Wei-Lun Wu Shang-Wen Li | 2024/1/24 |
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition | arXiv preprint arXiv:2402.13018 | Haibin Wu Huang-Cheng Chou Kai-Wei Chang Lucas Goncalves Jiawei Du | 2024/2/20 |
Dynamic-superb: Towards a dynamic, collaborative, and comprehensive instruction-tuning benchmark for speech | Chien-yu Huang Ke-Han Lu Shih-Heng Wang Chi-Yuan Hsiao Chun-Yi Kuan | 2024/4/14 | |
Towards ASR robust spoken language understanding through in-context learning with word confusion networks | Kevin Everson Yile Gu Huck Yang Prashanth Gurunath Shivakumar Guan-Ting Lin | 2024/4/14 | |
Over-Reasoning and Redundant Calculation of Large Language Models | arXiv preprint arXiv:2401.11467 | Cheng-Han Chiang Hung-yi Lee | 2024/1/21 |