Haokun Liu
New York University
H-index: 11
North America-United States
Top articles of Haokun Liu
Learning to Route Among Specialized Experts for Zero-Shot Generalization
arXiv preprint arXiv:2402.05859
2024/2/8
Haokun Liu
H-Index: 6
Colin Raffel
H-Index: 29
Git-theta: A git extension for collaborative development of machine learning models
2023/7/3
Haokun Liu
H-Index: 6
Colin Raffel
H-Index: 29
Soft merging of experts with adaptive routing
arXiv preprint arXiv:2306.03745
2023/6/6
Haokun Liu
H-Index: 6
Colin Raffel
H-Index: 29
Models with conditional computation learn suboptimal solutions
2022/12/6
Haokun Liu
H-Index: 6
Colin Raffel
H-Index: 29
Few-shot parameter-efficient fine-tuning is better and cheaper than in-context learning
Advances in Neural Information Processing Systems
2022/12/6
Fine-tuned transformers show clusters of similar representations across layers
arXiv preprint arXiv:2109.08406
2021/9/17
Jason Phang
H-Index: 10
Haokun Liu
H-Index: 6
Comparing test sets with item response theory
arXiv preprint arXiv:2106.00840
2021/6/1
Learning which features matter: RoBERTa acquires a preference for linguistic generalizations (eventually)
arXiv preprint arXiv:2010.05358
2020/10/11
Counterfactually-augmented SNLI training data does not yield better generalization than unaugmented data
arXiv preprint arXiv:2010.04762
2020/10/9
William Huang
H-Index: 33
Haokun Liu
H-Index: 6
Precise task formalization matters in Winograd schema evaluations
arXiv preprint arXiv:2010.04043
2020/10/8
Haokun Liu
H-Index: 6
William Huang
H-Index: 33
BLiMP: The benchmark of linguistic minimal pairs for English
Transactions of the Association for Computational Linguistics
2020/7/1
English intermediate-task training improves zero-shot cross-lingual transfer too
arXiv preprint arXiv:2005.13013
2020/5/26
Intermediate-task transfer learning with pretrained models for natural language understanding: When and why does it work?
arXiv preprint arXiv:2005.00628
2020/5/1
jiant: A software toolkit for research on general-purpose text understanding models
arXiv preprint arXiv:2003.02249
2020/3/4