Suchin Gururangan
University of Washington
H-index: 13
North America-United States
Top articles of Suchin Gururangan
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments | arXiv preprint arXiv:2404.07972 | Tianbao Xie Danyang Zhang Jixuan Chen Xiaochuan Li Siheng Zhao | 2024/4/11 |
Language models scale reliably with over-training and on downstream tasks | arXiv preprint arXiv:2403.08540 | Samir Yitzhak Gadre Georgios Smyrnis Vaishaal Shankar Suchin Gururangan Mitchell Wortsman | 2024/3/13 |
Less: Selecting influential data for targeted instruction tuning | arXiv preprint arXiv:2402.04333 | Mengzhou Xia Sadhika Malladi Suchin Gururangan Sanjeev Arora Danqi Chen | 2024/2/6 |
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models | arXiv preprint arXiv:2401.10440 | Terra Blevins Tomasz Limisiewicz Suchin Gururangan Margaret Li Hila Gonen | 2024/1/19 |
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters | arXiv preprint arXiv:2401.06408 | Li Lucy Suchin Gururangan Luca Soldaini Emma Strubell David Bamman | 2024/1/12 |
Time is Encoded in the Weights of Finetuned Language Models | arXiv preprint arXiv:2312.13401 | Kai Nylund Suchin Gururangan Noah A Smith | 2023/12/20 |
Silo language models: Isolating legal risk in a nonparametric datastore | arXiv preprint arXiv:2308.04430 | Sewon Min Suchin Gururangan Eric Wallace Hannaneh Hajishirzi Noah A Smith | 2023/8/8 |
Information Flow Control in Machine Learning through Modular Model Architecture | arXiv preprint arXiv:2306.03235 | Trishita Tiwari Suchin Gururangan Chuan Guo Weizhe Hua Sanjay Kariyappa | 2023/6/5 |
Scaling expert language models with unsupervised domain discovery | arXiv preprint arXiv:2303.14177 | Suchin Gururangan Margaret Li Mike Lewis Weijia Shi Tim Althoff | 2023/3/24 |
Whose language counts as high quality? measuring language ideologies in text data selection | arXiv preprint arXiv:2201.10474 | Suchin Gururangan Dallas Card Sarah K Dreier Emily K Gade Leroy Z Wang | 2022/1/25 |
Nearest neighbor zero-shot inference | Weijia Shi Julian Michael Suchin Gururangan Luke Zettlemoyer | 2022/12 | |
lo-fi: distributed fine-tuning without communication | arXiv preprint arXiv:2210.11948 | Mitchell Wortsman Suchin Gururangan Shen Li Ali Farhadi Ludwig Schmidt | 2022/10/19 |
M2D2: A massively multi-domain language modeling dataset | arXiv preprint arXiv:2210.07370 | Machel Reid Victor Zhong Suchin Gururangan Luke Zettlemoyer | 2022/10/13 |
Branch-train-merge: Embarrassingly parallel training of expert language models | arXiv preprint arXiv:2208.03306 | Margaret Li Suchin Gururangan Tim Dettmers Mike Lewis Tim Althoff | 2022/8/5 |
Editing models with task arithmetic | arXiv preprint arXiv:2212.04089 | Gabriel Ilharco Marco Tulio Ribeiro Mitchell Wortsman Suchin Gururangan Ludwig Schmidt | 2022/12/8 |
Demix layers: Disentangling domains for modular language modeling | arXiv preprint arXiv:2108.05036 | Suchin Gururangan Mike Lewis Ari Holtzman Noah A Smith Luke Zettlemoyer | 2021/8/11 |
All that's' human'is not gold: Evaluating human evaluation of generated text | arXiv preprint arXiv:2107.00061 | Elizabeth Clark Tal August Sofia Serrano Nikita Haduong Suchin Gururangan | 2021/6/30 |
Time waits for no one! analysis and challenges of temporal misalignment | arXiv preprint arXiv:2111.07408 | Kelvin Luu Daniel Khashabi Suchin Gururangan Karishma Mandyam Noah A Smith | 2021/11/14 |
Expected Validation Performance and Estimation of a Random Variable's Maximum | Jesse Dodge Suchin Gururangan Dallas Card Roy Schwartz Noah A. Smith | 2021/10/1 | |
Detoxifying language models risks marginalizing minority voices | Albert Xu Eshaan Pathak Eric Wallace Suchin Gururangan Maarten Sap | 2021/6/6 |