Mitchell Wortsman
University of Washington
H-index: 19
North America-United States
Top articles of Mitchell Wortsman
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Language models scale reliably with over-training and on downstream tasks | arXiv preprint arXiv:2403.08540 | Samir Yitzhak Gadre Georgios Smyrnis Vaishaal Shankar Suchin Gururangan Mitchell Wortsman | 2024/3/13 |
Datacomp: In search of the next generation of multimodal datasets | Advances in Neural Information Processing Systems | Samir Yitzhak Gadre Gabriel Ilharco Alex Fang Jonathan Hayase Georgios Smyrnis | 2024/2/13 |
Olmo: Accelerating the science of language models | arXiv preprint arXiv:2402.00838 | Dirk Groeneveld Iz Beltagy Pete Walsh Akshita Bhagia Rodney Kinney | 2024/2/1 |
Robust and reliable large-scale transfer learning | Mitchell Wortsman | 2024 | |
Stable and low-precision training for large-scale vision-language models | Advances in Neural Information Processing Systems | Mitchell Wortsman Tim Dettmers Luke Zettlemoyer Ari Morcos Ali Farhadi | 2023/12/15 |
Reproducible scaling laws for contrastive language-image learning | Mehdi Cherti Romain Beaumont Ross Wightman Mitchell Wortsman Gabriel Ilharco | 2023 | |
Small-scale proxies for large-scale transformer training instabilities | arXiv preprint arXiv:2309.14322 | Mitchell Wortsman Peter J Liu Lechao Xiao Katie Everett Alex Alemi | 2023/9/25 |
Replacing softmax with relu in vision transformers | arXiv preprint arXiv:2309.08586 | Mitchell Wortsman Jaehoon Lee Justin Gilmer Simon Kornblith | 2023/9/15 |
Openflamingo: An open-source framework for training large autoregressive vision-language models | arXiv preprint arXiv:2308.01390 | Anas Awadalla Irena Gao Josh Gardner Jack Hessel Yusuf Hanafy | 2023/8/2 |
The role of pre-training data in transfer learning | arXiv preprint arXiv:2302.13602 | Rahim Entezari Mitchell Wortsman Olga Saukh M Moein Shariatnia Hanie Sedghi | 2023/2/27 |
Cows on pasture: Baselines and benchmarks for language-driven zero-shot object navigation | arXiv preprint arXiv:2203.10421 | Samir Yitzhak Gadre Mitchell Wortsman Gabriel Ilharco Ludwig Schmidt Shuran Song | 2022/3/20 |
Robust fine-tuning of zero-shot models | Mitchell Wortsman* Gabriel Ilharco* Jong Wook Kim Mike Li Simon Kornblith | 2022 | |
Exploring the landscape of distributional robustness for question answering models | arXiv preprint arXiv:2210.12517 | Anas Awadalla Mitchell Wortsman Gabriel Ilharco Sewon Min Ian Magnusson | 2022/10/22 |
lo-fi: distributed fine-tuning without communication | arXiv preprint arXiv:2210.11948 | Mitchell Wortsman Suchin Gururangan Shen Li Ali Farhadi Ludwig Schmidt | 2022/10/19 |
Editing models with task arithmetic | arXiv preprint arXiv:2212.04089 | Gabriel Ilharco Marco Tulio Ribeiro Mitchell Wortsman Suchin Gururangan Ludwig Schmidt | 2022/12/8 |
How well do contrastively trained models transfer? | M Moein Shariatnia Rahim Entezari Mitchell Wortsman Olga Saukh Ludwig Schmidt | 2022/7/23 | |
Laion-5b: An open large-scale dataset for training next generation image-text models | Advances in Neural Information Processing Systems | Christoph Schuhmann Romain Beaumont Richard Vencu Cade Gordon Ross Wightman | 2022/12/6 |
Data determines distributional robustness in contrastive language image pre-training (clip) | International Conference on Machine Learning (ICML) | Alex Fang Gabriel Ilharco Mitchell Wortsman Yuhao Wan Vaishaal Shankar | 2022/5/3 |
Patching open-vocabulary models by interpolating weights | Advances in Neural Information Processing Systems | Gabriel Ilharco Mitchell Wortsman Samir Yitzhak Gadre Shuran Song Hannaneh Hajishirzi | 2022/12/6 |
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time | Mitchell Wortsman Gabriel Ilharco Samir Ya Gadre Rebecca Roelofs Raphael Gontijo-Lopes | 2022/6/28 |