Yair Carmon
Tel Aviv University
H-index: 22
Asia-Israel
Top articles of Yair Carmon
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
Language models scale reliably with over-training and on downstream tasks | arXiv preprint arXiv:2403.08540 | Samir Yitzhak Gadre Georgios Smyrnis Vaishaal Shankar Suchin Gururangan Mitchell Wortsman | 2024/3/13 |
The Price of Adaptivity in Stochastic Convex Optimization | arXiv preprint arXiv:2402.10898 | Yair Carmon Oliver Hinder | 2024/2/16 |
Datacomp: In search of the next generation of multimodal datasets | Advances in Neural Information Processing Systems | Samir Yitzhak Gadre Gabriel Ilharco Alex Fang Jonathan Hayase Georgios Smyrnis | 2024/2/13 |
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions | Yair Carmon Arun Jambulapati Yujia Jin Aaron Sidford | 2024 | |
Accelerated Parameter-Free Stochastic Optimization | arXiv preprint arXiv:2404.00666 | Itai Kreisler Maor Ivgi Oliver Hinder Yair Carmon | 2024/3/31 |
Gradient descent monotonically decreases the sharpness of gradient flow solutions in scalar networks and beyond | Itai Kreisler Mor Shpigel Nacson Daniel Soudry Yair Carmon | 2023/7/3 | |
Dog is sgd’s best friend: A parameter-free dynamic step size schedule | Maor Ivgi Oliver Hinder Yair Carmon | 2023/7/3 | |
Lower bounds for non-convex stochastic optimization | Mathematical Programming | Yossi Arjevani Yair Carmon John C Duchi Dylan J Foster Nathan Srebro | 2023/5 |
Resqueing parallel and private stochastic convex optimization | Yair Carmon Arun Jambulapati Yujia Jin Yin Tat Lee Daogao Liu | 2023/11/6 | |
Malign overfitting: Interpolation can provably preclude invariance | arXiv preprint arXiv:2211.15724 | Yoav Wald Gal Yona Uri Shalit Yair Carmon | 2022/11/28 |
Recapp: Crafting a more efficient catalyst for convex optimization | Yair Carmon Arun Jambulapati Yujia Jin Aaron Sidford | 2022/6/28 | |
Making SGD parameter-free | Yair Carmon Oliver Hinder | 2022/6/28 | |
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time | Mitchell Wortsman Gabriel Ilharco Samir Ya Gadre Rebecca Roelofs Raphael Gontijo-Lopes | 2022/6/28 | |
Optimal and adaptive monteiro-svaiter acceleration | Advances in Neural Information Processing Systems | Yair Carmon Danielle Hausler Arun Jambulapati Yujia Jin Aaron Sidford | 2022/12/6 |
Scaling laws under the microscope: Predicting transformer performance from small scale experiments | arXiv preprint arXiv:2202.06387 | Maor Ivgi Yair Carmon Jonathan Berant | 2022/2/13 |
Distributionally robust optimization via ball oracle acceleration | Advances in Neural Information Processing Systems | Yair Carmon Danielle Hausler | 2022/12/6 |
Stochastic bias-reduced gradient methods | Hilal Asi Yair Carmon Jambulapati Arun Yujia Jin Aaron Sidford | 2021/6/17 | |
Never go full batch (in stochastic convex optimization) | Advances in Neural Information Processing Systems | Idan Amir Yair Carmon Tomer Koren Roi Livni | 2021/12/6 |
Thinking inside the ball: Near-optimal minimization of the maximal loss | Yair Carmon Arun Jambulapati Yujia Jin Aaron Sidford | 2021/7/21 | |
Accuracy on the line: on the strong correlation between out-of-distribution and in-distribution generalization | John P Miller Rohan Taori Aditi Raghunathan Shiori Sagawa Pang Wei Koh | 2021/7/1 |