ProfessorsProfessors of Princeton UniversitySanjeev Arora

Sanjeev Arora

Princeton University

H-index: 75

North America-United States

About Sanjeev Arora

Sanjeev Arora, With an exceptional h-index of 75 and a recent h-index of 52 (since 2020), a distinguished researcher at Princeton University, specializes in the field of theoretical machine learning, theoretical computer science.

His recent articles reflect a diverse array of research interests and contributions to the field:

Fine-tuning language models with just forward passes

Why (and When) does Local SGD Generalize Better than SGD?

A theory for emergence of complex skills in language models

Trainable transformer in transformer

Task-specific skill localization in fine-tuned language models

A kernel-based view of language model fine-tuning

Understanding the generalization benefit of normalization layers: Sharpness reduction

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Sanjeev Arora Information

University	Princeton University
Position	Professor of Computer Science
Citations(all)	35153
Citations(since 2020)	16578
Cited By	24971
hIndex(all)	75
hIndex(since 2020)	52
i10Index(all)	143
i10Index(since 2020)	113
Email	Access Email
University Profile Page	Princeton University
Google Scholar	View Google Scholar Profile

Sanjeev Arora Skills & Research Interests

theoretical machine learning

theoretical computer science

Top articles of Sanjeev Arora

Title	Journal	Author(s)	Publication Date
Fine-tuning language models with just forward passes	Advances in Neural Information Processing Systems	Sadhika Malladi Tianyu Gao Eshaan Nichani Alex Damian Jason D Lee ...	2024/2/13
Why (and When) does Local SGD Generalize Better than SGD?	arXiv preprint arXiv:2303.01215	Xinran Gu Kaifeng Lyu Longbo Huang Sanjeev Arora	2023/3/2
A theory for emergence of complex skills in language models	arXiv preprint arXiv:2307.15936	Sanjeev Arora Anirudh Goyal	2023/7/29
Trainable transformer in transformer	arXiv preprint arXiv:2307.01189	Abhishek Panigrahi Sadhika Malladi Mengzhou Xia Sanjeev Arora	2023/7/3
Task-specific skill localization in fine-tuned language models		Abhishek Panigrahi Nikunj Saunshi Haoyu Zhao Sanjeev Arora	2023/7/3
A kernel-based view of language model fine-tuning		Sadhika Malladi Alexander Wettig Dingli Yu Danqi Chen Sanjeev Arora	2023/7/3
Understanding the generalization benefit of normalization layers: Sharpness reduction	Advances in Neural Information Processing Systems	Kaifeng Lyu Zhiyuan Li Sanjeev Arora	2022/12/6
New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound	Advances in Neural Information Processing Systems	Arushi Gupta Nikunj Saunshi Dingli Yu Kaifeng Lyu Sanjeev Arora	2022/12/6
On the SDEs and scaling rules for adaptive gradient algorithms	Advances in Neural Information Processing Systems	Sadhika Malladi Kaifeng Lyu Abhishek Panigrahi Sanjeev Arora	2022/12/6
Understanding Influence Functions and Datamodels via Harmonic Analysis		Nikunj Saunshi Arushi Gupta Mark Braverman Sanjeev Arora	2022/9/29
Understanding contrastive learning requires incorporating inductive biases		Nikunj Saunshi Jordan Ash Surbhi Goel Dipendra Misra Cyril Zhang ...	2022/6/28
Understanding gradient descent on the edge of stability in deep learning		Sanjeev Arora Zhiyuan Li Abhishek Panigrahi	2022/6/28
What Happens after SGD Reaches Zero Loss?--A Mathematical Framework	arXiv preprint arXiv:2110.06914	Zhiyuan Li Tianhao Wang Sanjeev Arora	2021/10/13
Opening the Black Box of Deep Learning: Some Lessons and Take-aways		Sanjeev Arora	2021/5/31
Evaluating gradient inversion attacks and defenses in federated learning	Advances in Neural Information Processing Systems	Yangsibo Huang Samyak Gupta Zhao Song Kai Li Sanjeev Arora	2021/12/6
Rip van Winkle's Razor: A Simple Estimate of Overfit to Test Data	arXiv preprint arXiv:2102.13189	Sanjeev Arora Yi Zhang	2021/2/25
Gradient descent on two-layer nets: Margin maximization and simplicity bias	Advances in Neural Information Processing Systems	Kaifeng Lyu Zhiyuan Li Runzhe Wang Sanjeev Arora	2021/12/6
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)	Advances in Neural Information Processing Systems	Zhiyuan Li Sadhika Malladi Sanjeev Arora	2021/12/6
Technical perspective: Why don't today's deep nets overfit to their training data?	Communications of the ACM	Sanjeev Arora	2021/2/22
On Predicting Generalization using GANs	arXiv preprint arXiv:2111.14212	Yi Zhang Arushi Gupta Nikunj Saunshi Sanjeev Arora	2021/11/28