Sanjeev Arora

Sanjeev Arora

Princeton University

H-index: 75

North America-United States

About Sanjeev Arora

Sanjeev Arora, With an exceptional h-index of 75 and a recent h-index of 52 (since 2020), a distinguished researcher at Princeton University, specializes in the field of theoretical machine learning, theoretical computer science.

His recent articles reflect a diverse array of research interests and contributions to the field:

Fine-tuning language models with just forward passes

Why (and When) does Local SGD Generalize Better than SGD?

A theory for emergence of complex skills in language models

Trainable transformer in transformer

Task-specific skill localization in fine-tuned language models

A kernel-based view of language model fine-tuning

Understanding the generalization benefit of normalization layers: Sharpness reduction

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Sanjeev Arora Information

University

Position

Professor of Computer Science

Citations(all)

35153

Citations(since 2020)

16578

Cited By

24971

hIndex(all)

75

hIndex(since 2020)

52

i10Index(all)

143

i10Index(since 2020)

113

Email

University Profile Page

Princeton University

Google Scholar

View Google Scholar Profile

Sanjeev Arora Skills & Research Interests

theoretical machine learning

theoretical computer science

Top articles of Sanjeev Arora

Title

Journal

Author(s)

Publication Date

Fine-tuning language models with just forward passes

Advances in Neural Information Processing Systems

Sadhika Malladi

Tianyu Gao

Eshaan Nichani

Alex Damian

Jason D Lee

...

2024/2/13

Why (and When) does Local SGD Generalize Better than SGD?

arXiv preprint arXiv:2303.01215

Xinran Gu

Kaifeng Lyu

Longbo Huang

Sanjeev Arora

2023/3/2

A theory for emergence of complex skills in language models

arXiv preprint arXiv:2307.15936

Sanjeev Arora

Anirudh Goyal

2023/7/29

Trainable transformer in transformer

arXiv preprint arXiv:2307.01189

Abhishek Panigrahi

Sadhika Malladi

Mengzhou Xia

Sanjeev Arora

2023/7/3

Task-specific skill localization in fine-tuned language models

Abhishek Panigrahi

Nikunj Saunshi

Haoyu Zhao

Sanjeev Arora

2023/7/3

A kernel-based view of language model fine-tuning

Sadhika Malladi

Alexander Wettig

Dingli Yu

Danqi Chen

Sanjeev Arora

2023/7/3

Understanding the generalization benefit of normalization layers: Sharpness reduction

Advances in Neural Information Processing Systems

Kaifeng Lyu

Zhiyuan Li

Sanjeev Arora

2022/12/6

New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

Advances in Neural Information Processing Systems

Arushi Gupta

Nikunj Saunshi

Dingli Yu

Kaifeng Lyu

Sanjeev Arora

2022/12/6

On the SDEs and scaling rules for adaptive gradient algorithms

Advances in Neural Information Processing Systems

Sadhika Malladi

Kaifeng Lyu

Abhishek Panigrahi

Sanjeev Arora

2022/12/6

Understanding Influence Functions and Datamodels via Harmonic Analysis

Nikunj Saunshi

Arushi Gupta

Mark Braverman

Sanjeev Arora

2022/9/29

Understanding contrastive learning requires incorporating inductive biases

Nikunj Saunshi

Jordan Ash

Surbhi Goel

Dipendra Misra

Cyril Zhang

...

2022/6/28

Understanding gradient descent on the edge of stability in deep learning

Sanjeev Arora

Zhiyuan Li

Abhishek Panigrahi

2022/6/28

What Happens after SGD Reaches Zero Loss?--A Mathematical Framework

arXiv preprint arXiv:2110.06914

Zhiyuan Li

Tianhao Wang

Sanjeev Arora

2021/10/13

Opening the Black Box of Deep Learning: Some Lessons and Take-aways

Sanjeev Arora

2021/5/31

Evaluating gradient inversion attacks and defenses in federated learning

Advances in Neural Information Processing Systems

Yangsibo Huang

Samyak Gupta

Zhao Song

Kai Li

Sanjeev Arora

2021/12/6

Rip van Winkle's Razor: A Simple Estimate of Overfit to Test Data

arXiv preprint arXiv:2102.13189

Sanjeev Arora

Yi Zhang

2021/2/25

Gradient descent on two-layer nets: Margin maximization and simplicity bias

Advances in Neural Information Processing Systems

Kaifeng Lyu

Zhiyuan Li

Runzhe Wang

Sanjeev Arora

2021/12/6

On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs)

Advances in Neural Information Processing Systems

Zhiyuan Li

Sadhika Malladi

Sanjeev Arora

2021/12/6

Technical perspective: Why don't today's deep nets overfit to their training data?

Communications of the ACM

Sanjeev Arora

2021/2/22

On Predicting Generalization using GANs

arXiv preprint arXiv:2111.14212

Yi Zhang

Arushi Gupta

Nikunj Saunshi

Sanjeev Arora

2021/11/28

See List of Professors in Sanjeev Arora University(Princeton University)

Co-Authors

H-index: 69
Madhu Sudan

Madhu Sudan

Harvard University

H-index: 63
Elad Hazan

Elad Hazan

Princeton University

H-index: 54
Tengyu MA

Tengyu MA

Stanford University

H-index: 45
Yuanzhi Li

Yuanzhi Li

Carnegie Mellon University

H-index: 45
Simon Shaolei Du

Simon Shaolei Du

University of Washington

H-index: 44
Rong Ge

Rong Ge

Duke University

academic-engine