Beidi Chen

Beidi Chen

Stanford University

H-index: 16

North America-United States

About Beidi Chen

Beidi Chen, With an exceptional h-index of 16 and a recent h-index of 16 (since 2020), a distinguished researcher at Stanford University, specializes in the field of Machine Learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Laughing hyena distillery: Extracting compact recurrences from convolutions

Prompt-prompted Mixture of Experts for Efficient LLM Generation

Learn To be Efficient: Build Structured Sparsity in Large Language Models

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

Beidi Chen Information

University

Position

___

Citations(all)

801

Citations(since 2020)

756

Cited By

145

hIndex(all)

16

hIndex(since 2020)

16

i10Index(all)

20

i10Index(since 2020)

20

Email

University Profile Page

Stanford University

Google Scholar

View Google Scholar Profile

Beidi Chen Skills & Research Interests

Machine Learning

Top articles of Beidi Chen

Title

Journal

Author(s)

Publication Date

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

arXiv preprint arXiv:2402.12374

Zhuoming Chen

Avner May

Ruslan Svirschevski

Yuhsun Huang

Max Ryabinin

...

2024/2/19

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

arXiv preprint arXiv:2404.11912

Hanshi Sun

Zhuoming Chen

Xinyu Yang

Yuandong Tian

Beidi Chen

2024/4/18

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

arXiv preprint arXiv:2402.09398

Harry Dong

Xinyu Yang

Zhenyu Zhang

Zhangyang Wang

Yuejie Chi

...

2024/2/14

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

arXiv preprint arXiv:2404.08801

Xuezhe Ma

Xiaomeng Yang

Wenhan Xiong

Beidi Chen

Lili Yu

...

2024/4/12

Laughing hyena distillery: Extracting compact recurrences from convolutions

Advances in Neural Information Processing Systems

Stefano Massaroli

Michael Poli

Dan Fu

Hermann Kumbong

Rom Parnichkun

...

2024/2/13

Prompt-prompted Mixture of Experts for Efficient LLM Generation

arXiv preprint arXiv:2404.01365

Harry Dong

Beidi Chen

Yuejie Chi

2024/4/1

Learn To be Efficient: Build Structured Sparsity in Large Language Models

arXiv preprint arXiv:2402.06126

Haizhong Zheng

Xiaoyan Bai

Beidi Chen

Fan Lai

Atul Prakash

2024/2/9

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

arXiv preprint arXiv:2403.04797

Zhenyu Zhang

Runjin Chen

Shiwei Liu

Zhewei Yao

Olatunji Ruwase

...

2024/3/5

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

arXiv preprint arXiv:2402.02750

Zirui Liu

Jiayi Yuan

Hongye Jin

Shaochen Zhong

Zhaozhuo Xu

...

2024/2/5

LLM Inference Unveiled: Survey and Roofline Model Insights

arXiv preprint arXiv:2402.16363

Zhihang Yuan

Yuzhang Shang

Yang Zhou

Zhen Dong

Chenhao Xue

...

2024/2/26

Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer

International Conference on Machine Learning

Yuandong Tian

Yiping Wang

Beidi Chen

Simon Du

2023/5/25

CocktailSGD: Fine-tuning foundation models over 500Mbps networks

Jue Wang

Yucheng Lu

Binhang Yuan

Beidi Chen

Percy Liang

...

2023/7/3

Joma: Demystifying multilayer transformers via joint dynamics of mlp and attention

(ICLR) International Conference on Learning Representations

Yuandong Tian

Yiping Wang

Zhenyu Zhang

Beidi Chen

Simon Du

2023/10/1

Compress, then prompt: Improving accuracy-efficiency trade-off of llm inference with transferable prompt

arXiv preprint arXiv:2305.11186

Zhaozhuo Xu

Zirui Liu

Beidi Chen

Yuxin Tang

Jue Wang

...

2023/5/17

Deja vu: Contextual sparsity for efficient llms at inference time

Zichang Liu

Jue Wang

Tri Dao

Tianyi Zhou

Binhang Yuan

...

2023/7/3

Efficient streaming language models with attention sinks

arXiv preprint arXiv:2309.17453

Guangxuan Xiao

Yuandong Tian

Beidi Chen

Song Han

Mike Lewis

2023/9/29

Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials

arXiv preprint arXiv:2301.02747

Andrew Cohen

Weiping Dou

Jiang Zhu

Slawomir Koziel

Peter Renner

...

2023/1/6

H O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

International Conference on Machine Learning

Zhenyu Zhang

Ying Sheng

Tianyi Zhou

Tianlong Chen

Lianmin Zheng

...

2023/6/24

On the Similarity between Attention and SVM on the Token Separation and Selection Behavior

Beidi Chen

Wentao Guo

Zhihang Li

Zhao Song

Tianyi Zhou

2023/9/22

Inrank: Incremental low-rank learning

arXiv preprint arXiv:2306.11250

Jiawei Zhao

Yifei Zhang

Beidi Chen

Florian Schäfer

Anima Anandkumar

2023/6/20

See List of Professors in Beidi Chen University(Stanford University)

Co-Authors

H-index: 132
David CULLER

David CULLER

University of California, Berkeley

H-index: 128
Randy Katz

Randy Katz

University of California, Berkeley

H-index: 87
Christopher Ré

Christopher Ré

Stanford University

H-index: 74
Anima Anandkumar

Anima Anandkumar

California Institute of Technology

H-index: 69
Farinaz Koushanfar

Farinaz Koushanfar

University of California, San Diego

H-index: 16
Rebecca Steorts

Rebecca Steorts

Duke University

academic-engine