Beidi Chen

Beidi Chen

Stanford University

H-index: 16

North America-United States

About Beidi Chen

Beidi Chen, With an exceptional h-index of 16 and a recent h-index of 16 (since 2020), a distinguished researcher at Stanford University, specializes in the field of Machine Learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Prompt-prompted Mixture of Experts for Efficient LLM Generation

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

LLM Inference Unveiled: Survey and Roofline Model Insights

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

Laughing hyena distillery: Extracting compact recurrences from convolutions

Beidi Chen Information

University

Position

___

Citations(all)

801

Citations(since 2020)

756

Cited By

145

hIndex(all)

16

hIndex(since 2020)

16

i10Index(all)

20

i10Index(since 2020)

20

Email

University Profile Page

Google Scholar

Beidi Chen Skills & Research Interests

Machine Learning

Top articles of Beidi Chen

TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

arXiv preprint arXiv:2404.11912

2024/4/18

Xinyu Yang
Xinyu Yang

H-Index: 2

Beidi Chen
Beidi Chen

H-Index: 7

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

arXiv preprint arXiv:2404.08801

2024/4/12

Prompt-prompted Mixture of Experts for Efficient LLM Generation

arXiv preprint arXiv:2404.01365

2024/4/1

Beidi Chen
Beidi Chen

H-Index: 7

Yuejie Chi
Yuejie Chi

H-Index: 25

Found in the Middle: How Language Models Use Long Contexts Better via Plug-and-Play Positional Encoding

arXiv preprint arXiv:2403.04797

2024/3/5

LLM Inference Unveiled: Survey and Roofline Model Insights

arXiv preprint arXiv:2402.16363

2024/2/26

Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

arXiv preprint arXiv:2402.12374

2024/2/19

Zhihao Jia
Zhihao Jia

H-Index: 16

Beidi Chen
Beidi Chen

H-Index: 7

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

arXiv preprint arXiv:2402.09398

2024/2/14

Laughing hyena distillery: Extracting compact recurrences from convolutions

Advances in Neural Information Processing Systems

2024/2/13

Learn To be Efficient: Build Structured Sparsity in Large Language Models

arXiv preprint arXiv:2402.06126

2024/2/9

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

arXiv preprint arXiv:2402.02750

2024/2/5

Hexgen: Generative inference of foundation model over heterogeneous decentralized environment

arXiv preprint arXiv:2311.11514

2023/11/20

Joma: Demystifying multilayer transformers via joint dynamics of mlp and attention

(ICLR) International Conference on Learning Representations

2023/10/1

Efficient streaming language models with attention sinks

arXiv preprint arXiv:2309.17453

2023/9/29

On the Similarity between Attention and SVM on the Token Separation and Selection Behavior

2023/9/22

Towards Structured Sparsity in Transformers for Efficient Inference

2023/7/16

Beidi Chen
Beidi Chen

H-Index: 7

Yuejie Chi
Yuejie Chi

H-Index: 25

Fast Algorithms for a New Relaxation of Optimal Transport

2023/7/12

CocktailSGD: Fine-tuning foundation models over 500Mbps networks

2023/7/3

Deja vu: Contextual sparsity for efficient llms at inference time

2023/7/3

H O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

International Conference on Machine Learning

2023/6/24

Inrank: Incremental low-rank learning

arXiv preprint arXiv:2306.11250

2023/6/20

See List of Professors in Beidi Chen University(Stanford University)

Co-Authors

academic-engine