ProfessorsProfessors of Carnegie Mellon UniversityEmma Strubell

Emma Strubell

Carnegie Mellon University

H-index: 19

North America-United States

About Emma Strubell

Emma Strubell, With an exceptional h-index of 19 and a recent h-index of 18 (since 2020), a distinguished researcher at Carnegie Mellon University, specializes in the field of Natural Language Processing, Machine Learning, Green AI.

His recent articles reflect a diverse array of research interests and contributions to the field:

Making scalable meta learning practical

Olmo: Accelerating the science of language models

Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters

To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing

Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution

Regularizing Self-training for Unsupervised Domain Adaptation via Structural Constraints

Understanding the effect of model compression on social bias in large language models

Emma Strubell Information

University	Carnegie Mellon University
Position	Assistant Professor
Citations(all)	5504
Citations(since 2020)	5260
Cited By	1629
hIndex(all)	19
hIndex(since 2020)	18
i10Index(all)	27
i10Index(since 2020)	23
Email	Access Email
University Profile Page	Carnegie Mellon University
Google Scholar	View Google Scholar Profile

Emma Strubell Skills & Research Interests

Natural Language Processing

Machine Learning

Green AI

Top articles of Emma Strubell

Title	Journal	Author(s)	Publication Date
Making scalable meta learning practical	Advances in neural information processing systems	Sang Choe Sanket Vaibhav Mehta Hwijeen Ahn Willie Neiswanger Pengtao Xie ...	2024/2/13
Olmo: Accelerating the science of language models	arXiv preprint arXiv:2402.00838	Dirk Groeneveld Iz Beltagy Pete Walsh Akshita Bhagia Rodney Kinney ...	2024/2/1
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research	arXiv preprint arXiv:2402.00159	Luca Soldaini Rodney Kinney Akshita Bhagia Dustin Schwenk David Atkinson ...	2024/1/31
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters	arXiv preprint arXiv:2401.06408	Li Lucy Suchin Gururangan Luca Soldaini Emma Strubell David Bamman ...	2024/1/12
To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing	arXiv preprint arXiv:2310.07715	Sireesh Gururaja Amanda Bertsch Clara Na David Gray Widder Emma Strubell	2023/10/11
Annotating Mentions Alone Enables Efficient Domain Adaptation for Coreference Resolution		Nupoor Gandhi Anjalie Field Emma Strubell	2023/7
Regularizing Self-training for Unsupervised Domain Adaptation via Structural Constraints	arXiv preprint arXiv:2305.00131	Rajshekhar Das Jonathan Francis Sanket Vaibhav Mehta Jean Oh Emma Strubell ...	2023/4/29
Understanding the effect of model compression on social bias in large language models	arXiv preprint arXiv:2312.05662	Gustavo Gonçalves Emma Strubell	2023/12/9
The Framework Tax: Disparities Between Inference Efficiency in NLP Research and Deployment	Empirical Methods in Natural Language Processing (EMNLP)	Jared Fernandez Jacob Kahn Clara Na Yonatan Bisk Emma Strubell	2023/2/13
Efficiency pentathlon: A standardized arena for efficiency evaluation	arXiv preprint arXiv:2307.09701	Hao Peng Qingqing Cao Jesse Dodge Matthew E Peters Jared Fernandez ...	2023/7/19
Queer people are people first: Deconstructing sexual identity stereotypes in large language models	arXiv preprint arXiv:2307.00101	Harnoor Dhingra Preetiha Jayashanker Sayali Moghe Emma Strubell	2023/6/30
Energy and Carbon Considerations of Fine-Tuning BERT	arXiv preprint arXiv:2311.10267	Xiaorong Wang Clara Na Emma Strubell Sorelle Friedler Sasha Luccioni	2023/11/17
Dissecting Efficient Architectures for Wake-Word Detection		Cody Berger Juncheng B Li Yiyuan Li Aaron Berger Dmitri Berger ...	2023/7/16
Efficient and equitable natural language processing in the age of deep learning (dagstuhl seminar 22232)		Jesse Dodge Iryna Gurevych Roy Schwartz Emma Strubell Betty van Aken	2023
Surveying (dis) parities and concerns of compute hungry NLP research		Ji-Ung Lee Haritz Puerto Betty van Aken Yuki Arase Jessica Zosa Forde ...	2023/6/29
Power Hungry Processing: Watts Driving the Cost of AI Deployment?	arXiv e-prints	Alexandra Sasha Luccioni Yacine Jernite Emma Strubell	2023/11
Efficient methods for natural language processing: A survey	TACL	Marcos Treviso Tianchu Ji Ji-Ung Lee Betty van Aken Qingqing Cao ...	2023/4
An empirical investigation of the role of pre-training in lifelong learning	Journal of Machine Learning Research	Sanket Vaibhav Mehta Darshan Patil Sarath Chandar Emma Strubell	2023
Large Language Model Distillation Doesn't Need a Teacher	arXiv preprint arXiv:2305.14864	Ananya Harsh Jha Dirk Groeneveld Emma Strubell Iz Beltagy	2023/5/24
Efficiency Pentathlon: A Standardized Benchmark for Efficiency Evaluation		Hao Peng Qingqing Cao Jesse Dodge Matthew E Peters Jared Fernandez ...	2023/10/13