ProfessorsProfessors of University of WashingtonLuke Zettlemoyer

Luke Zettlemoyer

University of Washington

H-index: 100

North America-United States

About Luke Zettlemoyer

Luke Zettlemoyer, With an exceptional h-index of 100 and a recent h-index of 90 (since 2020), a distinguished researcher at University of Washington, specializes in the field of Natural Language Processing, Semantics, Machine Learning, Artificial Intelligence.

His recent articles reflect a diverse array of research interests and contributions to the field:

Lima: Less is more for alignment

Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

Megabyte: Predicting million-byte sequences with multiscale transformers

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

Reliable, adaptable, and attributable language models with retrieval

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Luke Zettlemoyer Information

University	University of Washington
Position	; Facebook
Citations(all)	89812
Citations(since 2020)	81418
Cited By	27150
hIndex(all)	100
hIndex(since 2020)	90
i10Index(all)	225
i10Index(since 2020)	205
Email	Access Email
University Profile Page	University of Washington
Google Scholar	View Google Scholar Profile

Luke Zettlemoyer Skills & Research Interests

Natural Language Processing

Semantics

Machine Learning

Artificial Intelligence

Top articles of Luke Zettlemoyer

Title	Journal	Author(s)	Publication Date
Lima: Less is more for alignment	arXiv preprint arXiv:2305.11206	Chunting Zhou Pengfei Liu Puxin Xu Srini Iyer Jiao Sun ...	2023/5/18
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research	arXiv preprint arXiv:2402.00159	Luca Soldaini Rodney Kinney Akshita Bhagia Dustin Schwenk David Atkinson ...	2024/1/31
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments	arXiv preprint arXiv:2404.07972	Tianbao Xie Danyang Zhang Jixuan Chen Xiaochuan Li Siheng Zhao ...	2024/4/11
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling	arXiv preprint arXiv:2403.10691	Tomasz Limisiewicz Terra Blevins Hila Gonen Orevaoghene Ahia Luke Zettlemoyer	2024/3/15
Megabyte: Predicting million-byte sequences with multiscale transformers	arXiv preprint arXiv:2305.07185	Lili Yu Dániel Simig Colin Flaherty Armen Aghajanyan Luke Zettlemoyer ...	2023/5/12
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens	arXiv preprint arXiv:2401.17377	Jiacheng Liu Sewon Min Luke Zettlemoyer Yejin Choi Hannaneh Hajishirzi	2024/1/30
Reliable, adaptable, and attributable language models with retrieval	arXiv preprint arXiv:2403.03187	Akari Asai Zexuan Zhong Danqi Chen Pang Wei Koh Luke Zettlemoyer ...	2024/3/5
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models	arXiv preprint arXiv:2401.10440	Terra Blevins Tomasz Limisiewicz Suchin Gururangan Margaret Li Hila Gonen ...	2024/1/19
Toolformer: Language models can teach themselves to use tools	Advances in Neural Information Processing Systems	Timo Schick Jane Dwivedi-Yu Roberto Dessì Roberta Raileanu Maria Lomeli ...	2024/2/13
Comparing hallucination detection metrics for multilingual generation	arXiv preprint arXiv:2402.10496	Haoqiang Kang Terra Blevins Luke Zettlemoyer	2024/2/16
Do Membership Inference Attacks Work on Large Language Models?	arXiv preprint arXiv:2402.07841	Michael Duan Anshuman Suri Niloofar Mireshghallah Sewon Min Weijia Shi ...	2024/2/12
MoDE: CLIP Data Experts via Clustering	arXiv preprint arXiv:2404.16030	Jiawei Ma Po-Yao Huang Saining Xie Shang-Wen Li Luke Zettlemoyer ...	2024/4/24
Qlora: Efficient finetuning of quantized llms	Advances in Neural Information Processing Systems	Tim Dettmers Artidoro Pagnoni Ari Holtzman Luke Zettlemoyer	2024/2/13
Olmo: Accelerating the science of language models	arXiv preprint arXiv:2402.00838	Dirk Groeneveld Iz Beltagy Pete Walsh Akshita Bhagia Rodney Kinney ...	2024/2/1
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length	arXiv preprint arXiv:2404.08801	Xuezhe Ma Xiaomeng Yang Wenhan Xiong Beidi Chen Lili Yu ...	2024/4/12
Legonn: Building modular encoder-decoder models	IEEE/ACM Transactions on Audio, Speech, and Language Processing	Siddharth Dalmia Dmytro Okhonko Mike Lewis Sergey Edunov Shinji Watanabe ...	2023/7/17
Retrieval-based Language Models Using a Multi-domain Datastore		Rulin Shao Sewon Min Luke Zettlemoyer Pang Wei Koh	2023/12/7
The belebele benchmark: a parallel reading comprehension dataset in 122 language variants	arXiv preprint arXiv:2308.16884	Lucas Bandarkar Davis Liang Benjamin Muller Mikel Artetxe Satya Narayan Shukla ...	2023/8/31
Eliciting Attributions from LLMs with Minimal Supervision		Ramakanth Pasunuru Koustuv Sinha Armen Aghajanyan LILI YU Tianlu Wang ...	2023/10/13
Cit: Curation in training for effective vision-language data		Hu Xu Saining Xie Po-Yao Huang Licheng Yu Russell Howes ...	2023