ProfessorsProfessors of Technion - Israel Institute of TechnologyYonatan Belinkov

Yonatan Belinkov

Technion - Israel Institute of Technology

H-index: 43

Asia-Israel

About Yonatan Belinkov

Yonatan Belinkov, With an exceptional h-index of 43 and a recent h-index of 40 (since 2020), a distinguished researcher at Technion - Israel Institute of Technology, specializes in the field of Natural Language Processing, Model Interpretability, Artificial Intelligence.

His recent articles reflect a diverse array of research interests and contributions to the field:

BetaAlign: a deep learning approach for multiple sequence alignment

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry

Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Accelerating the Global Aggregation of Local Explanations

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Concept-Best-Matching: Evaluating Compositionality in Emergent Communication

Unified concept editing in diffusion models

Yonatan Belinkov Information

University	Technion - Israel Institute of Technology
Position	___
Citations(all)	8836
Citations(since 2020)	8248
Cited By	2585
hIndex(all)	43
hIndex(since 2020)	40
i10Index(all)	76
i10Index(since 2020)	74
Email	Access Email
University Profile Page	Technion - Israel Institute of Technology
Google Scholar	View Google Scholar Profile

Yonatan Belinkov Skills & Research Interests

Natural Language Processing

Model Interpretability

Artificial Intelligence

Top articles of Yonatan Belinkov

Title	Journal	Author(s)	Publication Date
BetaAlign: a deep learning approach for multiple sequence alignment	bioRxiv	Edo Dotan Elya Wygoda Noa Ecker Michael Alburquerque Oren Avram ...	2024/3/27
A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry	arXiv preprint arXiv:2402.17371	Michael Toker Oren Mishali Ophir Münz-Manor Benny Kimelfeld Yonatan Belinkov	2024/2/27
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms	arXiv preprint arXiv:2403.17806	Michael Hanna Sandro Pezzelle Yonatan Belinkov	2024/3/26
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking	arXiv preprint arXiv:2402.14811	Nikhil Prakash Tamar Rott Shaham Tal Haklay Yonatan Belinkov David Bau	2024/2/22
Accelerating the Global Aggregation of Local Explanations	Proceedings of the AAAI Conference on Artificial Intelligence	Alon Mor Yonatan Belinkov Benny Kimelfeld	2024/3/24
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space	arXiv preprint arXiv:2402.12865	Shahar Katz Yonatan Belinkov Mor Geva Lior Wolf	2024/2/20
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication	arXiv preprint arXiv:2403.14705	Boaz Carmeli Yonatan Belinkov Ron Meir	2024/3/17
Unified concept editing in diffusion models		Rohit Gandikota Hadas Orgad Yonatan Belinkov Joanna Materzyńska David Bau	2024
Effect of tokenization on transformers for biological sequences	Bioinformatics	Edo Dotan Gal Jaschek Tal Pupko Yonatan Belinkov	2024/4/12
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models	arXiv preprint arXiv:2403.19647	Samuel Marks Can Rager Eric J Michaud Yonatan Belinkov David Bau ...	2024/3/28
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines	arXiv preprint arXiv:2403.05846	Michael Toker Hadas Orgad Mor Ventura Dana Arad Yonatan Belinkov	2024/3/9
Understanding arithmetic reasoning in language models using causal mediation analysis	arXiv preprint arXiv:2305.15054	Alessandro Stolfo Yonatan Belinkov Mrinmaya Sachan	2023/5/24
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP		Yonatan Belinkov Sophie Hao Jaap Jumelet Najoung Kim Arya D McCarthy ...	2023/12
Linearity of relation decoding in transformer language models	arXiv preprint arXiv:2308.09124	Evan Hernandez Arnab Sen Sharma Tal Haklay Kevin Meng Martin Wattenberg ...	2023/8/17
Shielded representations: Protecting sensitive attributes through iterative gradient-based projection	arXiv preprint arXiv:2305.10204	Shadi Iskander Kira Radinsky Yonatan Belinkov	2023/5/17
A mechanistic interpretation of arithmetic reasoning in language models using causal mediation analysis		Alessandro Stolfo Yonatan Belinkov Mrinmaya Sachan	2023/12
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias	arXiv preprint arXiv:2308.00225	Itay Itzhak Gabriel Stanovsky Nir Rosenfeld Yonatan Belinkov	2023/8/1
ContraSim--A Similarity Measure Based on Contrastive Learning	arXiv preprint arXiv:2303.16992	Adir Rahamim Yonatan Belinkov	2023/3/29
Generating benchmarks for factuality evaluation of language models	arXiv preprint arXiv:2307.06908	Dor Muhlgay Ori Ram Inbal Magar Yoav Levine Nir Ratner ...	2023/7/13
OM2Seq: Learning retrieval embeddings for optical genome mapping	bioRxiv	Yevgeni Nogin Danielle Sapir Tahir Detinis Zur Nir Weinberger Yonatan Belinkov ...	2023/11/21