Yonatan Belinkov
Technion - Israel Institute of Technology
H-index: 43
Asia-Israel
Top articles of Yonatan Belinkov
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
BetaAlign: a deep learning approach for multiple sequence alignment | bioRxiv | Edo Dotan Elya Wygoda Noa Ecker Michael Alburquerque Oren Avram | 2024/3/27 |
A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry | arXiv preprint arXiv:2402.17371 | Michael Toker Oren Mishali Ophir Münz-Manor Benny Kimelfeld Yonatan Belinkov | 2024/2/27 |
Have Faith in Faithfulness: Going Beyond Circuit Overlap When Finding Model Mechanisms | arXiv preprint arXiv:2403.17806 | Michael Hanna Sandro Pezzelle Yonatan Belinkov | 2024/3/26 |
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking | arXiv preprint arXiv:2402.14811 | Nikhil Prakash Tamar Rott Shaham Tal Haklay Yonatan Belinkov David Bau | 2024/2/22 |
Accelerating the Global Aggregation of Local Explanations | Proceedings of the AAAI Conference on Artificial Intelligence | Alon Mor Yonatan Belinkov Benny Kimelfeld | 2024/3/24 |
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space | arXiv preprint arXiv:2402.12865 | Shahar Katz Yonatan Belinkov Mor Geva Lior Wolf | 2024/2/20 |
Concept-Best-Matching: Evaluating Compositionality in Emergent Communication | arXiv preprint arXiv:2403.14705 | Boaz Carmeli Yonatan Belinkov Ron Meir | 2024/3/17 |
Unified concept editing in diffusion models | Rohit Gandikota Hadas Orgad Yonatan Belinkov Joanna Materzyńska David Bau | 2024 | |
Effect of tokenization on transformers for biological sequences | Bioinformatics | Edo Dotan Gal Jaschek Tal Pupko Yonatan Belinkov | 2024/4/12 |
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models | arXiv preprint arXiv:2403.19647 | Samuel Marks Can Rager Eric J Michaud Yonatan Belinkov David Bau | 2024/3/28 |
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines | arXiv preprint arXiv:2403.05846 | Michael Toker Hadas Orgad Mor Ventura Dana Arad Yonatan Belinkov | 2024/3/9 |
Understanding arithmetic reasoning in language models using causal mediation analysis | arXiv preprint arXiv:2305.15054 | Alessandro Stolfo Yonatan Belinkov Mrinmaya Sachan | 2023/5/24 |
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP | Yonatan Belinkov Sophie Hao Jaap Jumelet Najoung Kim Arya D McCarthy | 2023/12 | |
Linearity of relation decoding in transformer language models | arXiv preprint arXiv:2308.09124 | Evan Hernandez Arnab Sen Sharma Tal Haklay Kevin Meng Martin Wattenberg | 2023/8/17 |
Shielded representations: Protecting sensitive attributes through iterative gradient-based projection | arXiv preprint arXiv:2305.10204 | Shadi Iskander Kira Radinsky Yonatan Belinkov | 2023/5/17 |
A mechanistic interpretation of arithmetic reasoning in language models using causal mediation analysis | Alessandro Stolfo Yonatan Belinkov Mrinmaya Sachan | 2023/12 | |
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias | arXiv preprint arXiv:2308.00225 | Itay Itzhak Gabriel Stanovsky Nir Rosenfeld Yonatan Belinkov | 2023/8/1 |
ContraSim--A Similarity Measure Based on Contrastive Learning | arXiv preprint arXiv:2303.16992 | Adir Rahamim Yonatan Belinkov | 2023/3/29 |
Generating benchmarks for factuality evaluation of language models | arXiv preprint arXiv:2307.06908 | Dor Muhlgay Ori Ram Inbal Magar Yoav Levine Nir Ratner | 2023/7/13 |
OM2Seq: Learning retrieval embeddings for optical genome mapping | bioRxiv | Yevgeni Nogin Danielle Sapir Tahir Detinis Zur Nir Weinberger Yonatan Belinkov | 2023/11/21 |