Hannah Rose Kirk
University of Oxford
H-index: 10
Europe-United Kingdom
Top articles of Hannah Rose Kirk
Introducing v0. 5 of the AI Safety Benchmark from MLCommons
arXiv preprint arXiv:2404.12241
2024/4/18
Cody Coleman
H-Index: 11
Surgan Jandial
H-Index: 3
Foutse Khomh
H-Index: 33
Hannah Rose Kirk
H-Index: 1
Michael Kuchnik
H-Index: 2
Chris Lengerich
H-Index: 3
Bo Li
H-Index: 27
Yifan Mai
H-Index: 5
Priyanka Mary Mammen
H-Index: 3
Shafee Mohammed
H-Index: 1
Alicia Parrish
H-Index: 2
Eleonora Presani
H-Index: 15
Paul Röttger
H-Index: 1
Elizabeth Anne Watkins
H-Index: 2
Poonam Yadav
H-Index: 6
Yi Zeng
H-Index: 5
Wenhui Zhang
H-Index: 10
Jiacheng Zhu
H-Index: 3
Percy Liang
H-Index: 55
Joaquin Vanschoren
H-Index: 27
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
arXiv preprint arXiv:2402.16786
2024/2/26
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
arXiv preprint arXiv:2403.12075
2024/2/14
Dataperf: Benchmarks for data-centric ai development
Advances in Neural Information Processing Systems
2024/2/13
Colby Banbury
H-Index: 3
Xiaozhe Yao
H-Index: 1
Alicia Parrish
H-Index: 2
Hannah Rose Kirk
H-Index: 1
Charvi Rastogi
H-Index: 2
David Kanter
H-Index: 14
Bilge Acun
H-Index: 8
Max Bartolo
H-Index: 3
Emmett Goodman
H-Index: 14
Oana Inel
H-Index: 9
Tzu-Sheng Kuo
H-Index: 4
Joaquin Vanschoren
H-Index: 27
Serena Yeung
H-Index: 19
Ce Zhang
H-Index: 3
Cody Coleman
H-Index: 11
Andrew Ng
H-Index: 105
Vijay Janapa Reddi
H-Index: 29
Visogender: A dataset for benchmarking gender bias in image-text pronoun resolution
Advances in Neural Information Processing Systems
2024/2/13
Aleksandar Shtedritski
H-Index: 0
Hannah Rose Kirk
H-Index: 1
The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models
arXiv preprint arXiv:2404.16019
2024/4/24
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
Andrew Bean
H-Index: 2
Max Bartolo
H-Index: 3
He He
H-Index: 4
The benefits, risks and bounds of personalizing the alignment of large language models to individuals
2024/4/23
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
arXiv preprint arXiv:2311.08370
2023/11/14
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
The past, present and better future of feedback learning in large language models for subjective human preferences and values
arXiv preprint arXiv:2310.07629
2023/10/11
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models
arXiv preprint arXiv:2310.02457
2023/10/3
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
Casteist but not racist? quantifying disparities in large language model bias between india and the west
arXiv preprint arXiv:2309.08573
2023/9/15
Hannah Rose Kirk
H-Index: 1
Xstest: A test suite for identifying exaggerated safety behaviours in large language models
arXiv preprint arXiv:2308.01263
2023/8/2
Paul Röttger
H-Index: 1
Hannah Rose Kirk
H-Index: 1
Giuseppe Attanasio
H-Index: 2
Federico Bianchi
H-Index: 4
Dirk Hovy
H-Index: 26
DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures
arXiv preprint arXiv:2307.16811
2023/7/31
Auditing large language models: a three-layered approach
AI and Ethics
2023/5/30
Balancing the picture: Debiasing vision-language datasets with synthetic contrast sets
arXiv preprint arXiv:2305.15407
2023/5/24
Assessing language model deployment with risk cards
arXiv preprint arXiv:2303.18190
2023/3/31
Hannah Rose Kirk
H-Index: 1
Vidhisha Balachandran
H-Index: 2
Sachin Kumar
H-Index: 9
Yulia Tsvetkov
H-Index: 22
SemEval-2023 task 10: explainable detection of online sexism
2023/3/7
Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning
arXiv preprint arXiv:2209.10193
2022/9/21
Hannah Rose Kirk
H-Index: 1
Tracking abuse on Twitter against football players in the 2021–22 Premier League Season
Available at SSRN 4403913
2022/8/2
Hannah Rose Kirk
H-Index: 1
Paul Röttger
H-Index: 1
Proceedings of the First Workshop on Dynamic Adversarial Data Collection
2022/7