Hannah Rose Kirk

Hannah Rose Kirk

University of Oxford

H-index: 10

Europe-United Kingdom

Contact Hannah Rose Kirk

About Hannah Rose Kirk

Hannah Rose Kirk, With an exceptional h-index of 10 and a recent h-index of 10 (since 2020), a distinguished researcher at University of Oxford, specializes in the field of Large language models, NLP, Ethics in AI, Alignment, AI Safety.

His recent articles reflect a diverse array of research interests and contributions to the field:

Introducing v0. 5 of the AI Safety Benchmark from MLCommons

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation

Dataperf: Benchmarks for data-centric ai development

Visogender: A dataset for benchmarking gender bias in image-text pronoun resolution

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

Hannah Rose Kirk Information

University	University of Oxford
Position	___
Citations(all)	497
Citations(since 2020)	497
Cited By	1
hIndex(all)	10
hIndex(since 2020)	10
i10Index(all)	11
i10Index(since 2020)	11
Email	Access Email
University Profile Page	University of Oxford
Google Scholar	View Google Scholar Profile

Hannah Rose Kirk Skills & Research Interests

Large language models

NLP

Ethics in AI

Alignment

AI Safety

Top articles of Hannah Rose Kirk

Introducing v0. 5 of the AI Safety Benchmark from MLCommons

arXiv preprint arXiv:2404.12241

2024/4/18

Cody Coleman

H-Index: 11

Surgan Jandial

H-Index: 3

Foutse Khomh

H-Index: 33

Hannah Rose Kirk

H-Index: 1

Michael Kuchnik

H-Index: 2

Chris Lengerich

H-Index: 3

Bo Li

H-Index: 27

Yifan Mai

H-Index: 5

Priyanka Mary Mammen

H-Index: 3

Shafee Mohammed

H-Index: 1

Alicia Parrish

H-Index: 2

Eleonora Presani

H-Index: 15

Paul Röttger

H-Index: 1

Elizabeth Anne Watkins

H-Index: 2

Poonam Yadav

H-Index: 6

Yi Zeng

H-Index: 5

Wenhui Zhang

H-Index: 10

Jiacheng Zhu

H-Index: 3

Percy Liang

H-Index: 55

Joaquin Vanschoren

H-Index: 27

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

arXiv preprint arXiv:2402.16786

2024/2/26

Paul Röttger

H-Index: 1

Valentin Hofmann

H-Index: 3

Valentina Pyatkin

H-Index: 1

Hannah Rose Kirk

H-Index: 1

Hinrich Schütze

H-Index: 48

Dirk Hovy

H-Index: 26

Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation

arXiv preprint arXiv:2403.12075

2024/2/14

Alicia Parrish

H-Index: 2

Oana Inel

H-Index: 9

Charvi Rastogi

H-Index: 2

Hannah Rose Kirk

H-Index: 1

Minsuk Kahng

H-Index: 16

Max Bartolo

H-Index: 3

Justin White

H-Index: 2

Vijay Janapa Reddi

H-Index: 29

Dataperf: Benchmarks for data-centric ai development

Advances in Neural Information Processing Systems

2024/2/13

Colby Banbury

H-Index: 3

Xiaozhe Yao

H-Index: 1

Alicia Parrish

H-Index: 2

Hannah Rose Kirk

H-Index: 1

Charvi Rastogi

H-Index: 2

David Kanter

H-Index: 14

Bilge Acun

H-Index: 8

Max Bartolo

H-Index: 3

Emmett Goodman

H-Index: 14

Oana Inel

H-Index: 9

Tzu-Sheng Kuo

H-Index: 4

Joaquin Vanschoren

H-Index: 27

Serena Yeung

H-Index: 19

Ce Zhang

H-Index: 3

Cody Coleman

H-Index: 11

Andrew Ng

H-Index: 105

Vijay Janapa Reddi

H-Index: 29

Visogender: A dataset for benchmarking gender bias in image-text pronoun resolution

Advances in Neural Information Processing Systems

2024/2/13

Aleksandar Shtedritski

H-Index: 0

Hannah Rose Kirk

H-Index: 1

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

arXiv preprint arXiv:2404.16019

2024/4/24

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

Andrew Bean

H-Index: 2

Max Bartolo

H-Index: 3

He He

H-Index: 4

The benefits, risks and bounds of personalizing the alignment of large language models to individuals

2024/4/23

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

arXiv preprint arXiv:2311.08370

2023/11/14

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

The past, present and better future of feedback learning in large language models for subjective human preferences and values

arXiv preprint arXiv:2310.07629

2023/10/11

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models

arXiv preprint arXiv:2310.02457

2023/10/3

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

Casteist but not racist? quantifying disparities in large language model bias between india and the west

arXiv preprint arXiv:2309.08573

2023/9/15

Hannah Rose Kirk

H-Index: 1

Xstest: A test suite for identifying exaggerated safety behaviours in large language models

arXiv preprint arXiv:2308.01263

2023/8/2

Paul Röttger

H-Index: 1

Hannah Rose Kirk

H-Index: 1

Giuseppe Attanasio

H-Index: 2

Federico Bianchi

H-Index: 4

Dirk Hovy

H-Index: 26

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures

arXiv preprint arXiv:2307.16811

2023/7/31

Hannah Rose Kirk

H-Index: 1

Liam Burke

H-Index: 7

Jonathan Bright

H-Index: 17

Auditing large language models: a three-layered approach

AI and Ethics

2023/5/30

Jakob Mökander

H-Index: 1

Hannah Rose Kirk

H-Index: 1

Luciano Floridi

H-Index: 57

Balancing the picture: Debiasing vision-language datasets with synthetic contrast sets

arXiv preprint arXiv:2305.15407

2023/5/24

Hannah Rose Kirk

H-Index: 1

Aleksandar Shtedritski

H-Index: 0

Max Bain

H-Index: 2

Assessing language model deployment with risk cards

arXiv preprint arXiv:2303.18190

2023/3/31

Hannah Rose Kirk

H-Index: 1

Vidhisha Balachandran

H-Index: 2

Sachin Kumar

H-Index: 9

Yulia Tsvetkov

H-Index: 22

SemEval-2023 task 10: explainable detection of online sexism

2023/3/7

Hannah Rose Kirk

H-Index: 1

Wenjie Yin

H-Index: 3

Paul Röttger

H-Index: 1

Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning

arXiv preprint arXiv:2209.10193

2022/9/21

Hannah Rose Kirk

H-Index: 1

Tracking abuse on Twitter against football players in the 2021–22 Premier League Season

Available at SSRN 4403913

2022/8/2

Hannah Rose Kirk

H-Index: 1

Paul Röttger

H-Index: 1

Proceedings of the First Workshop on Dynamic Adversarial Data Collection

2022/7

Max Bartolo

H-Index: 3

Robin Jia

H-Index: 10

See List of Professors in Hannah Rose Kirk University(University of Oxford)

Co-Authors

H-index: 111

Luciano Floridi — Luciano Floridi
University of Oxford

Visit Luciano Floridi Page

H-index: 28

Scott A. Hale — Scott A. Hale
University of Oxford

Visit Scott A. Hale Page

H-index: 22

Frédéric A. Dreyer — Frédéric A. Dreyer
University of Oxford

Visit Frédéric A. Dreyer Page

H-index: 17

Yuki M. Asano — Yuki M. Asano
University of Oxford

Visit Yuki M. Asano Page

H-index: 10

Jakob Mökander — Jakob Mökander
University of Oxford

Visit Jakob Mökander Page

H-index: 8

Paul Röttger — Paul Röttger
University of Oxford

Visit Paul Röttger Page

1
2

academic-engine

Useful Links

List of top 50 universities

List of top 100 universities

List of top 500 universities

Find Professors Email

Find Universities Email

CUFinder Academic Engine