ProfessorsProfessors of Columbia University in the City of New YorkYunhao Tang

Yunhao Tang

Columbia University in the City of New York

H-index: 15

North America-United States

About Yunhao Tang

Yunhao Tang, With an exceptional h-index of 15 and a recent h-index of 15 (since 2020), a distinguished researcher at Columbia University in the City of New York, specializes in the field of Reinforcement Learning.

His recent articles reflect a diverse array of research interests and contributions to the field:

Generalized Preference Optimization: A Unified Approach to Offline Alignment

Learning Uncertainty-Aware Temporally-Extended Actions

Off-policy Distributional Q(): Distributional RL without Importance Sampling

Human Alignment of Large Language Models through Online Preference Optimisation

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

A Distributional Analogue to the Successor Representation

Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model

The statistical benefits of quantile temporal-difference learning for value estimation

Yunhao Tang Information

University	Columbia University in the City of New York
Position	PhD student
Citations(all)	954
Citations(since 2020)	946
Cited By	222
hIndex(all)	15
hIndex(since 2020)	15
i10Index(all)	21
i10Index(since 2020)	21
Email	Access Email
University Profile Page	Columbia University in the City of New York
Google Scholar	View Google Scholar Profile

Yunhao Tang Skills & Research Interests

Reinforcement Learning

Top articles of Yunhao Tang

Title	Journal	Author(s)	Publication Date
Generalized Preference Optimization: A Unified Approach to Offline Alignment	arXiv preprint arXiv:2402.05749	Yunhao Tang Zhaohan Daniel Guo Zeyu Zheng Daniele Calandriello Rémi Munos ...	2024/2/8
Learning Uncertainty-Aware Temporally-Extended Actions	Proceedings of the AAAI Conference on Artificial Intelligence	Joongkyu Lee Seung Joon Park Yunhao Tang Min-hwan Oh	2024/3/24
Off-policy Distributional Q(): Distributional RL without Importance Sampling	arXiv preprint arXiv:2402.05766	Yunhao Tang Mark Rowland Rémi Munos Bernardo Ávila Pires Will Dabney	2024/2/8
Human Alignment of Large Language Models through Online Preference Optimisation	arXiv preprint arXiv:2403.08635	Daniele Calandriello Daniel Guo Remi Munos Mark Rowland Yunhao Tang ...	2024/3/13
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context	arXiv preprint arXiv:2403.05530	Machel Reid Nikolay Savinov Denis Teplyashin Dmitry Lepikhin Timothy Lillicrap ...	2024/3/8
A Distributional Analogue to the Successor Representation	arXiv preprint arXiv:2402.08530	Harley Wiltzer* Jesse Farebrother* Arthur Gretton Yunhao Tang André Barreto ...	2024/2/13
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model	arXiv preprint arXiv:2402.07598	Mark Rowland Li Kevin Wenliang Rémi Munos Clare Lyle Yunhao Tang ...	2024/2/12
The statistical benefits of quantile temporal-difference learning for value estimation		Mark Rowland Yunhao Tang Clare Lyle Rémi Munos Marc G Bellemare ...	2023/7/3
Fast rates for maximum entropy exploration		Daniil Tiapkin Denis Belomestny Daniele Calandriello Eric Moulines Remi Munos ...	2023/3/14
Towards a better understanding of representation dynamics under TD-learning		Yunhao Tang Rémi Munos	2023/7/3
The edge of orthogonality: a simple view of what makes BYOL tick		Pierre Harvey Richemond Allison Tam Yunhao Tang Florian Strub Bilal Piot ...	2023/7/3
DoMo-AC: doubly multi-step off-policy actor-critic algorithm		Yunhao Tang Tadashi Kozuno Mark Rowland Anna Harutyunyan Rémi Munos ...	2023/7/3
Gemini: a family of highly capable multimodal models	arXiv preprint arXiv:2312.11805	Gemini Team Rohan Anil Sebastian Borgeaud Yonghui Wu Jean-Baptiste Alayrac ...	2023/12/19
Understanding self-predictive learning for reinforcement learning	International Conference on Machine Learning (ICML23)	Yunhao Tang Zhaohan Daniel Guo Pierre Harvey Richemond Bernardo Ávila Pires Yash Chandak ...	2023/1/6
VA-learning as a more efficient alternative to Q-learning		Yunhao Tang Rémi Munos Mark Rowland Michal Valko	2023/7/3
Nash learning from human feedback	arXiv preprint arXiv:2312.00886	Rémi Munos Michal Valko Daniele Calandriello Mohammad Gheshlaghi Azar Mark Rowland ...	2023/12/1
An analysis of quantile temporal-difference learning	arXiv preprint arXiv:2301.04462	Mark Rowland Rémi Munos Mohammad Gheshlaghi Azar Yunhao Tang Georg Ostrovski ...	2023/1/11
Regularization and variance-weighted regression achieves minimax optimality in linear MDPs: theory and practice		Toshinori Kitamura Tadashi Kozuno Yunhao Tang Nino Vieillard Michal Valko ...	2023/7/3
Quantile credit assignment		Thomas Mesnard Wenqi Chen Alaa Saade Yunhao Tang Mark Rowland ...	2023/7/3
Representations and exploration for deep reinforcement learning using singular value decomposition		Yash Chandak Shantanu Thakoor Zhaohan Daniel Guo Yunhao Tang Remi Munos ...	2023/7/3