Jiaqi Yang at Tsinghua University

University	Tsinghua University
Position	Institute for Interdisciplinary Information Sciences
Citations(all)	263
Citations(since 2020)	263
Cited By	9
hIndex(all)	9
hIndex(since 2020)	9
i10Index(all)	9
i10Index(since 2020)	9
Email	Access Email
University Profile Page	Tsinghua University
Google Scholar	View Google Scholar Profile

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

arXiv preprint arXiv:2302.01605

2023/2/3

Chao Yu

H-Index: 21

Weilin Liu

H-Index: 11

Hao Tang

H-Index: 6

Jiaqi Yang

H-Index: 1

Yu Wang

H-Index: 14

Yi Wu

H-Index: 10

Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning

2022/6/28

Yunfei Li

H-Index: 9

Tian Gao

H-Index: 3

Jiaqi Yang

H-Index: 1

Yi Wu

H-Index: 10

Revisiting some common practices in cooperative multi-agent reinforcement learning

ICML 2022

2022/6/15

Wei Fu

H-Index: 7

Chao Yu

H-Index: 21

Jiaqi Yang

H-Index: 1

Yi Wu

H-Index: 10

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

arXiv preprint arXiv:2203.15664

2022/3/29

Jiaqi Yang

H-Index: 1

Qi Lei

H-Index: 2

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Advances in Neural Information Processing Systems

2021/12/6

Kaixuan Huang

H-Index: 2

Qi Lei

H-Index: 2

Jiaqi Yang

H-Index: 1

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Advances in Neural Information Processing Systems

2021/12/6

Zihan Zhang

H-Index: 4

Jiaqi Yang

H-Index: 1

Going Beyond Linear RL: Sample Efficient Neural Function Approximation

Advances in Neural Information Processing Systems

2021/12/6

Kaixuan Huang

H-Index: 2

Qi Lei

H-Index: 2

Jiaqi Yang

H-Index: 1

Provable Model-Based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Advances in neural information processing systems

2021/12/6

Kefan Dong

H-Index: 5

Jiaqi Yang

H-Index: 1

Tengyu Ma

H-Index: 2

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

2021/6/15

Jiaqi Yang

H-Index: 1

Yuan Zhou

H-Index: 16

Fully Gap-Dependent Bounds for Multinomial Logit Bandit

2021/3/18

Jiaqi Yang

H-Index: 1

Impact of Representation Learning in Linear Bandits

2020/10/2

Jiaqi Yang

H-Index: 1

Wei Hu

H-Index: 1

Simon Shaolei Du

H-Index: 28

Jiaqi Yang

Tsinghua University

About Jiaqi Yang

Jiaqi Yang Information

Jiaqi Yang Skills & Research Interests

Top articles of Jiaqi Yang

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

Chao Yu

Weilin Liu

Hao Tang

Jiaqi Yang

Yu Wang

Yi Wu

Phasic Self-Imitative Reduction for Sparse-Reward Goal-Conditioned Reinforcement Learning

Yunfei Li

Tian Gao

Jiaqi Yang

Yi Wu

Revisiting some common practices in cooperative multi-agent reinforcement learning

Wei Fu

Chao Yu

Jiaqi Yang

Yi Wu

Nearly Minimax Algorithms for Linear Bandits with Shared Representation

Jiaqi Yang

Qi Lei

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Kaixuan Huang

Qi Lei

Jiaqi Yang

Improved Variance-Aware Confidence Sets for Linear Bandits and Linear Mixture MDP

Zihan Zhang

Jiaqi Yang

Going Beyond Linear RL: Sample Efficient Neural Function Approximation

Kaixuan Huang

Qi Lei

Jiaqi Yang

Provable Model-Based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Kefan Dong

Jiaqi Yang

Tengyu Ma

Linear Bandits with Limited Adaptivity and Learning Distributional Optimal Design

Jiaqi Yang

Yuan Zhou

Fully Gap-Dependent Bounds for Multinomial Logit Bandit

Jiaqi Yang

Impact of Representation Learning in Linear Bandits

Jiaqi Yang

Wei Hu

Simon Shaolei Du

Co-Authors

Sham M Kakade

Jason D. Lee

Tengyu MA

Simon Shaolei Du

Yuan Zhou

Yi Wu