Matthijs T. J. Spaan
Technische Universiteit Delft
H-index: 35
Europe-Netherlands
Top articles of Matthijs T. J. Spaan
Title | Journal | Author(s) | Publication Date |
---|---|---|---|
When Do Off-Policy and On-Policy Policy Gradient Methods Align? | arXiv preprint arXiv:2402.12034 | Davide Mambelli Stephan Bongers Onno Zoeter Matthijs TJ Spaan Frans A Oliehoek | 2024/2/19 |
4.17 E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty | Scalable Analysis of Probabilistic Models and Programs | Matthijs Spaan | 2024/2 |
Bayesian Ensembles for Exploration in Deep Q-Learning | Pascal Van der Vaart Neil Yorke-Smith Matthijs TJ Spaan | 2024/5/6 | |
Scalable safe policy improvement via monte carlo tree search | Alberto Castellini Federico Bianchi Edoardo Zorzi Thiago D Simao Alessandro Farinelli | 2023/7/3 | |
Cem: Constrained entropy maximization for task-agnostic safe exploration | Proceedings of the AAAI Conference on Artificial Intelligence | Qisong Yang Matthijs TJ Spaan | 2023/6/26 |
Diverse Projection Ensembles for Distributional Reinforcement Learning | arXiv preprint arXiv:2306.07124 | Moritz A Zanger Wendelin Böhmer Matthijs TJ Spaan | 2023/6/12 |
Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in Reinforcement Learning | Miguel Suau Matthijs TJ Spaan Frans A Oliehoek | 2023/10/13 | |
The Role of Diverse Replay for Generalisation in Reinforcement Learning | arXiv preprint arXiv:2306.05727 | Max Weltevrede Matthijs TJ Spaan Wendelin Böhmer | 2023/6/9 |
Reinforcement Learning by Guided Safe Exploration | ECAI 2023 | Qisong Yang Thiago D Simão Nils Jansen Simon H Tindemans Matthijs TJ Spaan | 2023/7/26 |
Safety-constrained reinforcement learning with a distributional safety critic | Machine Learning | Qisong Yang Thiago D Simão Simon H Tindemans Matthijs TJ Spaan | 2023/3 |
Bayesian Deep Q-Learning via Sequential Monte Carlo | Pascal Van der Vaart Matthijs TJ Spaan Neil Yorke-Smith | 2023/7/20 | |
Refined Risk Management in Safe Reinforcement Learning with a Distributional Safety Critic | Qisong Yang Thiago D Simão Simon H Tindemans Matthijs TJ Spaan | 2022 | |
Influence-augmented local simulators: A scalable solution for fast deep rl in large networked systems | ICML 2022 | M Suau J He MTJ Spaan FA Oliehoek | 2022 |
Distributed influence-augmented local simulators for parallel MARL in large networked systems | NeurIPS 2022 | Miguel Suau Jinke He Mustafa Mert Çelikok Matthijs TJ Spaan Frans A Oliehoek | 2022/7/1 |
Training and transferring safe policies in reinforcement learning | Qisong Yang T Simão Nils Jansen S Tindemans M Spaan | 2022 | |
Speeding up deep reinforcement learning through influence-augmented local simulators | Miguel Suau Jinke He Matthijs TJ Spaan Frans A Oliehoek | 2022/5/9 | |
E-MCTS: Deep Exploration in Model-Based Reinforcement Learning by Planning with Epistemic Uncertainty | arXiv preprint arXiv:2210.13455 | Yaniv Oren Matthijs TJ Spaan Wendelin Böhmer | 2022/10/21 |
Back to the Future: Solving Hidden Parameter MDPs with Hindsight | Canmanie Ponnambalam Danial Kamran Thiago D Simão Frans A Oliehoek Matthijs TJ Spaan | 2022 | |
Large-scale collaborative vehicle routing | Annals of Operations Research | Johan Los Frederik Schulte Margaretha Gansterer Richard F Hartl Matthijs TJ Spaan | 2022/4/8 |
A modern perspective on safe automated driving for different traffic dynamics using constrained reinforcement learning | Danial Kamran TD Simão Q Yang CT Ponnambalam Johannes Fischer | 2022/10 |