2026

  1. RPRA: Predicting an LLM-Judge for Efficient but Performant Inference Dylan R. Ashley, Gaël Le Lan, Changsheng Zhao, Naina Dhingra, Zhipeng Cai, Ernie Chang, Mingchen Zhuge, Yangyang Shi, Vikas Chandra, and Jürgen Schmidhuber Under review and previously presented at the ICLR 2026 Workshop on Multi-Agent Learning and Its Opportunities in the Era of Generative AI and the ICLR 2026 Workshop on Agentic AI in the Wild: From Hallucinations to Reliable Autonomy [Abstract] [arXiv] [Poster] [BibTeX]
  2. Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization Yanning Dai*, Yuhui Wang*, Dylan R. Ashley, and Jürgen Schmidhuber Published in the Proceedings of the 14th International Conference on Learning Representations [Abstract] [PDF] [BibTeX]
  3. RACAS: Controlling Diverse Robots With a Single Agentic System Dylan R. Ashley*, Jan Przepióra*, Yimeng Chen*, Ali Abualsaud, Nurzhan Yesmagambet, Shinkyu Park, Eric Feron, and Jürgen Schmidhuber Under review [Abstract] [arXiv] [BibTeX]

2025

  1. Towards an Extremely Robust Baby Robot With Rich Interaction Ability for Advanced Machine Learning Algorithms Mohannad Alhakami*, Dylan R. Ashley*, Joel Dunham*, Yanning Dai, Francesco Faccio, Eric Feron, and Jürgen Schmidhuber Published in the Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (Oral Presentation) and previously presented as a late-breaking result at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems [Abstract] [arXiv] [Code] [Poster] [BibTeX]
  2. Agent-as-a-Judge: Evaluate Agents with Agents Mingchen Zhuge, Changsheng Zhao, Dylan R. Ashley, Wenyi Wang, Dmitrii Khizbullin, Yunyang Xiong, Zechun Liu, Ernie Chang, Raghuraman Krishnamoorthi, Yuandong Tian, Yangyang Shi, Vikas Chandra, and Jürgen Schmidhuber Published in the Proceedings of the 42nd International Conference on Machine Learning [Abstract] [arXiv] [BibTeX]
  3. Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning Yuhui Wang, Qingyuan Wu, Dylan R. Ashley, Francesco Faccio, Weida Li, Chao Huang, and Jürgen Schmidhuber Published in the Proceedings of the 42nd International Conference on Machine Learning and previously presented at the 17th European Workshop on Reinforcement Learning [Abstract] [arXiv] [Poster] [BibTeX]
  4. On the Distillation of Stories for Transferring Narrative Arcs in Collections of Independent Media Dylan R. Ashley*, Vincent Herrmann*, Zachary Friggstad, and Jürgen Schmidhuber Published in IEEE Transactions on Pattern Analysis and Machine Intelligence (IF 20.8) and previously presented at the NeurIPS 2023 Workshop on Machine Learning for Creativity and Design and at the NeurIPS 2022 Workshop Information-Theoretic Principles in Cognitive Systems [Abstract] [PDF] [Code] [Poster] [BibTeX]
  5. Upside Down Reinforcement Learning with Policy Generators Jacopo Di Ventura, Dylan R. Ashley, Vincent Herrmann, Francesco Faccio, and Jürgen Schmidhuber Presented at the 6th Multidisciplinary Conference on Reinforcement Learning and Decision Making and the 18th European Workshop on Reinforcement Learning [Abstract] [arXiv] [Code] [BibTeX]
  6. Mindstorms in Natural Language-Based Societies of Mind Mingchen Zhuge*, Haozhe Liu*, Francesco Faccio*, Dylan R. Ashley*, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, and Jürgen Schmidhuber Published in Computational Visual Media (IF 17.3) as the journal version of the early 2023 paper [PDF] [BibTeX]
  7. On the Convergence and Stability of Upside-Down Reinforcement Learning, Goal-Conditioned Supervised Learning, and Online Decision Transformers Miroslav Štrupl, Oleg Szehr, Francesco Faccio, Dylan R. Ashley, Rupesh Kumar Srivastava, and Jürgen Schmidhuber Under review [Abstract] [arXiv] [Code] [BibTeX]

2024

  1. How to Correctly do Semantic Backpropagation on Language-based Agentic Systems Wenyi Wang*, Hisham A. Alyahya*, Dylan R. Ashley, Oleg Serikov, Dmitrii Khizbullin, Francesco Faccio, and Jürgen Schmidhuber Under review [Abstract] [arXiv] [Code] [BibTeX]
  2. Automatic Album Sequencing Vincent Herrmann*, Dylan R. Ashley*, and Jürgen Schmidhuber Presented as a late-breaking demo at the 2024 Conference of the International Society for Music Information Retrieval [Abstract] [arXiv] [Code] [Poster] [BibTeX]

2023

  1. Mindstorms in Natural Language-Based Societies of Mind Mingchen Zhuge*, Haozhe Liu*, Francesco Faccio*, Dylan R. Ashley*, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, and Jürgen Schmidhuber Presented at the NeurIPS 2023 Workshop on Robustness of Zero/Few-Shot Learning in Foundation Models (Best-Paper Award) [Abstract] [arXiv] [Poster] [Slides] [BibTeX]
  2. The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute Aleksandar Stanić*, Dylan R. Ashley, Oleg Serikov, Louis Kirsch, Francesco Faccio, Jürgen Schmidhuber, Thomas Hofmann, and Imanol Schlag* Preprint on arXiv [Abstract] [arXiv] [Code] [BibTeX]

2022

  1. Upside-Down Reinforcement Learning Can Diverge in Stochastic Environments With Episodic Resets Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Jürgen Schmidhuber, and Rupesh Kumar Srivastava Presented at the 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making and the 15th European Workshop on Reinforcement Learning [Abstract] [arXiv] [Code] [Poster] [BibTeX]
  2. Learning Relative Return Policies With Upside-Down Reinforcement Learning Dylan R. Ashley, Kai Arulkumaran, Jürgen Schmidhuber, and Rupesh Kumar Srivastava Presented at the 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making [Abstract] [arXiv] [Poster] [BibTeX]
  3. All You Need Is Supervised Learning: From Imitation Learning to Meta-RL With Upside Down RL Kai Arulkumaran, Dylan R. Ashley, Jürgen Schmidhuber, and Rupesh Kumar Srivastava Presented at the 5th Multidisciplinary Conference on Reinforcement Learning and Decision Making [Abstract] [arXiv] [Code] [Poster] [BibTeX]
  4. Reward-Weighted Regression Converges to a Global Optimum Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Rupesh Kumar Srivastava, and Jürgen Schmidhuber Published in the Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence [Abstract] [arXiv] [Code] [Poster] [BibTeX]

2021

  1. Automatic Embedding of Stories Into Collections of Independent Media Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Kory W. Mathewson, and Jürgen Schmidhuber Preprint on arXiv [Abstract] [arXiv] [Code] [BibTeX]
  2. Back to Square One: Superhuman Performance in Chutes and Ladders Through Deep Neural Networks and Tree Search Dylan R. Ashley*, Anssi Kanervisto*, and Brendan Bennett* Published in the Proceedings of the 2021 Conference of the ACH Special Interest Group on Harry Q. Bovik [Abstract] [arXiv] [PDF] [Code] [BibTeX]
  3. Does the Adam Optimizer Exacerbate Catastrophic Forgetting? Dylan R. Ashley, Sina Ghiassian, and Richard S. Sutton Preprint on arXiv [Abstract] [arXiv] [Code] [BibTeX]

2020

  1. Understanding Forgetting in Artificial Neural Networks Dylan R. Ashley Master’s thesis (University of Alberta) [Abstract] [PDF] [Code] [Slides] [BibTeX]
  2. Universal Successor Features for Transfer Reinforcement Learning Chen Ma, Dylan R. Ashley, Junfeng Wen, and Yoshua Bengio Preprint on arXiv [Abstract] [arXiv] [BibTeX]

2019

  1. Learning to Select Mates in Evolving Non-playable Characters Dylan R. Ashley*, Valliappa Chockalingam*, Braedy Kuzma*, and Vadim Bulitko Published in the Proceedings of the 2019 IEEE Conference on Games (Oral Presentation) [Abstract] [PDF] [Slides] [BibTeX]
  2. Learning to Select Mates in Artificial Life Dylan R. Ashley*, Valliappa Chockalingam*, Braedy Kuzma*, and Vadim Bulitko Published in the Proceedings of the Genetic and Evolutionary Computation Conference Companion [Abstract] [PDF] [Code] [Poster] [BibTeX]

2018

  1. Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return Craig Sherstan, Dylan R. Ashley*, Brendan Bennett*, Kenny Young, Adam White, Martha White, and Richard S. Sutton Published in the Proceedings of the 34th Conference on Uncertainty in Artificial Intelligence (Oral Presentation) [Abstract] [PDF] [SUP] [Code] [Poster] [Slides] [BibTeX]
  2. The Alberta Workloads for the SPEC CPU 2017 Benchmark Suite José Nelson Amaral, Edson Borin, Dylan R. Ashley, Caian Benedicto, Elliot Colp, Joao Henrique Stange Hoffmam, Marcus Karpoff, Erick Ochoa, Morgan Redshaw, and Raphael Ernani Rodrigues Published in the Proceedings of the 2018 IEEE International Symposium on Performance Analysis of Systems and Software [Abstract] [PDF] [Code] [BibTeX]
  3. Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, and Richard S. Sutton Preprint on arXiv [Abstract] [arXiv] [BibTeX]