Adaptive AI Systems in Air Hockey: Reinforcement Learning Implementation with Self-Play for Optimization
Keywords:
Reinforcement Learning, Self-Play, Air Hockey, PPO, Game AIAbstract
This paper presents an adaptive AI system for educational Air Hockey that combines Proximal Policy Optimization (PPO) with self-play mechanisms to create intelligent agents capable of strategic adaptation while promoting climate change awareness. The proposed approach integrates three main contributions: a hybrid PPO-Self-Play architecture with behavioral correction systems to prevent suboptimal patterns, a 21-dimensional observation system that includes normalized positions, velocities, and trajectory prediction, and an adaptive self-play mechanism that trains agents against previous versions with varying difficulty levels. The system implements a multi-objective reward function and curriculum learning to guide agents toward competitive and efficient behaviors. The educational game "HOCKEY IS MELTING DOWN" uses polar ice melting as a metaphor to raise environmental awareness through interactive gameplay. Experimental results demonstrate substantial improvements over baseline methods, with the final model achieving an 81% win rate and significantly outperforming random agents, heuristic AI, and simple DQN implementations. Specialized evaluation metrics and usability testing with human participants confirm the system’s effectiveness as both a competitive gaming AI and an engaging educational tool for climate change awareness.
References
[Bansal et al., 2018] Bansal, T., Pachocki, J., Sidor, S., Sutskever, I., and Mordatch, I. (2018). Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748.
[Brown and Green, 2021] Brown, C. and Green, D. (2021). Serious games for environmental education. In Proceedings of the Interna- tional Conference on Game-Based Learning, pages 112–125.
[Chuck et al., 2024] Chuck, C., Qi, C., Munje, M. J., Li, S., Rudolph, M., Shi, C., Agarwal, S., Sikchi, H., Peri, A., Dayal, S., Kuo, E.,
Mehta, K., Wang, A., Stone, P., Zhang, A., and Niekum, S. (2024). Robot air hockey: A manipulation testbed for robot learning with reinforcement learning. arXiv preprint arXiv:2405.03113.
[Heinrich and Silver, 2016] Heinrich, J. and Silver, D. (2016). Deep reinforcement learning from self-play in imperfect-information games. Advances in Neural Information Processing Systems.
[Lee and Kim, 2023] Lee, J. and Kim, S. (2023). Generalization of ppo in complex game environments. Journal of Artificial Intelligence Research.
[Orsula, 2024] Orsula, M. (2024). Learning to play air hockey with model-based deep reinforcement learning. Robotics and Autonomous Systems.
[Schott, 2024] Schott, G. (2024). Game over for climate change? communicating and visualising global warming in digital games. Games and Culture.
[Schulman et al., 2017a] Schulman, J., Wolski, F., Dhariwal, P., Rad- ford, A., and Klimov, O. (2017a). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
[Schulman et al., 2017b] Schulman, J., Wolski, F., Dhariwal, P., Rad- ford, A., and Klimov, O. (2017b). Proximal policy optimization algorithms. In arXiv preprint arXiv:1707.06347.
[Silver et al., 2017] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., and Hassabis, D. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815.
[Smith and Doe, 2022] Smith, J. and Doe, M. (2022). Artificial Intelligence in Modern Video Games. Game AI Press.
[Taitler and Shimkin, 2017] Taitler, S. and Shimkin, N. (2017). Learning control for air hockey striking using deep reinforcement learning. In International Conference on Control, Artificial Intelligence, Robotics & Optimization, pages 22–27.
[Vinyals et al., 2019] Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., Vezhnevets, A. S., Leblond, R., Pohlen, T., Dalibard, V., Budden, D., Sulsky, Y., Molloy, J., Paine, T. L., Gulcehre, C., Wang, Z., Pfaff, T., Wu, Y., Ring, R., Yogatama, D., Wünsch, D., McKinney, K., Smith, O., Schaul, T., Lillicrap, T., Kavukcuoglu, K., Hassabis, D., Apps, C., and Silver, D. (2019). Grandmaster level in starcraft ii using multi-agent reinforcement learning. In Nature, volume 575, pages 350–354.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Flavio Andrés Arregoces Mercado, Cristian David Gonzáles Franco, Bella Valentina Mejía Gonzáles, Jorge Luis Sanchez Barrenche, Yovany Zhu Ye

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
The authors retain copyright and grant the journal the right to publish the work under a Creative Commons Attribution License, which allows third parties to use the published content as long as they credit the author(s) and the publication in OnBoard Knowledge Journal.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.



a