Adaptive AI Systems in Air Hockey: Reinforcement Learning Implementation with Self-Play for Optimization

Authors

Keywords:

Reinforcement Learning, Self-Play, Air Hockey, PPO, Game AI

Abstract

This paper presents an adaptive AI system for educational Air Hockey that combines Proximal Policy Optimization (PPO) with self-play mechanisms to create intelligent agents capable of strategic adaptation while promoting climate change awareness. The proposed approach integrates three main contributions: a hybrid PPO-Self-Play architecture with behavioral correction systems to prevent suboptimal patterns, a 21-dimensional observation system that includes normalized positions, velocities, and trajectory prediction, and an adaptive self-play mechanism that trains agents against previous versions with varying difficulty levels. The system implements a multi-objective reward function and curriculum learning to guide agents toward competitive and efficient behaviors. The educational game "HOCKEY IS MELTING DOWN" uses polar ice melting as a metaphor to raise environmental awareness through interactive gameplay. Experimental results demonstrate substantial improvements over baseline methods, with the final model achieving an 81% win rate and significantly outperforming random agents, heuristic AI, and simple DQN implementations. Specialized evaluation metrics and usability testing with human participants confirm the system’s effectiveness as both a competitive gaming AI and an engaging educational tool for climate change awareness.

Author Biographies

Flavio Andrés Arregoces Mercado, Universidad del Norte

Systems Engineering student.

Cristian David Gonzáles Franco, Universidad del Norte

Systems Engineering student.

Bella Valentina Mejía Gonzáles, Universidad del Norte

Systems Engineering student.

Jorge Luis Sanchez Barreneche, Universidad del Norte

Systems Engineering student.

Yovany Zhu Ye, Universidad del Norte

Systems Engineering student.

References

[Bansal et al., 2018] Bansal, T., Pachocki, J., Sidor, S., Sutskever, I., and Mordatch, I. (2018). Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748.

[Brown and Green, 2021] Brown, C. and Green, D. (2021). Serious games for environmental education. In Proceedings of the Interna- tional Conference on Game-Based Learning, pages 112–125.

[Chuck et al., 2024] Chuck, C., Qi, C., Munje, M. J., Li, S., Rudolph, M., Shi, C., Agarwal, S., Sikchi, H., Peri, A., Dayal, S., Kuo, E.,

Mehta, K., Wang, A., Stone, P., Zhang, A., and Niekum, S. (2024). Robot air hockey: A manipulation testbed for robot learning with reinforcement learning. arXiv preprint arXiv:2405.03113.

[Heinrich and Silver, 2016] Heinrich, J. and Silver, D. (2016). Deep reinforcement learning from self-play in imperfect-information games. Advances in Neural Information Processing Systems.

[Lee and Kim, 2023] Lee, J. and Kim, S. (2023). Generalization of ppo in complex game environments. Journal of Artificial Intelligence Research.

[Orsula, 2024] Orsula, M. (2024). Learning to play air hockey with model-based deep reinforcement learning. Robotics and Autonomous Systems.

[Schott, 2024] Schott, G. (2024). Game over for climate change? communicating and visualising global warming in digital games. Games and Culture.

[Schulman et al., 2017a] Schulman, J., Wolski, F., Dhariwal, P., Rad- ford, A., and Klimov, O. (2017a). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

[Schulman et al., 2017b] Schulman, J., Wolski, F., Dhariwal, P., Rad- ford, A., and Klimov, O. (2017b). Proximal policy optimization algorithms. In arXiv preprint arXiv:1707.06347.

[Silver et al., 2017] Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., and Hassabis, D. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815.

[Smith and Doe, 2022] Smith, J. and Doe, M. (2022). Artificial Intelligence in Modern Video Games. Game AI Press.

[Taitler and Shimkin, 2017] Taitler, S. and Shimkin, N. (2017). Learning control for air hockey striking using deep reinforcement learning. In International Conference on Control, Artificial Intelligence, Robotics & Optimization, pages 22–27.

[Vinyals et al., 2019] Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., Vezhnevets, A. S., Leblond, R., Pohlen, T., Dalibard, V., Budden, D., Sulsky, Y., Molloy, J., Paine, T. L., Gulcehre, C., Wang, Z., Pfaff, T., Wu, Y., Ring, R., Yogatama, D., Wünsch, D., McKinney, K., Smith, O., Schaul, T., Lillicrap, T., Kavukcuoglu, K., Hassabis, D., Apps, C., and Silver, D. (2019). Grandmaster level in starcraft ii using multi-agent reinforcement learning. In Nature, volume 575, pages 350–354.

Downloads

Published

2025-11-21

How to Cite

Arregoces Mercado, F. A., Gonzáles Franco, C. D., Mejía Gonzáles, B. V., Sanchez Barreneche, J. L., & Zhu Ye, Y. (2025). Adaptive AI Systems in Air Hockey: Reinforcement Learning Implementation with Self-Play for Optimization. OnBoard Knowledge Journal, 1(02), 1–16. Retrieved from https://revistasescuelanaval.com/obk/article/view/117

Issue

Section

Articles