Aprendiendo a jugar a fútbol con algoritmos de aprendizaje por refuerzo

Clar Lopera, Álvaro

dc.contributor	Moyà Alcover, Gabriel
dc.contributor	Buades Rubio, José María
dc.contributor.author	Clar Lopera, Álvaro
dc.date	2020
dc.date.accessioned	2022-02-10T13:03:06Z
dc.date.available	2022-02-10T13:03:06Z
dc.date.issued	2020-11-11
dc.identifier.uri	http://hdl.handle.net/11201/157479
dc.description.abstract	[spa] En el campo de los videojuegos se han realizado numerosos trabajos que hacen uso de algoritmos de Aprendizaje por Refuerzo (RL). Uno de ellos es el trabajo Google Research Football, en el que se creó un entorno para que los agentes puedan ser entrenados en un simulador de futbol 3D. A partir de este entorno, se ha desarrollado un sistema de aprendizaje con un algoritmo de RL, junto a sistemas para poder reproducir los entrenamientos realizados y para poder comprobar el rendimiento de los agentes entrenados en este entorno. Para demostrar que el algoritmo y sistemas desarrollados permiten a los agentes obtener un buen rendimiento, se realiza un estudio en el que se entrenan múltiples agentes en 4 escenarios diferentes, con el objetivo adicional de comprobar si el uso de un tipo de recompensa mejora los resultados obtenidos. Los resultados obtenidos verifican un buen rendimiento de los agentes en cada escenario, aunque con algunas dificultades en algunos escenarios , y confirman una mejoría debida al uso de este tipo de recompensa adicional. Finalmente, se extraen conclusiones, en las que establecen nuevas futuras líneas de investigación.	ca
dc.description.abstract	[eng] In the field of video games, numerous works have been carried out that make use of Reinforcement Learning (RL) algorithms. One of them is the Google Research Football work, in which an environment was created so that agents can be trained in a 3D football simulator. From this environment, a learning system with an RL algorithm has been developed, together with systems to be able to reproduce the trainings carried out and to be able to check the performance of the agents trained in this environment. In order to demonstrate that the algorithm and systems developed allow agents to obtain good performance, a study is carried out in which multiple agents are trained in 4 different scenarios, with the additional objective of verifying if the use of a type of reward improves the results obtained. The results obtained show a good performance of the agents in each scenario, although with some difficulties in some scenarios, and confirm an improvement due to the use of this type of additional reward. Finally, conclusions are drawn, in which new future lines of research are established.	ca
dc.format	application/pdf
dc.language.iso	spa	ca
dc.publisher	Universitat de les Illes Balears
dc.rights	all rights reserved
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	004 - Informàtica	ca
dc.subject	79 - Diversions. Espectacles. Cinema. Teatre. Dansa. Jocs. Esports	ca
dc.subject.other	Aprendizaje	ca
dc.subject.other	refuerzo	ca
dc.subject.other	recompensa	ca
dc.subject.other	política	ca
dc.subject.other	aprendizaje-q	ca
dc.subject.other	actor	ca
dc.subject.other	crítico	ca
dc.subject.other	agente	ca
dc.subject.other	ventaja	ca
dc.subject.other	entorno	ca
dc.subject.other	paso	ca
dc.subject.other	episodio	ca
dc.title	Aprendiendo a jugar a fútbol con algoritmos de aprendizaje por refuerzo	ca
dc.type	info:eu-repo/semantics/masterThesis	ca
dc.type	info:eu-repo/semantics/publishedVersion
dc.date.updated	2022-02-01T06:49:37Z