Dynamic choice of state abstraction in Q-learning

Tamassia, M, Zambetta, F, Raffe, W, Mueller, F and Li, X 2016, 'Dynamic choice of state abstraction in Q-learning', in Gal A. Kaminka, Maria Fox, Paolo Bouquet, Eyke Hüllermeier, Virginia Dignum, Frank Dignum, Frank van Harmelen (ed.) Proceedings of the 22nd European Conference on Artificial Intelligence (ECAI 2016), Amsterdam, Netherlands, 29 August - 2 September 2016, pp. 46-54.


Document type: Conference Paper
Collection: Conference Papers

Title Dynamic choice of state abstraction in Q-learning
Author(s) Tamassia, M
Zambetta, F
Raffe, W
Mueller, F
Li, X
Year 2016
Conference name ECAI 2016: Volume 285 Frontiers in Artificial Intelligence and Applications (FAIA)
Conference location Amsterdam, Netherlands
Conference dates 29 August - 2 September 2016
Proceedings title Proceedings of the 22nd European Conference on Artificial Intelligence (ECAI 2016)
Editor(s) Gal A. Kaminka, Maria Fox, Paolo Bouquet, Eyke Hüllermeier, Virginia Dignum, Frank Dignum, Frank van Harmelen
Publisher IOS Press
Place of publication Amsterdam, Netherlands
Start page 46
End page 54
Total pages 9
Abstract Q-learning associates states and actions of a Markov Decision Process to expected future reward through online learning. In practice, however, when the state space is large and experience is still limited, the algorithm will not find a match between current state and experience unless some details describing states are ignored. On the other hand, reducing state information affects long term performance because decisions will need to be made on less informative inputs. We propose a variation of Q-learning that gradually enriches state descriptions, after enough experience is accumulated. This is coupled with an ad-hoc exploration strategy that aims at collecting key information that allows the algorithm to enrich state descriptions earlier. Experimental results obtained by applying our algorithm to the arcade game Pac-Man show that our approach significantly outperforms Q-learning during the learning process while not penalizing long-term performance.
Subjects Adaptive Agents and Intelligent Robotics
Virtual Reality and Related Simulation
DOI - identifier 10.3233/978-1-61499-672-9-46
Copyright notice © 2016 The Authors and IOS Press. Creative Commons Attribution Non-Commercial License
ISBN 9781614996729
Versions
Version Filter Type
Citation counts: TR Web of Science Citation Count  Cited 0 times in Thomson Reuters Web of Science Article
Altmetric details:
Access Statistics: 37 Abstract Views  -  Detailed Statistics
Created: Wed, 07 Sep 2016, 07:42:00 EST by Catalyst Administrator
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us