Markov Decision Process
Markov Decision Process (MDP) ´Â ÀÌ»ê½Ã°£ È®·üÁ¦¾î °úÁ¤ (discrete time stochastic control process) À¸·Î¼, ÀÏ·ÃÀÇ »óÅ (states), Çൿ (actions), ÁÖ¾îÁø »óÅ¿¡¼ ¼±ÅÃµÈ Çൿ¿¡ ÀÇÁ¸ÇÏ´Â ÀüÀÌÈ®·üÇà·Ä (transition probability matrices) µîÀ» Ư¡À¸·Î ÇÑ´Ù. dynamic programming °ú °ÈÇнÀ (Reinforcement learning) À» ÅëÇÑ ÇØ¹ýÀ» ã´Â ±¤¹üÀ§ÇÑ ÃÖÀûȹ®Á¦ (optimization problem) ¸¦ ¿¬±¸Çϴµ¥¿¡ MDP ´Â À¯¿ëÇÏ´Ù. ....... (Wikipedia : Markov decision process)
Âü°í¼Àû
Bellman, R. E. Dynamic Programming. Princeton University Press, Princeton, NJ.
M. L. Puterman. Markov Decision Processes. Wiley, 1994.
Site :
MDP Toolbox for Matlab - An excellent tutorial and Matlab toolbox for working with MDPs.