Puterman M.L. Markov Decision Processes. Discrete Stochastic Dynamic Programming MVspa

Файл формата djvu
размером 8,55 МБ

Добавлен пользователем Shushimora 04.10.2015 03:05
Описание отредактировано 05.10.2015 13:51

Puterman M.L. Markov Decision Processes. Discrete Stochastic Dynamic Programming MVspa

Издательство John Wiley, 2005, -667 pp.

The past decade has seen a notable resurgence in both applied and theoretical research on Markov decision processes. Branching out from operations research roots of the 1950's, Markov decision process models have gained recognition in such diverse fields as ecology, economics, and communications engineering. These new applications have been accompanied by many theoretical advances. In response to the increased activity and the potential for further advances, I felt that there was a need for an up-to-date, unified and rigorous treatment of theoretical, computational, and applied research on Markov decision process models. This book is my attempt to meet this need.
I have written this book with two primary objectives iil mind: to provide a comprehensive reference for researchers, and to serve as a text in an advanced undergraduate or graduate level course in operations research, economics, or control engineering. Further, I hope it will serve as an accessible introduction to the subject for investigators in other disciplines. I expect that the material in this book will be of interest to management scientists, computer scientists, economists, applied mathematicians, control and communications engineers, statisticians, and mathematical ecologists. As a prerequisite, a reader should have some background in real analysis, linear algebra, probability, and linear programming; however, I have tried to keep the book self-contained by including relevant appendices. I hope that this book will inspire readers to delve deeper into this subject and to use these methods in research and application.
Markov decision processes, also referred to as stochastic dynamic programs or stochastic control problems, are models for sequential decision making when outcomes are uncertain. The Markov decision process model consists of decision epochs, states, actions, rewards, and transition probabilities. Choosing an action in a state generates a reward and determines the state at the next decision epoch through a transition probability function. Policies or strategies are prescriptions of which action to choose under any eventuality at every future decision epoch. Decision makers seek policies which are optimal in some sense.
This book covers several topics which have received little or no attention in other books on this subject. They include modified policy iteration, multichain models with average reward criterion, and sensitive optimality. Further I have tried to provide an in-depth discussion of algorithms and computational issues. The Bibliographic Remarks section of each chapter comments on relevant historical references in the extensive bibliography. I also have attempted to discuss recent research advances in areas such as countable-state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria. I include a table of symbols to help follow the extensive notation.

Model Formulation
Examples
Finite-Horizon Markov Decision Processes
Infinite Horizon Models: Foundations
Discounted Markov Decision Problems
The Expected Total-Reward Criterion
Average Reward and Related Criteria
The Average Reward Criterion-Multichain and Communicating Models
Sensitive Discount Optimality
Continuous-Time Models
A. Markov Chains
B. Semicontinuous Functions
C. Normed Linear Spaces
D. Linear Programming