2.II.29I
Consider a stochastic controllable dynamical system with action-space and countable state-space . Thus and denotes the transition probability from to when taking action . Suppose that a cost is incurred each time that action is taken in state , and that this cost is uniformly bounded. Write down the dynamic optimality equation for the problem of minimizing the expected long-run average cost.
State in terms of this equation a general result, which can be used to identify an optimal control and the minimal long-run average cost.
A particle moves randomly on the integers, taking steps of size 1 . Suppose we can choose at each step a control parameter , where is fixed, which has the effect that the particle moves in the positive direction with probability and in the negative direction with probability . It is desired to maximize the long-run proportion of time spent by the particle at 0 . Show that there is a solution to the optimality equation for this example in which the relative cost function takes the form , for some constant .
Determine an optimal control and show that the maximal long-run proportion of time spent at 0 is given by
You may assume that it is valid to use an unbounded function in the optimality equation in this example.