A discrete-time controlled Markov process evolves according to
Xt+1=λXt+ut+εt,t=0,1,…,
where the ε are independent zero-mean random variables with common variance σ2, and λ is a known constant.
Consider the problem of minimizing
Ft,T(x)=E[j=t∑T−1βj−tC(Xj,uj)+βT−tR(XT)],
where C(x,u)=21(u2+ax2),β∈(0,1) and R(x)=21a0x2+b0. Show that the optimal control at time j takes the form uj=kT−jXj for certain constants ki. Show also that the minimized value for Ft,T(x) is of the form
21aT−tx2+bT−t
for certain constants aj,bj. Explain how these constants are to be calculated. Prove that the equation
f(z)≡a+1+βzλ2βz=z
has a unique positive solution z=a∗, and that the sequence (aj)j⩾0 converges monotonically to a∗.
Prove that the sequence (bj)j⩾0 converges, to the limit
b∗≡2(1−β)βσ2a∗.
Finally, prove that kj→k∗≡−βa∗λ/(1+βa∗).