Paper 2, Section II, K

Optimization and Control
Part II, 2013

Suppose {xt}t0\left\{x_{t}\right\}_{t \geqslant 0} is a Markov chain. Consider the dynamic programming equation

Fs(x)=max{r(x),βE[Fs1(x1)x0=x]},s=1,2,,F_{s}(x)=\max \left\{r(x), \beta E\left[F_{s-1}\left(x_{1}\right) \mid x_{0}=x\right]\right\}, \quad s=1,2, \ldots,

with r(x)>0,β(0,1)r(x)>0, \beta \in(0,1), and F0(x)=0F_{0}(x)=0. Prove that:

(i) Fs(x)F_{s}(x) is nondecreasing in ss;

(ii) Fs(x)F(x)F_{s}(x) \leqslant F(x), where F(x)F(x) is the value function of an infinite-horizon problem that you should describe;

(iii) F(x)=limsFs(x)=F(x)F_{\infty}(x)=\lim _{s \rightarrow \infty} F_{s}(x)=F(x).

A coin lands heads with probability pp. A statistician wishes to choose between: H0:p=1/3H_{0}: p=1 / 3 and H1:p=2/3H_{1}: p=2 / 3, one of which is true. Prior probabilities of H1H_{1} and H0H_{0} in the ratio x:1x: 1 change after one toss of the coin to ratio 2x:12 x: 1 (if the toss was a head) or to ratio x:2x: 2 (if the toss was a tail). What problem is being addressed by the following dynamic programming equation?

F(x)=max{11+x,x1+x,β[(11+x23+x1+x13)F(x/2)+(11+x13+x1+x23)F(2x)]}F(x)=\max \left\{\frac{1}{1+x}, \frac{x}{1+x}, \beta\left[\left(\frac{1}{1+x} \frac{2}{3}+\frac{x}{1+x} \frac{1}{3}\right) F(x / 2)+\left(\frac{1}{1+x} \frac{1}{3}+\frac{x}{1+x} \frac{2}{3}\right) F(2 x)\right]\right\}

Prove that G(x)=(1+x)F(x)G(x)=(1+x) F(x) is a convex function of xx.

By sketching a graph of GG, describe the form of the optimal policy.