2.II.29I

A policy $\pi$ is to be chosen to maximize

F(\pi, x)=\mathbb{E}_{\pi}\left[\sum_{t \geqslant 0} \beta^{t} r\left(x_{t}, u_{t}\right) \mid x_{0}=x\right]

where $0<\beta \leqslant 1$ . Assuming that $r \geqslant 0$ , prove that $\pi$ is optimal if $F(\pi, x)$ satisfies the optimality equation.

An investor receives at time $t$ an income of $x_{t}$ of which he spends $u_{t}$ , subject to $0 \leqslant u_{t} \leqslant x_{t}$ . The reward is $r\left(x_{t}, u_{t}\right)=u_{t}$ , and his income evolves as

x_{t+1}=x_{t}+\left(x_{t}-u_{t}\right) \varepsilon_{t},

where $\left(\varepsilon_{t}\right)_{t \geqslant 0}$ is a sequence of independent random variables with common mean $\theta>0$ . If $0<\beta \leqslant 1 /(1+\theta)$ , show that the optimal policy is to take $u_{t}=x_{t}$ for all $t$ .

What can you say about the problem if $\beta>1 /(1+\theta) ?$