B3.14
Consider a scalar system with , where is a sequence of independent random variables, uniform on the interval , with . We wish to choose to minimize the expected value of
where is chosen knowing but not . Prove that the minimal expected cost can be written and derive a recurrence for calculating .
How does your answer change if is constrained to lie in the set
Consider a stopping problem for which there are two options in state :
(1) stop: paying a terminal cost ; no further costs are incurred;
(2) continue: choosing , paying , and moving to state
Consider the problem of minimizing total expected cost subject to the constraint that no more than continuation steps are allowed. Suppose . Show that an optimal policy stops if and only if either continuation steps have already been taken or .
[Hint: Use induction on to show that a one-step-look-ahead rule is optimal. You should not need to find the optimal for the continuation steps.]