Paper 2, Section II, J
Describe the elements of a generic stochastic dynamic programming equation for the problem of maximizing the expected sum of discounted rewards accrued at times What is meant by the positive case? What is specially true in this case that is not true in general?
An investor owns a single asset which he may sell once, on any of the days . On day he will be offered a price . This value is unknown until day , is independent of all other offers, and a priori it is uniformly distributed on . Offers remain open, so that on day he may sell the asset for the best of the offers made on days . If he sells for on day then the reward is . Show from first principles that if then there exists such that the expected reward is maximized by selling the first day the offer is at least .
For , find both and the expected reward under the optimal policy.
Explain what is special about the case .