Paper 4, Section II, J
A factory has a tank of capacity in which it stores chemical waste. Each week the factory produces, independently of other weeks, an amount of waste that is equally likely to be 0,1 , or . If the amount of waste exceeds the remaining space in the tank then the excess must be specially handled at a cost of per . The tank may be emptied or not at the end of each week. Emptying costs , plus a variable cost of for each of its content. It is always emptied when it ends the week full.
It is desired to minimize the average cost per week. Write down equations from which one can determine when it is optimal to empty the tank.
Find the average cost per week of a policy , which empties the tank if and only if its content at the end of the week is 2 or .
Describe the policy improvement algorithm. Explain why, starting from , this algorithm will find an optimal policy in at most three iterations.
Prove that is optimal if and only if .