Paper 1, Section II, H

Consider the general linear model $Y=X \theta+\epsilon$ where $X$ is a known $n \times p$ matrix, $\theta$ is an unknown $p \times 1$ vector of parameters, and $\epsilon$ is an $n \times 1$ vector of independent $N\left(0, \sigma^{2}\right)$ random variables with unknown variance $\sigma^{2}$ . Assume the $p \times p$ matrix $X^{T} X$ is invertible. Let

\begin{aligned} \hat{\theta} &=\left(X^{T} X\right)^{-1} X^{T} Y \\ \hat{\epsilon} &=Y-X \hat{\theta} \end{aligned}

What are the distributions of $\hat{\theta}$ and $\hat{\epsilon}$ ? Show that $\hat{\theta}$ and $\hat{\epsilon}$ are uncorrelated.

Four apple trees stand in a $2 \times 2$ rectangular grid. The annual yield of the tree at coordinate $(i, j)$ conforms to the model

y_{i j}=\alpha_{i}+\beta x_{i j}+\epsilon_{i j}, \quad i, j \in\{1,2\},

where $x_{i j}$ is the amount of fertilizer applied to tree $(i, j), \alpha_{1}, \alpha_{2}$ may differ because of varying soil across rows, and the $\epsilon_{i j}$ are $N\left(0, \sigma^{2}\right)$ random variables that are independent of one another and from year to year. The following two possible experiments are to be compared:

\mathrm{I}:\left(x_{i j}\right)=\left(\begin{array}{cc} 0 & 1 \\ 2 & 3 \end{array}\right) \quad \text { and } \quad \mathrm{II}:\left(x_{i j}\right)=\left(\begin{array}{cc} 0 & 2 \\ 3 & 1 \end{array}\right) \text {. }

Represent these as general linear models, with $\theta=\left(\alpha_{1}, \alpha_{2}, \beta\right)$ . Compare the variances of estimates of $\beta$ under I and II.

With II the following yields are observed:

\left(y_{i j}\right)=\left(\begin{array}{ll} 100 & 300 \\ 600 & 400 \end{array}\right)

Forecast the total yield that will be obtained next year if no fertilizer is used. What is the $95 \%$ predictive interval for this yield?