3.I.5J

Consider the linear model $Y=X \beta+\varepsilon$ . Here, $Y$ is an $n$ -dimensional vector of observations, $X$ is a known $n \times p$ matrix, $\beta$ is an unknown $p$ -dimensional parameter, and $\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right)$ , with $\sigma^{2}$ unknown. Assume that $X$ has full rank and that $p \ll n$ . Suppose that we are interested in checking the assumption $\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right)$ . Let $\hat{Y}=X \hat{\beta}$ , where $\hat{\beta}$ is the maximum likelihood estimate of $\beta$ . Write in terms of $X$ an expression for the projection matrix $P=\left(p_{i j}: 1 \leqslant i, j \leqslant n\right)$ which appears in the maximum likelihood equation $\hat{Y}=X \hat{\beta}=P Y$ .

Find the distribution of $\hat{\varepsilon}=Y-\hat{Y}$ , and show that, in general, the components of $\hat{\varepsilon}$ are not independent.

A standard procedure used to check our assumption on $\varepsilon$ is to check whether the studentized fitted residuals

\hat{\eta}_{i}=\frac{\hat{\varepsilon}_{i}}{\tilde{\sigma} \sqrt{1-p_{i i}}}, \quad i=1, \ldots, n

look like a random sample from an $N(0,1)$ distribution. Here,

\tilde{\sigma}^{2}=\frac{1}{n-p}\|Y-X \hat{\beta}\|^{2} .

Say, briefly, how you might do this in R.

This procedure appears to ignore the dependence between the components of $\hat{\varepsilon}$ noted above. What feature of the given set-up makes this reasonable?