Paper 3, Section I, J

Statistical Modelling
Part II, 2013

Consider the linear model Y=Xβ+ϵY=X \beta+\epsilon where Y=(Y1,,Yn)T,β=(β1,,βp)TY=\left(Y_{1}, \ldots, Y_{n}\right)^{\mathrm{T}}, \beta=\left(\beta_{1}, \ldots, \beta_{p}\right)^{\mathrm{T}}, and ϵ=(ϵ1,,ϵn)T\epsilon=\left(\epsilon_{1}, \ldots, \epsilon_{n}\right)^{\mathrm{T}}, with ϵ1,,ϵn\epsilon_{1}, \ldots, \epsilon_{n} independent N(0,σ2)N\left(0, \sigma^{2}\right) random variables. The (n×p)(n \times p) matrix XX is known and is of full rank p<np<n. Give expressions for the maximum likelihood estimators β^\widehat{\beta} and σ^2\widehat{\sigma}^{2} of β\beta and σ2\sigma^{2} respectively, and state their joint distribution. Show that β^\widehat{\beta} is unbiased whereas σ^2\widehat{\sigma}^{2} is biased.

Suppose that a new variable YY^{*} is to be observed, satisfying the relationship

Y=x Tβ+ϵY^{*}=x^{* \mathrm{~T}} \beta+\epsilon^{*}

where x(p×1)x^{*}(p \times 1) is known, and ϵN(0,σ2)\epsilon^{*} \sim N\left(0, \sigma^{2}\right) independently of ϵ\epsilon. We propose to predict YY^{*} by Y~=x Tβ^\widetilde{Y}=x^{* \mathrm{~T}} \widehat{\beta}. Identify the distribution of

YY~τσ~\frac{Y^{*}-\tilde{Y}}{\tau \tilde{\sigma}}

where

σ~2=nnpσ^2τ2=x T(XTX)1x+1\begin{aligned} \tilde{\sigma}^{2} &=\frac{n}{n-p} \widehat{\sigma}^{2} \\ \tau^{2} &=x^{* \mathrm{~T}}\left(X^{\mathrm{T}} X\right)^{-1} x^{*}+1 \end{aligned}