4.I.5I

Statistical Modelling
Part II, 2007

Consider the normal linear model Y=Xβ+εY=X \beta+\varepsilon in vector notation, where

Y=(Y1Yn),X=(x1TxnT),β=(β1βp),ε=(ε1εn),εiY=\left(\begin{array}{c}Y_{1} \\ \vdots \\ Y_{n}\end{array}\right), \quad X=\left(\begin{array}{c}x_{1}^{\mathrm{T}} \\ \vdots \\ x_{n}^{\mathrm{T}}\end{array}\right), \quad \beta=\left(\begin{array}{c}\beta_{1} \\ \vdots \\ \beta_{p}\end{array}\right), \quad \varepsilon=\left(\begin{array}{c}\varepsilon_{1} \\ \vdots \\ \varepsilon_{n}\end{array}\right), \quad \varepsilon_{i} \sim i.i.d. N(0,σ2)N\left(0, \sigma^{2}\right),

where xiT=(xi1,,xip)x_{i}^{\mathrm{T}}=\left(x_{i 1}, \ldots, x_{i p}\right) is known and XX is of full rank (p<n)(p<n). Give expressions for maximum likelihood estimators β^\hat{\beta} and σ^2\hat{\sigma}^{2} of β\beta and σ2\sigma^{2} respectively, and state their joint distribution.

Suppose that there is a new pair (x,y)\left(x^{*}, y^{*}\right), independent of (x1,y1),,(xn,yn)\left(x_{1}, y_{1}\right), \ldots,\left(x_{n}, y_{n}\right), satisfying the relationship

y=x Tβ+ε, where εN(0,σ2).y^{*}=x^{* \mathrm{~T}} \beta+\varepsilon^{*}, \quad \text { where } \quad \varepsilon^{*} \sim N\left(0, \sigma^{2}\right) .

We suppose that xx^{*} is known, and estimate yy^{*} by y~=x Tβ^\tilde{y}=x^{* \mathrm{~T}} \hat{\beta}. State the distribution of

y~yσ~τ, where σ~2=nnpσ^2 and τ2=x T(XTX)1x+1\frac{\tilde{y}-y^{*}}{\tilde{\sigma} \tau}, \quad \text { where } \quad \tilde{\sigma}^{2}=\frac{n}{n-p} \hat{\sigma}^{2} \quad \text { and } \quad \tau^{2}=x^{* \mathrm{~T}}\left(X^{\mathrm{T}} X\right)^{-1} x^{*}+1

Find the form of a (1α)(1-\alpha)-level prediction interval for yy^{*}.