1.I.5I

Statistical Modelling
Part II, 2006

Assume that observations Y=(Y1,,Yn)TY=\left(Y_{1}, \ldots, Y_{n}\right)^{T} satisfy the linear model

Y=Xβ+ϵY=X \beta+\epsilon

where XX is an n×pn \times p matrix of known constants of full rankp<n\operatorname{rank} p<n, where β=(β1,,βp)T\beta=\left(\beta_{1}, \ldots, \beta_{p}\right)^{T} is unknown and ϵNn(0,σ2I)\epsilon \sim N_{n}\left(0, \sigma^{2} I\right). Write down a (1α)(1-\alpha)-level confidence set for β\beta.

Define Cook's distance for the observation (xi,Yi)\left(x_{i}, Y_{i}\right), where xiTx_{i}^{T} is the ii th row of XX. Give its interpretation in terms of confidence sets for β\beta.

In the above model with n=50n=50 and p=2p=2, you observe that one observation has Cook's distance 1.3. Would you be concerned about the influence of this observation?

[You may find some of the following facts useful:

(i) If Zχ22Z \sim \chi_{2}^{2}, then P(Z0.21)=0.1,P(Z1.39)=0.5\mathbb{P}(Z \leqslant 0.21)=0.1, \mathbb{P}(Z \leqslant 1.39)=0.5 and P(Z4.61)=0.9\mathbb{P}(Z \leqslant 4.61)=0.9.

(ii) If ZF2,48Z \sim F_{2,48}, then P(Z0.11)=0.1,P(Z0.70)=0.5\mathbb{P}(Z \leqslant 0.11)=0.1, \mathbb{P}(Z \leqslant 0.70)=0.5 and P(Z2.42)=0.9\mathbb{P}(Z \leqslant 2.42)=0.9.

(iii) If ZF48,2Z \sim F_{48,2}, then P(Z0.41)=0.1,P(Z1.42)=0.5\mathbb{P}(Z \leqslant 0.41)=0.1, \mathbb{P}(Z \leqslant 1.42)=0.5 and P(Z9.47)=0.9\mathbb{P}(Z \leqslant 9.47)=0.9. ]