Paper 1, Section II, 13K13 K

Statistical Modelling
Part II, 2014

Consider the normal linear model where the nn-vector of responses YY satisfies Y=Xβ+εY=X \beta+\varepsilon with εNn(0,σ2I)\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right). Here XX is an n×pn \times p matrix of predictors with full column rank where np+3n \geqslant p+3, and βRp\beta \in \mathbb{R}^{p} is an unknown vector of regression coefficients. Let X0X_{0} be the matrix formed from the first p0<pp_{0}<p columns of XX, and partition β\beta as β=(β0T,β1T)T\beta=\left(\beta_{0}^{T}, \beta_{1}^{T}\right)^{T} where β0Rp0\beta_{0} \in \mathbb{R}^{p_{0}} and β1Rpp0\beta_{1} \in \mathbb{R}^{p-p_{0}}. Denote the orthogonal projections onto the column spaces of XX and X0X_{0} by PP and P0P_{0} respectively.

It is desired to test the null hypothesis H0:β1=0H_{0}: \beta_{1}=0 against the alternative hypothesis H1:β10H_{1}: \beta_{1} \neq 0. Recall that the FF-test for testing H0H_{0} against H1H_{1} rejects H0H_{0} for large values of

F=(PP0)Y2/(pp0)(IP)Y2/(np).F=\frac{\left\|\left(P-P_{0}\right) Y\right\|^{2} /\left(p-p_{0}\right)}{\|(I-P) Y\|^{2} /(n-p)} .

Show that (IP)(PP0)=0(I-P)\left(P-P_{0}\right)=0, and hence prove that the numerator and denominator of FF are independent under either hypothesis.

Show that

Eβ,σ2(F)=(np)(τ2+1)np2\mathbb{E}_{\beta, \sigma^{2}}(F)=\frac{(n-p)\left(\tau^{2}+1\right)}{n-p-2}

where τ2=(PP0)Xβ2(pp0)σ2\tau^{2}=\frac{\left\|\left(P-P_{0}\right) X \beta\right\|^{2}}{\left(p-p_{0}\right) \sigma^{2}}.

[In this question you may use the following facts without proof: PP0P-P_{0} is an orthogonal projection with rank pp0p-p_{0}; any n×nn \times n orthogonal projection matrix Π\Pi satisfies Πε2σ2χν2\|\Pi \varepsilon\|^{2} \sim \sigma^{2} \chi_{\nu}^{2}, where ν=rank(Π);\nu=\operatorname{rank}(\Pi) ; and if Zχν2Z \sim \chi_{\nu}^{2} then E(Z1)=(ν2)1\mathbb{E}\left(Z^{-1}\right)=(\nu-2)^{-1} when ν>2.]\left.\nu>2 .\right]