Paper 4, Section II, J

Statistical Modelling
Part II, 2015

Consider the normal linear model where the nn-vector of responses YY satisfies Y=Xβ+εY=X \beta+\varepsilon with εNn(0,σ2I)\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right). Here XX is an n×pn \times p matrix of predictors with full column rank where p3p \geqslant 3 and βRp\beta \in \mathbb{R}^{p} is an unknown vector of regression coefficients. For j{1,,p}j \in\{1, \ldots, p\}, denote the jj th column of XX by XjX_{j}, and let XjX_{-j} be XX with its jj th column removed. Suppose X1=1nX_{1}=1_{n} where 1n1_{n} is an nn-vector of 1 's. Denote the maximum likelihood estimate of β\beta by β\beta. Write down the formula for β^j\hat{\beta}_{j} involving PjP_{-j}, the orthogonal projection onto the column space of XjX_{-j}.

Consider j,k{2,,p}j, k \in\{2, \ldots, p\} with j<kj<k. By thinking about the orthogonal projection of XjX_{j} onto XkX_{k}, show that

var(β^j)σ2Xj2(1(XkTXjXkXj)2)1\operatorname{var}\left(\hat{\beta}_{j}\right) \geqslant \frac{\sigma^{2}}{\left\|X_{j}\right\|^{2}}\left(1-\left(\frac{X_{k}^{T} X_{j}}{\left\|X_{k}\right\|\left\|X_{j}\right\|}\right)^{2}\right)^{-1}

[You may use standard facts about orthogonal projections including the fact that if VV and WW are subspaces of Rn\mathbb{R}^{n} with VV a subspace of WW and ΠV\Pi_{V} and ΠW\Pi_{W} denote orthogonal projections onto VV and WW respectively, then for all vRn,ΠWv2ΠVv2v \in \mathbb{R}^{n},\left\|\Pi_{W} v\right\|^{2} \geqslant\left\|\Pi_{V} v\right\|^{2}.]

By considering the fitted values Xβ^X \hat{\beta}, explain why if, for any j2j \geqslant 2, a constant is added to each entry in the jj th column of XX, then β^j\hat{\beta}_{j} will remain unchanged. Let Xˉj=i=1nXij/n\bar{X}_{j}=\sum_{i=1}^{n} X_{i j} / n. Why is (*) also true when all instances of XjX_{j} and XkX_{k} are replaced by XjXˉj1nX_{j}-\bar{X}_{j} 1_{n} and XkXˉk1nX_{k}-\bar{X}_{k} 1_{n} respectively?

The marks from mid-year statistics and mathematics tests and an end-of-year statistics exam are recorded for 100 secondary school students. The first few lines of the data are given below.

The following abbreviated output is obtained:

What are the hypothesis tests corresponding to the final column of the coefficients table? What is the hypothesis test corresponding to the final line of the output? Interpret the results when testing at the 5%5 \% level.

How does the following sample correlation matrix for the data help to explain the relative sizes of some of the pp-values?