Paper 2, Section II, H

Statistics
Part IB, 2020

Consider the general linear model Y=Xβ0+εY=X \beta^{0}+\varepsilon where XX is a known n×pn \times p design matrix with p2,β0Rpp \geqslant 2, \beta^{0} \in \mathbb{R}^{p} is an unknown vector of parameters, and εRn\varepsilon \in \mathbb{R}^{n} is a vector of stochastic errors with E(εi)=0,var(εi)=σ2>0\mathbb{E}\left(\varepsilon_{i}\right)=0, \operatorname{var}\left(\varepsilon_{i}\right)=\sigma^{2}>0 and cov(εi,εj)=0\operatorname{cov}\left(\varepsilon_{i}, \varepsilon_{j}\right)=0 for all i,j=1,,ni, j=1, \ldots, n with iji \neq j. Suppose XX has full column rank.

(a) Write down the least squares estimate β^\hat{\beta} of β0\beta^{0} and show that it minimises the least squares objective S(β)=YXβ2S(\beta)=\|Y-X \beta\|^{2} over βRp\beta \in \mathbb{R}^{p}.

(b) Write down the variance-covariance matrix cov(β^)\operatorname{cov}(\hat{\beta}).

(c) Let β~Rp\tilde{\beta} \in \mathbb{R}^{p} minimise S(β)S(\beta) over βRp\beta \in \mathbb{R}^{p} subject to βp=0\beta_{p}=0. Let ZZ be the n×(p1)n \times(p-1) submatrix of XX that excludes the final column. Write downcov(β~)\operatorname{down} \operatorname{cov}(\tilde{\beta}).

(d) Let PP and P0P_{0} be n×nn \times n orthogonal projections onto the column spaces of XX and ZZ respectively. Show that for all uRn,uTPuuTP0uu \in \mathbb{R}^{n}, u^{T} P u \geqslant u^{T} P_{0} u.

(e) Show that for all xRpx \in \mathbb{R}^{p},

var(xTβ~)var(xTβ^).\operatorname{var}\left(x^{T} \tilde{\beta}\right) \leqslant \operatorname{var}\left(x^{T} \hat{\beta}\right) .

[Hint: Argue that x=XTux=X^{T} u for some uRnu \in \mathbb{R}^{n}.]