Paper 4, Section II, J

Statistical Modelling
Part II, 2021

Let XX be an n×pn \times p non-random design matrix and YY be a nn-vector of random responses. Suppose YN(μ,σ2I)Y \sim N\left(\mu, \sigma^{2} I\right), where μ\mu is an unknown vector and σ2>0\sigma^{2}>0 is known.

(a) Let λ0\lambda \geqslant 0 be a constant. Consider the ridge regression problem

β^λ=argminβYXβ2+λβ2.\hat{\beta}_{\lambda}=\arg \min _{\beta}\|Y-X \beta\|^{2}+\lambda\|\beta\|^{2} .

Let μ^λ=Xβ^λ\hat{\mu}_{\lambda}=X \hat{\beta}_{\lambda} be the fitted values. Show that μ^λ=HλY\hat{\mu}_{\lambda}=H_{\lambda} Y, where

Hλ=X(XTX+λI)1XTH_{\lambda}=X\left(X^{T} X+\lambda I\right)^{-1} X^{T}

(b) Show that

E(Yμ^λ2)=(IHλ)μ2+{n2trace(Hλ)+trace(Hλ2)}σ2\mathbb{E}\left(\left\|Y-\hat{\mu}_{\lambda}\right\|^{2}\right)=\left\|\left(I-H_{\lambda}\right) \mu\right\|^{2}+\left\{n-2 \operatorname{trace}\left(H_{\lambda}\right)+\operatorname{trace}\left(H_{\lambda}^{2}\right)\right\} \sigma^{2}

(c) Let Y=μ+ϵY^{*}=\mu+\epsilon^{*}, where ϵN(0,σ2I)\epsilon^{*} \sim \mathrm{N}\left(0, \sigma^{2} I\right) is independent of YY. Show that Yμ^λ2+2σ2trace(Hλ)\left\|Y-\hat{\mu}_{\lambda}\right\|^{2}+2 \sigma^{2} \operatorname{trace}\left(H_{\lambda}\right) is an unbiased estimator of E(Yμ^λ2)\mathbb{E}\left(\left\|Y^{*}-\hat{\mu}_{\lambda}\right\|^{2}\right).

(d) Describe the behaviour (monotonicity and limits) of E(Yμ^λ2)\mathbb{E}\left(\left\|Y^{*}-\hat{\mu}_{\lambda}\right\|^{2}\right) as a function of λ\lambda when p=np=n and X=IX=I. What is the minimum value of E(Yμ^λ2)\mathbb{E}\left(\left\|Y^{*}-\hat{\mu}_{\lambda}\right\|^{2}\right) ?