Paper 3, Section I, J

Statistical Modelling
Part II, 2020

Suppose we have data (Y1,x1T),,(Yn,xnT)\left(Y_{1}, x_{1}^{T}\right), \ldots,\left(Y_{n}, x_{n}^{T}\right), where the YiY_{i} are independent conditional on the design matrix XX whose rows are the xiT,i=1,,nx_{i}^{T}, i=1, \ldots, n. Suppose that given xix_{i}, the true probability density function of YiY_{i} is fxif_{x_{i}}, so that the data is generated from an element of a model F:={(fxi(;θ))i=1n,θΘ}\mathcal{F}:=\left\{\left(f_{x_{i}}(\cdot ; \theta)\right)_{i=1}^{n}, \theta \in \Theta\right\} for some ΘRq\Theta \subseteq \mathbb{R}^{q} and qNq \in \mathbb{N}.

(a) Define the log-likelihood function for F\mathcal{F}, the maximum likelihood estimator of θ\theta and Akaike's Information Criterion (AIC) for F\mathcal{F}.

From now on let F\mathcal{F} be the normal linear model, i.e. Y:=(Y1,,Yn)T=Xβ+εY:=\left(Y_{1}, \ldots, Y_{n}\right)^{T}=X \beta+\varepsilon, where XRn×pX \in \mathbb{R}^{n \times p} has full column rank and εNn(0,σ2I)\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right).

(b) Let σ^2\hat{\sigma}^{2} denote the maximum likelihood estimator of σ2\sigma^{2}. Show that the AIC of F\mathcal{F} is

n(1+log(2πσ^2))+2(p+1)n\left(1+\log \left(2 \pi \hat{\sigma}^{2}\right)\right)+2(p+1)

(c) Let χnp2\chi_{n-p}^{2} be a chi-squared distribution on npn-p degrees of freedom. Using any results from the course, show that the distribution of the AIC of F\mathcal{F} is

nlog(χnp2)+n(log(2πσ2/n)+1)+2(p+1)n \log \left(\chi_{n-p}^{2}\right)+n\left(\log \left(2 \pi \sigma^{2} / n\right)+1\right)+2(p+1)

[\left[\right. Hint: σ^2:=n1YXβ^2=n1(IP)ε2\hat{\sigma}^{2}:=n^{-1}\|Y-X \hat{\beta}\|^{2}=n^{-1}\|(I-P) \varepsilon\|^{2}, where β^\hat{\beta} is the maximum likelihood estimator of β\beta and PP is the projection matrix onto the column space of XX.]