Paper 3, Section I, J

Suppose we have data $\left(Y_{1}, x_{1}^{T}\right), \ldots,\left(Y_{n}, x_{n}^{T}\right)$ , where the $Y_{i}$ are independent conditional on the design matrix $X$ whose rows are the $x_{i}^{T}, i=1, \ldots, n$ . Suppose that given $x_{i}$ , the true probability density function of $Y_{i}$ is $f_{x_{i}}$ , so that the data is generated from an element of a model $\mathcal{F}:=\left\{\left(f_{x_{i}}(\cdot ; \theta)\right)_{i=1}^{n}, \theta \in \Theta\right\}$ for some $\Theta \subseteq \mathbb{R}^{q}$ and $q \in \mathbb{N}$ .

(a) Define the log-likelihood function for $\mathcal{F}$ , the maximum likelihood estimator of $\theta$ and Akaike's Information Criterion (AIC) for $\mathcal{F}$ .

From now on let $\mathcal{F}$ be the normal linear model, i.e. $Y:=\left(Y_{1}, \ldots, Y_{n}\right)^{T}=X \beta+\varepsilon$ , where $X \in \mathbb{R}^{n \times p}$ has full column rank and $\varepsilon \sim N_{n}\left(0, \sigma^{2} I\right)$ .

(b) Let $\hat{\sigma}^{2}$ denote the maximum likelihood estimator of $\sigma^{2}$ . Show that the AIC of $\mathcal{F}$ is

n\left(1+\log \left(2 \pi \hat{\sigma}^{2}\right)\right)+2(p+1)

(c) Let $\chi_{n-p}^{2}$ be a chi-squared distribution on $n-p$ degrees of freedom. Using any results from the course, show that the distribution of the AIC of $\mathcal{F}$ is

n \log \left(\chi_{n-p}^{2}\right)+n\left(\log \left(2 \pi \sigma^{2} / n\right)+1\right)+2(p+1)

$\left[\right.$ Hint: $\hat{\sigma}^{2}:=n^{-1}\|Y-X \hat{\beta}\|^{2}=n^{-1}\|(I-P) \varepsilon\|^{2}$ , where $\hat{\beta}$ is the maximum likelihood estimator of $\beta$ and $P$ is the projection matrix onto the column space of $X$ .]