Paper 4, Section II, K

What does it mean to say that a $(1 \times p)$ random vector $\xi$ has a multivariate normal distribution?

Suppose $\xi=(X, Y)$ has the bivariate normal distribution with mean vector $\mu=\left(\mu_{X}, \mu_{Y}\right)$ , and dispersion matrix

\Sigma=\left(\begin{array}{cc} \sigma_{X X} & \sigma_{X Y} \\ \sigma_{X Y} & \sigma_{Y Y} \end{array}\right)

Show that, with $\beta:=\sigma_{X Y} / \sigma_{X X}, Y-\beta X$ is independent of $X$ , and thus that the conditional distribution of $Y$ given $X$ is normal with mean $\mu_{Y}+\beta\left(X-\mu_{X}\right)$ and variance $\sigma_{Y Y \cdot X}:=\sigma_{Y Y}-\sigma_{X Y}^{2} / \sigma_{X X}$ .

For $i=1, \ldots, n, \xi_{i}=\left(X_{i}, Y_{i}\right)$ are independent and identically distributed with the above distribution, where all elements of $\mu$ and $\Sigma$ are unknown. Let

S=\left(\begin{array}{cc} S_{X X} & S_{X Y} \\ S_{X Y} & S_{Y Y} \end{array}\right):=\sum_{i=1}^{n}\left(\xi_{i}-\bar{\xi}\right)^{\mathrm{T}}\left(\xi_{i}-\bar{\xi}\right)

where $\bar{\xi}:=n^{-1} \sum_{i=1}^{n} \xi_{i}$ .

The sample correlation coefficient is $r:=S_{X Y} / \sqrt{S_{X X} S_{Y Y}}$ . Show that the distribution of $r$ depends only on the population correlation coefficient $\rho:=\sigma_{X Y} / \sqrt{\sigma_{X X} \sigma_{Y Y}}$ .

Student's $t$ -statistic (on $n-2$ degrees of freedom) for testing the null hypothesis $H_{0}: \beta=0$ is

t:=\frac{\widehat{\beta}}{\sqrt{S_{Y Y \cdot X} /(n-2) S_{X X}}},

where $\widehat{\beta}:=S_{X Y} / S_{X X}$ and $S_{Y Y \cdot X}:=S_{Y Y}-S_{X Y}^{2} / S_{X X}$ . Its density when $H_{0}$ is true is

p(t)=C\left(1+\frac{t^{2}}{n-2}\right)^{-\frac{1}{2}(n-1)}

where $C$ is a constant that need not be specified.

Express $t$ in terms of $r$ , and hence derive the density of $r$ when $\rho=0$ .

How could you use the sample correlation $r$ to test the hypothesis $\rho=0$ ?