What does it mean to say that a (1×p) random vector ξ has a multivariate normal distribution?
Suppose ξ=(X,Y) has the bivariate normal distribution with mean vector μ=(μX,μY), and dispersion matrix
Σ=(σXXσXYσXYσYY)
Show that, with β:=σXY/σXX,Y−βX is independent of X, and thus that the conditional distribution of Y given X is normal with mean μY+β(X−μX) and variance σYY⋅X:=σYY−σXY2/σXX.
For i=1,…,n,ξi=(Xi,Yi) are independent and identically distributed with the above distribution, where all elements of μ and Σ are unknown. Let
S=(SXXSXYSXYSYY):=i=1∑n(ξi−ξˉ)T(ξi−ξˉ)
where ξˉ:=n−1∑i=1nξi.
The sample correlation coefficient is r:=SXY/SXXSYY. Show that the distribution of r depends only on the population correlation coefficient ρ:=σXY/σXXσYY.
Student's t-statistic (on n−2 degrees of freedom) for testing the null hypothesis H0:β=0 is
t:=SYY⋅X/(n−2)SXXβ,
where β:=SXY/SXX and SYY⋅X:=SYY−SXY2/SXX. Its density when H0 is true is
p(t)=C(1+n−2t2)−21(n−1)
where C is a constant that need not be specified.
Express t in terms of r, and hence derive the density of r when ρ=0.
How could you use the sample correlation r to test the hypothesis ρ=0 ?