The body mass index (BMI) of your closest friend is a good predictor of your own BMI. A scientist applies polynomial regression to understand the relationship between these two variables among 200 students in a sixth form college. The R commands
> fit. 1<−lm(BMI∼ poly ( friendBMI , 2, raw=T ))
> fit. 2<−lm(BMI∼ poly ( friendBMI, 3, raw =T))
fit the models Y=β0+β1X+β2X2+ε and Y=β0+β1X+β2X2+β3X3+ε, respectively, with ε∼N(0,σ2) in each case.
Setting the parameters raw to FALSE:
> fit. 3<−lm(BMI∼ poly ( friendBMI , 2, raw=F ) )
> fit. 4<−lm(BMI∼ poly ( friendBMI, 3, raw =F))
fits the models Y=β0+β1P1(X)+β2P2(X)+ε and Y=β0+β1P1(X)+β2P2(X)+ β3P3(X)+ε, with ε∼N(0,σ2). The function Pi is a polynomial of degree i. Furthermore, the design matrix output by the function poly with raw=F satisfies:
>t( poly ( friendBMI, 3, raw =F))%∗% poly (a,3, raw =F)
112321.000000e+001.288032e−163.187554e−1731.288032e−161.000000e+00−6.201636e−173.187554e−17−6.201636e−171.000000e+00
How does the variance of β^ differ in the models fit.2 and fit.4 ? What about the variance of the fitted values Y^=Xβ^ ? Finally, consider the output of the commands
>anova (fit.1,fit.2)
anova(fit.3,fit.4)
Define the test statistic computed by this function and specify its distribution. Which command yields a higher statistic?