Assume that the n-dimensional observation vector Y may be written as Y=Xβ+ϵ, where X is a given n×p matrix of rank p,β is an unknown vector, with βT=(β1,…,βp), and
ϵ∼Nn(0,σ2I)
where σ2 is unknown. Find β^, the least-squares estimator of β, and describe (without proof) how you would test
H0:βν=0
for a given ν.
Indicate briefly two plots that you could use as a check of the assumption (∗).
Continued opposite Sulphur dioxide is one of the major air pollutants. A data-set presented by Sokal and Rohlf (1981) was collected on 41 US cities in 1969-71, corresponding to the following variables:
Y= sulphur dioxide content of air in micrograms per cubic metre
X1= average annual temperature in degrees Fahrenheit
X2 = number of manufacturing enterprises employing 20 or more workers
X3= population size (1970 census) in thousands
X4= average annual wind speed in miles per hour
X5= average annual precipitation in inches
X6= average annual of days with precipitation per year .
Interpret the R output that follows below, quoting any standard theorems that you need to use.
> next. lm−lm(log(Y)∼X1+X2+X3+X4+X5+X6)> summary ( next.lm ) Call: lm( formula =log(Y)∼X1+X2+X3+X4+X5+X6)
Call: lm( formula =log(Y)∼X1+X2+X3+X4+X5+X6)
Residuals :
Min .795481Q−0.25538 Median −0.019683Q0.28328 Max 0.98029
−0.79548−0.25538−0.019680.283280.98029
Coefficients: (Intercept) X1 X2 X3 X4 X5 X6 Estimate 7.2532456−0.05990170.0012639−0.0007077−0.16971710.01737230.0004347 Std. Error 1.44836860.01901380.00048200.00046320.05555630.01110360.0049591 t value 5.008−3.1502.622−1.528−3.0551.5650.088Pr(>∣t∣)1.68e−050.003390.012980.135800.004360.126950.93066∗∗∗∗∗∗∗∗
Signif. codes: 0 ', 0.001 ', 0.01 ', 0.05 ':
Residual standard error: 0.448 on 34 degrees of freedom
Multiple R-Squared: 0.6541
F-statistic: 10.72 on 6 and 34 degrees of freedom, p-value: 1.126e−06