A2.12

Computational Statistics and Statistical Modelling
Part II, 2001

(i) Suppose that Y1,,YnY_{1}, \ldots, Y_{n} are independent random variables, and that YiY_{i} has probability density function

f(yiθi,ϕ)=exp[(yiθib(θi))/ϕ+c(yi,ϕ)]f\left(y_{i} \mid \theta_{i}, \phi\right)=\exp \left[\left(y_{i} \theta_{i}-b\left(\theta_{i}\right)\right) / \phi+c\left(y_{i}, \phi\right)\right]

Assume that E(Yi)=μiE\left(Y_{i}\right)=\mu_{i}, and that g(μi)=βTxig\left(\mu_{i}\right)=\beta^{T} x_{i}, where g()g(\cdot) is a known 'link' function, x1,,xnx_{1}, \ldots, x_{n} are known covariates, and β\beta is an unknown vector. Show that

E(Yi)=b(θi),var(Yi)=ϕb(θi)=Vi, say, \mathbb{E}\left(Y_{i}\right)=b^{\prime}\left(\theta_{i}\right), \operatorname{var}\left(Y_{i}\right)=\phi b^{\prime \prime}\left(\theta_{i}\right)=V_{i} \text {, say, }

and hence

lβ=i=1n(yiμi)xig(μi)Vi, where l=l(β,ϕ) is the log-likelihood. \frac{\partial l}{\partial \beta}=\sum_{i=1}^{n} \frac{\left(y_{i}-\mu_{i}\right) x_{i}}{g^{\prime}\left(\mu_{i}\right) V_{i}}, \text { where } l=l(\beta, \phi) \text { is the log-likelihood. }

(ii) The table below shows the number of train miles (in millions) and the number of collisions involving British Rail passenger trains between 1970 and 1984 . Give a detailed interpretation of the RR output that is shown under this table:

 year  collisions  miles 11970328121971627631972426841973726951974628161975227171976226581977426491978126710197972651119803267121981526013198262311419831249\begin{array}{llll} & \text { year } & \text { collisions } & \text { miles } \\ 1 & 1970 & 3 & 281 \\ 2 & 1971 & 6 & 276 \\ 3 & 1972 & 4 & 268 \\ 4 & 1973 & 7 & 269 \\ 5 & 1974 & 6 & 281 \\ 6 & 1975 & 2 & 271 \\ 7 & 1976 & 2 & 265 \\ 8 & 1977 & 4 & 264 \\ 9 & 1978 & 1 & 267 \\ 10 & 1979 & 7 & 265 \\ 11 & 1980 & 3 & 267 \\ 12 & 1981 & 5 & 260 \\ 13 & 1982 & 6 & 231 \\ 14 & 1983 & 1 & 249\end{array}

Call:

glm(formula == collisions \sim year +log(+\log ( miles )), family == poisson)

Coefficients:

 Estimate  Std. Error  z value Pr(>z) (Intercept) 127.14453121.377961.0480.295 year 0.053980.051751.0430.297log (miles) 3.416544.186160.8160.414\begin{array}{lrrrr} & \text { Estimate } & \text { Std. Error } & \text { z value } & \operatorname{Pr}(>|z|) \\ \text { (Intercept) } & 127.14453 & 121.37796 & 1.048 & 0.295 \\ \text { year } & -0.05398 & 0.05175 & -1.043 & 0.297 \\ \log \text { (miles) } & -3.41654 & 4.18616 & -0.816 & 0.414\end{array}

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 15.93715.937 on 13 degrees of freedom

Residual deviance: 14.84314.843 on 11 degrees of freedom

Number of Fisher Scoring iterations: 4

Part II