Paper 4, Section II, K

In a study on infant respiratory disease, data are collected on a sample of 2074 infants. The information collected includes whether or not each infant developed a respiratory disease in the first year of their life; the gender of each infant; and details on how they were fed as one of three categories (breast-fed, bottle-fed and supplement). The data are tabulated in $\mathrm{R}$ as follows:

$\begin{array}{rrrrr} & \text { disease } & \text { nondisease } & \text { gender } & \text { food } \\ 1 & 77 & 381 & \text { Boy } & \text { Bottle-fed } \\ 2 & 19 & 128 & \text { Boy } & \text { Supplement } \\ 3 & 47 & 447 & \text { Boy } & \text { Breast-fed } \\ 4 & 48 & 336 & \text { Girl } & \text { Bottle-fed } \\ 5 & 16 & 111 & \text { Girl } & \text { Supplement } \\ 6 & 31 & 433 & \text { Girl } & \text { Breast-fed }\end{array}$

Write down the model being fit by the $R$ commands on the following page:

The following (slightly abbreviated) output from $R$ is obtained.

Briefly explain the justification for the standard errors presented in the output above.

Explain the relevance of the output of the following $R$ code to the data being studied, justifying your answer:

$>\exp (c(-0.6693-1.96 * 0.153,-0.6693+1.96 * 0.153))$

[1] $0.3793940 \quad 0.6911351$

[Hint: It may help to recall that if $Z \sim N(0,1)$ then $\mathbb{P}(Z \geqslant 1.96)=0.025 .]$

Let $D_{1}$ be the deviance of the model fitted by the following $\mathrm{R}$ command.

$>$ fit $1<-$ glm (disease/total gender + food + gender:food,

$+$ family = binomial, weights = total $)$

What is the numerical value of $D_{1}$ ? Which of the two models that have been fitted should you prefer, and why?