A1.13

(i) Suppose $Y_{i}, 1 \leqslant i \leqslant n$ , are independent binomial observations, with $Y_{i} \sim B i\left(t_{i}, \pi_{i}\right)$ , $1 \leqslant i \leqslant n$ , where $t_{1}, \ldots, t_{n}$ are known, and we wish to fit the model

\omega: \log \frac{\pi_{i}}{1-\pi_{i}}=\mu+\beta^{T} x_{i} \quad \text { for each } i

where $x_{1}, \ldots, x_{n}$ are given covariates, each of dimension $p$ . Let $\hat{\mu}, \hat{\beta}$ be the maximum likelihood estimators of $\mu, \beta$ . Derive equations for $\hat{\mu}, \hat{\beta}$ and state without proof the form of the approximate distribution of $\hat{\beta}$ .

(ii) In 1975 , data were collected on the 3-year survival status of patients suffering from a type of cancer, yielding the following table

\begin{tabular}{ccrr} & & \multicolumn{2}{c}{ survive? } \ age in years & malignant & yes & no \ under 50 & no & 77 & 10 \ under 50 & yes & 51 & 13 \ $50-69$ & no & 51 & 11 \ $50-69$ & yes & 38 & 20 \ $70+$ & no & 7 & 3 \ $70+$ & yes & 6 & 3 \end{tabular}

Here the second column represents whether the initial tumour was not malignant or was malignant.

Let $Y_{i j}$ be the number surviving, for age group $i$ and malignancy status $j$ , for $i=1,2,3$ and $j=1,2$ , and let $t_{i j}$ be the corresponding total number. Thus $Y_{11}=77$ , $t_{11}=87$ . Assume $Y_{i j} \sim B i\left(t_{i j}, \pi_{i j}\right), 1 \leqslant i \leqslant 3,1 \leqslant j \leqslant 2$ . The results from fitting the model

\log \left(\pi_{i j} /\left(1-\pi_{i j}\right)\right)=\mu+\alpha_{i}+\beta_{j}

with $\alpha_{1}=0, \beta_{1}=0$ give $\hat{\beta}_{2}=-0.7328(\mathrm{se}=0.2985)$ , and deviance $=0.4941$ . What do you conclude?

Why do we take $\alpha_{1}=0, \beta_{1}=0$ in the model?

What "residuals" should you compute, and to which distribution would you refer them?