(i) Suppose Yi,1⩽i⩽n, are independent binomial observations, with Yi∼Bi(ti,πi), 1⩽i⩽n, where t1,…,tn are known, and we wish to fit the model
ω:log1−πiπi=μ+βTxi for each i
where x1,…,xn are given covariates, each of dimension p. Let μ^,β^ be the maximum likelihood estimators of μ,β. Derive equations for μ^,β^ and state without proof the form of the approximate distribution of β^.
(ii) In 1975 , data were collected on the 3-year survival status of patients suffering from a type of cancer, yielding the following table
\begin{tabular}{ccrr} & & \multicolumn{2}{c}{ survive? } \ age in years & malignant & yes & no \ under 50 & no & 77 & 10 \ under 50 & yes & 51 & 13 \ 50−69 & no & 51 & 11 \ 50−69 & yes & 38 & 20 \ 70+ & no & 7 & 3 \ 70+ & yes & 6 & 3 \end{tabular}
Here the second column represents whether the initial tumour was not malignant or was malignant.
Let Yij be the number surviving, for age group i and malignancy status j, for i=1,2,3 and j=1,2, and let tij be the corresponding total number. Thus Y11=77, t11=87. Assume Yij∼Bi(tij,πij),1⩽i⩽3,1⩽j⩽2. The results from fitting the model
log(πij/(1−πij))=μ+αi+βj
with α1=0,β1=0 give β^2=−0.7328(se=0.2985), and deviance =0.4941. What do you conclude?
Why do we take α1=0,β1=0 in the model?
What "residuals" should you compute, and to which distribution would you refer them?