Consider a generalised linear model with full column rank design matrix X∈Rn×p, output variables Y=(Y1,…,Yn)∈Rn, link function g, mean parameters μ=(μ1,…,μn) and known dispersion parameters σi2=aiσ2,i=1,…,n. Denote its variance function by V and recall that g(μi)=xiTβ,i=1,…,n, where β∈Rp and xiT is the ith row of X.
(a) Define the score function in terms of the log-likelihood function and the Fisher information matrix, and define the update of the Fisher scoring algorithm.
(b) Let W∈Rn×n be a diagonal matrix with positive entries. Note that XTWX is invertible. Show that
argminb∈Rp{i=1∑nWii(Yi−xiTb)2}=(XTWX)−1XTWY
[Hint: you may use that argminb∈Rp{∥∥∥Y−XTb∥∥∥2}=(XTX)−1XTY.]
(c) Recall that the score function and the Fisher information matrix have entries
Uj(β)=i=1∑naiσ2V(μi)g′(μi)(Yi−μi)Xijj=1,…,pijk(β)=i=1∑naiσ2V(μi){g′(μi)}2XijXikj,k=1,…,p
Justify, performing the necessary calculations and using part (b), why the Fisher scoring algorithm is also known as the iterative reweighted least squares algorithm.