Logistic regression / Generalized linear model books

Applied Logistic Regression, by Hosmer, Lemeshow and Sturdivant
Hosmer, Lemeshow and (now also) Sturdivant's have recently (2013) released the third edition of their very popular book on logistic regression. It is an extremely readable account of models for binary (and also categorical) outcome data. I think one of the reasons the book has become so popular is that it gives lots of very practical guidance for the model building process, something that is arguably lacking in more theoretical treatments of the topic.

One of the distinguishing features of the book compared to others is its excellent coverage of techniques for assessing goodness of fit (including the Hosmer and Lemeshow test) and the predictive power of the fitted model (for example using the area under the receiver operating characteristic curve).

Despite the title, the book also covers the modelling of ordered and unordered categorical outcomes (i.e. with more than two levels). The third edition has a number of important additions, including coverage of fitting models to data collected with a complex survey design and the analysis of correlated categorical outcomes (e.g. clustered or longitudinal data).

Overall, an excellent book. Probably the best book to get if you're learning about logistic regression for the first time.

Modelling Binary Data, by Collett
Like Hosmer, Lemeshow and Sturdivant, Collett's book is another very practical introduction to models for binary outcomes. The logistic regression model is described in detail, before covering goodness of fit and giving lots of practical guidance on the process of model selection.

A strong feature of the book is a very comprehensive chapter on techniques for assessing the fit of a model, with the use of diagnostic plots and residuals. Some coverage is also given of random-effects models for correlated binary data. Models for ordinal outcomes are also covered briefly. Another very nice book on modelling binary outcomes.

Generalized Linear Models, by McCullagh and Nelder
This is if you like the 'original' book on GLMs, with John Nelder being one of the original developers (with Wedderburn) of the GLM modelling framework. It's a great book, starting with a historical perspective on the development of GLMs from linear models and the analysis of variance, before developing the GLM framework. Models for ordered and unordered categorical outcomes are covered, log-linear models for joint models for contingency table data, and even models for survival data.

It is a much more formal presentation compared to the other books listed here, and so in that sense may be less attractive to applied researchers. Furthermore, since it is now quite old (1989), it necessarily doesn't include some of the newer developments which have been made in the following 25 years. Nevertheless, it arguably remains the definitive text book on GLMs.

An Introduction to Generalized Linear Models, by Dobson
Dobson's book gives a somewhat more applied introduction to GLMs, and has the advantage compared to McCullagh and Nelder of being more recently published. I personally find Dobson's book very readable. The first part of the book introduces the GLM framework using quite a general setup, including giving derivations for the asymptotic results which are used for inference from a fitted GLM.

The following chapters then give the specific details for normally distributed continuous outcomes, binary, ordinal and unordered logistic regression models, and models based on the Poisson distribution for count outcomes and contingency tables. Like Collett and McCullagh and Nelder, Dobson also gives some coverage of models for time to event / survival outcomes. Dobson's book is a good option for those looking for a systematic treatment of the GLM framework, but perhaps find McCullagh and Nelder's text too formal.

Leave a Reply