Comparing predictive ability of two nested logistic regression models

A very common situation in biostatistics, but also much more broadly of course, is that one wants to compare the predictive ability of two competing models. A key question of interest often is whether adding a new marker or variable Y to an existing set X improves prediction. The most obvious way of testing this hypothesis is to use a regression model, and then test whether adding the new variable Y improves fit, by testing the null hypothesis that the coefficient of Y in the expanded model differs from zero. An alternative approach is to test whether adding the new variable improves some measure of predictive ability, such as the area under the ROC curve.

Read more

Checking functional form in logistic regression using loess plots

When we include a continuous variable as a covariate in a regression model, it's important that we include it using the correct (or something approximately correct) functional form. For example, with a continuous outcome Y and continuous covariate X, it may be the case that the expected value of Y is a linear function of X and X^2, rather than a linear function of X. For linear regression there are a number of ways of assessing what the appropriate functional form is for a covariate. A simple but often effective approach is simply to look at a scatter plot of Y against X, to visually assess the shape of the association.

Read more

Area under the ROC curve - assessing discrimination in logistic regression

In a previous post we looked at the popular Hosmer-Lemeshow test for logistic regression, which can be viewed as assessing whether the model is well calibrated. In this post we'll look at one approach to assessing the discrimination of a fitted logistic model, via the receiver operating characteristic (ROC) curve.

Read more

Deviance goodness of fit test for Poisson regression

In this post we'll look at the deviance goodness of fit test for Poisson regression with individual count data. Many software packages provide this test either in the output when fitting a Poisson regression model or can perform it after fitting such a model (e.g. Stata), which may lead researchers and analysts in to relying on it. In this post we'll see that often the test will not perform as expected, and therefore, I argue, ought to be used with caution.

Read more