Fixed versus random-effects meta-analysis – efficiency and confidence interval coverage

Meta-analysis is a critical tool for synthesizing existing evidence. It is commonly used within medical and clinical settings to evaluate the existing evidence regarding the effect of a treatment or exposure on an outcome of interest. The essential idea is that the estimates of the effect of interest from previous study are pooled together. A choice which has to be made when conducting a meta-analysis is between fixed-effects and random-effects. In this post we’ll look at some of the consequences of this choice, when in truth the studies are measuring different effects.

Read more

Multiple imputation with interactions and non-linear terms

Multiple imputation has become an extremely popular approach to handling missing data, for a number of reasons. One is that once the imputed datasets have been generated, they can each be analysed using standard analysis methods, and the results pooled using Rubin’s rules. However, in addition to the missing at random assumption, for multiple imputation to give unbiased point estimates the model(s) used to impute missing data need to be (at least approximately) correctly specified. Because of this, care must be taken when choosing the imputation model.

What constitutes a reasonable imputation model will obviously depend on the dataset and situation at hand. One situation which is commonly encountered, but where it is not obvious what one should do, is where the dataset, or the model(s) which will be fitted after imputation, contains interaction terms or non-linear terms such as squared terms.

Read more

Adjusting for covariate misclassification in logistic regression – predictive value weighting

When we fit regression models, we implicitly assume that the values in our dataset are accurate measurements of the variables of interest. In many settings, the measurements we actually have are imperfect. In the case of a categorical variable, for some of the records in our dataset the observed value may differ from the true value, due to misclassification. Misclassification arises for many different reasons. In epidemiology, instruments are often used to measure conditions imperfectly – sometimes observations which should be recorded as 1 are recorded as 0, and vice-versa. In this post I’ll focus on the common situation where logistic regression is used to model an outcome Y, and one of the covariates is subject to misclassification.

Read more