Inference – The Stats Geek

Does a Bernoulli/binomial model really assume everyone has the same probability p?

November 4, 2024 by Jonathan Bartlett

When you estimate a proportion and want to calculate a standard error for the estimate, you would normally do so based on assuming that the number of ‘successes’ in the sample is a draw from a binomial distribution, which counts the number of successes in a series of $n$ independent Bernoulli 0/1 draws, where each draw has a probability $p$ of ‘success’. Does the model rely or assume that for each of these binary observations the success probability is the same? In the third paragraph of this blog post Frank Harrell (seems to) argue that it does. In this post I’ll delve into this a bit further, using the same numerical example Frank gives.

Suppose we have a random sample of n individuals on whom we observe a binary outcome indicating presence or absence of disease. Suppose that in a sample of n=100, 40 have the disease, and so our estimate of the proportion of disease in the population (which I will denote $p$ ) from the sample was drawn is $\hat{p}$ =40/100=0.4.

Interpretation of frequentist confidence intervals and Bayesian credible intervals

September 12, 2024November 21, 2020 by Jonathan Bartlett

This post was prompted by a tweet by Frank Harrell yesterday asking:

#Statistics challenge of the day: suppose a randomized trial yielded a 0.95 confidence interval for the treatment odds ratio of [0.72, 0.91]. Can you provide an exact interpretation of THIS interval? #bbrcourse @vandy_biostat @EdgeforScholars
— Frank Harrell (@f2harrell) November 19, 2020

In this post I’ll say a little bit about trying to answer Frank’s question, and then a little bit about an alternative question which I posed in response, namely, how does the interpretation change if the interval is a Bayesian credible interval, rather than a frequentist confidence interval.

P-values after multiple imputation using mitools in R

July 5, 2021November 5, 2020 by Jonathan Bartlett

I’ve been using Thomas Lumley’s excellent mitools package in R for applying Rubin’s rules for multiple imputation ever since I wrote the smcfcs package in R. Somebody recently asked me about how they could obtain p-values corresponding to the Rubin’s rules results calculated by the MIcombine function in mitools. In this short post I’ll give some R code to calculate these.