We might also logically expect greater uncertainty above the fitted value, for our upper limit on the confidence interval we’re saying that the true expected abudance is possibly somewhat larger than the fitted value and due to the mean-variance relationship, a larger fitted value is a larger mean value, which implies a larger variance, and consequently a larger amount of uncertainty above the fitted value than below. In that case we do have some uncertainty about this fitted value the uncertainty on the lower end has to logically fit somewhere between the small estimated value and zero, but not exactly zero as we’re not creating an interval with 100% coverage. However, our model won’t ever return expected (fitted) values that are exactly equal to zero it might yield values that are very close to zero, but never exactly zero. If we had an expected count of zero the variance would also be zero, and our uncertainty about this value would also be zero. The implication of this is that as the mean tends to zero, so must the variance.
In fact, in the Poisson GLM, the mean and variance are the same thing. In this model there is an implied mean-variance relationship as the mean count increases so does the variance. Think about a Poisson GLM fitted to some species abundance data. This results in symmetric intervals on this scale and the very real possibility that the intervals will include values that are nonsensical, like negative abundances and concentrations, or probabilities that are outside the limits of 0 and 1. Well, it’s not! However, the main reason why people mess up computing confidence intervals for a GLM is that they do all the calculations on the response scale. Why is plus/minus two standard errors wrong? If I had a dollar (even a Canadian one) for every time I’ve seen someone present graphs of estimated abundance of some species where the confidence interval includes negative abundances, I’d be rich! Here, following the rule of “if I’m asked more than once I should write a blog post about it!” I’m going to show a simple way to correctly compute a confidence interval for a GLM or a related model.
Unfortunately this only really works like this for a linear model.
If you remember a little bit of theory from your stats classes, you may recall that such an interval can be produced by adding to and subtracting from the fitted values 2 times their standard error. In general this is done using confidence intervals with typically 95% converage. You’ve estimated a GLM or a related model (GLMM, GAM, etc.) for your latest paper and, like a good researcher, you want to visualise the model and show the uncertainty in it.