The subject of using big data to try and model for future outcomes has been a popular one in relations to numerous types of fields. Yet in regards to this post, one application for modeling which seems particularly interesting is in regards to epidemiology and specifically Legionnaires’ disease.

Now the primary paper I could find dates back to March of 2011 however it does draw some interesting conclusions. Essentially they attempt to model a Legionnaires’ disease outbreak using symptom-onset data from several specific outbreaks in order to estimate the beginning and end of the release of Legionella. The researchers are also able to develop a model that could estimate the final size of an ongoing outbreak along with the timing of its release.

The second model had some difficulties, including estimated release end dates being earlier than the reported end dates, a problem which suggests that with many outbreaks, the release of the Legionella bacteria might have already ended by the time the original  source was reportedly cleaned and/or closed. That said, their other model did show that accurate estimates of the release start date could be completed early in a outbreak, that the total number of cases could be reasonably determined after the release of the bacteria had ended, and estimates of the release end date could be satisfactorily achieved within later stages of an outbreak.

With all of this in mind, the paper concludes by suggesting that the model could be used during a Legionnaires’ disease outbreak to provide an early estimate of the total number of cases, information which could thus help inform public-health planning moving forward. In addition, once it is towards the end of an outbreak, the model’s estimates of the release end date could potentially help corroborate standard epidemiologic investigations that wish to identify the source of an outbreak.

