Twitter Geo-Epidemiological Feasibility Study


Improving epidemiological models by adding spatial and semantic information derived from social media data




Extracting Twitter posts that mention influenza, yellow fever, cholera, Ebola, and other diseases for 12 countries in Africa. Creating basic statistics like the number of tweets per capita per week per country or space-time variations in the number of tweets. Assessing the suitability of tweets to investigate, detect and predict disease outbreaks at fine spatial scales. We’re considering a large set of keywords which are associated with different diseases and symptoms. For the analysis, we categorized them loosely based on the International Statistical Classification of Diseases and Related Health Problems (ICD) by grouping keywords explicitly naming diseases with keywords mentioning the accompanied symptoms.

With the COVID-19 pandemic the project was extended to analyze geo-social network data for indicators about the presence of disease outbreak as well as connectivity between regions to improve disease prediction models.


Exploratory Spatial Data Analysis (ESDA) through extracting and processing geocoded Twitter data to assess the feasibility of using these data to identify, monitor and predict epidemiological waves in selected countries in Africa.


The results of this project will help improve epidemiological models by adding a spatial component and semantic information to them. The benefits of this are two-fold: (1) Prediction accuracy of existing disease models can be increased and (2) precise spatial information about disease incidence allows researchers to narrow down areas in which disease outbreaks are imminent or ongoing.


Bernd Resch (project lead), Clemens Havas, Veronika Krieger, Andreas Petutschnig, Oliver Zichert
Project Partners