Esperanca, AlvaroMiled, Zina BenMahoui, Malika2019-12-202019-12-202019Esperanca, A., Miled, Z. B., & Mahoui, M. (2019). Social Media Sensing Framework for Population Health. 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), 0298–0304. https://doi.org/10.1109/CCWC.2019.8666534https://hdl.handle.net/1805/21539Conducting large health population studies is expensive. For instance, collecting field information about the efficacy of health campaigns or the impact of a disease may require the involvement of many health providers over an extended period of time and sometimes may not reach the target population. In fact, due to the aforementioned difficulties, health-related population statistics may be unavailable or lag by several years. Recently, social media networks have emerged as a source of sensory data for various aspects of social behavior. This source of information is used to drive marketing campaigns, conduct threat analysis and profile groups of individuals among numerous other applications. However, these applications are usually limited to specific case studies and do not provide a systematic approach to translating social media data into knowledge. In this paper, we propose a framework that can extract knowledge from social media networks in support of large scale health studies. The framework consists of an automated workflow designed to collect data from social media platforms, filter the data based on geographical criteria, and extract information relevant to a target hypothesis. The framework is demonstrated in the case of mortality and incidence of three chronic diseases, namely asthma, cancer, and diabetes. Twitter data is extracted over the period 2010 to 2015 for each target geographical region and classified based on its relevance to each of the aforementioned diseases. Due to the large number of extracted records, a simple random sampling approach is used to support the supervised training and testing of the classifier in the framework. Despite the limited number of records used for the training of the classifiers as a result of this approach, high classification accuracies are achieved for all three diseases. While the focus of the case studies in this paper is on the three chronic diseases asthma, diabetes and cancer, the utility of the proposed framework extends to other areas in the health sector. The proposed framework can help automate data-driven hypothesis validation for social media health-related studies. This paper describes the underlying methodology as well as the limitations associated with using social media data as a sensor for trends in population health.enPublisher Policysocial networkingpublic healthsocial mediaSocial Media Sensing Framework for Population HealthConference proceedings