Word Adjacency Graph Modeling: Separating Signal From Noise in Big Data

There is a need to develop methods to analyze Big Data to inform patient-centered interventions for better health outcomes. The purpose of this study was to develop and test a method to explore Big Data to describe salient health concerns of people with epilepsy. Specifically, we used Word Adjacency Graph modeling to explore a data set containing 1.9 billion anonymous text queries submitted to the ChaCha question and answer service to (a) detect clusters of epilepsy-related topics, and (b) visualize the range of epilepsy-related topics and their mutual proximity to uncover the breadth and depth of particular topics and groups of users. Applied to a large, complex data set, this method successfully identified clusters of epilepsy-related topics while allowing for separation of potentially non-relevant topics. The method can be used to identify patient-driven research questions from large social media data sets and results can inform the development of patient-centered interventions.

Keywords

epilepsy, Big Data, methods

Cite As

Miller, W. R., Groves, D., Knopf, A., Otte, J. L., & Silverman, R. D. (2017). Word adjacency graph modeling: Separating signal from noise in big data. Western journal of nursing research, 39(1), 166-185. https://doi.org/10.1177/0193945916670363

Journal

Western Journal of Nursing Research

Rights

Publisher Policy

Source

Author

Type

Article

Permanent Link

https://hdl.handle.net/1805/18439

DOI

https://doi.org/10.1177/0193945916670363

Version

Author's manuscript

Collections

Open Access Policy Articles
IU School of Nursing Works
Wendy Miller

Full item page