Cracking the Code of Geo-Identifiers: Harnessing Data-Based Decision-Making for the Public Good
Date
Authors
Language
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
The accessibility of official statistics to non-expert users could be aided by employing natural language processing and deep learning models to dataset lexicons. Specifically, the semantic structure of FIPS codes would offer a relatively standardized data dictionary of column names and string variable structure to identify: two-digits for states, followed by three-digits for counties. The technical, methodological contribution of this paper is a bibliometric analysis of scientific publications based on FIPS code analysis indicated that between 27,954 and 1,970,000 publications attend to this geo- identifier. Within a single dataset reporting national representative and longitudinal survey data, 141 publications utilize FIPS data. The high incidence shows the research impact. Yet, the low proportion of only 2.0 percent of all publications utilizing this dataset also shows a gap even among expert users. A data use case drawn from public health data implies that cracking the code of geo-identifiers could advance access by helping everyday users formulate data inquiries within intuitive language.