Natural Language Processing of Stories

Rittichier, Kaley J.

Natural Language Processing of Stories

Files

Rittichier_Thesis_Final.pdf (348.52 KB)

Date

2022-05

Authors

Rittichier, Kaley J.

Language

American English

Committee Chair

Mukhopadhyay, Snehasis

Committee Members

Durresi, Arjan
Mohler, George

Degree

M.S.

Degree Year

2022

Department

Computer & Information Science

Grantor

Purdue University

Abstract

In this thesis, I deal with the task of computationally processing stories with a focus on multidisciplinary ends, specifically in Digital Humanities and Cultural Analytics. In the process, I collect, clean, investigate, and predict from two datasets. The first is a dataset of 2,302 open-source literary works categorized by the time period they are set in. These works were all collected from Project Gutenberg. The classification of the time period in which the work is set was discovered by collecting and inspecting Library of Congress subject classifications, Wikipedia Categories, and literary factsheets from SparkNotes. The second is a dataset of 6,991 open-source literary works categorized by the hierarchical location the work is set in; these labels were constructed from Library of Congress subject classifications and SparkNotes factsheets. These datasets are the first of their kind and can help move forward an understanding of 1) the presentation of settings in stories and 2) the effect the settings have on our understanding of the stories.

Description

Indiana University-Purdue University Indianapolis (IUPUI)

Keywords

Natural Language Processing, Stories, Story Setting, Digital Humanities, Cultural Analytics

Rights

Attribution 4.0 International

Type

Thesis

Permanent Link

https://hdl.handle.net/1805/29175
http://dx.doi.org/10.7912/C2/2925

Collections

Computer & Information Science Department Theses and Dissertations

Full item page