A Framework for Text Processing and Supporting Access to Collections of Digitized Historical Newspapers
If you need an accessible version of this item, please email your request to digschol@iu.edu so that they may create one and provide it to you.
Date
2007
Language
American English
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Large quantities of historical newspapers are being digitized and OCRd. We describe a framework for processing the OCRd text to identify articles and extract metadata for them. We describe the article schema and provide examples of features that facilitate automatic indexing of them. For this processing, we employ lexical semantics, structural models, and community content. Furthermore, we describe visualization and summarization techniques that can be used to present the extracted events.
Description
Keywords
item.page.description.tableofcontents
item.page.relation.haspart
Cite As
Allen, R. B., Japzon, A., Achananuparp, P., & Lee, K. J. (2007). A framework for text processing and supporting access to collections of digitized historical newspapers. In Human Interface and the Management of Information. Interacting in Information Environments (pp. 235-244). Springer Berlin Heidelberg.
ISSN
Publisher
Series/Report
Sponsorship
Major
Extent
Identifier
Relation
Journal
Source
Alternative Title
Type
Book chapter