Evaluating Web-Based Automatic Transcription for Alzheimer Speech Data: Transcript Comparison and Machine Learning Analysis

dc.contributor.authorSoroski, Thomas
dc.contributor.authorVasco, Thiago da Cunha
dc.contributor.authorNewton-Mason, Sally
dc.contributor.authorGranby, Saffrin
dc.contributor.authorLewis, Caitlin
dc.contributor.authorHarisinghani, Anuj
dc.contributor.authorRizzo, Matteo
dc.contributor.authorConati, Cristina
dc.contributor.authorMurray, Gabriel
dc.contributor.authorCarenini, Giuseppe
dc.contributor.authorField, Thalia S.
dc.contributor.authorJang, Hyeju
dc.contributor.departmentComputer Science, Luddy School of Informatics, Computing, and Engineering
dc.date.accessioned2024-11-26T16:39:15Z
dc.date.available2024-11-26T16:39:15Z
dc.date.issued2022
dc.description.abstractBackground: Speech data for medical research can be collected noninvasively and in large volumes. Speech analysis has shown promise in diagnosing neurodegenerative disease. To effectively leverage speech data, transcription is important, as there is valuable information contained in lexical content. Manual transcription, while highly accurate, limits the potential scalability and cost savings associated with language-based screening. Objective: To better understand the use of automatic transcription for classification of neurodegenerative disease, namely, Alzheimer disease (AD), mild cognitive impairment (MCI), or subjective memory complaints (SMC) versus healthy controls, we compared automatically generated transcripts against transcripts that went through manual correction. Methods: We recruited individuals from a memory clinic (“patients”) with a diagnosis of mild-to-moderate AD, (n=44, 30%), MCI (n=20, 13%), SMC (n=8, 5%), as well as healthy controls (n=77, 52%) living in the community. Participants were asked to describe a standardized picture, read a paragraph, and recall a pleasant life experience. We compared transcripts generated using Google speech-to-text software to manually verified transcripts by examining transcription confidence scores, transcription error rates, and machine learning classification accuracy. For the classification tasks, logistic regression, Gaussian naive Bayes, and random forests were used. Results: The transcription software showed higher confidence scores (P<.001) and lower error rates (P>.05) for speech from healthy controls compared with patients. Classification models using human-verified transcripts significantly (P<.001) outperformed automatically generated transcript models for both spontaneous speech tasks. This comparison showed no difference in the reading task. Manually adding pauses to transcripts had no impact on classification performance. However, manually correcting both spontaneous speech tasks led to significantly higher performances in the machine learning models. Conclusions: We found that automatically transcribed speech data could be used to distinguish patients with a diagnosis of AD, MCI, or SMC from controls. We recommend a human verification step to improve the performance of automatic transcripts, especially for spontaneous tasks. Moreover, human verification can focus on correcting errors and adding punctuation to transcripts. However, manual addition of pauses is not needed, which can simplify the human verification step to more efficiently process large volumes of speech data.
dc.eprint.versionFinal published version
dc.identifier.citationSoroski T, Vasco T da C, Newton-Mason S, et al. Evaluating Web-Based Automatic Transcription for Alzheimer Speech Data: Transcript Comparison and Machine Learning Analysis. JMIR Aging. 2022;5(3):e33460. doi:10.2196/33460
dc.identifier.urihttps://hdl.handle.net/1805/44739
dc.language.isoen_US
dc.publisherJMIR
dc.relation.isversionof10.2196/33460
dc.relation.journalJMIR Aging
dc.rightsAttribution 4.0 Internationalen
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.sourcePublisher
dc.subjectSpeech data
dc.subjectMedical research
dc.subjectSpeech analysis
dc.subjectNeurodegenerative disease
dc.subjectTranscription
dc.titleEvaluating Web-Based Automatic Transcription for Alzheimer Speech Data: Transcript Comparison and Machine Learning Analysis
dc.typeArticle
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Soroski2022Evaluating-CCBY.pdf
Size:
741.53 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.04 KB
Format:
Item-specific license agreed upon to submission
Description: