ScholarWorksIndianapolis
  • Communities & Collections
  • Browse ScholarWorks
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Subject

Browsing by Subject "Automatic speech recognition"

Now showing 1 - 2 of 2
Results Per Page
Sort Options
  • Loading...
    Thumbnail Image
    Item
    Eyes-free interaction with aural user interfaces
    (2015-04-11) Rohani Ghahari, Romisa; Bolchini, Davide
    Existing web applications force users to focus their visual attentions on mobile devices, while browsing content and services on the go (e.g., while walking or driving). To support mobile, eyes-free web browsing and minimize interaction with devices, designers can leverage the auditory channel. Whereas acoustic interfaces have proven to be effective in regard to reducing visual attention, a perplexing challenge exists in designing aural information architectures for the web because of its non-linear structure. To address this problem, we introduce and evaluate techniques to remodel existing information architectures as "playlists" of web content - aural flows. The use of aural flows in mobile web browsing can be seen in ANFORA News, a semi-aural mobile site designed to facilitate browsing large collections of news stories. An exploratory study involving frequent news readers (n=20) investigated the usability and navigation experiences with ANFORA News in a mobile setting. The initial evidence suggests that aural flows are a promising paradigm for supporting eyes-free mobile navigation while on the go. Interacting with aural flows, however, requires users to select interface buttons, tethering visual attention to the mobile device even when it is unsafe. To reduce visual interaction with the screen, we also explore the use of simulated voice commands to control aural flows. In a study, 20 participants browsed aural flows either through a visual interface or with a visual interface augmented by voice commands. The results suggest that using voice commands decreases by half the time spent looking at the device, but yields similar walking speeds, system usability and cognitive effort ratings as using buttons. To test the potential of using aural flows in a higher distracting context, a study (n=60) was conducted in a driving simulation lab. Each participant drove through three driving scenario complexities: low, moderate and high. Within each driving complexity, the participants went through an alternative aural application exposure: no device, voice-controlled aural flows (ANFORADrive) or alternative solution on the market (Umano). The results suggest that voice-controlled aural flows do not affect distraction, overall safety, cognitive effort, driving performance or driving behavior when compared to the no device condition.
  • Loading...
    Thumbnail Image
    Item
    Post-Processing Automatic Transcriptions with Machine Learning for Verbal Fluency Scoring
    (Elsevier, 2023) Bushnell, Justin; Unverzagt, Frederick; Wadley, Virginia G.; Kennedy, Richard; Del Gaizo, John; Clark, David Glenn; Neurology, School of Medicine
    Objective: To compare verbal fluency scores derived from manual transcriptions to those obtained using automatic speech recognition enhanced with machine learning classifiers. Methods: Using Amazon Web Services, we automatically transcribed verbal fluency recordings from 1400 individuals who performed both animal and letter F verbal fluency tasks. We manually adjusted timings and contents of the automatic transcriptions to obtain "gold standard" transcriptions. To make automatic scoring possible, we trained machine learning classifiers to discern between valid and invalid utterances. We then calculated and compared verbal fluency scores from the manual and automatic transcriptions. Results: For both animal and letter fluency tasks, we achieved good separation of valid versus invalid utterances. Verbal fluency scores calculated based on automatic transcriptions showed high correlation with those calculated after manual correction. Conclusion: Many techniques for scoring verbal fluency word lists require accurate transcriptions with word timings. We show that machine learning methods can be applied to improve off-the-shelf ASR for this purpose. These automatically derived scores may be satisfactory for some applications. Low correlations among some of the scores indicate the need for improvement in automatic speech recognition before a fully automatic approach can be reliably implemented.
About IU Indianapolis ScholarWorks
  • Accessibility
  • Privacy Notice
  • Copyright © 2025 The Trustees of Indiana University