Hyper-structure mining of frequent patterns in uncertain data streams
Date
Language
Embargo Lift Date
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
Data uncertainty is inherent in many real-world applications such as sensor monitoring systems, location-based services, and medical diagnostic systems. Moreover, many real-world applications are now capable of producing continuous, unbounded data streams. During the recent years, new methods have been developed to find frequent patterns in uncertain databases; nevertheless, very limited work has been done in discovering frequent patterns in uncertain data streams. The current solutions for frequent pattern mining in uncertain streams take a FP-tree-based approach; however, recent studies have shown that FP-tree-based algorithms do not perform well in the presence of data uncertainty. In this paper, we propose two hyper-structure-based false-positive-oriented algorithms to efficiently mine frequent itemsets from streams of uncertain data. The first algorithm, UHS-Stream, is designed to find all frequent itemsets up to the current moment. The second algorithm, TFUHS-Stream, is designed to find frequent itemsets in an uncertain data stream in a time-fading manner. Experimental results show that the proposed hyper-structure-based algorithms outperform the existing tree-based algorithms in terms of accuracy, runtime, and memory usage.