Outliers are a critical factor that affects the accuracy of data-based predictions and some other data-based processing; thus, outliers must be effectively detected as soon as possible to improve the… Click to show full abstract
Outliers are a critical factor that affects the accuracy of data-based predictions and some other data-based processing; thus, outliers must be effectively detected as soon as possible to improve the credibility of the data. In recent years, massive outlier detection approaches have been proposed for static data and precise data; however, the uncertainty and weight information of each item was not considered in this prior work. Moreover, traditional outlier detection approaches only take the deviation degree of each data element as the standard for determining outliers; therefore, the detected outliers do not fit the definition of an outlier (i.e., rarely appearing and different from most of the other data). Aimed at these problems, a minimal weighted infrequent itemset mining-based outlier detection approach that can be applied to an uncertain data stream, called MWIFIM–OD–UDS, is proposed in this paper to effectively detect implicit outliers, which have a rarely occurring frequency, uncertainty and a certain weight of the itemset, while the characteristics of the data stream are considered. In particular, a matrix structure-based approach that is called MWIFIM–UDS is proposed to mine the minimal weighted infrequent itemsets ( MWiFIs ) from an uncertain data stream, and then, the MWIFIM–OD–UDS method is proposed based on the mined MWiFIs and the designed deviation indexes. Experimental results show that the proposed MWIFIM–OD–UDS method outperforms the frequent itemset mining-based outlier detection methods, FindFPOF and LFP, in terms of its runtime and detection accuracy.
               
Click one of the above tabs to view related content.