The amount of multimedia data has grown rapidly because of improvements in data collection and storage technologies. The association rule mining (ARM) technique is a type of data mining method… Click to show full abstract
The amount of multimedia data has grown rapidly because of improvements in data collection and storage technologies. The association rule mining (ARM) technique is a type of data mining method widely used to extract useful information from data warehouses. In real-world big data applications, fast and effective data mining algorithms are emerging as a valuable approach. In this paper, we propose DCE-Miner, a fast association rule mining algorithm with low memory requirements based on the MapReduce framework. In the precomputation phase, we split large datasets into equal-sized smaller ones using data division method. In the frequent K-itemsets mining phase, the mappers read the small datasets and distribute the data to reducers based on the closed set characteristics associated with each partition. The reducers use bitmaps to accelerate the computation speed and store the possible frequent 2-itemsets to reduce future computation. Extensive experimental results show that on large-scale datasets with up to 40 million transactions, DCE-Miner achieves better performance and is more robust with respect to dataset sizes and support level than are the current algorithms.
               
Click one of the above tabs to view related content.