With the booming of social media and health informatics, there exists a pressing need for a powerful tool to sustain comprehensive analysis of public and personal health information. In particular,… Click to show full abstract
With the booming of social media and health informatics, there exists a pressing need for a powerful tool to sustain comprehensive analysis of public and personal health information. In particular, it should be able (1) to maximize the discovery of association rules amongst data items and (2) to handle the rapid growing data scale. The FP-Growth algorithm is a salient association rule learning method in exploring potential relation in database possibly with a lack of priori knowledge. It has the merits of low time & space complexity, whereas it cannot handle negative association rules which is necessary in comprehensive mining of health data. In order to enable comprehensive discovery of association rules, this study extends the FP-Growth algorithm to mine both positive and negative frequent patterns, namely the PNFP-Growth framework. The extended approach also adopts a prune strategy to filter out misleading patterns to the most by correlating the negative data items and the positive ones. Experiments had been performed to evaluate the performance of the PNFP-Growth over a public data set and a database consisting of thousands of people’s real health examination information (collected within 5 years from the date of this publication). The results indicate that (1) the PNFP-Growth can excavate more patterns than the traditional counterpart does while it is still highly efficient, and (2) the analysis upon the health examination data is informative and well complies with the clinical practices, e.g., more than 30 % people suffering from hypertension are having high systolic pressure and liver problems.
               
Click one of the above tabs to view related content.