"Weighted Outlier Detection of High-Dimensional Categorical Data Using Feature Grouping"

We propose a weighted outlier mining method called WATCH to identify outliers in high-dimensional categorical datasets. WATCH is composed of two distinctive modules: 1) feature grouping by the virtue of correlation measurement among features and 2) outlier mining by assigning scores to objects in each feature groups. At the heart of WATCH is the feature grouping module, which groups an array of features into multiple groups to discover various aspects of feature patterns in each group. The outlier mining module detects outliers from high-dimensional categorical datasets. Except for the number of outliers specified by users, WATCH is conducive to bypassing the optimization of any user-given parameter. We implement and evaluate WATCH using synthetic and real-world datasets. Our experimental results show that WATCH is a promising and practical algorithm to detect outliers in high-dimensional categorical datasets, because WATCH achieves high performance in terms of precision, efficiency, and interpretability.

Keywords: weighted outlier; outlier mining; dimensional categorical; high dimensional; feature grouping

Journal Title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
Year Published: 2020

Link to full text (if available)

Share on Social Media: Sign Up to like & get
recommendations!
0

LAUSR

You are not signed in:

Sign Up!

Related content

More Information News Social Media Video Recommended