LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Robust subspace methods for outlier detection in genomic data circumvents the curse of dimensionality

Photo from wikipedia

The application of machine learning to inference problems in biology is dominated by supervised learning problems of regression and classification, and unsupervised learning problems of clustering and variants of low-dimensional… Click to show full abstract

The application of machine learning to inference problems in biology is dominated by supervised learning problems of regression and classification, and unsupervised learning problems of clustering and variants of low-dimensional projections for visualization. A class of problems that have not gained much attention is detecting outliers in datasets, arising from reasons such as gross experimental, reporting or labelling errors. These could also be small parts of a dataset that are functionally distinct from the majority of a population. Outlier data are often identified by considering the probability density of normal data and comparing data likelihoods against some threshold. This classical approach suffers from the curse of dimensionality, which is a serious problem with omics data which are often found in very high dimensions. We develop an outlier detection method based on structured low-rank approximation methods. The objective function includes a regularizer based on neighbourhood information captured in the graph Laplacian. Results on publicly available genomic data show that our method robustly detects outliers whereas a density-based method fails even at moderate dimensions. Moreover, we show that our method has better clustering and visualization performance on the recovered low-dimensional projection when compared with popular dimensionality reduction techniques.

Keywords: genomic data; dimensionality; outlier detection; robust subspace; curse dimensionality

Journal Title: Royal Society Open Science
Year Published: 2020

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.