Abstract Many scientific disciplines rely on dimensionality reduction techniques for computationally less expensive handling of multivariate data sets. In particular, Principal Component Analysis (PCA) is a popular method that can… Click to show full abstract
Abstract Many scientific disciplines rely on dimensionality reduction techniques for computationally less expensive handling of multivariate data sets. In particular, Principal Component Analysis (PCA) is a popular method that can be used to discover the underlying low-dimensional manifolds in high-dimensional data sets. PCA-derived manifolds are formed by projecting the original data set onto a new basis spanned by the first few Principal Components (PCs). In many cases, it is crucial that the manifold maintains certain topological characteristics for its subsequent analysis and parameterization. Avoiding overlap or steep gradients of a dependent variable could be two desired examples. In this paper, we present PCAfold , a Python software package that can be used to generate, improve and analyze low-dimensional manifolds. Our software incorporates data preprocessing, clustering and sampling techniques, uses PCA as a data reduction technique and utilizes a novel approach to assess the quality of the obtained low-dimensional manifolds.
               
Click one of the above tabs to view related content.