Multivariate density estimation methods typically work well in low dimensions and their extension to data analytics in high dimensions domain has proven challenging. For density estimation in high-dimensional big data… Click to show full abstract
Multivariate density estimation methods typically work well in low dimensions and their extension to data analytics in high dimensions domain has proven challenging. For density estimation in high-dimensional big data domains, the non-parametric Bayesian sequential partitioning (BSP) algorithm provides an efficient way of partitioning the sample space, based on Bayesian inference. In this paper, we present a detailed analysis of BSP and provide a computationally efficient copula-transformed data structure and algorithm for use in density estimation for data analytics in high dimensions. Using the copula-transformed data structure, we implement the density estimation for marginals in both BSP and kernel density estimation (KDE) methods. The data structures and algorithm are suitably designed for most efficient rendering into parallel processing paradigms of open multi-processing (OPENMP®) and message passing interface (MPI).
               
Click one of the above tabs to view related content.