The major challenges during the data acquisition process in an environmental wireless sensor network (EWSN) architecture are the presence of outliers and missing data. Outliers and missing data are ubiquitous… Click to show full abstract
The major challenges during the data acquisition process in an environmental wireless sensor network (EWSN) architecture are the presence of outliers and missing data. Outliers and missing data are ubiquitous in EWSN due to sensor failures, external noise, power dwindling, communication failures etc. Robust tensor principal component analysis (RTPCA) decomposes a noisy data tensor into a low-rank tensor and a sparse tensor, which can be exploited for the data recovery in EWSNs, where the low-rank component represents the intrinsic data tensor and the sparse component represents the gross outlier tensor. In this paper, a novel probabilistic outlier modelling scheme using multivariate Chebyshev's inequality hypothesis is proposed, which maps the sample population and the associated magnitudes of outliers with the spatio-temporal correlations in the acquired data. The inherent spatio-temporal and multi-attribute correlations in the EWSN data are established using singular value methods. A tensor nuclear norm (TNN) which extracts more temporal and multi-attribute correlations in the sensory data through block circulant matricization is used as the minimization function in RTPCA. In the forest surveillance scenario, usage of RTPCA results in a reconstruction accuracy of approximately 90% for a dataset with a high missing ratio of 0.95 and the outliers having largest possible magnitudes considered. For oceanographic data, in which the variance is lower, the simulations show similar performance for different outlier contaminations, although RPTCA performs better than other matrix and tensor data completion methods in terms of reconstruction accuracy.
               
Click one of the above tabs to view related content.