LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Gramian matrix data collection-based random forest classification for predictive analytics with big data

Photo by campaign_creators from unsplash

Prediction is the process of analyzing the current and past events to identify future events. The prediction of the subsequent future conditions is still a revealing stage in many applications… Click to show full abstract

Prediction is the process of analyzing the current and past events to identify future events. The prediction of the subsequent future conditions is still a revealing stage in many applications to minimize the risk level. Several techniques have been developed for predictive analysis with big data. However, an accurate prediction analysis was not obtained while handling a large volume of data with less complexity. In order to improve prediction accuracy with less complexity, a Gramian symmetric data collection-based random forest bivariate regression and classification (GSDC-RFBRC) technique is developed. Initially, a large volume of data is collected from the dataset. Then, the Gramian symmetric matrix is used for storing the volume of data in rows and columns of a matrix. Then, the classification and regression process is carried out using random decision forests for finding future outcomes. Regression process measures the relationship between a dependent variable (i.e., outcomes) and independent variables (i.e., data) through bivariate correlation. Random decision forest constructs a number of decision trees for classification based on the correlation. Finally, it combines a number of decision trees and applies the voting scheme. The majority vote of classification results is identified for achieving high prediction accuracy. Experimental evaluation is carried out on the factors such as prediction accuracy, prediction time, false-positive rate and space complexity with respect to the number of data (i.e., file). The results confirmed that the proposed GSDC-RFBRC technique improves the performance results of prediction accuracy and minimizes the prediction time, false-positive rate as well as space complexity.

Keywords: prediction accuracy; collection based; data collection; prediction; big data; classification

Journal Title: Soft Computing
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.