LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

An efficient predictive analytics system for high dimensional big data

Photo from wikipedia

Abstract The excessive growth of high dimensional big data has resulted in a greater challenge for data scientists to efficiently obtain valuable knowledge from these data. Traditional data mining techniques… Click to show full abstract

Abstract The excessive growth of high dimensional big data has resulted in a greater challenge for data scientists to efficiently obtain valuable knowledge from these data. Traditional data mining techniques are not fit to process big data. Predictive analytics has grown in prominence alongside the emergence of big data. In this paper, an efficient predictive analytics system for high dimensional big data is proposed by enhancing scalable random forest (SRF) algorithm on the Apache Spark platform. SRF is enhanced by optimizing the hyperparameters and prediction performance is improved by reducing the dimensions. The effectiveness of the proposed system is examined on five real-world datasets. Experimental results demonstrated that the proposed system achieves the highly competitive performance compared with RF algorithm implemented by Spark MLlib.

Keywords: big data; high dimensional; system; predictive analytics; efficient predictive; dimensional big

Journal Title: Journal of King Saud University - Computer and Information Sciences
Year Published: 2019

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.