LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

Ensemble machine learning techniques using computer simulation data for wild blueberry yield prediction

Photo from wikipedia

Precision agriculture is a challenging task to achieve. Several studies have been conducted to forecast agricultural yields using machine learning algorithms (MLA), but few studies have used ensemble machine learning… Click to show full abstract

Precision agriculture is a challenging task to achieve. Several studies have been conducted to forecast agricultural yields using machine learning algorithms (MLA), but few studies have used ensemble machine learning algorithms (EMLA). In the current study, we used a dataset generated by a computer simulation program, and meteorological data obtained over 30 years ago from Maine, United States (USA). The primary goal of this research is to increase the forecast accuracy of the best characteristics for overcoming hunger challenges. We designed stacking regression (SR) and cascading regression (CR) with a novel combination of MLA based on the wild blueberry dataset. We used features that indicated the best regulation for wild blueberry agroecosystems. The four feature engineering selection techniques are applied variance inflation factor (VIF), sequential forward feature selection (SFFS), sequential backward elimination feature selection (SBEFS), and extreme gradient boosting based on feature importance (XFI). We applied Bayesian optimization on popular MLA to obtain the best hyperparameters to achieve accurate wild blueberry yield prediction. The SR used a two-layer structure: level-0 contained light gradient boosting machine (LGBM), gradient boost regression (GBR), and extreme gradient boosting (XGBoost); level-1 provided the output prediction using a Ridge. The (CR) topology is the same MLA used in SR, but in a series form that takes the new prediction as a feeder to each MLA and removes the previous prediction in each stage. We assessed many techniques, CR, and SR outcomes regarding the root mean square error (RMSE) and coefficient of determination (R2). In the results, the proposed SR showed the best performance 0.984 R2 and 179.898 RMSE compared with another study that published 0.938 R2 and 343.026 RMSE on the seven features selected by XFI. The SR achieved the highest 0.985 R2 on all features and the features that were selected by SBEFS. Our SR outperformed CR, many other techniques, and another study on wild blueberry yield prediction.

Keywords: blueberry yield; machine learning; wild blueberry; prediction; yield prediction

Journal Title: IEEE Access
Year Published: 2022

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.