Comparing Machine Learning Algorithms for Large-scale Crop Yield Prediction Using Agroecological Parameters and Pesticide Usage

Authors

  • Pavlo Lykhovyd Institute of Climate-Smart Agriculture of NAAS, Ukraine
  • Liudmyla Hranovska Institute of Climate-Smart Agriculture of NAAS, Ukraine
  • Oleksandr Averchev Kherson State Agrarian and Economic University, Ukraine
  • Oleksandr Zhuikov Kherson State Agrarian and Economic University, Ukraine
  • Gennadiy Karashchuk Kherson State Agrarian and Economic University, Ukraine
  • Dmytro Maksymov Institute of Climate-Smart Agriculture of NAAS, Ukraine

DOI:

https://doi.org/10.7546/CRABS.2026.01.17

Keywords:

data-driven agriculture, feature importance analysis, regression, ensemble modeling, sustainability

Abstract

Accurate crop yield prediction is essential to ensure global food security under changing climatic, environmental and agro-industrial conditions. This study presents a comprehensive comparative analysis of machine learning algorithms for large-scale yield prediction using agroecological parameters and pesticide usage as key explanatory variables. Crop yield prediction dataset, downloaded from Kaggle, comprising 28 242 records across multiple countries and crops was preprocessed and modelled with 17 algorithms, including tree-based, regression-based, support vector, neural network, boosting, and ensemble approaches. Model performance was assessed using root mean square error (RMSE) and coefficient of determination (R2). Modelling and visualization were performed in Python 3 using corresponding external modules. Among global models, Extra Trees achieved the highest accuracy (R2 = 0.991$, RMSE = 8282.92 hg/ha), outperforming both gradient boosting and neural network approaches. Ensemble techniques, particularly stacking ensembles, provided comparable accuracy R2 = 0.990), confirming the robustness of tree-based methods. Feature importance analysis highlighted crop type (0.609) as the dominant predictor, while pesticides (0.110), average temperature (0.108), and rainfall (0.087) emerged as the most influential agro-environmental factors. Country-specific models achieved near-perfect predictive power R2 ≈ 1.0) for India, Brazil, Pakistan, Mexico, and Turkey, while Ukraine's best-performing model (XGBoost, R2 = 0.980) revealed yields averaging 43.4% below the global mean. Crop-level analysis identified potatoes, cassava, and sweet potatoes as the highest-yielding crops globally. These results demonstrate the superiority of tree-based and ensemble models for yield forecasting and emphasize the value of localized modelling strategies. Findings provide actionable insights for optimizing agricultural practices and guiding sustainable intensification policies.

Author Biographies

Pavlo Lykhovyd, Institute of Climate-Smart Agriculture of NAAS, Ukraine

Mailing Address:
Department of Irrigated Agriculture and Decarbonization of Agroecosystems
Institute of Climate-Smart Agriculture of NAAS
Vil. Khlibodarske, Odessa,
67667, Ukraine

E-mail: pavel.likhovid@gmail.com

Liudmyla Hranovska, Institute of Climate-Smart Agriculture of NAAS, Ukraine

Mailing Address:
Department of Irrigated Agriculture and Decarbonization of Agroecosystems
Institute of Climate-Smart Agriculture of NAAS
Vil. Khlibodarske, Odessa,
67667, Ukraine

E-mail: G_Ludmila15@ukr.net

Oleksandr Averchev, Kherson State Agrarian and Economic University, Ukraine

Mailing Address:
Department of Agriculture
Kherson State Agrarian and Economic University
23 Stritenska St, Kherson, Ukraine

E-mail: averchev1966@gmail.com

Oleksandr Zhuikov, Kherson State Agrarian and Economic University, Ukraine

Mailing Address:
Department of Agriculture
Kherson State Agrarian and Economic University
23 Stritenska St, Kherson, Ukraine

E-mail: docent6977@gmail.com

Gennadiy Karashchuk, Kherson State Agrarian and Economic University, Ukraine

Mailing Address:
Department of Manufacturing and Processing Agricultural Production Technologies Named after Academician V. G. Pelykh
Kherson State Agrarian and Economic University
23 Stritenska St, Kherson, Ukraine

E-mail: karashchuk_g@ksaeu.kherson.ua

Dmytro Maksymov, Institute of Climate-Smart Agriculture of NAAS, Ukraine

Mailing Address:
Department of Irrigated Agriculture and Decarbonization of Agroecosystems
Institute of Climate-Smart Agriculture of NAAS
Vil. Khlibodarske, Odessa, 67667, Ukraine

E-mail: maksimovsinger@gmail.com

Downloads

Published

28-01-2026

How to Cite

[1]
P. Lykhovyd, L. Hranovska, O. Averchev, O. Zhuikov, G. Karashchuk, and D. Maksymov, “Comparing Machine Learning Algorithms for Large-scale Crop Yield Prediction Using Agroecological Parameters and Pesticide Usage”, C. R. Acad. Bulg. Sci., vol. 79, no. 1, pp. 136–144, Jan. 2026.

Issue

Section

Agricultural Sciences