The distribution and abundance of Phlebotomus papatasi, the primary vector of zoonotic cutaneous leishmaniasis in most semi-/arid countries, is a major public health challenge. This study compares several approaches to… Click to show full abstract
The distribution and abundance of Phlebotomus papatasi, the primary vector of zoonotic cutaneous leishmaniasis in most semi-/arid countries, is a major public health challenge. This study compares several approaches to model the spatial distribution of the species in an endemic region of the disease in Golestan province, northeast of Iran. The intent is to assist decision makers for targeted interventions. We developed a geo-database of the collected Phlebotominae sand flies from different parts of the study region. Sticky paper traps coated with castor oil were used to collect sand flies. In 44 out of 142 sampling sites, Ph. papatasi was present. We also gathered and prepared data on related environmental factors including topography, weather variables, distance to main rivers and remotely sensed data such as normalized difference vegetation cover and land surface temperature (LST) in a GIS framework. Applicability of three classifiers: (vanilla) logistic regression, random forest and support vector machine (SVM) were compared for predicting presence/absence of the vector. Predictive performances were compared using an independent dataset to generate area under the ROC curve (AUC) and Kappa statistics. All three models successfully predicted the presence/absence of the vector, however, the SVM classifier (Accuracy = 0.906, AUC = 0.974, Kappa = 0.876) outperformed the other classifiers on predicting accuracy. Moreover, this classifier was the most sensitive (85%), and the most specific (93%) model. Sensitivity analysis of the most accurate model (i.e. SVM) revealed that slope, nighttime LST in October and mean temperature of the wettest quarter were among the most important predictors. The findings suggest that machine learning techniques, especially the SVM classifier, when coupled with GIS and remote sensing data can be a useful and cost-effective way for identifying habitat suitability of the species.
               
Click one of the above tabs to view related content.