Simple Summary Cancer prevalence estimates are used to guide policymaking, from prevention to screening programs. However, these data are only available for 28% of the U.S. population. We used deep… Click to show full abstract
Simple Summary Cancer prevalence estimates are used to guide policymaking, from prevention to screening programs. However, these data are only available for 28% of the U.S. population. We used deep learning to analyze satellite imagery in order to predict cancer prevalence with a high spatial resolution. This method explained up to 64.37% of the variation of cancer prevalence. It could potentially be used to map cancer prevalence in entire regions for which these estimates are currently unavailable. Abstract The worldwide growth of cancer incidence can be explained in part by changes in the prevalence and distribution of risk factors. There are geographical gaps in the estimates of cancer prevalence, which could be filled with innovative methods. We used deep learning (DL) features extracted from satellite images to predict cancer prevalence at the census tract level in seven cities in the United States. We trained the model using detailed cancer prevalence estimates from 2018 available in the CDC (Center for Disease Control) 500 Cities project. Data from 3500 census tracts covering 14,483,366 inhabitants were included. Features were extracted from 170,210 satellite images with deep learning. This method explained up to 64.37% (median = 43.53%) of the variation of cancer prevalence. Satellite features are highly correlated with individual socioeconomic and health measures that are linked to cancer prevalence (age, smoking and drinking status, and obesity). A higher similarity between two environments is associated with better generalization of the model (p = 1.10–6). This method can be used to accurately estimate cancer prevalence at a high spatial resolution without using surveys at a fraction of the cost.
               
Click one of the above tabs to view related content.