BACKGROUND AND OBJECTIVE Diagnostic and prognostic prediction models often perform poorly when externally validated. We investigate how differences in the measurement of predictors across settings affect the discriminative power and… Click to show full abstract
BACKGROUND AND OBJECTIVE Diagnostic and prognostic prediction models often perform poorly when externally validated. We investigate how differences in the measurement of predictors across settings affect the discriminative power and transportability of a prediction model. METHODS Differences in predictor measurement between data sets can be described formally using a measurement error taxonomy. Using this taxonomy, we derive an expression relating variation in the measurement of a continuous predictor to the area under the receiver operating characteristic curve (AUC) of a logistic regression prediction model. This expression is used to demonstrate how variation in measurements across settings affects the out-of-sample discriminative ability of a prediction model. We illustrate these findings with a diagnostic prediction model using example data of patients suspected of having deep venous thrombosis. RESULTS When a predictor, such as D-dimer, is measured with more noise in one setting compared to another, which we conceptualize as a difference in "classical" measurement error, the expected value of the AUC decreases. In contrast, constant, "structural" measurement error does not impact on the AUC of a logistic regression model, provided the magnitude of the error is the same among cases and noncases. As the differences in measurement methods between settings (and in turn differences in measurement error structures) become more complex, it becomes increasingly difficult to predict how the AUC will differ between settings. CONCLUSION When a prediction model is applied to a different setting to the one in which it was developed, its discriminative ability can decrease or even increase if the magnitude or structure of the errors in predictor measurements differ between the two settings. This provides an important starting point for researchers to better understand how differences in measurement methods can affect the performance of a prediction model when externally validating or implementing it in practice.
               
Click one of the above tabs to view related content.