Measurement error in continuous, normally distributed data is well known in the literature. Measurement error in a binary outcome variable, however, remains under-studied. Misclassification is the error in categorical data… Click to show full abstract
Measurement error in continuous, normally distributed data is well known in the literature. Measurement error in a binary outcome variable, however, remains under-studied. Misclassification is the error in categorical data in which the observed category is different from the underlying one. In this study, we show, through a Monte Carlo simulation study, that there are non-ignorable biases in parameter estimates if the misclassification is ignored. To deal with the influence of the misclassification, we introduce a model with false-positive and false-negative misclassification parameters. Such a model can not only estimate the underlying association between the dependent and independent variables, but it also provides information on the extent of the misclassification. To estimate the model, a maximum likelihood estimation method based on a Fisher scoring algorithm is utilized. A simulation study is conducted to evaluate the performance, and a real data example is given to demonstrate the usefulness of the new model. An R package is developed to aid the application of the model.
               
Click one of the above tabs to view related content.