Standard corrections for missing data rely on the strong and generally untestable assumption of missing at random. Heckman-type selection models relax this assumption, but have been criticized because they typically… Click to show full abstract
Standard corrections for missing data rely on the strong and generally untestable assumption of missing at random. Heckman-type selection models relax this assumption, but have been criticized because they typically require a selection variable which predicts non-response but not the outcome of interest, and can impose bivariate normality. In this paper we illustrate an application using a copula methodology which does not rely on bivariate normality. We implement this approach in data on HIV testing at a demographic surveillance site in rural South Africa which are affected by non-response. Randomized incentives are the ideal selection variable, particularly when implemented ex ante to deal with potential missing data. However, elements of survey design may also provide a credible method of correcting for non-response bias ex post. For example, although not explicitly randomized, allocation of food gift vouchers during our survey was plausibly exogenous and substantially raised participation, as did effective survey interviewers. Based on models with receipt of a voucher and interviewer identity as selection variables, our results imply that 37% of women in the population under study are HIV positive, compared to imputation-based estimates of 28%. For men, confidence intervals are too wide to reject the absence of non-response bias. Consistent results obtained when comparing different selection variables and error structures strengthen these conclusions. Our application illustrates the feasibility of the selection model approach when combined with survey metadata.
               
Click one of the above tabs to view related content.