Disease registries, surveillance data, and other data sets with extremely large sample sizes become increasingly available in providing population-based information on disease incidence, survival probability or other important public health… Click to show full abstract
Disease registries, surveillance data, and other data sets with extremely large sample sizes become increasingly available in providing population-based information on disease incidence, survival probability or other important public health characteristics. Such information can be leveraged in studies that collect detailed measurements but with smaller sample sizes. In contrast to recent proposals that formulate additional information as constraints in optimization problems, we develop a general framework to construct simple estimators that update the usual regression estimators with some functionals of data that incorporate the additional information. We consider general settings which incorporate nuisance parameters in the auxiliary information, non-i.i.d. data such as those from case-control studies, and semiparametric models with infinite dimensional parameters common in survival analysis. Details of several important data and sampling settings are provided with numerical examples. This article is protected by copyright. All rights reserved.
               
Click one of the above tabs to view related content.