Ultrahigh dimensional data are collected in many scientific fields where the predictor dimension is often much higher than the sample size. To reduce the ultrahigh dimensionality effectively, many marginal screening… Click to show full abstract
Ultrahigh dimensional data are collected in many scientific fields where the predictor dimension is often much higher than the sample size. To reduce the ultrahigh dimensionality effectively, many marginal screening approaches are developed. However, existing screening methods may miss some important predictors which are marginally independent of the response, or select some unimportant ones due to their high correlations with the important predictors. Iterative screening procedures are proposed to address this issue. However, studying their theoretical properties is not straightforward. Penalized regression are not computationally efficient or numerically stable when the predictors are ultrahigh dimensional. To overcome these drawbacks, Wang (2009) proposed a novel Forward Regression (FR) approach for linear models. However, nonlinear dependence between predictors and the response is often present in ultrahigh dimensional problems. In this paper, we further extend the FR to develop a Forward Additive Regression (FAR) method for selecting significant predictors in ultrahigh dimensional nonparametric additive models. We establish the screening consistency for the FAR method and examine its finite-sample performance by Monte Carlo simulations. Our simulations indicate that, compared with marginal screenings, the FAR is shown to be much more effective to identify important predictors for additive models. When the predictors are highly correlated, the FAR even performs better than the iterative marginal screenings, such as iterative nonparametric independence screening (INIS). We also apply the FAR method to a real data analysis in genetic studies.
               
Click one of the above tabs to view related content.