ABSTRACT In this paper, we consider the estimation problem for the semiparametric regression model with censored data in which the number of explanatory variables p in the linear part is… Click to show full abstract
ABSTRACT In this paper, we consider the estimation problem for the semiparametric regression model with censored data in which the number of explanatory variables p in the linear part is much larger than sample size n, often denoted as p n. The purpose of this paper is to study the effects of covariates on a response variable censored on the right by a random censoring variable with an unknown probability distribution. It should be noted that high variance and over-fitting are a major concern in such problems. Ordinary statistical methods for estimation cannot be applied directly to censored and high-dimensional data, and therefore a transformation is required. In the context of this paper, a synthetic data transformation is used for solving the censoring problem. We then apply the LASSO-type double-penalized least squares (DPLS) to achieve sparsity in the parametric component and use smoothing splines to estimate the nonparametric component. A Monte Carlo simulation study is performed to show the performance of the estimators and to analyse the effects of the different censoring levels. A real high-dimensional censored data example is used to illustrate the ideas discussed herein.
               
Click one of the above tabs to view related content.