LAUSR.org creates dashboard-style pages of related content for over 1.5 million academic articles. Sign Up to like articles & get recommendations!

SP0016 Stepwise or not to stepwise? the do’s and dont’s of multivariable modelling

Photo from wikipedia

Introduction Different types of regression analyses, including linear, logistic, and Cox regression, are commonly used methods in medical research. Usually, these analyses include more than one covariate as independent variables.… Click to show full abstract

Introduction Different types of regression analyses, including linear, logistic, and Cox regression, are commonly used methods in medical research. Usually, these analyses include more than one covariate as independent variables. This is particularly the case in observational studies: When investigating the possible association between an exposure and an outcome, there can be a large number of potential confounders. Examples are age, sex, body mass index, and lifestyle factors. How should we choose which variables to include in the model? Here I shall focus on two issues: Attempting to include too many covariates in the analyses Use of stepwise selection of covariates These are among the most frequently encountered issues in statistical review of manuscripts submitted for the Annals of the Rheumatic Diseases Lydersen 2015 Limit the number of covariates With a limited number of observations, how many covariates can you include? Traditional rules of thumb state that the ratio of observations per variable ought to be in the size of order 10. Some authors recommend 15, some 20, others state that 5 is sufficient. See Lydersen, 2015 and references therein. Do not use stepwise selection Stepwise selection of covariates basically means that only covariates that are statistically significant, typically with a p-value less than 0.05 or 0.10, are included in the model. A fundamental problem is the following: As always is the case in estimation, regression coefficients are estimated with some uncertainty. Hence, some are underestimated, and some are overestimated, that is, too far away from the null hypothesis. Including only covariates with small p-values causes overestimated coefficients to be more likely to be selected. This introduces bias away from the null hypothesis. Stepwise procedures used to be very popular, but today an increasing number of analyst criticise such methods. For example, Rothman et al. 2008 page 419 state: “There are several systematic, mechanical, and traditional algorithms for finding models (such as stepwise and best-subset regression) that lack logical and statistical justification and that perform poorly in theory, simulations and case studies … One serious problem is that the P-values and standard errors … will be downwardly biassed, usually to a large degree”. Recommendation Selection of covariates should be based on the research question at hand and on substantial knowledge such as what is biologically plausible. Chapter 10 ‘Predictor selection’ in the book Vittinghoff et al. 2012 gives good guidance. Check that the number of covariates is small enough compared to the number of observations. Do not use stepwise selection. References [1] Lydersen S. Statistical review: Frequently given comments. Ann.Rheum. Dis2015;74(2):323–325. [2] Rothman KJ, Greenland S, Lash TL. Modern epidemiology2008(3 ed.). Philadelphia: Wolters Kluwer Health/Lippincott Williams & Wilkins. [3] Vittinghoff E, Glidden DV, Shiboski SC, McCulloch CE. Regression methods in biostatistics linear, logistic, survival, and repeated measures models2012(2nd ed.). New York: Springer. Disclosure of Interest None declared

Keywords: regression; stepwise; number; stepwise selection; use stepwise

Journal Title: Annals of the Rheumatic Diseases
Year Published: 2018

Link to full text (if available)


Share on Social Media:                               Sign Up to like & get
recommendations!

Related content

More Information              News              Social Media              Video              Recommended



                Click one of the above tabs to view related content.