Weighted logrank tests are the usual tool for detecting late effects in clinical trials. Weights determine the alternative hypotheses against which the tests are optimal. Choosing a specific weight is… Click to show full abstract
Weighted logrank tests are the usual tool for detecting late effects in clinical trials. Weights determine the alternative hypotheses against which the tests are optimal. Choosing a specific weight is thus a crucial issue in practice. One common weight was introduced in 1982 by Harrington and Fleming. The corresponding test is implemented in standard statistical softwares packages. However, using this test in randomized controlled clinical trials raises two major and still unsolved difficulties. First, the weight depends on a parameter q that has to be set before collecting the data. Second, the necessary sample size depends on this q. This article addresses these difficulties. We provide the explicit form of the alternative hypothesis under which the Fleming–Harrington test for late effects is optimal in terms of Pitman’s asymptotic relative efficiency. Using simulations, we investigate various aspects of the Fleming–Harrington test for late effects, such as power properties and sensitivity to the value of q. We also investigate the relation between q and the necessary sample size for the Fleming–Harrington test. Based on these results, we propose q = 3 as a general choice for testing late effects. We illustrate our methodology on a data set arising from a prevention trial in the field of dementia.
               
Click one of the above tabs to view related content.