We thank Dr Vickers for his thoughtful commentary. Dr Vickers argues that if a benefit–harm metric includes absolutely every aspect of the decision, then there is no clinically relevant difference,… Click to show full abstract
We thank Dr Vickers for his thoughtful commentary. Dr Vickers argues that if a benefit–harm metric includes absolutely every aspect of the decision, then there is no clinically relevant difference, and even for the smallest net benefit, you should decide to take the better treatment. If we take Dr Vickers’ wholistic benefit–harm metric and the premise that we would want to find any size of net benefit, the sample size could become gigantic. He implies that a formal, quantitative decision analysis using such a wholistic benefit–harm metric is necessary and sufficient for decision-making, and recommendations are clearly dictated by this decision analysis. Therefore, as this is decision-making and not inference, the decision analysis should be performed separately, after the trial has been performed, and should not inform the design of the trial. All guideline panels consider benefits, harms, and burdens, but many subsequently also consider other aspects such as cost, acceptability, feasibility, and equity. It is probably impossible to include all aspects in a benefit– harm metric because some are qualitative by nature. Moreover, the benefit–harm balance should be judged first, before considering other aspects, so a wholistic metric is not suited to inform decisions. More fundamentally, there is an entire process from evidence generation to a decision. The Grading of Recommendations Assessment, Development and Evaluation (GRADE) system describes these steps and clearly distinguishes between evidence synthesis and making recommendations. Thus, informing decisions through evidence generation and synthesis, whether that synthesis includes a formal decision analysis or not, is not the same as making the decision. GRADE’s concept of the certainty of net benefit includes benefits, harms, and burdens. This concept explains why Dr Vickers’ choice of retirement fund does not apply to guideline development: a small net benefit with a small or a large confidence interval has different implications. With a small interval, the net benefit or harm is certainly near zero. With a large interval, we have low certainty of net benefit or harm. The recommendation will likely differ, and it is not a direct conclusion from the evidence, but a decision that puts many aspects into context. A small net benefit will have to be weighed against cost and other factors. GRADE’s concept of the certainty in net benefit also implies that we could calculate the sample size based on the width of the confidence interval of the benefit– harm balance, or in an analogous way as equivalence studies. Accordingly, the sample size would not need to be gigantic just because the net benefit is expected to be close to zero. Trial design should anticipate the next steps in evidence synthesis. Just like trial design can be optimized to update the existing evidence in a meta-analysis and a trial’s sample size can be minimized accordingly, trial design could also anticipate how to improve the certainty in the benefit–harm balance. The guideline and regulatory communities have formalized and conceptualized net benefit. This opens opportunities, as we have argued, to design clinical trials to generate more useful, interpretable evidence to inform decisions.
               
Click one of the above tabs to view related content.