Blog Post
Adaptive Design Series: Futility-A Big Reason We Are Here
August 1, 2012
Note: This article is one of a series about adaptive design that come from a blog written by Dr. Karen Kesler from 2010 to 2011. That blog is no longer active, but it contained some great information, so we wanted to re-post it here.
When we started doing pruning studies, I was frustrated by the futility boundaries found in group sequential designs. They’re generally the opposite of the efficacy bounds, which means you have to prove your compound is significantly worse than your control before you cross the boundary, for heaven’s sake! I know nobody wants to talk about the fact that their drug or device or biologic doesn’t work, but as statisticians, it’s our duty to remind them that reality doesn’t play nice. Especially now, when money is tight, we need to put our resources toward therapies that work—which means killing doses or therapies that don’t work as quickly as possible.
The implication I see is that spending a bit more effort than is fashionable right now in Phase II makes sense. It’s easier to justify looking at a wider range of doses in Phase II if you can eliminate them early in a study using futility boundaries. (See “Why Pruning Designs are my favorite adaptive designs” for more detail on that.) Putting a little more power into a “go/no-go” study could save you a lot of headaches by preventing Phase III trials that have no hope for success.
This all begs the question of what do we use for futility—either in pruning doses or killing development programs. I believe the answer lies in simulations. Every company and clinical area has a tangled mess of expectations and risk-benefit limits. Some companies are willing to go to Phase III with less information than others. Some companies have more compounds in the pipeline and therefore are willing to kill one to put the resources into a more promising one. Making a one-size-fits-all framework for futility would be impossible. By simulating what will happen in a few key scenarios (e.g. “compound doesn’t work at all”, “compound works as expected”, etc.) you can quantify the risks and benefits for the decision makers in an approachable manner. (Certainly, more approachable than power/sample size calculations!) I know that it’s a hard argument to make. “Gosh, why don’t we consider options that will kill this therapy early?” just doesn’t go over well in the board room, but if you consider the bigger picture, futility can save you big.