RE: SG018 primary endpoints17 Jan 2022 19:25
[1 of 2]
Hi Matterhorn/all, first up thanks for your contributions to the board over the last however many months, I personally really appreciate what you’ve shared. Saw this thread last night but it was far too late for me to think about replying then!
I have something of a stats & data science background but medical stats aren’t really my thing – having said that, I might be able to offer some assistance. I’ll go through things in order & try to tie things together.
1) Power. With regards to the power of a statistical test, this is an important factor in the design of an experiment / trial. What the ‘power’ gets at is the question “if the true effect size (of the treatment) is as big as I hypothesise, given the number of test subjects, what is the probability that (due to bad luck) I won’t detect any significant effect at all?” In other words, what is the ‘false negative’ rate. In the case of time to recovery, SNG hypothesise that the hazard ratio (more on that in a sec) for SNG001 with respect to time to recovery is at least 1.7. *If that last part is true*, then with a trial of 610 patients there is a <=10% chance of the p-value of the hazard ratio for that endpoint being >5% in SPRINTER (i.e. the trial not meeting that particular endpoint) just due to bad luck. Equivalently, the trial has 90% power for that endpoint. If the true hazard ratio is actually higher than 1.7 (i.e. SNG001 is more effective than that) then the false negative rate from SPRINTER would in practice have been lower (the trial would have actually had greater than 90% power), even though we couldn’t have known that in advance.
So it doesn’t mean the meeting the endpoint is 90% likely, it means that *if the effect size is as big as SNG hypothesise* then it is 90% likely. With regards to P3 respiratory trials in general, I think a figure of 70% has recently been stated by FinnCap, but I’d have it far higher given the futility analysis, Ashfield, and the new jobs & website today!
2) Think Josh has answered this, but in short, a p-value means “if the treatment has no effect, what is the probability that I observed a significant result for that endpoint due to random good luck?” When that is small (typically <5%) we reject the notion that the treatment has no effect on that endpoint. (Ps Josh that coin toss probability starts with a 0. and then has ten zeros before another number!)
Worth adding here that the line “adjusting for the Hochberg procedure” means that SNG are not going to be dredging the data to try to find a p-value < 5% somewhere. This is important for confidence / validity as even with no effect, if you perform 20 tests, you would expect one of them to return a p-value <= 5% just due to randomness.