We would love to hear your thoughts about our site and services, please take our survey here.
London South East prides itself on its community spirit, and in order to keep the chat section problem free, we ask all members to follow these simple rules. In these rules, we refer to ourselves as "we", "us", "our". The user of the website is referred to as "you" and "your".
By posting on our share chat boards you are agreeing to the following:
The IP address of all posts is recorded to aid in enforcing these conditions. As a user you agree to any information you have entered being stored in a database. You agree that we have the right to remove, edit, move or close any topic or board at any time should we see fit. You agree that we have the right to remove any post without notice. You agree that we have the right to suspend your account without notice.
Please note some users may not behave properly and may post content that is misleading, untrue or offensive.
It is not possible for us to fully monitor all content all of the time but where we have actually received notice of any content that is potentially misleading, untrue, offensive, unlawful, infringes third party rights or is potentially in breach of these terms and conditions, then we will review such content, decide whether to remove it from this website and act accordingly.
Premium Members are members that have a premium subscription with London South East. You can subscribe here.
London South East does not endorse such members, and posts should not be construed as advice and represent the opinions of the authors, not those of London South East Ltd, or its affiliates.
JimSanchez - thank you, that was very helpful!
Understanding this correctly is crucial as it defines what success here means. A bit late of me to only put effort into trying to decipher this now, but better late than too late/never.
Matml74 - I assumed what the numbers could be and applied ‘trial and error’ to try and get to a z-score of 1.4. But, was failing. Haha. Yes, you’re correct the phase II data wasn’t shared unfortunately.
[2 of 2]
3) The hazard ratio is the modelled ratio of the hazards between treatment and placebo. Referring back to the first example, the ‘hazard’ in question is “are you going to recover today”. You may or may not. What a hazard ratio of 1.7 implies is that if you were on SNG001, you were 70% more likely to recover on any given day during the study than if you were in the placebo arm.
The net effect of this is that people on treatment recover sooner, and their average time to recovery is shorter. What a hazard ratio of 1.7 means in those terms is anyone’s guess – like Matml says, unless you’re Wilko / SSH etc. you’re unlikely to know. Depends on the underlying hazard in the placebo arm. Often *median* differences in time to recovery, time to discharge etc. are quoted to make things more intuitive/relatable, and we’ll probably see that in the results RNS (perhaps tomorrow morning?!).
Worth adding that the analyst(s) will be building statistical models of each hazard, which also include the effects of things like age, sex, comorbidities etc. The hazard ratio puts all those to one side, controls for them if you will, and isolates the effect of the treatment vs. the placebo.
For anyone interested, this is called ‘survival analysis’ and this is an awesome link / series of papers if you’re so inclined:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2394262/
4) I think Mary has deducted a 1 from all of his figures so we should read them as hazard ratios of 1.8, 1.5, 1.2 etc. Happy to be corrected here.
Anyways hope the above is helpful and makes things more rather than less clear. If there’s someone on the board with a medical stats background they might be able to share more, but alas, this is far as my knowledge goes! Have a small child so can’t say I’ll be able to reply again tonight but hope this has helped & pays back some of the great research that’s been shared with me on this board!
All the best for tomorrow everyone!
[1 of 2]
Hi Matterhorn/all, first up thanks for your contributions to the board over the last however many months, I personally really appreciate what you’ve shared. Saw this thread last night but it was far too late for me to think about replying then!
I have something of a stats & data science background but medical stats aren’t really my thing – having said that, I might be able to offer some assistance. I’ll go through things in order & try to tie things together.
1) Power. With regards to the power of a statistical test, this is an important factor in the design of an experiment / trial. What the ‘power’ gets at is the question “if the true effect size (of the treatment) is as big as I hypothesise, given the number of test subjects, what is the probability that (due to bad luck) I won’t detect any significant effect at all?” In other words, what is the ‘false negative’ rate. In the case of time to recovery, SNG hypothesise that the hazard ratio (more on that in a sec) for SNG001 with respect to time to recovery is at least 1.7. *If that last part is true*, then with a trial of 610 patients there is a <=10% chance of the p-value of the hazard ratio for that endpoint being >5% in SPRINTER (i.e. the trial not meeting that particular endpoint) just due to bad luck. Equivalently, the trial has 90% power for that endpoint. If the true hazard ratio is actually higher than 1.7 (i.e. SNG001 is more effective than that) then the false negative rate from SPRINTER would in practice have been lower (the trial would have actually had greater than 90% power), even though we couldn’t have known that in advance.
So it doesn’t mean the meeting the endpoint is 90% likely, it means that *if the effect size is as big as SNG hypothesise* then it is 90% likely. With regards to P3 respiratory trials in general, I think a figure of 70% has recently been stated by FinnCap, but I’d have it far higher given the futility analysis, Ashfield, and the new jobs & website today!
2) Think Josh has answered this, but in short, a p-value means “if the treatment has no effect, what is the probability that I observed a significant result for that endpoint due to random good luck?” When that is small (typically <5%) we reject the notion that the treatment has no effect on that endpoint. (Ps Josh that coin toss probability starts with a 0. and then has ten zeros before another number!)
Worth adding here that the line “adjusting for the Hochberg procedure” means that SNG are not going to be dredging the data to try to find a p-value < 5% somewhere. This is important for confidence / validity as even with no effect, if you perform 20 tests, you would expect one of them to return a p-value <= 5% just due to randomness.
Matterhorn,
1.4 z score = 1.4 standard deviations from the mean but you would need to know both the mean and standard deviation of the study population to work it out. Whilst you can guess/assume different mean scores for placebo group, as we don’t know the variation in the sample I don’t think this can be worked out accurately. As far as I can tell SNG did not release the raw data (means/st dev’s) for the P2 so can’t use those numbers to try and work out a ballpark.
Joshholdforgold - many thanks for correcting me. Much appreciated. I've realised where I'd got things mixed up.
On a slightly different note I'm trying to determine the mean length of hospital stay in the treatment vs placebo group at the halfway mark which would've produced a z statistic of at least 1.4. That was the go/no go threshold to continue with the trial. But, I just cannot make it work. Not sure whether you or anyone else tried it. Time to recovery would be a bit more difficult due to it being measure over 35 days.
Hi Matterhorn.
The bit on statistical significance isn't quite right.
In a nutshell, the p-value is the probability that the results were a fluke. In this case 0.05. The 95% significance level is asking the question, if the trial was repeated 100 times, would you draw the same conclusion at least 95 times?
You do not need 95% of the sample to show improvement to be 95% confident that the data is not a fluke. You could for example be 95% confident that 20% of the sample would show improvement.
The smaller the % of the population showing improvement, the larger the sample needed for it to be statistically significant.
Take the flip of a coin:
n=number of flips h=heads t=tails.
n=2 t=0 h=2 Would not be statistically significant at the 95% level to say that heads and tails had different probabilities. The probability of this event is 0.25 (>p=0.05).
However,
n=1000 t=600 h=400 would be statistically significant. You would be able conclude that tails was more likely than heads given a 95% significance level. (I haven't worked out the exact probability of 600 or more tails out of 1000 tosses, but trust me it will be very small. Much less than 0.05).
[2 of 2]
4) [Effect size] The concept of ‘Effect size’ tries to determine whether the observed difference between the treatment and placebo group is large enough to be considered important. This is what Mary Aurélien’s tweet from last week referred to when he referenced 0.2, 0.5 and 0.8.
Effect size is expressed as ‘Cohen’s’ and is defined as ([mean of treatment group] – [mean of placebo group] / Standard Deviation).
0.2 = small effect, 0.5 = medium effect and 0.8 or greater is a large effect. So ideally we need to achieve 0.8 plus.
* Might try my hand at the futility analysis.
Hazard ratio
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC478551/#!po=55.7143
Power and statistical significance
https://meera.snre.umich.edu/power-analysis-statistical-significance-effect-size
https://towardsdatascience.com/the-relationship-between-significance-power-sample-size-effect-size-899fcf95a76d
[1 of 2]
Been upping my game on the statistical definitions, hence tried my hand at explaining in simple terms what success of the trial would mean according to the protocol’s definition. Anyone who is versed at statistics please add and/or correct where I may have gotten something wrong or anyone who could be of assistance.
SG018 protocol definition of trial success:
‘Success will be determined if at least one of the primary endpoints is declared statistically significant by the primary analysis. A sample size of 610 patients in total using a 1:1 randomisation ratio (305 patients per treatment arm) has been chosen to provide at least 90% power to detect a hazard ratio of 1.45 in time to hospital discharge and a hazard ratio of 1.7 in time to recovery and at least 95% power to declare statistical significance on at least one of the primary endpoints. This sample size has been calculated using a global 2-sided alpha level of 0.05 and adjusting for the Hochberg procedure to allow for multiple comparisons. '
My observations:
1) [Power] The trial was designed to recruit 610 patients (actual recruitment # was 623) to provide power of at least 90% for each primary endpoint and 95% power to meet at least one of the primary endpoints. Meaning there’s at least a 90% probability that a statistically significant difference will be found for each primary endpoint and a 95% probability of finding a statistically significant difference in at least one of the primary endpoints. The larger the trial size the better the power.
Generally accepted practice seems to indicate power of at least 80%.
2) [Statistical significance] To achieve statistical significance a primary endpoint need to produce a p-value of 0.05 (5%) or lower which will indicate that the difference observed between the treatment and placebo group is due to the treatment received and not by chance. This would translate in a favourable difference being observed in 95% of cases/pairs. (305 * 95% = 290).
For this trial the alpha is set at 0.05 i.e. requiring a p-value of 0.05 or lower. (p-value is defined as the probability that the results were due to chance and not based on treatment).
3) [Hazard ratio] The hazard ratio (HR) is the odds of a patient being discharged earlier or recovering earlier when receiving SNG001, however it does not indicate how much earlier discharge or recovery is.
For me translating the HR into odds is easier to understand where Odds = HR/(1+HR).
Time to discharge: HR of 1.45 translates to odds of 59% (chance of being discharged earlier).
Time to recover: HR of 1.7 translates to odds of 63% (chance of recovering earlier).
In SG016 HOSPITAL time to discharge had a HR of 1.72 (odds 63%) and time to recovery by day 28 a HR of 3.86 (odds 79% ). Time to discharge was not statistically significant. [Adding phase II stats gave a bit more context around the HRs for SG018).