Sample Size & Power Calculations
Calculate for a Variety of frequentist and Bayesian Design
Adaptive Design
Design and Analyze a Wide Range of Adaptive Designs
Milestone Prediction
Predict Interim Analysis Timing or Study Length
Randomization Lists
Generate and Save Lists for your Trial Design
Group Sequential and Promising Zone Designs
Calculate Boundaries & Find Sample Size. Evaluate Interim Data & Re-estimate Sample Size
Sample Size for Bayesian Statistics
Probability of Success (Assurance), Credible Intervals, Bayes Factors and more
Early Stage and Complex Designs
Sample size & operating characteristics for Phase I, II & Seamless Designs (MAMS)
Sample Size & Power Calculations
Calculate for a Variety of frequentist and Bayesian Design
Adaptive Design
Design and Analyze a Wide Range of Adaptive Designs
Milestone Prediction
Predict Interim Analysis Timing or Study Length
Randomization Lists
Generate and Save Lists for your Trial Design
Group Sequential and Promising Zone Designs
Calculate Boundaries & Find Sample Size. Evaluate Interim Data & Re-estimate Sample Size
Sample Size for Bayesian Statistics
Probability of Success (Assurance), Credible Intervals, Bayes Factors and more
Early Stage and Complex Designs
Sample size & operating characteristics for Phase I, II & Seamless Designs (MAMS)
With increasing costs and failure rates in drug development an increasingly important issue, adaptive trials offer one mechanism to alleviate these problems and make clinical trials better reflect the statistical and practical needs of trial sponsors. These issues have also spurred an increasing openness towards innovative clinical trial designs in regulatory agencies around the world.
Summer 2019 nQuery Release Notes | ver 8.4
The Summer 2019 release extends the number of tables on offer for adaptive designs. 7 new tables will be added to this area.
In this release the following areas are targeted for development:
A background to these areas along with a list of the sample size tables which are added in the Summer release is given in the adjacent sections.
Explore the new Adaptive Module by clicking on the tabs above
In group sequential designs and other adaptive designs, access to the interim data gives the ability to answer the important question of how likely a trial is to succeed based on the information accrued so far. The two most commonly cited statistics to evaluate this are conditional power and predictive power.
Conditional power is the probability that the trial will reject the null hypothesis at a subsequent look given the current test statistic and the assumed parameter values, which are usually assumed to equal their interim estimates. Predictive power (a.k.a. Bayesian Predictive Power) is the conditional power averaged over the posterior distribution of the effect size. Both give an indication of how promising a study is based on the interim data and can be used as ad-hoc measures for futility testing or for defining “promising” results for unblinded sample size re-estimation.
Building on the initial nQuery Adapt release, 1 table will be added for conditional power and predictive power as follows:
Crossover trials use a repeated measures design, where each subject receives more than one treatment, with different treatments given in different time periods. The main benefit of a crossover trial is the removal of the between subject effect that impacts parallel trials. This can yield a more efficient use of resources as fewer subjects may be required in the crossover design than comparable designs.
Explore the new Adaptive Module by clicking on the tabs above
In group sequential designs and other similar designs, access to the interim data provides the opportunity to improve a study to better reflect the updated understanding of the study. One way a group sequential design can use the interim effect size estimate is not only to decide whether or not to stop a trial early but to increase the sample size if the interim effect size is considered “promising”. This optionality gives the trialist the chance to initially power for a more optimistic effect size, thus reducing up-front costs, while still being confident of being able to find for a smaller but clinically relevant effect size by increasing sample size if needed.
The most common way to define whether an interim effect size is promising is conditional power. Conditional power is the probability that the trial will reject the null hypothesis at a subsequent look given the current test statistic and the assumed parameter values, which are usually assumed to equal their interim estimates. For “promising” trials where the conditional power falls above a lower bound, a typical value would be 50%, the initial target power of the sample size can be increased to make the conditional power equal the target study power.
Building on the initial nQuery Adapt release, the following table will be added for unblinded sample size re-estimation:
This table allows nQuery Adapt users to extend their initial group sequential design for survival in two groups using the Log-Rank test (with or without unequal follow-up) by giving tools which allow users to conduct interim monitoring and conduct a flexible sample size re-estimate at a specified interim look.
This table will be accessible by designing a study using either of the two group sequential designs for survival tables and using the “Interim Monitoring & Sample Size Re-estimation” option from the group sequential “Looks” table. This table will provide for two common approaches to unblinded sample size re-estimation: Chen-DeMets-Lan and Cui-Hung-Wang. There is also an option to ignore the sample size re-estimation and conduct interim monitoring for standard group sequential design.
The Chen-DeMets-Lan method allows a sample size increase while using the standard group sequential unweighted Wald statistics without appreciable error inflation, assuming an interim result has sufficiently "promising" conditional power. The primary advantages of the Chen-DeMets-Lan method are being able to use the standard group sequential test statistics and that each subject will be weighted equally to the equivalent group sequential design after a sample size increase. However, this design is restricted to the final interim analysis and Type I error control is expected but not guaranteed depending on the sample size re-estimation rules.
The Cui-Hung-Wang method uses a weighted test statistic, using pre-set weights based on the initial sample size and the incremental interim test statistics, which strictly controls the type I error. However, this statistic will differ from that of a standard group sequential design after a sample size increase and since subjects are weighted on the initial sample size, those subjects in the post-sample size increase cohort will be weighted less than those before.
There will be full control over the rules for the sample size re-estimation including sample size re-estimation look (for Cui-Hung-Wang), maximum sample size, whether to increase to the maximum sample size or the sample size to achieve the target conditional power and bounds for what a “promising” conditional power is, among others.
Explore the new Adaptive Module by clicking on the tabs above
Sample size determination always requires a level of uncertainty over the assumptions made to find the appropriate sample size. Many of these assumed values are for nuisance parameters which are not directly related to the effect size. Thus it would useful to have a better estimate for these values than relying on external sources or the cost of a separate pilot study but without the additional regulatory and logistical costs of using unblinded interim data. Blinded sample size re-estimation allows the estimation of improved estimates for these nuisance parameters without unblinding the study.
In the Summer 2019 release, five tables will be added for blinded sample size re-estimation using the internal pilot method. The internal pilot method assigns an initial cohort of subjects as the “pilot study” and then calculates an updated value for a nuisance parameter of interest. This updated nuisance parameter value is then used to increase the study sample size if required, with the final analysis conducted with standard fixed term analyses with the internal pilot data included.
The new additions to the Adapt Module expand the scope of the nQuery Adapt blinded sample size re-estimation tables to the cases where unequal sample sizes and continuity corrections are needed. The new tables will be as follows:
These tables will provide full flexibility over the size of the internal pilot study, whether sample size decreases are allowable in addition to increase and tools to derive the best-blinded estimate from the internal pilot.
Blinded sample size re-estimation for the two sample t-test updates the sample size based on a blinded estimate of the common within-group standard deviation. Three methods are available to estimate the within-group standard deviation from the internal pilot data: pilot standard deviation, bias-adjusted pilot standard deviation, upper confidence limit for pilot standard deviation.
Blinded sample size re-estimation for the two sample chi-squared test updates the sample size based on a blinded estimate of the total proportion of successes and combining this with the initial proportion difference estimate. The user can enter either the proportion of successes or number of successes for the equivalent analysis.
Explore the new Adaptive Module by clicking on the tabs above
Sample size determination always requires a level of uncertainty over the assumptions made to find the appropriate sample size. Many of these assumed values are for nuisance parameters which are not directly related to the effect size. Thus it would useful to have a better estimate for these values than relying on external sources or the cost of a separate pilot study but without the additional regulatory and logistical costs of using unblinded interim data. Blinded sample size re-estimation allows the estimation of improved estimates for these nuisance parameters without unblinding the study.
In the Summer 2019 release, five tables will be added for blinded sample size re-estimation using the internal pilot method. The internal pilot method assigns an initial cohort of subjects as the “pilot study” and then calculates an updated value for a nuisance parameter of interest. This updated nuisance parameter value is then used to increase the study sample size if required, with the final analysis conducted with standard fixed term analyses with the internal pilot data included.
The new additions to the Adapt Module expand the scope of the nQuery Adapt blinded sample size re-estimation tables to the cases where unequal sample sizes and continuity corrections are needed. The new tables will be as follows:
These tables will provide full flexibility over the size of the internal pilot study, whether sample size decreases are allowable in addition to increase and tools to derive the best-blinded estimate from the internal pilot.
Blinded sample size re-estimation for the two sample t-test updates the sample size based on a blinded estimate of the common within-group standard deviation. Three methods are available to estimate the within-group standard deviation from the internal pilot data: pilot standard deviation, bias-adjusted pilot standard deviation, upper confidence limit for pilot standard deviation.
Blinded sample size re-estimation for the two sample chi-squared test updates the sample size based on a blinded estimate of the total proportion of successes and combining this with the initial proportion difference estimate. The user can enter either the proportion of successes or number of successes for the equivalent analysis.
Explore the new Adaptive Module by clicking on the tabs above
Blinded Sample Size Re-estimation
Conditional Power and Predictive Power
Unblinded Sample Size Re-estimation
VIEW ALL SAMPLE SIZE PROCEDURES AVAILABLE
Continue to explore the new Adaptive Module by clicking on the tabs above
To access the adaptive module you must have a nQuery Advanced Pro subscription. If you do, then nQuery should automatically prompt you to update.
You can manually update nQuery Advanced by clicking Help>Check for updates.
CLICK HERE FOR FULL DETAILS ABOUT UPDATING
If your nQuery home screen is different, you are using an older version of nQuery.
Please contact your Account Manager.
Summer 2019 nQuery Release Notes | ver 8.4
The Summer 2019 release extends the tables for sample size calculation using Bayesian methods. There are 5 new tables in this release that extend the assurance table options for survival analysis and introduce a new concept known as the Posterior Error Method. This release summary will provide an overview of what areas have been targeted in this release along with the full list of tables being added.
In the Summer 2019 release, two main areas are targeted for development. These are:
A background to these areas along with a list of the sample size tables which are added in the Summer 2019 release is given in the following sections.
Continue to explore the new Bayes tables by clicking on the tabs above
Assurance is the unconditional probability that the trial will yield a positive result (usually a significant p-value) and is the expectation for the power averaged over the prior distribution of the unknown parameter estimate. This provides a useful estimate of the likely utility of a trial and provides an alternative method to frequentist power for finding the appropriate sample size for a study. For this reason, assurance is often referred to as “Bayesian power” or the "true probability of success".
The following table has been added to this area:
This table calculates the Bayesian Assurance in a two independent group survival analysis and uses two individual inverse gamma priors to characterise the uncertainty in the control and treatment group hazard rates.
Continue to explore the new Bayes tables by clicking on the tabs above
The Posterior Error Method uses Bayesian theory to provide an alternative to the frequentist type I and type II error rates using null hypothesis testing. Proposed by Lee and Zelen in 2002, they argue that the error rates conditioned on assuming if the null hypothesis is true or not does not reflect typical medical decision making and that the inverse conditional of these errors would be more appropriate for that purpose. Thus, they derive posterior error rates for a frequentist analysis, where the posterior error rate considers how likely the null hypothesis is true or not given what the result of the trial was (i.e. statistically significant/not significant). For example, the traditional frequentist type I error rate is defined as the probability that the null hypothesis is rejected given that the hypothesis is true. The posterior type I error rate on the other hand represents the probability that the null hypothesis is true given that it is was rejected (i.e. the result was significant).
Since the Posterior Error Method still assumes a frequentist analysis, this gives an analyst the option to consider both the frequentist and posterior errors simultaneously by converting between them with addition of a single additional parameter of the prior belief against the null hypothesis.
The Summer 2019 release contains the following 4 tables in this area:
The Posterior Error method can be applied to a wide range of pre-existing frequentist tables by simply calculating the posterior errors from the significance level and power of and vice-versa The Posterior Error Rate Calculator has been created in nQuery which can be found in the Assistants menu to facilitate this.
In addition to this, three tables have been created which detail the specific application of this method to commonly used frequentist methods for the analysis of means using the Z-test.
Continue to explore the new Bayes tables by clicking on the tabs above
Assurance is the unconditional probability that the trial will yield a positive result (usually a significant p-value) and is the expectation for the power averaged over the prior distribution of the unknown parameter estimate. This provides a useful estimate of the likely utility of a trial and provides an alternative method to frequentist power for finding the appropriate sample size for a study. For this reason, assurance is often referred to as “Bayesian power” or the "true probability of success".
The following table has been added to this area:
This table calculates the Bayesian Assurance in a two independent group survival analysis and uses two individual inverse gamma priors to characterise the uncertainty in the control and treatment group hazard rates.
Continue to explore the new Bayes Tables by clicking on the tabs above
Posterior Error Method
Assurance
VIEW ALL SAMPLE SIZE PROCEDURES AVAILABLE
Continue to explore the new Bayes tables by clicking on the tabs above
To access the Bayesian module, you must have a nQuery Advanced Plus or Advanced Pro subscription. If you do, nQuery should automatically prompt you to update.
You can manually update nQuery Advanced by clicking Help>Check for updates.
CLICK HERE FOR FULL DETAILS ABOUT UPDATING
If your nQuery home screen is different, you are using an older version of nQuery.
Please contact your Account Manager.
Summer 2019 nQuery Release Notes | ver 8.4
In the Summer 2019 release, we will be adding 29 new sample size tables to the nQuery Advanced Base module. This summary will provide an overview of which areas have been targeted in this release and the full list of tables being added.
In the Summer 2019 release, two main areas were targeted for development. These are:
Explore the new Core tables by clicking on the tabs above
Hierarchical models are models used when data is nested within multiple levels. For example, when students (level 1) are nested within a class (level 2) which are nested within a school (level 3) which are nested within a school district (level 4). Repeated measures can also be modelled in a similar fashion. Hierarchical models are also known as multi-level or mixed-effects models.
The main benefit of these models is the ability to formally account for the nesting structure and ensure the appropriate standard errors are given for the treatment effect by accounting for issues such as the self-similarity of units within a level (e.g. within-subject correlation structures for repeated measures) and the randomization level (e.g. subjects being equally randomized to each treatment within a cluster versus all subjects in a cluster getting the same treatment and randomizing treatment by cluster). Both cross-sectional (i.e. single measurement per subject) and longitudinal (i.e. multiple measurements per subject) designs are easily encompassed within hierarchical modelling methods.
For sample size determination, the usage of a mixed model is usually assumed and the effect of level of randomization, self-similarity within units and within-subject correlation (for longitudinal studies) have to be accounted for to achieve the appropriate sample size.
For example, a common type of hierarchical trial design would be a 2-Level longitudinal hierarchical design in which subjects (level 2) are randomized to one of two treatments and then multiple observations (level 1) are made on each subject over time. In determining the sample sizes for such a design, the hierarchical data structure must be considered since both the first and second level units contribute to the total variation in the observed outcomes. In addition, level 1 data units (i.e. repeated measurements) from the same level 2 unit 9 (i.e. subject) tend to be positively correlated.
In the Summer 2019 release, 17 new Hierarchical Model tables will be added, which vary across two and three level designs and are derived from Ahn, Heo, and Zhang (2015).
The main areas fall under the following headings:
Mixed Models Tests for Means
The mixed models tests for means tables relate to (cross-sectional) designs in which subjects are assigned to one of two treatments (e.g. treatment and control) and the aim of the study is to compare the mean response of each group.
In the Summer 2019 release we are adding 6 tables in this area. These are as follows:
Mixed Models Test for Two Means in a 2-Level Hierarchical Design (Cluster Randomization)
Mixed Models Test for Two Means in a 2-Level Hierarchical Design (Subject Randomization)
Mixed Models Tests for Two Means in a 3-Level Hierarchical Design (Level 3 Randomization)
Mixed Models Tests for Two Means in a 3-Level Hierarchical Design (Level 2 Randomization)
Mixed Models Tests for Two Means in a 3-Level Hierarchical Design (Level 1 Randomization)
Tests for Two Means in a Multicenter Randomized Design
Mixed Models Tests for Slope Differences
The mixed models tests for slope differences relate to longitudinal studies in which subjects are randomly assigned to one of two treatments and then followed up for a pre-specified number of visits. Note that in these designs the subjects are considered the “second” level units and the repeated measures are the “first” level units. In this design the treatment effect is the difference in the slopes (trend) in the outcome over time for each treatment group. For the per-subject slope, both fixed and random slope effect models are available. For the former, subjects within the same treatment group are assumed to have a common slope and for the latter, each subject is allowed to have their own trend in outcome over time assuming the slopes come from a per-treatment normal distribution.
In the Summer 2019 release we are adding 6 new tables in this area. These are as follows:
Mixed Models Tests for Proportions
The mixed models tests for proportions tables relate to (cross-sectional) designs in which subjects are assigned to one of two treatments (e.g. treatment and control) and the aim of the study is to compare the binomial proportion responses in each group.
In the Summer 2019 release we are adding 5 new tables in this area. These are as follows:
Explore the new Core tables by clicking on the tabs above
This release will bring 12 new Interval Estimation tables adding to the 68 interval estimation tables already on offer. Interval estimation relates to calculating an interval for “probable” range of an unknown parameter. In sample size determination, the target for the sample size is the desired width (precision) of the calculated interval. Recently, there has been growing interest in inference and sample size methods which target precision rather than relying on the inherent dichotomous decision making associated with hypothesis testing and power.
The most common type of statistical interval is the confidence interval, but other types of interval are available when the target of inference is not the parameter value in repeated sampling and these have got focussed on here with prediction and tolerance interval methods being included. Note that Bayesian intervals such as the posterior credible interval have been covered in previous updates. The types of interval estimation covered in the Summer 2019 are in the following areas:
Confidence Intervals
Confidence Intervals are the most widely used statistical interval and represent an interval within which a population parameter is likely to lie with a given level of probability. The new confidence interval tables in the Summer 2019 release relate to diverse set of statistics common in areas such as engineering, quality control, enzyme kinetics and phycology. One of these is also a confidence interval for the limit of a reference interval. Reference intervals provide a reference for the expected values from a given population. For example, in health-related fields, a reference interval is the range of values that is deemed “normal” for a physiologic measurement in healthy persons.
In the Summer 2019 release we are adding 6 new Confidence Interval tables. These are as follows:
Prediction Intervals
Prediction Intervals are less widely used than confidence intervals but are vital in many regulatory and industrial fields. A prediction interval is an interval within which a future sample/ subject/parameter (from the same sampling distribution) will lie with a given probability. A common use case is giving an interval for how likely a single future observation will be contained in the prediction interval.
The new additions relate to intervals which predict an unknown quantity from a future sample based on a previously sampled normal distribution. The methods target the sample size based on the ratio of the calculated prediction interval to the limiting interval, where the prediction interval was based on an infinite sample size and two different versions of this metric (expected ratio, upper prediction limit for ratio) are provided.
In the Summer 2019 release we are adding 3 new Prediction Interval tables. These are as follows:
Tolerance Intervals
A tolerance interval is an interval determined from the observed values of a random sample for the purpose of drawing an inference about the proportion of a distribution contained within that interval. For the tables included in this release, one may interpret these as an interval which contains at least a given proportion of the sampled population with a given probability. Data from the normal, exponential and Gamma distributions are covered in this release.
In the Summer 2019 release we are adding 3 new Tolerance Interval tables. These are as follows:
Continue to explore the new Core tables by clicking on the tabs above
Means
Proportions
Regression
Agreement
Misc
VIEW ALL SAMPLE SIZE PROCEDURES AVAILABLE
Continue to explore the new Core tables by clicking on the tabs above
If you have nQuery Advanced installed, nQuery should automatically prompt you to update.
You can manually update nQuery Advanced by clicking Help>Check for updates.
CLICK HERE FOR FULL DETAILS ABOUT UPDATING
If your nQuery home screen is different, you are using an older version of nQuery.
Please contact your Account Manager.