Choosing the optimal number of samples while testing your medical device product boils down to a common trade-off: cost vs. benefit. Staying within your testing budget is important but ensuring that your data is robust enough to withstand scrutiny from the regulatory body is paramount and avoids costly delays with re-testing. As a 3rd party contract testing lab, DDL will not make the final determination of sample size for our clients. However, there are common tendencies which we have observed in the sampling that our customers see when submitting device applications. We will outline these tendencies and other observations in the hope that they will be useful for those who are researching how to best structure their testing regimen.
Before discussing the common sampling trends DDL has observed, it is relevant to address pre-design verification activities. Feasibility and characterization testing are often conducted during pre-design verification studies using lower sample quantities than Design Verification. When determining the sample size during Pre-DV, it is important to consider multiple samples per unique design input (e.g. mold cavity) as it may help mitigate any false positives and issues down the road. In addition, the risk of the device’s manufacturing process should also be considered when determining the sample size for these early, initial phase studies.
During design verification studies, the first factor that needs to be made when determining sample size is whether the test data you require will be attribute or variable. Attribute data, also called binomial data, are qualitative results. Pass/fail or go/no-go are common types of attribute data – for example whether a measured dimension falls within the tolerances on the drawing. Variable data output is shown in values. For example, the seal strength of a heat-sealed pouch or the tensile strength of a poly film is typically shown as variable results.
The next factor in determining sample size is evaluating your risk tolerance. An internal regulatory or quality department or a consultant will provide good guidance on risk tolerance. In addition, ISO 14971 is called out by a significant number of ISO standards on how to apply risk management to medical devices which may ultimately be helpful in determining the appropriate sample size. As one can expect, a high-risk product requires more test samples in order to achieve an acceptable confidence interval. Statistically speaking, a higher risk product means that you need to assign a more stringent acceptable quality level (AQL), p0, to your experimental design. The AQL represents the maximum allowable proportion of defective items in a lot. For example, if a maximum of 5% of your parts can be defective, your p0 value would be 0.05. From there, using the cumulative geometric distribution function, you can determine your optimum sample size for an attribute test. Similar principles can be applied to variable testing when it comes to risk but instead of a sample size increasing with a higher risk product, the Cpk or K-value will change. Ultimately, the change in these values will make sure the device is still held to higher levels of risk that will ensure that the device is still adequate for use against the appropriate risk assessment.
The most common sample sizes DDL sees for attribute tests are 29 and 59. For example, to obtain a 95% confidence that your product’s passing rate is at least 95% – commonly summarized as “95/95”, 59 samples must be tested and must pass the test. If your product has lower risk and you are able to accept a lower passing rate of 90%, only 29 passing samples are needed to obtain 95% confidence, or “95/90”. These numbers all assume that there will be no failures in any of the samples. That, unfortunately, is not always the case. In the event of an isolated failure, a different equation – the negative binomial distribution, must be used. In order to maintain the same confidence intervals as stated above with one failure, the sample size is 46 for a p value of 0.10 and 93 for a p value of 0.05 and increases with additional failures.
At the end of the day, determining sample size for an attribute test is a straightforward task once the statistical requirements are known, but its importance cannot be overstated. Conversely, with variable testing, there are many product dependent factors to consider before arriving at a sample size quantity. Determining your sample size using either statistical method will not only ensure that regulatory requirements are met, it also provides evidence that the quality of the product is high increasing patient safety.