The time, t, for a manufacturing process is required to be estimated, with 99% confidence, that a +/-2% accuracy interval about the estimate for t, t*, will contain the true value of t. The question is whether a sample size of n=10 will be large enough, with 99% confidence, for this to be true.
We require that t is in the interval [t*(1-a),t*(1+b)] with 99% confidence, where
(Eqn 1) a +b = 0.04 (so that the interval covers 4% around t* but need not be exactly symmetric)
We can simplify the problem by moving to the log scale, so that log(t) is required to fall in the interval about log(t*)
[log(t*) + log(1-a), log(t*) + log(1+b)]
with 99% confidence.
For a symmetric 4%-wide interval about log(t*) we have that
-log(1-a) = log(1+b) = c
giving the identity
(Eqn 2) (1-a)(1+b) = 1
Substituting a = 0.04-b from (Eqn 1) into (Eqn 2) we have that
(b+0.96)(1+b) = 1
b^2 + 1.96b - 0.04 = 0
Solving using the quadratic formula to obtain b and then using (Eqn 1) to obtain a we have
a = 0.0198, b = 0.0202 and hence c = log(1+b) = 0.02
The values of a and b are not quite equal, since for a +/-2% interval to be additive on the logarithmic scale, the 4%-wide interval on the original scale needs to be slightly asymmetric. Note that 2% is so small a percentage that it translates very closely to 0.02 on the log scale.
Hence an interval for the sample estimate of log(t), log(t*), that allows for +/-2% error in the estimate log(t*) is given by
(Eqn 3) log(t*) +/- 0.02
Now, applying the Central Limit Theorem, we assume that the sampling distribution of log(t*) can be approximated as Normal(log(t), sigma^2/n), where sigma^2 is the underlying variance of measurements of log(t) and n is the sample size.
A 99% confidence interval for log(t) from a sample size n will thus be of the form
(Eqn 4) log(t*) +/- 2.58sigma/sqrt(n)
where 2.58 is the 99.5th percentile of the standard Normal distribution. By taking 0.5% off each end of the distribution, we ensure we have a two-sided 99% confidence interval.
In practice, we of course need to plug an estimate of sigma, sigma*, into (Eqn 4), since the true value of sigma is unknown.
Since the 99% interval in (Eqn 4) is required to be no larger than the +/-2% accuracy interval in (Eqn 3), we require that
0.02 >= 2.58sigma/sqrt(n)
that is
n >= (2.58/0.02)^2sigma^2
n >= 16641sigma^2
Supposing that we have a sample estimate for sigma of sigma* = 6 seconds = 0.1 mins then we would require
n >= 166.41
Unless the process is a fast process and the accuracy of the measurement of the time it takes is also good, a sample of size n=10 is not enough to be 99% confident that the interval [t*(1-a),t*(1+b)] contains the true time length t, where the width of the interval is 4% of t* wide. Allowing only 4% inaccuracy on either side of the estimate t* means that a sample of only n=10 will result in lower confidence that t is in the interval about t* than 99%. In fact, the associated confidence in this case would be only
2*(1-Phi(0.02*sqrt(10)/0.1)) = 2*(1-Phi(0.632)) = 52.7%
where Phi() is the cdf of the standard Normal distribution. This is a very low level of confidence when compared to 99%.
No comments:
Post a Comment