Saturday 25 March 2017

The operations manager from Q4 wants to make sure the right sample size was used to collect the observations and calculate the standard time.The...

The time, t, for a manufacturing process is required to be estimated, with 99% confidence, that a +/-2% accuracy interval about the estimate for t, t*, will contain the true value of t. The question is whether a sample size of n=10 will be large enough, with 99% confidence, for this to be true.

We require that t is in the interval [t*(1-a),t*(1+b)] with 99% confidence, where


(Eqn 1) a +b = 0.04 (so that the interval covers 4% around t* but need not be exactly symmetric)


We can simplify the problem by moving to the log scale, so that log(t) is required to fall in the interval about log(t*)


[log(t*) + log(1-a), log(t*) + log(1+b)]


with 99% confidence.


For a symmetric 4%-wide interval about log(t*) we have that


-log(1-a) = log(1+b) = c


giving the identity


(Eqn 2) (1-a)(1+b) = 1


Substituting a = 0.04-b  from (Eqn 1) into (Eqn 2) we have that


(b+0.96)(1+b) = 1


b^2 + 1.96b - 0.04 = 0


Solving using the quadratic formula to obtain b and then using (Eqn 1) to obtain a we have


a = 0.0198, b = 0.0202 and hence c = log(1+b) = 0.02


The values of a and b are not quite equal, since for a +/-2% interval to be additive on the logarithmic scale, the 4%-wide interval on the original scale needs to be slightly asymmetric. Note that 2% is so small a percentage that it translates very closely to 0.02 on the log scale.


Hence an interval for the sample estimate of log(t), log(t*), that allows for +/-2% error in the estimate log(t*) is given by


(Eqn 3) log(t*) +/- 0.02


Now, applying the Central Limit Theorem, we assume that the sampling distribution of log(t*) can be approximated as Normal(log(t), sigma^2/n), where sigma^2 is the underlying variance of measurements of log(t) and n is the sample size.


A 99% confidence interval for log(t) from a sample size n will thus be of the form


(Eqn 4) log(t*) +/- 2.58sigma/sqrt(n)


where 2.58 is the 99.5th percentile of the standard Normal distribution. By taking 0.5% off each end of the distribution, we ensure we have a two-sided 99% confidence interval.


In practice, we of course need to plug an estimate of sigma, sigma*, into (Eqn 4), since the true value of sigma is unknown.


Since the 99% interval in (Eqn 4) is required to be no larger than the +/-2% accuracy interval in (Eqn 3), we require that


0.02 >= 2.58sigma/sqrt(n)


that is


n >= (2.58/0.02)^2sigma^2


n >= 16641sigma^2


Supposing that we have a sample estimate for sigma of sigma* = 6 seconds = 0.1 mins then we would require


n >= 166.41


Unless the process is a fast process and the accuracy of the measurement of the time it takes is also good, a sample of size n=10 is not enough to be 99% confident that the interval [t*(1-a),t*(1+b)] contains the true time length t, where the width of the interval is 4% of t* wide. Allowing only 4% inaccuracy on either side of the estimate t* means that a sample of only n=10 will result in lower confidence that t is in the interval about t* than 99%. In fact, the associated confidence in this case would be only


2*(1-Phi(0.02*sqrt(10)/0.1)) = 2*(1-Phi(0.632)) = 52.7%


where Phi() is the cdf of the standard Normal distribution. This is a very low level of confidence when compared to 99%.

No comments:

Post a Comment

How are race, gender, and class addressed in Oliver Optic's Rich and Humble?

While class does play a role in Rich and Humble , race and class aren't addressed by William Taylor Adams (Oliver Opic's real name) ...