Saturday, 25 March 2017

The operations manager from Q4 wants to make sure the right sample size was used to collect the observations and calculate the standard time.The...

The time, t, for a manufacturing process is required to be estimated, with 99% confidence, that a +/-2% accuracy interval about the estimate for t, t*, will contain the true value of t. The question is whether a sample size of n=10 will be large enough, with 99% confidence, for this to be true.

We require that t is in the interval [t*(1-a),t*(1+b)] with 99% confidence, where


(Eqn 1) a +b = 0.04 (so that the interval covers 4% around t* but need not be exactly symmetric)


We can simplify the problem by moving to the log scale, so that log(t) is required to fall in the interval about log(t*)


[log(t*) + log(1-a), log(t*) + log(1+b)]


with 99% confidence.


For a symmetric 4%-wide interval about log(t*) we have that


-log(1-a) = log(1+b) = c


giving the identity


(Eqn 2) (1-a)(1+b) = 1


Substituting a = 0.04-b  from (Eqn 1) into (Eqn 2) we have that


(b+0.96)(1+b) = 1


b^2 + 1.96b - 0.04 = 0


Solving using the quadratic formula to obtain b and then using (Eqn 1) to obtain a we have


a = 0.0198, b = 0.0202 and hence c = log(1+b) = 0.02


The values of a and b are not quite equal, since for a +/-2% interval to be additive on the logarithmic scale, the 4%-wide interval on the original scale needs to be slightly asymmetric. Note that 2% is so small a percentage that it translates very closely to 0.02 on the log scale.


Hence an interval for the sample estimate of log(t), log(t*), that allows for +/-2% error in the estimate log(t*) is given by


(Eqn 3) log(t*) +/- 0.02


Now, applying the Central Limit Theorem, we assume that the sampling distribution of log(t*) can be approximated as Normal(log(t), sigma^2/n), where sigma^2 is the underlying variance of measurements of log(t) and n is the sample size.


A 99% confidence interval for log(t) from a sample size n will thus be of the form


(Eqn 4) log(t*) +/- 2.58sigma/sqrt(n)


where 2.58 is the 99.5th percentile of the standard Normal distribution. By taking 0.5% off each end of the distribution, we ensure we have a two-sided 99% confidence interval.


In practice, we of course need to plug an estimate of sigma, sigma*, into (Eqn 4), since the true value of sigma is unknown.


Since the 99% interval in (Eqn 4) is required to be no larger than the +/-2% accuracy interval in (Eqn 3), we require that


0.02 >= 2.58sigma/sqrt(n)


that is


n >= (2.58/0.02)^2sigma^2


n >= 16641sigma^2


Supposing that we have a sample estimate for sigma of sigma* = 6 seconds = 0.1 mins then we would require


n >= 166.41


Unless the process is a fast process and the accuracy of the measurement of the time it takes is also good, a sample of size n=10 is not enough to be 99% confident that the interval [t*(1-a),t*(1+b)] contains the true time length t, where the width of the interval is 4% of t* wide. Allowing only 4% inaccuracy on either side of the estimate t* means that a sample of only n=10 will result in lower confidence that t is in the interval about t* than 99%. In fact, the associated confidence in this case would be only


2*(1-Phi(0.02*sqrt(10)/0.1)) = 2*(1-Phi(0.632)) = 52.7%


where Phi() is the cdf of the standard Normal distribution. This is a very low level of confidence when compared to 99%.

No comments:

Post a Comment

How are race, gender, and class addressed in Oliver Optic's Rich and Humble?

While class does play a role in Rich and Humble , race and class aren't addressed by William Taylor Adams (Oliver Opic's real name) ...