1) We will use interchangeably the words errors and uncertainties.

2) Error analysis is the study and evaluation of uncertainty in measurement. No experiment, however carefully made, can be completely free of uncertainties. In the scientific world error refers to this inevitable uncertainty, they are not "mistakes" or "blunders" or "human errors". The best you can hope is to ensure that errors are as small as reasonably possible or needed, and to provide a good estimate of their magnitude.

3) Some uncertainties are associated to the quality of the instruments we use to measure (for instance what is the smallest reading we can discriminate), some to the actual definition of the quality being measured (for instance, a person's height), some to the actual technique we decide to use (temperature dependent, humidity dependent, history dependent, etc.).

4) Error and discrepancy are completely different concepts.

Error refers only to the uncertainty in our result, regardless of what the expected value (if there is one) is supposed to be.

Discrepancy quantifies by how much our results differ from some expected value. For instance if you determine in an experiment a value for gmeas = (9.5+/-0.1) m/s^2, you can calculate the discrepancy to the accepted value (lets say gacc=(9.79+/-0.01)m/s^2) and quantify it as a percent discrepancy:

D=[(gacc- gmeas)/gacc]* 100 = [(9.79-9.5)/9.79 ]* 100 = 3%

We would say our result is "within 3%" of the accepted value (which might be great or not depending on the context).

Notice that the uncertainties (0.1, 0.01 m/s^2) play no role in discrepancy calculations. The discrepancy may be significant or not (see below).

5) Other terms that we need to define are agreement or disagreement.

We say that two results (for instance two determinations of g using different methods, or one determination of g and its accepted value) are in agreement if there is significant overlap between the two ranges of values ( to for each case).

In the example above, the range for gmeas is 9.4 to 9.6 m/s^2, the range for gacc is 9.78 to 9.80 m/s^2, so the results are in disagreement (no overlap). In other words the discrepancy is significant (difference larger than the errors), and we say that the results are "significantly different".

Note that this is not something to be "ashamed of", is a statement of a fact. For instance, if some other "sloppy" experiment would give gmeas = (9 +/- 1 ) m/s^2 , now there would be agreement because the ranges do overlap (nothing to feel very proud about it, though...), and we say that the results are "not significantly different".

On the other extreme, a "very careful" measurement might produce a result like gmeas = (9.75+/-0.01) m/s^2, which is in disagreement (significantly different)!

Note also that you cannot decide if your results agree or disagree unless you can calculate your uncertainties.

In most labs the crucial point (specially for your lab report grade!!!) is to decide if there is agreement or disagreement.

 

6) Random and Systematic Errors

Uncertainties are classified into two groups:

6a) Random errors: can be treated statistically, that is they can be revealed by repeating the measurements, providing then a well defined procedure to reduce them if necessary (by increasing sufficiently the number of measurements). They affect the precision (see 7) of a measurement.

6b) Systematic errors: cannot be treated statistically, that is they cannot be revealed by repeating the measurements. For this reason they are hard to evaluate and even to detect. One has to learn to anticipate the possible sources of systematic error , and to make sure these errors are much smaller (usually 10 times smaller is enough) than the required precision, so they become negligible. They do not affect the precision (see 7) of a measurement, they may only affect its accuracy.

Example: Imagine we are timing the revolution of a turntable (33 1/3 rpm). One source of error will be our reaction time in starting and stopping the watch. If our reaction time would be constant, these delays would cancel each other out (that is we would start the watch let's say 0.3 s late, and we would stop the watch 0.3 s late). In practice, however, our reaction time will vary, and usually in a way that sometimes overestimates and sometimes underestimates the times. In this case the sign of the effect is random. If we repeat the measurements many times and analyze statistically the spread in the results we can get a very reliable estimate of this kind of error.

On the other hand, if the stopwatch is running slow, then all our times will be underestimates and no amount of repetition will reveal the source of error. This error is called then systematic because the sign of the error is always the same, and cannot be discovered by statistical analysis

In general, sources of random errors are small problems of judgment by the observer (for instance interpolating in reading a scale), small disturbances of the apparatus (for instance vibrations), problems of definition (for instance "the height of a person"), etc. Obvious sources of systematic errors are instruments' miscalibrations, misalignments, improper zeroing, bending of a measuring tape or meter-stick, etc.

Sometimes this distinction between random and systematic errors is not clear-cut. For instance, the parallax effect is associated with not being perfectly perpendicular when reading a value in a scale, but may be random if "inconsistent" (head sometimes on one side and sometimes in other side of the normal) or systematic if consistent.

In the Introductory Labs is very hard to track down the sources and quantify the systematic errors (for instance by recalibrating an instrument against a better one), in some cases your lab instructor might specify them for a particular instrument as "1% relative uncertainty" or as "5% relative uncertainty" depending on the condition and/or age of the instrument. Even in those cases in which this is not possible, your lab instructors will expect to see in your lab reports an honest discussion about random and systematic error sources and their magnitudes when analyzing disagreements or discrepancies.

 

 

7) Two more concepts must be defined: precision and accuracy.

Precision refers (inversely) to the amount of dispersion of the experimental results respect to the average: A high precision measurement is characterized by repeated measurements producing very similar results (small spread). For instance, if you measure your height 6 times and obtain 1.80, 1.81, 1.79, 1.80, 1.79,1.80 we would say that the experiment has high precision (the spread is less than 0.1% of the average value). The precision will affect the number of significant figures in your result.

In a "target shooting" analogy all your shots hit about the same spot (you can also say that you are "consistent"). However, this does not mean that you are hitting close to the bulls-eye. You might be consistently hitting the wrong spot (going back to the height measurements, you might be using a measuring tape that bends).

Accuracy refers to how close your result is to the "true value" of the quantity being measured. Strictly speaking, high accuracy only means that your average value is very close to the "true value", although most authors would require also low dispersion to qualify a measurement as "accurate".

In a "target shooting" analogy most of your shots hit close to the bulls-eye. You would not call a target shooter "accurate" if he/she is hitting all over the place and only "in average" is centered OK.

Most Nobel Prize scientists are associated with one or more highly accurate experiments.

 

8) Estimating reading uncertainties:

A reasonable estimate of the magnitude of the uncertainties when reading a scale (like a ruler or a thermometer) is one half of the minimum division. Sometimes this is the uncertainty presumed when not stated explicitly:

L = 36 mm is taken as 35.5 mm < L < 36.5 mm

L = 36.0 mm is taken as 35.95 mm < L < 36.05 mm

 

9) Significant figures:

Experimental uncertainties should almost always be rounded to one significant figure.

g = (9.82 +/- 0.02) m/s2 is correct

g = (9.82 +/- 0.0235) m/s2 is incorrect

g = (9.8235 +/- 0.02) m/s2 is incorrect (round off all trailing digits beyond the uncertainty figure)

Best estimate stated values should keep as last significant figure the one with the same order of magnitude of the uncertainty. You may keep extra digits in intermediate calculations but never in the final stated value.

SEE AN EXAMPLE OF ERROR ANALYSIS

10) You should state your result for a variable x that has been measured N times (with results: x1, x2,.....xN) as :

x = (BEST ESTIMATE) +/- (UNCERTAINTY) =>

 

In general,

BEST ESTIMATE (=mean) =....("AVERAGE" in EXCEL)

UNCERTAINTY: ....(STANDARD DEVIATION OF THE MEAN)

where: .....(STANDARD DEVIATION OF THE SAMPLE, "STDEV" in EXCEL)

Note: The STANDARD DEVIATION OF THE MEAN is also called in many texts "STANDARD ERROR".

 


11) PROPAGATION OF ERRORS:

If a quantity X is calculated as a function of several measured quantities (lets say X = X (u, v, w)) , then the errors on u, v, w would cause an uncertainty on X according to the following rules:

 

11a) Sum/difference rule: X = u + v - w

. .for independent random errors

Note that always.

 

11b) Product/quotient rule: X = u * v / w

......for independent random errors

Note that .......always.

 

11c) If X is an arbitrary function of u, v, w:

........for independent random errors

 

11d) As a particular important case, if , then . This means that the relative error of u propagates n times when calculating X, so you must be extra careful when measuring variables that you will use to calculate X when they appear in high powers (like squares, cubes, etc.) , and conversely may be not so careful when low powers (like square roots, cubic roots, etc.) are involved.

 

12) The Normal Distribution

If a continuous variable x is measured many times, experience shows that the distribution of measured values will approach the following Normal (or Gaussian, or "bell-shaped") distribution:

where:

X is the center of the distribution (= best estimate = mean after many measurements),

is the width of the distribution (= standard deviation after many measurements),

and G(x)dx is the probability that any one measured value will fall in the range between x and x+dx.

 

The probability that a particular measurement will fall within X +/- is 68% (about 2/3 of the time). This is usually referred to by saying that "one-sigma is the interval for 68% confidence limit".

The probability that a particular measurement will fall within X +/- 2 is 95%. This means that "many-sigma" events are very unlikely.

This idea may help you to decide when to reject one value that is "suspiciously too far away from the rest". For instance, the so called "Chauvenet's Criterion" is based on the rejection of one "suspect" value xsus if its distance to the mean value is too large in units of sigma to make it "too improbable to believe it". This probability is calculated as 1/(number of data values), and compared to the normal distribution probability of being outside (xsus-xmean)/sigma.