The Issue of Best Practices in Science: How to Express Measurement Uncertainties

The issue of best practices in science: how to express measurement uncertainties Hervé This, vo Kientza 1 UMR Ingénierie Procédés Aliments, AgroParisTech, Inra, Université Paris- Saclay, 91300 Massy, France 2 Groupe de gastronomie moléculaire, Inra-AgroParisTech International Centre for Molecular Gastronomy, F-75005, Paris, France Correspondence : [email protected] This document is a personal translation of this article, published in French in the N3AF Journal: La question des bonnes pratiques en sciences de la nature : comment exprimer des incertitudes de mesure Notes Académiques de l'Académie d'agriculture de France (N3AF) 2016, 4, 1-8 Abstract : The scientific community could usefully adopt best practices for the various steps of the scientific method. Here a numerical example is used in order to discuss such rules for the composition of uncertainties. Keywords : Uncertainties, best practices, GUM, composition, Monte-Carlo Introduction In the Lettre de l'Académie d'agriculture de France (This, 2015), the issue of best practices for scientific research was discussed. Indeed, for activities where the obligation of results is not pertinent, such as medicine, there is instead an obligation of means, and best practices have to be used (HAS, 2016). For sciences of nature, the obligation of results seems difficult to impose, so that best practices should be given. However it can be observed that the rules given by scientific institutions, such as academies or research institutes, discuss primarily the methodology of analysis, with quality, traceability, validity of researches (Inra, 2016a), or ethics (Inra, 2016 b) and ethics publishing (Wiley, 2016). The question of calculation is not discussed much. However all steps in scientific research could be concerned, i.e. the identification of phenomena, quantitative characterization of phenomena of interest, making synthetic laws from these data (equations), inducing theories, or models, including proposing mechanisms, research of refutable theoretical predictions, and experimental tests of these predictions (This, 2011). Here we restrict the discussion to the issue of expressing the uncertainties of results, of measurements and calculations, in order to stress the procedures currently admitted by the scientific community. We do not intend to paraphrase the official documents, (Eurachem/CITAC Guide CG4, 2012 ; River et Balère, 2003), but one goal is to show, with one simple example, the interest of the creation of a « Guide of best scientific practices », by a scientific institution that could be Inra, CNRS, the French Academy of agriculture, or the French Academy of sciences. Composing uncertainties In particular, the quantitative characterization of phenomena, which is the second step of scientific research (Bacon, 1620), leads to measurement results that common sense and best practices should validate (ACS, 1980). The repetition of experiments and measurements ("validation") is almost accepted as an obligation, but the production of results (generally different results in repetitions of experiments) has to be followed by their comparison, and this is why one has to calculate the dispersion of values (This, 2013). For this calculation, various methods are possible (Saporta, 2006), depending on the experimental conditions. A first method consists of repeating an experiment entirely (it can be complex, with a series of many elementary steps) and to calculate an estimator of the standard deviation sn-1 of the x value of the experiment (for a dosage, for example) : (1) Here n is the number of repetitions of the experiment, xi the value found at the i-th repetition, and the mean of the n values xi, i=1..n is ¯x (Harris, 2007). However this method cannot always be used, in particular when the experiment destroys the samples which are quantitatively characterized. In order to determine the uncertainty of the parameter of interest x, one has to « propagate uncertainties » (or « compose uncertainties”) using a determination of elementary uncertainties, for each of the steps of the experiment. Let us consider for example the very elementary case of the preparation of a solution, by dissolution of a mass m of a solute (obtained by mass measurement) in a mass M of solvent (also determined by weighing), and for which the desired parameter is the mass concentration c = m/M (it has to be noted that the practice of weighing solutes and solvents in order to determine concentration -mass concentrations- is much better than the measurement of volumes, because the precision of balances is generally higher than the precision of micropipettes; a proof of this proposal is that the control of pipettes is done using balances). As the masses m and M are uncertain (the uncertainty can be estimated by the standard deviation of successive mass measurements, or by the precision of the balance when the dispersion of mass measurements is lower than the precision), the mass concentration c is also uncertain, and the issue is to express the uncertainty Δc of the mass concentration c as a function of the uncertainties Δm and ΔM relatively respectively to m and M. Another example is also common during chemical analysis : when the area A of a signal (for example the area of a chromatographic or spectroscopic signal) is determined, then using a calibration curve of equation A = a . c + b gives the possibility to find the concentration c using the equation . Because the areas are known with some uncertainty, one has to express the concentration c along with an uncertainty that has to be determined from the uncertainties n the calibration curve (expressed by the coefficients a and b) and from the area A. In all cases, one has to consider that the parameter of interest (mass concentration, concentration of a solution analyzed by spectroscopy...) is the value of a function of many variables, and the question of propagating uncertainties is the following : given the uncertainties of the variables, how can we determine the uncertainty on the value of the function? The answer from the scientific community and two faulty practices There are many ways of characterizing variability, and each is legitimate if it is rational and explicit. In particular, for the question of propagating uncertainties, a collective answer was put forward by the Bureau international des poids et mesures (BIPM): the Joint Committee for Guides in Metrology (JCGM) of the BIPM published a document entitled Évaluation des données de mesure — Guide pour l’expression de l’incertitude de mesure (GUM )(JCGM, 2015) that describes very generally best practices for the determination of measurement uncertainties and composition of uncertainties. It is not difficult to follow such rules conventionally decided, but the goal here is to show on a simple numerical example how they work and which mistakes have to be avoided, because teaching at the university demonstrates that many students understand better when examples are given, before general cases are shown (this idea remains however to assess quantitatively). The limits of the application of the rules will be given. Let us consider, for example, the determination of the uncertainty of a mass concentration c, from the uncertainties of the masses m and M respectively of the solute and the solvent. These uncertainties can be given by an experimental standard deviation or by the precision of the measurement tool, but it is formally exact that the mass concentration c is a continuous and derivable function of the two variables m and M, so that: (2) And here: (3) In such analytical expressions, differentials are infinitely small quantities, contrary to uncertainty, which is finite (Piskounov, 1980). Documents published before the GUM proposed to move from differential to uncertainties using the expression: (4) This expression corresponds to a « Manhattan distance » (Verley, 1997), and the absolute values take into account the fact that errors are random in plus or minus. However the JCGM decided rather for the expression using the Euclidean distance: (5) It can be observed that these two practices are close, but use different distances. Moreover the choice of the JCGM can be used only if the model of the measurement process does not show any significant non linearity. Dispersions have to be small for each of the variables implied in the measurement process, of the same order of magnitude, and with symmetrical distributions. When these hypothesis are not verified, the JCGM proposes to implement a Monte Carlo method, as described in a supplement to the first document (JCGM, 2008 ; Robert et Casella, 2010). Here the principle is not to propagate the uncertainties using the model, but to use a probability density function for the input parameters. This function being known or supposed for each input value, the Markov formula can be used (Monte-Carlo methods with Markov chains consist in generating a vector xi solely from the values of the vector xi-1 ; they are processes with no memory). Contrarily, one should avoid two practices which are sometimes observed. First, one should avoid to: - measure some values for each variable (for example, m1, m2, m3, and M1, M2, M3) ; - then calculate the values of the function relationship for pairs of values, such as : - determine a standard-deviation from such pairs of values. Indeed concerning this wrong practice one can observe immediately that there is no particular reason for the association of a particular mi to the corresponding Mi rather than another Mj, j ≠ i). This practice corresponds to the calculation of a value s'(c) : (6) One should also avoid the following practice: - measurement of many values for each variable (for example, m1, m2, m3, and M1, M2, M3) ; - then determination of all the crossed values mi/Mj ; - and calculation of the standard deviations from these crossed values. A numerical comparison for education In this paragraph, let us first indicate that the French students of the level « bachelor » generally learn to compose variables only for sums, differences, products and ratios : for sums and differences, they learn that the uncertainty is the sum of variables; for products and ratios, they learn to use the “logarithm rule”.

The Issue of Best Practices in Science: How to Express Measurement Uncertainties

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support