Statistical Data Analysis, Clarendon Press, Oxford, 1998 G
Total Page:16
File Type:pdf, Size:1020Kb
E. Santovetti Università degli Studi di Roma Tor Vergata StatisticalStatistical datadata analysisanalysis lecturelecture II Useful books: G. Cowan, Statistical Data Analysis, Clarendon Press, Oxford, 1998 G. D'Agostini: "Bayesian reasoning in data analysis - A critical introduction", World Scientific Publishing 2003 1 DataData analysisanalysis inin particleparticle physicsphysics Aim of experimental particle physics is to find or build environments, able to test the theoretical models, e.g. Standard Model (SM). In particle physics we study the result of an interaction and measure several quantities for each produced particles (charge, momentum, energy …) e+ e- Tasks of the data analysis is: Measure (estimate) the parameters; Quantify the uncertainty of the parameter estimates; Test the extent to which the predictions of a theory are in agreement with the data. There are several sources of uncertainty: Theory is not deterministic (quantum mechanics) Random measurement fluctuations, even without quantum effects Errors due to nonfunctional instruments or procedures We can quantify the uncertainty using probability 2 DefinitionDefinition In probability theory, the probability P of some event A, denoted with P(A), is usually defined in such a way that P satisfies the Kolmogorov axioms: 1) The probability of an event is a non negative real number P ( A)≥0 ∀A∈S 2) The total (maximal) probability is one P(S )=1 3) If two events are pairwise disjoint, the probability of the two events is the sum of the two probabilities A∩B=∅ ⇒ P ( A∪B)=P ( A)+ P (B) From this axioms we can derive further properties: P( A)=1−P( A) P( A∪A)=1 P(∅)=0 A⊂B ⇒ P( A)≤P(B) P( A∪B)=P( A)+P(B)−P( A∩B) 3 Andrey Kolmogorov, 1933 ConditionalConditional probability,probability, independenceindependence An important concept to introduce is the conditional probability: probability of A, given B (with P(B)≠0). In effect it is meaningless to define an absolute probability, probability depends on the various information we have about the event itself and the neighboring conditions. In this way we establish a connection between A and B. In physics connections (relations) are important Let us make an example with the rolling dice: P ((n< 3)∩(n even)) 1/6 P(n< 3∣n even)= = =1/3 P (n even) 3/6 If two events are independent (uncorrelated): P( A∩B)=P ( A)⋅P (B) and in that case: P( A∩B) P( A)⋅P(B) P( A∣B)= = =P( A) P(B) P(B) As expected, the probability of A given B, if A and B are independent, does not 4 depend on B. InterpretationInterpretation ofof probabilityprobability I. Probability of a given event is the relative frequency of happening. Let A is a particular event examples: quantum mechanics effects, particle scattering, radioactive decays... The limit operation has to be considered not in the usual mathematical meaning. II. Subjective probability: A and B are hypotheses, statements that are true or false We can define the probability as the price a person think fair to pay (not the maximum), if he gains 1 if the event will happen and 0 if the event will not (de Finetti, Savage). If the possible gain is T, and you think fair to bet S that the event will happen, P(A) = S/T. S P ( A)= T Both the interpretations are consistent with the Kolmogorov axioms In particle physics frequency interpretation often most useful, but subjective probability can provide more natural treatment of non-repeatable phenomena: 5 systematic uncertainties ISOISO definitiondefinition ofof probabilityprobability . In contrast to this frequency-based point of view of probability, an equally valid viewpoint is that probability is a measure of the degree of belief that an event will occur. For example, suppose one has a chance of winning a small sum of money D and one is a rational bettor. One's degree of belief in event A occurring is p=0.5 if one is indifferent to this two betting choices: (1) receiving D if event A occurs but nothing if it does not occur; (2) receiving D if event A does not occur but nothing if it does occur. In the case of generic p (0 ≤ p ≤ 1) the two choices to which the rational bettor is indifferent are: 1) receiving (1-p)D if event A occurs but nothing if it does not occur; 2) receiving pD if event A does not occur but nothing if it does occur The cases in which the probability is easily evaluable from objective parameters are few and fictitious. In the real life we have complex problems on which we have to decide and face with our responsibilities 6 BayesBayes theoremtheorem From the definition of conditional probability we can write: and, since: Thomas Bayes (1702 – 1761) An essay towards solving a problem in the doctrine of chances, Philos. Trans. R. Soc. 53 (1763) 370; we can conclude: reprinted in Biometrika, 45 (1958) 293. The probability of the event A, given B, is the probability of event B, given A, multiplied by probability of A and divided by probability of B Probability of B at the denominator can be seen as a normalization factor 7 TheThe lawlaw ofof totaltotal probabilityprobability E Consider a subset E of the total sample space S. S Assume that S is divided into disjoint subsets Hi (hypotheses), such that H S=U i H i i We can write E as Hi E=E∩S=E∩(U i H i )=U i (E∩H i) and the probability is P (E)=P (E∩(U H ))= P (E∩H )= P (E∣H ) P (H ) i i ∑i i ∑i i i law of total probability The Bayes theorem becomes Again, the denominator is a normalization factor. P (E∣H i )P ( H i ) P ( H i∣E)= P (H ∣E )∝ P (E∣H ) P (H ) P (E∣H ) P( H ) i i i ∑i i i 8 BayesBayes theorem:theorem: interpretationinterpretation keyskeys The Bayes theorem can be also written as P(H) is modified by the fact that E is true by the same factor P ( H ∣E) P (E∣H ) i = i P(E) is modified if H is true (a soccer team has double P( H ) P( E) probability to win the match if at the half time is up in the i score then ... The Bayes theorem can be used to test a theory (H) given a P( H i∣E)∝ P (E∣H i) P (H i) new experimental evidence (E). The probability of the theory to be true, after the new evidence is proportional to the probability of the theory 'a priori' and to the probability of the likelihood event E given the theory H The probability of an hypothesis given two events can be evaluated in two ways: 1) applying the theorem directly to the P( H∣E ∩E ) event E = E1∩E2. 2) applying the theorem first to the event 1 2 E1 and then applying to the resulting probability the theorem with the event E2. It is remarkable that the results are the same and the order independent n P( H k∣E1∩E 2∩... E n)∝Πi=1 P (Ei∣H k)⋅P 0(H k )∝ P (E1∩...∩E n∣H k)⋅P 0(H k ) 9 BayesBayes theorem:theorem: interpretationinterpretation keyskeys (2)(2) The Bayes theorem can be also written as This version of Bayes theory is very useful if we want to P( H ∣E) P (E∣H ) P( H ) 1 = 1 ⋅ 1 compare two different hypotheses or theories (it is often P (H ∣E ) P( E∣H ) P (H ) meaningless to define absolute probability for a theory) 2 2 2 Probabilities ratio is modified by the ratio of the likelihood factors (Bayes factor) Bayes factor in summary... Final probability = likelihood · initial probability We can use the theorem to solve the problems of the ”inverse probability”, e.g. The problem of the causes probabilities. If there are several causes that can generate the same experimental effect, the probability that the effect is produced by a certain cause is proportional to the probability of the cause multiplied by the probability that this cause produces the observed effect P(C i∣E)∝ P (E∣C i)⋅P 0(C i ) 10 BayesBayes theorem:theorem: exampleexample 11 Consider a school with 60% of male students and 40% of female students. Female students wear pants or skirt in the same number while male students wear only pants. If an observer sees, from very far, a student with pants. What is the probability that this student is a girl? This problem can be easily solved using the Bayes theorem with: event A: the observed student is a girl; event B: the observed student wears the pants. We have to evaluate P(A|B) and: P(A) is probability that a student is female, without any condition: 40% = 2/5 P(A') is probability that a student is male, without any condition: 60% = 3/5 P(B|A): probability that a student wears the pants, given that this student is female: 1/2 P(B|A'): probability that a student wears the pants, given that this student is male: 1 P(B): probability that a student (any) wears the pants. Since the number of the students that wear the pants is 80 (60 male + 20 female) over 100 total students, P(B) = 80/100 = 4/5 P (B∣A)P ( A) 1/2⋅2/5 P (A∣B)= = =1/4 P (B) 4/5 11 BayesBayes theorem:theorem: exampleexample 22 Suppose the probability (for anyone) to have AIDS is: P(AIDS) = 0.001 P(no AIDS) = 0.999 Suppose now you made the AIDS test and the result is positive (+) but ….