A New Family of Bounded Divergence Measures and Application to Signal Detection
Total Page:16
File Type:pdf, Size:1020Kb
A New Family of Bounded Divergence Measures and Application to Signal Detection Shivakumar Jolad1, Ahmed Roman2, Mahesh C. Shastry3, Mihir Gadgil4 and Ayanendranath Basu5 1Department of Physics, Indian Institute of Technology Gandhinagar, Ahmedabad, Gujarat, India 2Department of Mathematics, Virginia Tech , Blacksburg, VA, U.S.A. 3Department of Physics, Indian Institute of Science Education and Research Bhopal, Bhopal, Madhya Pradesh, India 4Biomedical Engineering Department, Oregon Health & Science University, Portland, OR, U.S.A. 5Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, West Bengal-700108, India Keywords: Divergence Measures, Bhattacharyya Distance, Error Probability, F-divergence, Pattern Recognition, Signal Detection, Signal Classification. Abstract: We introduce a new one-parameter family of divergence measures, called bounded Bhattacharyya distance (BBD) measures, for quantifying the dissimilarity between probability distributions. These measures are bounded, symmetric and positive semi-definite and do not require absolute continuity. In the asymptotic limit, BBD measure approaches the squared Hellinger distance. A generalized BBD measure for multiple distributions is also introduced. We prove an extension of a theorem of Bradt and Karlin for BBD relating Bayes error probability and divergence ranking. We show that BBD belongs to the class of generalized Csiszar f-divergence and derive some properties such as curvature and relation to Fisher Information. For distributions with vector valued parameters, the curvature matrix is related to the Fisher-Rao metric. We derive certain inequalities between BBD and well known measures such as Hellinger and Jensen-Shannon divergence. We also derive bounds on the Bayesian error probability. We give an application of these measures to the problem of signal detection where we compare two monochromatic signals buried in white noise and differing in frequency and amplitude. 1 INTRODUCTION to assess the convergence (or divergence) of proba- bility distributions. Many of these measures are not Divergence measures for the distance between two metrics in the strict mathematical sense, as they may probability distributions are a statistical approach to not satisfy either the symmetry of arguments or the comparing data and have been extensively studied in triangle inequality. In applications, the choice of the the last six decades (Kullback and Leibler, 1951; Ali measure depends on the interpretation of the metric in and Silvey, 1966; Kapur, 1984; Kullback, 1968; Ku- terms of the problem considered, its analytical prop- mar et al., 1986). These measures are widely used in erties and ease of computation (Gibbs and Su, 2002). varied fields such as pattern recognition (Basseville, One of the most well-known and widely used di- 1989; Ben-Bassat, 1978; Choi and Lee, 2003), speech vergence measures, the Kullback-Leibler divergence recognition (Qiao and Minematsu, 2010; Lee, 1991), (KLD)(Kullback and Leibler, 1951; Kullback, 1968), signal detection (Kailath, 1967; Kadota and Shepp, can create problems in specific applications. Specif- 1967; Poor, 1994), Bayesian model validation (Tumer ically, it is unbounded above and requires that the and Ghosh, 1996) and quantum information theory distributions be absolutely continuous with respect (Nielsen and Chuang, 2000; Lamberti et al., 2008). to each other. Various other information theoretic Distance measures try to achieve two main objectives measures have been introduced keeping in view ease (which are not mutually exclusive): to assess (1) how of computation ease and utility in problems of sig- “close” two distributions are compared to others and nal selection and pattern recognition. Of these mea- (2) how “easy” it is to distinguish between one pair sures, Bhattacharyya distance (Bhattacharyya, 1946; than the other (Ali and Silvey, 1966). Kailath, 1967; Nielsen and Boltz, 2011) and Chernoff There is a plethora of distance measures available distance (Chernoff, 1952; Basseville, 1989; Nielsen 72 Jolad, S., Roman, A., Shastry, M., Gadgil, M. and Basu, A. A New Family of Bounded Divergence Measures and Application to Signal Detection. DOI: 10.5220/0005695200720083 In Proceedings of the 5th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2016), pages 72-83 ISBN: 978-989-758-173-1 Copyright c 2016 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved A New Family of Bounded Divergence Measures and Application to Signal Detection and Boltz, 2011) have been widely used in signal for different signal to noise ratios, providing thresh- processing. However, these measures are again un- olds for signal separation. bounded from above. Many bounded divergence mea- Our paper is organized as follows: Section I is sures such as Variational, Hellinger distance (Bas- the current introduction. In Section II, we recall the seville, 1989; DasGupta, 2011) and Jensen-Shannon well known Kullback-Leibler and Bhattacharyya di- metric (Burbea and Rao, 1982; Rao, 1982b; Lin, vergence measures, and then introduce our bounded 1991) have been studied extensively. Utility of these Bhattacharyya distance measures. We discuss some measures vary depending on properties such as tight- special cases of BBD, in particular Hellinger distance. ness of bounds on error probabilities, information the- We also introduce the generalized BBD for multi- oretic interpretations, and the ability to generalize to ple distributions. In Section III, we show the posi- multiple probability distributions. tive semi-definiteness of BBD measure, applicability Here we introduce a new one-parameter (α) fam- of the Bradt Karl theorem and prove that BBD be- ily of bounded measures based on the Bhattacharyya longs to generalized f-divergence class. We also de- coefficient, called bounded Bhattacharyya distance rive the relation between curvature and Fisher Infor- (BBD) measures. These measures are symmetric, mation, discuss the curvature metric and prove some positive-definite and bounded between 0 and 1. In inequalities with other measures such as Hellinger the asymptotic limit (α ∞) they approach squared and Jensen Shannon divergence for a special case of Hellinger divergence→ (Hellinger, ± 1909; Kakutani, BBD. In Section IV, we move on to discuss applica- 1948). Following Rao (Rao, 1982b) and Lin (Lin, tion to signal detection problem. Here we first briefly 1991), a generalized BBD is introduced to capture describe basic formulation of the problem, and then the divergence (or convergence) between multiple dis- move on computing distance between random pro- tributions. We show that BBD measures belong to cesses and comparing BBD measure with Fisher In- the generalized class of f-divergences and inherit use- formation and KLD. In the Appendix we provide the ful properties such as curvature and its relation to expressions for BBD measures , with α = 2, for some Fisher Information. Bayesian inference is useful in commonly used distributions. We conclude the paper problems where a decision has to be made on clas- with summary and outlook. sifying an observation into one of the possible array of states, whose prior probabilities are known (Hell- man and Raviv, 1970; Varshney and Varshney, 2008). 2 DIVERGENCE MEASURES Divergence measures are useful in estimating the er- ror in such classification (Ben-Bassat, 1978; Kailath, In the following subsection we consider a measurable 1967; Varshney, 2011). We prove an extension of space Ω with σ-algebra B and the set of all probability the Bradt Karlin theorem for BBD, which proves the measures M on (Ω,B). Let P and Q denote probabil- existence of prior probabilities relating Bayes error ity measures on (Ω,B) with p and q denoting their probabilities with ranking based on divergence mea- densities with respect to a common measure λ. We sure. Bounds on the error probabilities Pe can be recall the definition of absolute continuity (Royden, calculated through BBD measures using certain in- 1986): equalities between Bhattacharyya coefficient and Pe. We derive two inequalities for a special case of BBD Absolute Continuity: A measure P on the Borel (α = 2) with Hellinger and Jensen-Shannon diver- subsets of the real line is absolutely continuous with ( ) = gences. Our bounded measure with α = 2 has been respect to Lebesgue measure Q, if P A 0, for ev- ery Borel subset A B for which Q(A) = 0, and is used by Sunmola (Sunmola, 2013) to calculate dis- ∈ tance between Dirichlet distributions in the context of denoted by P << Q. Markov decision process. We illustrate the applicabil- ity of BBD measures by focusing on signal detection 2.1 Kullback-Leibler Divergence problem that comes up in areas such as gravitational wave detection (Finn, 1992). Here we consider dis- The Kullback-Leibler divergence (KLD) (or rela- criminating two monochromatic signals, differing in tive entropy) (Kullback and Leibler, 1951; Kullback, frequency or amplitude, and corrupted with additive 1968) between two distributions P,Q with densities p white noise. We compare the Fisher Information of and q is given by: the BBD measures with that of KLD and Hellinger p I(P,Q) plog dλ. (1) distance for these random processes, and highlight the ≡ q Z regions where FI is insensitive large parameter devia- The symmetrized version is given by tions. We also characterize the performance of BBD J(P,Q) (I(P,Q) + I(Q,P))/2 ≡ 73 ICPRAM 2016 - International Conference on Pattern Recognition Applications and Methods (Kailath, 1967),