Philip Coppens tribute
The index of dispersion as a metric of quanta – unravelling the Fano factor
ISSN 2052-5206 Wilfred K. Fullagar,* Mahsa Paziresh, Shane J. Latham, Glenn R. Myers and Andrew M. Kingston
Applied Mathematics, RSPE, Oliphant Building 60, Mills Road, Australian National University, Canberra, ACT 2601, Received 17 February 2017 Australia. *Correspondence e-mail: [email protected] Accepted 19 June 2017 In statistics, the index of dispersion (or variance-to-mean ratio) is unity ( 2/hxi = 1) for a Poisson-distributed process with variance 2 for a variable x Edited by Y.-S. Chen, Center for Advanced that manifests as unit increments. Where x is a measure of some phenomenon, Radiation Sources, USA the index takes on a value proportional to the quanta that constitute the phenomenon. That outcome might thus be anticipated to apply for an Keywords: Monte Carlo methods; high-quality data refinement; spectrum modelling; poly- enormously wide variety of applied measurements of quantum phenomena. chromatic methods; quantum energy However, in a photon-energy proportional radiation detector, a set of M determination; detector development. witnessed Poisson-distributed measurements {W1, W2, ... WM} scaled so that the ideal expectation value of the quantum is unity, is generally observed to give 2/hWi < 1 because of detector losses as broadly indicated by Fano [Phys. Rev. (1947), 72, 26]. In other cases where there is spectral dispersion, 2/hWi >1. Here these situations are examined analytically, in Monte Carlo simulations, and experimentally. The efforts reveal a powerful metric of quanta broadly associated with such measurements, where the extension has been made to polychromatic and lossy situations. In doing so, the index of dispersion’s variously established yet curiously overlooked role as a metric of underlying quanta is indicated. The work’s X-ray aspects have very diverse utility and have begun to find applications in radiography and tomography, where the ability to extract spectral information from conventional intensity detectors enables a superior level of material and source characterization.
1. Introduction 1.1. Foundations Extraction of the maximum possible information from datasets has been critical to the practical establishment of many fields, among them X-ray charge-density analyses (Coppens, 1997; Bolotovsky et al., 1995), and monochromatic and polychromatic photocrystallographies (Coppens et al., 2009; Fullagar et al., 2000; Helliwell et al., 1989; Kim et al., 2002). The capability for hyperspectral X-ray imaging via shot noise in energy-proportional X-ray imaging detectors does not seem widely used within X-ray radiography and tomography communities. The corresponding aspect of shot noise has been indicated elsewhere (Uhlig et al., 2011), which motivated its further elaboration in the present study. Quantum shot noise exists in many situations, where clas- sical statistics may be applied. When doing so, the following two conditions may be indicated. (i) A witnessed measurement is proportional to both the number of quanta reaching the detector, N, and the detected content of the quanta, E. W ¼ NETðÞ E ; hWi¼hNiETðÞ E : ð1Þ
(ii) The number of quanta shows shot noise (Poisson) # 2017 International Union of Crystallography variation about its expectation value hNi, so that
Acta Cryst. (2017). B73, 675–695 https://doi.org/10.1107/S2052520617009222 675 Philip Coppens tribute
pffiffiffiffiffiffiffiffi ¼ hNiETðEÞ: ð2Þ Fano (1947) (see Appendix H). Their combined influence may be explored via equation (4) using models of S(E)andT(E). The measurement W may be the calibrated output of an The ratio of expectation values recalls a Rayleigh quotient analog to digital converter, but it could equally be a mass, a (Anton & Rorres, 1987), which would give an eigenvalue concentration, a deflection, a current or any other witnessable corresponding to ET(E); while 2=hWi is the eigenfunction of magnitude. Depending on literary background and context, the repeated self-convolution process corresponding to the the product NE might be formally defined as an intensity; to central limit theorem (see Appendix G). The calibration of avoid potential conflicts of understanding we avoid that T(E) for an energy-proportional detector is thus tantamount notation. The transfer function T(E) describes how E is to making a crude spectrometer from it. It enables hyper- transmitted through the detector system to yield the witnessed spectral imaging from the corresponding detector, even in signal. Observe that a quantum efficiency Q(E) governs the cases where individual quanta are not discernible. number of quanta N captured by the detector [and so effec- tively resides within N in equations (1) and (2)], but that Q(E) 1.3. Scope and organization is functionally distinct from T(E). The product ET(E) is the This document seeks to provide a useful collection of detector’s response to what it captures. ÀÁpffiffiffiffi Rearrangement of equation (2) gives E ¼ = N ½ 1=TEðÞ. information concerning the index of dispersion as a metric of This has been used to estimate the energy of hot electrons in quanta. A mathematical review is avoided for accessibility matter (Uhlig et al., 2011), which is usually a preliminary step reasons, recognizing that any claims of novelty would be of X-ray detection (Wilson, 1912; Fullagar et al., 2008). impossible to defend in that arena. Readers may be able to The role of 2/hWi as a metric of E follows from equations apply the interpretation offered here, that includes poly- (1) and (2), chromatic and lossy situations, to data of diverse natures. A cursory introduction to historical and other aspects is made 2 next, before presenting X-ray observations from which ¼ ETðÞ E ; ð3Þ hWi applied developments have been growing from this work, and closing with some corollaries. Appendices are referred to so is independent of Qp(Effiffiffiffiffiffiffiffi), but still a function of T(E). throughout. Their structure is as linear as reasonably possible, h i ¼ h i Alternatively, W = N (the signal-to-noise ratio for offering a deeper look at the many necessary considerations Poisson signals) follows by cancellation of ET(E) from equa- under suitable subheadings. They provide essential insights to tions (1) and (2). pertinent physics and mathematics, additional applied exam- ples whose examination allows placement of general bounds 1.2. The polychromatic case on applicability, and entry points to further literature. Equation (3) only really applies if E is single-valued (i.e. monodisperse). Otherwise, the spectrum may be conceptually divided into b evenly spaced energy bins of average energy 2. Applicability E1, E2 ..., Eb. The filling of neighbouring spectral bins is 2.1. Historical and mathematical assumed to be uncorrelated, so numerator and denominator contributions of the different energy bins are expanded The ability of the equations above to access the magnitude separately to give of quanta applies in broad fields. For example, in the early 1800s, ideas of the quantization of matter were re-invigorated 2 n E2T2ðÞþE n E2T2ðÞþþE n E2T2ðÞE by the likes of Dalton; molecular weights m became known ¼ 1 1 1 2 2 2 b b b hWi n1E1TEðÞþ1 n2E2TEðÞþþ2 nbEbTEðÞb through the work of Avogadro via Dumas’ vapour density hSEðÞET2ðÞiE measurements (soon also associated with the names Canniz- ¼ : ð4Þ hSEðÞTEðÞi zaro, Viktor Meyer and Hoffman) which then also led to atomic weight determinations (Glasstone, 1940; Pauling, Here the second equality gives the result as a ratio of expec- 1956). These use the fact that T=P ¼ m=R in accordance with tation values. These involveP the spectral density function the ideal gas law, for a gas of physical density (which supplies SEðÞ¼EiniðÞE =N, with ini ¼ N in the conceptual limit of the necessary mass dimension on the left side), at temperature infinite b and N. That suggests use of integral notation, T and pressure P with gas constant R. A connection to although here the quest for properties of quanta motivates the equation (3) is seen through the kinetic theory of gases, briefly summation notation of statistical mechanics and thermo- motivated as follows. From the average velocity v of mole- dynamics (Baker & Cousins, 1984; Schro¨dinger, 1989). cules, P / mv n (molecular momentum times numerical Equation (4) reduces to equation (3) in the monochromatic density); while T / mv2 (molecular kinetic energy) may be case, but S(E)andT(E) otherwise become essential features viewed as m-normalized dispersion of the molecular 2 within the expectation brackets. Spectral dispersion manifests momentum. The ratio T=P / mv=ðÞmv analogous to equation through S(E), which by itself always acts to increase the value (3). The recovery of m is by multiplying by ð¼ m nÞ, the of 2=hWi relative to the expectation energy hSðEÞi (see measured physical density of the gas. Appendix E). Meanwhile partial registry manifests through All the following have in common mathematics relating to T(E) and suppresses 2=hWi via arguments introduced by 2=hWi for the purposes of extracting quantum information,
676 Wilfred K. Fullagar et al. Unravelling the Fano factor Acta Cryst. (2017). B73, 675–695 Philip Coppens tribute and are grounded in fluctuation analyses in statistical value for data points, which is the key to their removal by mechanics (Barnes & Silverman, 1934; Schro¨dinger, 1952): subtraction, if needs are pressing and experiments cannot determination of the electron charge by shot noise measure- eliminate the cause. On average, a pattern properly accounted ments (Hull & Williams, 1925); diffusion and random walk for always brings expectation values closer to measured data models (Atkins, 1998); demonstration of the molecular (Appendix I). This reduces 2/hwi, whose minimum reliably constitution of matter and Avogadro’s number determination obtained value is therefore presumed best, provided that via Brownian motion (Glasstone, 1940); optical galvanometer underlying patterns do not involve an excessive number of and cantilever noise in atomic force microscopy to measure parameters (the optimum number of parameters given any Boltzmann’s constant (Ising, 1926; von Smoluchowski, 1912; prior information is provided by Bayesian analysis; Sivia & White, 1963). Raindrop measurements by corresponding Carlile, 1992). Examining all the dimensions of a dataset often arguments are presented in Appendix I. The closely related suggests patterns, while examination of known dimensions Fano factor has been examined in neuroscience contexts to within existing data and improved experimental methodology evaluate whether neural spiking follows a Poisson distribution stimulates the quest. (Typical experimental pattern searches (Eden & Kramer, 2010); the underlying arguments are may be seen in the determination of raindrop size Appendix I.) general. There are strong reasons to indicate a correspon- The shape of a Poisson distribution very rapidly approximates dence principle linking many related phenomena (Hughes & Gaussian even for quite modest expectation numbers of Ninham, 2016). quanta, but with the key additional property hxi¼ 2 The Poisson distribution is broadly accepted in many (Prigogine, 1978). The two distributions are often experi- contexts, but it can be difficult to pin specific physical causes mentally indistinguishable, especially where differences are for it. One derivation takes the binomial distribution in the involved. Where individual quanta are observed, Poisson limit of N (number of tries) going to infinity while the product error-normalized residuals whose histogram matches a =0, Np (with p being the probability of ‘success’ per try) remains = 1 Gaussian distribution indicate the exhaustion of a constant; while the binomial distribution can itself be seen as a dataset’s capacity to support further patterns (Uhlig et al., limit of other distributions (Dudewicz & Mishra, 1988). 2013). (This is because the additional distribution caused by a Relevance in combinatorial problems is implied, as when pattern would convolve with the observed residuals to adding random phases (Rayleigh, 1919) or when enumerating broaden their error-normalized histogram beyond the =0, microstates in statistical mechanics. Boltzmann’s combina- = 1 Gaussian expected for the hxi-subtracted Poisson torial approach to statistical mechanics can be avoided distribution alone.) Where quanta are not directly observed, (Darwin & Fowler, 1922; Schro¨dinger, 1952), whereupon the the underlying Poisson distribution may be open to question. Poisson distribution may be derived in functional form by However, using the ability of shot noise to numerically over- ‘top–down’ approaches (Morse, 1969; Kittel & Kroemer, whelm other noise sources, together with spectral examina- 1980), or from quantum mechanics (Henri-Rousseau & Blaise, tions, a picture of detector response can be built up. Poisson 2011). These allow the Poisson distribution’s perception as a shot noise is then unambiguously quantifiable in proportional distribution in its own right and not just as an approximation detectors that lack the sensitivity for single quantum to the binomial distribution (Noble, 1991); as an entropy- measurements. related function (Jaynes, 2003); and independently of any dimension chosen for its expression (e.g. the time domain is a popular choice; Lindgren, 1976; Davis, 1996). Mathematical 2.3. Key restrictions and the case for thermal detection consistency anticipates several approaches to the Poisson Practical measurement of a mean and standard deviation distribution’s derivation, just as the Gaussian distribution also via equations (1) and (2) implies a set of M individual enjoys (Jaynes, 2003). That empowers its use in both bottom– measurements, under experiment conditions that anticipate an up and top–down arguments, allowing robustness when facing expectation value. Such measurements might be done questions like ‘but what about quantum theory?’ (Jaynes, sequentially (in time domain) or in parallel (using identical 2003; Cabello et al., 2016). Inevitably then, the Poisson detectors in spatial or angular domains). All three domains are distribution’s deep roots lie beyond the scope of this manu- common in movies and images. Why then, is 2/hIi not used to script. Rather, the present contribution draws attention to the estimate colour information from black and white photo- connection it implies between 2=hWi and quanta, a rela- graphs? The X-ray experiments below show that at X-ray tionship that is overlooked by many works that might have photon energies (far exceeding the detector’s bandgap) the attached tremendous significance to it. X-ray colour can be determined in this way. However, for energies not much above the detector bandgap, ETðÞ’ E constant (Appendix H). This is because a quantum 2.2. Requirements with only fractionally more energy than the bandgap cannot For a property of quanta to be revealed by the index of excite more than one electron-hole pair over that energy dispersion, any patterns in data must be rigorously accounted barrier. Other consequences of the photon capture (mostly for – dispersion otherwise arises from variance in expectation heat) are usually not recorded. When the detector response values too, just as it does from shot noise [equation (4) ET(E) loses dependence on E, equations (1)–(4) no longer contains this, in S(E)]. Such patterns alter the expectation offer the required metric of E. The result is the same where
Acta Cryst. (2017). B73, 675–695 Wilfred K. Fullagar et al. Unravelling the Fano factor 677 Philip Coppens tribute thresholding (discrimination or windowing) is applied in the the parent quantum according to thermalization temperature front-end electronics of X-ray detectors. A different form of and the absorber’s heat capacity (Moseley et al., 1984; Mo¨ss- ill-conditioning occurs when T(E) for a detector is exponential bauer, 1991; Enss, 2005; Ullom & Bennett, 2015). A critical or extremely large, for example in detectors that, like Geiger consequence of this low-temperature microcalorimetry counters (Strong, 1938), use avalanche gain mechanisms approach for X-ray science is that the typically 10 5 (Fullagar et al., 2011b; Tutt et al., 2012), and methods of Darwin–Bragg losses of conventional diffraction-based thermal neutron detection that depend on unleashing rela- analyses (Als-Nielsen & McMorrow, 2001; Darwin, 1914) are tively enormous nuclear energies. There also, equations (1)– avoided (Fullagar, Uhlig et al., 2010; Fullagar et al., 2017), (4) can fail as a reliable metric of E. opening scattering-law technique development (Uhlig, 2011; Recognition of this situation in photon-measuring direct- Uhlig et al., 2013, 2015; Fullagar et al., 2017) around tempo- detection CCDs (Fullagar et al., 2008) motivated many rally remarkable in-house polychromatic hard X-ray sources investments in ultracryogenic microcalorimetry (Enss, 2005; (Fullagar et al., 2007; Ta Phuoc et al., 2012). Mazin et al., 2013; Fullagar et al., 2007). By suitable measures, not only is each quantum fully measured, but it is split into a 2.4. Radiation application fields temperature-dependent number of low-energy quasiparticles, Equations (1)–(4) apply in microcalorimeter systems and in fluctuations of which bound the accuracy of measurements of many other real-world imaging systems where the original
Figure 1 Pseudomonochromatic XRF data were generated in experiments using two different detectors (a) and (b). For the XRF images, expectation intensities (an example in c) were calculated from averages of 256 clearfield and 96 darkfield frames. (d) shows a selection of corresponding measured spectra. In (c) an expectation isointensity strip is highlighted, whose contributing pixels show measurable intensity variance. Plotting the variance against the expectation intensity in (e) yields an approximately straight line for each XRF sample, with a small y-axis offset due to inherent detector noise. In (f)the fitted line slopes are plotted against average photon energy observed in the spectrometer. A straight line fit through the points in panel f yields the zero- order estimate for the scintillator detector’s response.
678 Wilfred K. Fullagar et al. Unravelling the Fano factor Acta Cryst. (2017). B73, 675–695 Philip Coppens tribute quantum energy greatly exceeds bandgaps. These include al., 2001), which brilliant monochromatic sources cannot do X-ray radiography/tomography (Paziresh et al., 2016), Laue given ordered samples (Kim et al., 2002; Moffat, 2003; Helli- crystallography (Helliwell et al., 1989), polychromatic X-ray well et al., 1989; Fullagar 2017). The energy of quanta in the phase contrast imaging (Wilkins et al., 1996), cryogenic reflections is key to determining momentum transfer and microcalorimetry approaches to astronomical (Mazin et al., thereby structure (van Hove, 1954; Higginbotham et al., 2014; 2013) and THz (Becker et al., 2014) imaging, and other McDonald et al., 2015), as well as molecular dynamics via disciplines (e.g. fast electron measurements; Uhlig et al., 2011), spectroscopy (van Hove, 1954; Fullagar, 1999; Ament et al., and may be useful to estimate X-ray fluorescence (XRF) 2011). In combination with earlier work (Uhlig et al., 2011), contributions to diffuse scatter in recent XFEL reports (Ayyer results presented here indicate how to deduce that energy et al., 2016). In such situations we indicate the potential utility using pixel-to-pixel intensity variance in single shot data. to establish corresponding quantum energies and spectral X-ray radiography and tomography often uses a broadband parameters, when appropriate spectral models are applied to source and scintillator-coupled optical detector. This breed of equation (4). Equation (4) is an entry point to the thermo- detector was used in experiments described below. Parametric dynamics of colour, with Figs. 1 and 2 being demonstrations models relate S(E) to source spectra and (for example) density of it. and atomic number Z of intervening materials (Alvarez & Laue crystallography leads to monochromatic (Bragg- Macovski, 1976). Owing to the great redundancy of projec- diffracted) reflections on detector planes. This polychromatic tions traversing any given voxel, equation (4) offers clear technique is especially valuable for its ability to witness large scope for improved spectral understanding of statistical volumes of reciprocal space in single radiation shots (Sˇrajer et mechanics approaches to model-based refinement of
Figure 2 (a) An observed Amptek spectrum after spectral stripping (closed circles), the agreement obtainable in our models (line), and a typical Monte Carlo simulated spectrum (open circles; using different model parameters). (b) Monte Carlo tests of left- versus right-hand sides of equation (4) for several simulated spectra and applying T(E) = 1, anticipating direct correlation of the axes, as observed. (c) Physically measured data for three distinct spectra, observing three distinct slopes upon adjusting the intensity by tube current. In (d)(cf. b), physical measurements of 2=hIi are plotted against the correspondingly simulated right-hand side of equation (4). Nine measurements combining three filter arrangements and three tube voltage settings are evaluated, together with the data from (c). Random scatter about a straight line is seen, supporting the assumption T(E) = constant. (Physically different detectors were used in Fig. 1 versus Fig. 2, explaining why their final panels show different slopes.)
Acta Cryst. (2017). B73, 675–695 Wilfred K. Fullagar et al. Unravelling the Fano factor 679 Philip Coppens tribute polychromatic tomographic data (De Man et al., 2001; Elbakri molecules due to / N0 scaling; there the necessary average & Fessler, 2003; Nuyts et al., 2013). Equation (4) thus opens comes by collectively weighing the molecules in the respective spectral artefact correction and materials identification volume, a process for which the digitization error can often be (Paziresh et al., 2016) using statistical mechanics and a wide made insignificant in the macroscopic world. range of machine learning approaches to find variational minima. 3. X-rays 2.5. Scaling laws The X-ray spectrum witnessed by a detector [S(E) in equation An easily overlooked aspect of Poisson (shot) noise is that (4)] can usually be accurately accounted for using parametric the value of 2/hWi obtained from measurements is indepen- models, as is done below. Thereafter a combination of near- dent of the expectation number of photons hNi applied in the monochromatic and polychromatic X-ray examinations, individual intensity measurements, even when hNi2ð0; 1 . together with physical understanding of a detector system, This is seen in the cancellation of N that led to equation (3) offer incremental confidence in the nature and causes of the and in Monte Carlo simulations (Appendix E). Monte Carlo detector response function T(E), as well as changes of S(E) approaches shown there also confirm the analytical anticipa- arising from samples. Such measurements are particularly tion [equation (15)] that variance among sets of { 2/hWi} valuable given that specific details of radiation detectors and 2 1 containing M determinations of it scales as ðÞ 2=hWi / M their inner workings may be withheld from customers for (here using delta notation to avoid a confusion of sigmas). So commercial reasons. The extraction of spectral functions from in the absence of detector noise or instrumental digitization, inner products is known as unfolding. Many robotic inference the energy resolution of a monochromatic source obtainable procedures (Jaynes, 2003) for spectral unfolding in radiation by this averaging process is contexts have been implemented (Sanna & O’Brien, 1971;