Parallel Maximum-Likelihood Inversion for Estimating Wavenumber-Ordered Spectra in Emission Spectroscopy

Parallel Maximum-Likelihood Inversion for Estimating Wavenumber-Ordered Spectra in Emission Spectroscopy Hoda El-Sayed Marc Salit John Travis [email protected] [email protected] Judith Devaney William George [email protected] [email protected] National Institute of Standards and Technology Gaithersburg, Maryland USA Abstract distributes the signal-carried noise to the signal, preventing the masking of small spectral features by the signal-carried We introduce a parallelization of the maximum- noise from the large spectral features. The spectral esti- likelihood cosine transform. This transform consists of a mates obtained using maximum-likelihood inversion have computationally intensive iterative fitting process, but is another potentially useful property—a line-shape which is readily decomposed for parallel processing. The parallel burdened with a less distorting transform-function than the implementation is not only scalable, but has also brought sinc function of the Fourier transform. the execution time of this previously intractable problem to In this paper we present a parallel implementation of the feasible levels using contemporary and cost-efficient high- maximum-likelihood inversion method. It will be shown performance computers, including an SGI Origin 2000, an that the parallel implementation is not only scalable, but SGI Onyx, and a cluster of Intel-based PCs. has also brought the execution time of this problem to feasible levels using contemporary and cost-efficient high- Key words : parallel processing, emission spectroscopy, performance computers including Origin 2000, SGI Onyx, cosine transform, maximum-likelihood inversion, perfor- and PC clusters. By parallelizing this application, we were mance evaluation, DParLib, MPI. able to reduce the running time of the program to seconds rather than hours. The rest of this paper is organized as follows: Section 2 describes the maximum-likelihood in- 1 Introduction version and the sequential code implementation, Section 3 describes the computational methodology, where it outlines Michelson interferometers used in optical spectroscopy the sequential and the parallel algorithms, as well as their produce interferograms that can be considered a super- run-time cost, Section 4 shows the experimental results ob- position of cosine functions. The optical spectrum is tained on the different machines used, and Section 5 con- estimated from this interferogram using some transform tains discussion and future work. method, typically the Fourier transform, a least-squares procedure, which presumes normally-distributed noise. A 2 Maximum-Likelihood Inversion more general maximum-likelihood cosine transform has been demonstrated to have several advantages over the The method of maximum-likelihood (ML) was first de- Fourier transform when estimating the spectrum for typi- scribed in a 1922 paper by Fisher [4]. It is a method for cally sparse emission spectra. In particular, a maximum- estimating parameters for a model given a set of observed likelihood inversion derived for a Poisson noise distributed data. The essential feature of ML is to find a set of val- signal has been demonstrated to eliminate the often trouble- ues for the model parameters for which the probability of some noise redistribution associated with the Fourier trans- observing the given data is highest. In this application, we form of such signals [2]. The Fourier transform distributes model the data as the superposition of cosine functions and all noise in the signal as white (the spectral character of the parameters to be estimated by ML are the coefficients of the noise) while the maximum-likelihood cosine transform those cosine functions. UWVXY Z The ML inversion method has been proposed for use (program parameter) U\[ ]^Z in processing interferograms by Bialkowski [2]. The (program parameter) [ ] c expectation-maximization (EM) algorithm used to compute U `©a6b read _ c UWVXY the maximum likelihood is described by Shepp, Vardi, and b read dWe§fhgjilk<mGn©o;p Kaufman [11, 12] and also by Lange and Carson [10]. The qWr©sGtlu c U\[ ] Z{z©| } b reader is referred to those papers for more details on the de- a mGv;k<nxw%y ~ c cE UWV XY Z }§| h5UWVXY b b9* velopment of the EM algorithm. The basic algorithm con- p sists of iterating the following two equations: ~ ss c cc U[ ]©6U Z}| W z© GE VXY b b b ` aB a v.e _ p p qWr©sGt ss [G]Gc U Z ab aB ¨ ¨ ¢ ¡£¢¥¤§¦©¨ ¢¥¤§¦ wy v.e p p (1) $l For each iteration BEGIN ¢¡ qWr©sxt%u ~qWr£sGt%u c U\[G] Z 2.3 b a a ¦ ,+.-./10 145+.-6 m v;k<nxw%y m vknwy (*) 0 §¢ ¨ ¢ ¨"!#%$'& 4798;:<+ - ¨?>@ ¢¥¤§¦ (2) ¦ ( For all points j $=& BEGIN ¢¡ qWr©s sAs ~qWr£sGt%u c U Z VXY b a1 a ¦ ¡£¢A¤ ¨ ¢ ¨ a gjilk<mGn©o v.e m vknwy p p where is the estimated interferogram, the $¥¤;¦ ¦ ¨ ¢A¤ are estimates of optical power, are the basis set END B functions, and are parameter estimates after the ¦ ¨ ¢¥¤ For all frequencies i k and k+1 iterations and the are normalized on the ];[ BEGIN X,¬ ¦ ¦ ¡ ¢¥¤ ¨ ¤ %¨ ©lª6« qWr©sG§ rx~ sAs c U\[ ] Z - ][ b Y X¬ a a1 interval. is the expectation of the interferogram ª6« n w%vk v.e p l p - , $ for iteration k. Initial parameter values, , are typically ]6´µ Y²V ¬³X l 3 qWr©sxt%u qWr£s § r~ c U\[ ] Z¯®° ± ´xµ Y<² b a set to unity in the computations here. Convergence occurs a m v;k<nxw%y n w%v;k 3 l when the sums to the right of approach 1. END END qWr©sGt%u c 3 Computational Methodology U\[ ] b a print m vk<nxw%y qWr©s c U VXY b gjilkm n£o print p The algorithm was first implemented sequentially to get an initial baseline of the properties with respect to speed and to look at the results in terms of the physics. Figure 1 Figure 1. Pseudo-code for the sequential gives an outline of the serial algorithm. There are CEDGF¥H in maximum-likelihood inversion algorithm. ¤§¦ the interferogram which is denoted ¡£¢¥¤I¦©¨ , where the are the points. There are C"J©K values of frequency which are ¡PORQ denoted . LNM and are constants. The maximum- likelihood inversion method uses two input files, one con- libraries use MPI for all inter-process communication. The tains the frequencies of interest and the other contains the use of this industry standard library, MPI, ensures portabil- interferogram points. ity of these libraries with respect to message passing opera- In each iteration, we perform a computation based on tions. ¢ ¨ DGF¥H all data points C for each of the frequencies of in- Since the serial version of the maximum-likelihood al- ¢ ¨ J©K terest C leading to a sequential time complexity of gorithm was implemented in C, C-DParLib was used to S ¨ ¢ F¥T6K J©K D F¥H C C C . Following the sequential implementa- support the parallel implementation. The data-parallel pro- tion, we created a parallel algorithm. Since the number of gramming paradigm supported by C-DParLib allows pro- points in the interferogram is much larger than the number grammers to specify operations on arrays rather than oper- of frequencies of interest, we decided to parallelize across ations on individual array elements, thus producing a sim- the interferogram. This turns this problem into a data- ple SPMD (single program-multiple data) style of program. parallel SPMD computation. The most important functions provided by C-DParLib allow Two libraries supporting the data-parallel programming the programmer to specify the general distribution scheme model, using the Message Passing Interface (MPI), have for dividing the data arrays among the processors, such as been developed at the National Institute of Standards and block and block-cyclic data distribution along one or more Technology. One was written for Fortran 90 programs, array axes, without requiring the specification of the ma- known as “F90-DParLib” [3, 7], and the other is written chine size. All specific distribution parameters, such as how for C programs, known as “C-DParLib” [5]. Both of these many array elements are assigned to each processor are de- UWVXY Z termined at runtime by the library. This frees the program- (program parameter) U\[ ]^Z (program parameter) § mer to concentrate on the algorithm and not the details of ar- Z ray distribution. For many parallel algorithms, this data dis- (number of processors) c U[ ] tribution service alone greatly simplifies the programming `©a6b Proc 0: read _ c U V X,Y task. b dWe§fhgjilk<mGn©o Proc 0: read p In addition to handling all of the details of data distribu- // Communications § tions, other commonly needed data-parallel operations are Distribute array sections U*V XY also supported by C-DParLib such as array shifting, ele- Copy arrays of U\[G] mental operations, and array reductions. For operations that require communications, the library caches information for // Local operations on each proc. qWr©sGtlu [ ]Gc U Z{z©| } a<b later reuse so that some of the overhead cost can be amor- mGv;k<nxw%y § ~ c cE U Z }§| h5U V XY VXY b b9£ tized over multiple operations. For operations not directly p § ~ ss c cc U[ ];U Z }§| ©h z5 xE VXY b b b ` aB a v.e _ p supported by the library, the programmer has access to all p of the array distribution details needed to correctly access // Communications qWr©sGt ss c Z the distributed arrays and may use any of the MPI routines U\[G] b a aB wy v.e p p directly as needed. $ We used this library to simplify the parallelization of For each iteration the maximum-likelihood algorithm across interferogram BEGIN points. Each processor receives a copy of all of the frequen- // ¢¡Local operations ~qWr£sGt%u qWr©sxt%u c U\[G] Z b a a ¡ £¢ cies, and one section, of size CED F¥H , of the interferogram m vknwy m v;k<nxw%y where ¢ is the number of processors used.

Parallel Maximum-Likelihood Inversion for Estimating Wavenumber-Ordered Spectra in Emission Spectroscopy

SGI® Opengl Multipipe™ User's Guide

Computing @SERC Resources,Services and Policies

CXFSTM Administration Guide for SGI® Infinitestorage

SGI Dmediapro™ DM2/DM3 Board Owner's Guide

SGI™ Origin 3000 Series Technical Configuration Owner's Guide

SGI High Performance Visualization Solutions

SGI® Opengl Vizserver™ Administrator's Guide

Realtime Computer Graphics on Gpus Introduction

Sgrortgtn'" 3000 Sertes Modular, Hlgh-Performance Servers Sgiorigin 3000 Series Product Stanford and SGI .Q.Y..~Iyi~W

SGI® L1 and L2 Controller Software User's Guide

Warren Harrop Thesis

SGI® Fibre Channel PCI Option Board and XIO™ Option Board User's