1946 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016 A Fast Hyperplane-Based Minimum-Volume Enclosing Algorithm for Blind Hyperspectral Unmixing Chia-Hsiang Lin, Chong-Yung Chi, Senior Member, IEEE, Yu-Hsiang Wang, and Tsung-Han Chan

Abstract—Hyperspectral unmixing (HU) is a crucial signal well as their corresponding fractions (or abundances) present processing procedure to identify the underlying materials (or end- in a scene of interest from observed hyperspectral data, having members) and their corresponding proportions (or abundances) various applications such as planetary exploration, land map- from an observed hyperspectral scene. A well-known blind HU criterion, advocated by Craig during the early 1990s, considers ping and classification, environmental monitoring, and mineral the vertices of the minimum-volume enclosing simplex of the data identification and quantification [5]–[7]. The observed pixels in cloud as good endmember estimates, and it has been empirically the hyperspectral imaging data cube are often spectral mixtures and theoretically found effective even in the scenario of no pure of multiple substances, the so-called mixed pixel phenomenon pixels. However, such kinds of algorithms may suffer from heavy simplex volume computations in numerical optimization, etc. In [8], owing to the limited spatial resolution of the hyperspectral this paper, without involving any simplex volume computations, sensor (usually equipped on board the satellite or aircraft) uti- by exploiting a convex fact that a simplest simplex of lized for recording the electromagnetic scattering patterns of the vertices can be defined by associated hyperplanes, we propose underlying materials in the observed hyperspectal scene over a fast blind HU algorithm, for which each of the hyperplanes about several hundreds of narrowly spaced (typically, 5–10 nm) associated with the Craig’s simplex of vertices is constructed from affinely independent data pixels, together with an wavelengths that contiguously range from visible to near-in- endmember identifiability analysis for its performance support. frared bands. Occasionally, the mixed pixel phenomenon can Without resorting to numerical optimization, the devised algo- also be present in hyperspectral scenes where the underlying rithm searches for the active data pixels via simple materials are intimately mixed with each other [9] even though linear algebraic computations, accounting for its computational efficiency. Monte Carlo simulations and real data experiments are the spatial resolution is high enough. Hyperspectral unmixing provided to demonstrate its superior efficacy over some bench- (HU) [8], [10], an essential procedure of extracting individual mark Craig-criterion-based algorithms in both computational spectral signatures of the underlying materials in the captured efficiency and estimation accuracy. scene from these measured spectral mixtures, is therefore of Index Terms—Hyperspectral unmixing, Craig’s criterion, paramount importance in the HRS context. convex geometry, minimum-volume enclosing simplex, hyper- Blind HU, or unsupervised HU, involves two core stages, . namely endmember extraction and abundance estimation, without (or with very limited) prior knowledge about the end- members’ nature or the mixing mechanism. Some endmember I. INTRODUCTION extraction algorithms (EEAs), such as alternating projected YPERSPECTRAL remote sensing (HRS) [2]–[4], also subgradients (APS) [11], joint Bayesian approach (JBA) [12], H known as imaging spectroscopy, is a crucial technology and iterated constrained endmembers (ICE) [13] (also the to the identification of material substances (or endmembers) as sparsity promoting ICE (SPICE) [14]), can simultaneously determine the associated abundance fractions while extracting Manuscript received April 01, 2015; revised October 02, 2015; accepted De- the endmember signatures. Nevertheless, some EEAs perform cember 01, 2015. Date of publication December 17, 2015; date of current ver- endmember estimation, followed by abundance estimation sion February 26, 2016. The associate editor coordinating the review of this using such as the fully constrained least squares (FCLS) [15] to manuscript and approving it for publication was Prof. Andre de Almeida. This work is supported partly by the National Science Council, R.O.C., under Grant complete the entire HU processing. NSC 102-2221-E-007-035-MY2, and partly by Ministry of Science and Tech- The pure-pixel assumption has been exploited in devising nology, R.O.C., under Grant MOST 104-2221-E-007-069-MY3. Part of this fast blind HU algorithms to search for the purest pixels over work was presented at IEEE ICASSP 2015 [1]. (Corresponding author: Chia-Hsiang Lin.) the data set as the endmember estimates, and such searching C.-H. Lin, C.-Y. Chi, and Y.-H. Wang are with the Institute of procedure can always be carried out through simple linear al- Communications Engineering and the Department of Electrical En- gineering, National Tsing Hua University, Hsinchu, Taiwan 30013, gebraic formulations; see, e.g., pixel purity index (PPI) [16] R.O.C. (e-mail: [email protected]; [email protected]; and vertex component analysis (VCA) [17]. An important blind [email protected]). HU criterion, called Winter’s criterion [18], also based on the T.-H. Chan is with MediaTek Inc., National Science Park, Hsinchu, Taiwan 30013, R.O.C. (e-mail: [email protected]). pure-pixel assumption, is to identify the vertices of the max- Color versions of one or more of the figures in this paper are available online imum-volume simplex inscribed in the observed data cloud as at http://ieeexplore.ieee.org. endmember estimates. HU algorithms in this category include Digital Object Identifier 10.1109/TSP.2015.2508778

1053-587X © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1947

N-finder (N-FINDR) [18], simplex growing algorithm (SGA) by this observation, we naturally raise a question: Can Craig’s [19] (also the real-time implemented SGA [20]), and worst case minimum-volume simplex be identified by simply searching for alternating volume maximization (WAVMAX) [21], to name a a specific set of pixels in the data set regardless of the existence few. However, the pure-pixel assumption could be seriously in- of pure pixels? The answer is affirmative and will be given in fringed in practical scenarios especially when the pixels are in- this paper. timately mixed, for instance, the hyperspectral imaging data for Based on the convex geometry fact that a simplest simplex retinal analysis in the ophthalmology [9]. In these scenarios, HU of vertices can be characterized by the associated hy- algorithms in this category could completely fail; actually, it is perplanes, this paper proposes an efficient and effective unsu- proven that perfect endmember identifiability is impossible for pervised Craig-criterion-based HU algorithm, together with an Winter-criterion-based algorithms if the pure-pixel assumption endmember identifiability analysis. Each hyperplane, parame- is violated [21]. terized by a vector and an inner product constant [26], Without relying on the existence of pure pixels, another can then be estimated from affinely independent pixels promising blind HU approach, advocated by Craig in early in the data set via simple linear algebraic formulations. The re- 1990’s [22], exploits the simplex structure of hyperspectral sulting hyperplane-based CSI (HyperCSI) algorithm, based on data, and believes that the vertices of the minimum-volume the above pixel search scheme, can withstand the no pure-pixel data-enclosing simplex can yield good endmember estimates, scenario, and can yield deterministic, non-negative, and, most and algorithms developed accordingly include such as min- importantly, accurate endmember estimates. After endmember imum-volume transform (MVT) [22], minimum-volume con- estimation, a closed-form expression in terms of the identified strained nonnegative matrix factorization (MVC-NMF) [23], hyperplanes’ parameters is derived for abundance estimation. and minimum-volume-based elimination strategy (MINVEST) Then some Monte Carlo numerical simulations and real hyper- [24]. Moreover, some linearization-based methods have also spectral data experiments are presented to demonstrate the su- been reported to practically identify Craig’s minimum-volume perior efficacy of the proposed HyperCSI algorithm over some simplex, e.g., the iterative linear approximation in min- benchmark Craig-criterion-based HU algorithms in both esti- imum-volume simplex analysis (MVSA) [25] (also its fast mation accuracy and computational efficiency. implementation using the interior- method [26], termed The remaining part of this paper is organized as follows. as ipMVSA [27]), and the alternating linear programming in In Section II, we briefly review some essential convex geom- minimum-volume enclosing simplex (MVES) [28]. Empirical etry concepts, followed by the signal model and re- evidences do well support that this minimum-volume approach duction. Section III focuses on the HyperCSI algorithm devel- is resistant to lack of pure pixels, and can recover ground truth opment, and in Section IV, some simulation results are pre- endmembers quite accurately even when the observed pixels sented for its performance comparison with some benchmark are heavily mixed. Very recently, the validity of this empirical Craig-criterion-based HU algorithms. In Section V, we further belief has been theoretically justified; specifically, we show evaluate the effectiveness of the proposed HyperCSI algorithm that, as long as a key measure concerning the pixels’ mixing with AVIRIS [32] data experiments. Finally, we draw some con- level is above a certain (small) threshold, Craig’s simplex clusions in Section VI. can perfectly identify the true endmembers in the noiseless The following notations will be used in the ensuing presenta- scenario [29]. However, without the guidance of the pure-pixel tion. is the set of real numbers ( -vectors, assumption, this more sophisticated criterion would generally matrices). is the set of non-negative real lead to more computationally expensive HU algorithms. To the numbers ( -vectors, matrices). is best of our knowledge, the ipMVSA algorithm [27] and the the set of positive real numbers ( -vectors, matrices). simplex identification via split augmented Lagrangian (SISAL) denotes the Moore-Penrose pseudo-inverse of a matrix . algorithm [30] are the two state-of-the-art Craig-criterion-based and are all-one and all-zero -vectors, respectively. algorithms in terms of computational efficiency. Nevertheless, denotes the unit vector of proper dimension with the th entry in view of not only the NP-hardness of the Craig-simplex-iden- equal to unity. is the identity matrix. and stand tification (CSI) problem [31] but also heavy simplex volume for the componentwise inequality and strictly componentwise computations, all the above mentioned HU algorithms are yet inequality, respectively. denotes the Euclidean norm (i.e., to be much more computationally efficient. Moreover, their 2-norm). The distance of a vector to a set is denoted by performances may not be very reliable owing to the sensitivity [26]. denotes the cardinality to regularization parameter tuning, non-deterministic (i.e., of the set . The determinant of matrix is represented by non-reproducible) endmember estimates caused by random ini- . stands for the set of integers ,forany tializations, and, most seriously, lack of rigorous identifiability positive integer . analysis. In this work, we break the deadlock on the trade-off between asimple fast algorithmic scheme and the estimation accuracy in II. CONVEX GEOMETRY AND SIGNAL MODEL the no pure-pixel case. We have observed that when the pure- In this section, a brief review on some essential convex geom- pixel assumption holds true, the effectiveness of a simple fast etry will be given for ease of later use. Then the signal model HU algorithmic scheme could be attributed to that the desired for representing the hyperspectral imaging data together with solutions (i.e., pure pixels) already exist in the data set. Inspired dimension reduction preprocessing will be presented. 1948 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

with the pixel index , each pixel in the observed data set can then be represented as a linear mixture of the endmembers’ spectral signatures1

(1)

where is the spectral signature ma- trix, is the abundance vector, and is the total number of pixels. The following standard as- sumptions pertaining to the model in (1), which also charac- terize the simplex structure inherent in the hyperspectral data, are used in our HU algorithm development later [2]–[8], [10], Fig. 1. A graphical illustration in for some convex geometry concepts, [28]: where the segment connecting and is the convex hull of , (Non-negativity) and . the straight line passing and is the affine hull of , the shaded tri- angle is the convex hull of , and the plane passing the three points (Full-additivity) . is the affine hull of .Asanaffinehullin is called and is full column a hyperplane if its affine dimension is 2, is a hyperplane, while rank. is not. Moreover, like most benchmark HU algorithms (see, e.g., [23], [25], [28], [30]), the number of endmembers is assumed to A. Convex Geometry Preliminary be known a priori, which can be determined beforehand by applying model-order selection methods, such as hyperspec- The convex hull of a given set of vectors tral signal subspace identificationbyminimumerror(HySiMe) is defined as [26] [36], and Neyman-Pearson detection theory-based virtual di- mensionality (VD) [37]. We aim to blindly estimate the unknown endmembers (i.e., ), as well as their abundances (i.e., ), where (cf. Fig. 1). A convex hull from the observed spectral mixtures (i.e., ). Due is called an -dimensional simplex to the huge dimensionality of hyperspectral data, directly an- with vertices if is affinely alyzing the data may not be very computationally efficient. In- independent, or, equivalently, if stead, an efficient data preprocessing technique, called affine set is linearly independent, and it is called a simplest simplex in fitting (ASF) procedure [38], can be applied to compactly rep- when [33]. For example, a triangle is a resent each measured pixel in a dimension-reduced 2-dimensional simplest simplex in , and a tetrahedron is a (DR) as follows: 3-dimensional simplest simplex in (cf. Fig. 1). For a given set of vectors ,itsaffine hull (2) is defined as [26] where

(3) where (cf. Fig. 1). This affine hull can be (4) parameterized by a 2-tuple using the following alternative representation [26]: (5) in which denotes the th principal eigenvector (with unit norm) of the square matrix ,and where (the rank of )istheaffine dimension of is the mean-removed . Moreover, an affine hull data matrix. Actually, like other dimension reduction algorithms is called a hyperplane if its affine dimension [39], ASF also performs noise suppression in the meantime. It (cf. Fig. 1). has been shown that the above ASF best represents the measured data in an -dimensional space in the sense of least- B. Signal Model and Dimension Reduction squares fitting error [38], while such fitting error vanishes in the Consider a scenario where a hyperspectral sensor measures 1 solar electromagnetic radiation over spectral bands from Note that there is a research line considering non-linear mixtures for mod- eling the effect of multiple reflections of solar radiation [34]. Moreover,the unknown materials (endmembers) in a scene of interest. Based endmember spectral signatures may be spatially varying, hence leading to the on the linear mixing model (LMM) [2]–[8], [10], [28], where full-additivity in being violated [10]. However, studying these effects is the measured solar radiations are assumed to reflect from the ex- out of the scope of this paper, and the representative LMM is sufficient for our analysis and algorithm development; interested readers can refer to the maga- plored scene through one single bounce, and the endmembers’ zine papers [34] and [35], about the non-linear effect and the endmember vari- spectral signature vectors areassumedtobeinvariant ability effect, respectively. LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1949 noiseless scenario [38]. Note that the data mean in the DR space is the origin (by(2)and(3)). Because of in typical HU applications, the HyperCSI algorithm will be developed in the DR space wherein the DR endmembers are estimated. Then, by (5), the endmember estimates in the original space can be restored as

(6) where ’s are the endmember estimates in the DR space. Fig. 2. An illustration of hyperplanes and DR data in for the case of ,where is a purest pixel in (a purest pixel can be considered as the pixel closest to ) but not necessarily very close to hyperplane III. HYPERPLANE-BASED CRAIG-SIMPLEX-IDENTIFICATION , leading to nontrivial orientation difference between and . ALGORITHM However, the active pixels and identified by (17) will be very close to (especially, for large ), and hence the orientations of and will First of all, due to (2) and – , the true endmembers’ be almost the same. On the other hand, one can see that the pixels identified by convex hull itself is a data-enclosing sim- (21) are (that are very close to each other), and thus the associated plex, i.e., normal vector estimate is obviously far away from the true .

(7) volume simplex can be uniquely determined by tightly en- According to Craig’s criterion, the true endmembers’ convex closed -dimensional hyperplanes, where each hyper- hull is estimated by minimizing the volume of the data-en- plane can be reconstructed from any affinely indepen- closing simplex [22], namely, by solving the following volume dent points of the hyperplane, we hence endeavor to search for minimization problem (called the CSI problem interchangeably affinely independent pixels (referred to as active pixels hereafter): in ) that are as close to the associated hyperplane as possible. We begin with purest pixels that define disjoint proper re- gions, each centered at a different purest pixel. Then for each hyperplane of the minimum-volume simplex, the desired (8) active pixels, that are as close to the hyperplane as possible, are respectively sifted from subsets of , each from the data where denotes the volume of the simplex set associated with one different proper region (cf. Fig. 2). Then . Under some mild conditions on the obtained pixels are used to construct one estimated data purity level [29], the optimal solution of problem (8) can hyperplane. Finally, the desired minimum-volume simplex can perfectly yield the true endmembers .2 be determined from the obtained hyperplane estimates. Besides in the HU context, the NP-hard CSI problem in (8) [31] has been studied in some earlier works in mathematical A. Hyperplane Representation for Craig’s Simplex geology [40] and computational geometry [41]. However, their The idea of solving the CSI problem in (8), without involving intractable computational complexity almost disable them from any simplex volume computations, is based on the hyperplane practical applications for larger problem size [41], mainly owing representation of a simplest simplex as stated in the following to calculation of the complicated nonconvex objective function proposition: [28] Proposition 1: If is affinely inde- pendent, i.e., is a simplest simplex, then can be reconstructed from the associated hy- perplanes ,that tightly enclose ,where in (8). Instead, the HyperCSI algorithm to be presented can judi- . ciously bypass simplex volume calculations, and meanwhile the Proof: It suffices to show that can be deter- identified simplex can be shown to be exactly the “minimum- mined by . It is known that hyperplane can volume” (data-enclosing) simplex in the asymptotic sense be parameterized by a normal vector and an inner . product constant as follows: First of all, let us succinctly present the actual idea on which the HyperCSI algorithm is based. As the Craig’s minimum- (9)

2In [29], is used As for all ,wehave to measure the data purity level of ,where from (9) that for all ,i.e., and ; the geometric interpretations of can be found in [29]. Simply speaking, one can show that , and the (10) most heavily mixed scenario (i.e., ) will lead to the lower bound [29]. On the contrary, the pure-pixel assumption is equivalentto the condition of (the upper bound) [29], in comparison with which a mild where are defined as condition of only is sufficient to guarantee the perfect endmember identifiability of problem (8) [29]. (11) 1950 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

(12) be the outward-pointing normal vector of hyperplane ,i.e., As is a simplest simplex in must be of full rank and hence invertible [26]. Hence, we have from (10) that (16)

(13) Considering Fact 1 and the requirement that the set must contain distinct elements (otherwise, is not affinely implying that can be reconstructed. The proof is therefore independent), we identify the desired affinely independent set completed. by: As it can be inferred from that the set of DR end- members is affinely independent, one can apply (17) Proposition 1 to decouple the CSI problem (8) into sub- problems of hyperplane estimation, namely, estimation of where are disjoint sets defined as parameter vectors in (9). Then (13) can be utilized to obtain the desired endmember estimates. Next, let us present (18) how to estimate the normal vector and the inner product constant fromthedataset , respectively. in which is the B. Normal Vector Estimation open Euclidean norm ball with center and radius . Note that The normal vector of hyperplane can be obtained by the choice of the radius is to guarantee that projecting the vector (for any ) onto the subspace are non-overlapping regions, thereby guaranteeing that that is orthogonal to the hyperplane [42], i.e., contains distinct points. Moreover, each hyperball must contain at least one pixel (as it contains either or ; cf. (18)), i.e., , and hence problem (17) must be a feasible problem (i.e., a problem with non-empty feasible set (14) [26]). where ,and If the points extracted by (17) are affinely independent, is the matrix with its then the estimated normal vector associated with can be de- th and th columns removed. Besides (14) for obtaining the termined as (cf. Proposition 2) normal vector of , we also need another closed-form ex- (19) pression of in terms of distinct points as given in the following proposition. Fortunately, the obtained by (17) can be proved (in Theorem Proposition 2: Given any affinely independent set 1 below) to be always affinely independent with one more as- can be alternatively obtained by sumption:3 (except for a positive scale factor) The abundance vectors (defined below (1)) are independent and identically distributed (15) (i.i.d.) following Dirichlet distribution with parameter whose probability density where is definedin(14). function (p.d.f.) is given by [44]: The proof of Proposition 2 can be shown from the fact that is the data mean in the DR space (by (2) and (3)), and is omitted here due to space limitation. (20) Based on Proposition 2, we estimate the normal vector by finding affinely independent data points where ,and denotes the gamma function. that are as close to as possible. To this end, an observation Theorem 1: Assume – hold true. Let from (7) is needed and given in the following fact: beasolutionto(17)with defined in (18), for all Fact 1: Observing that (i) (i.e., all the points lie on the same side of ; cf. (7)), and 3The rationale of adopting Dirichlet distribution in is not only that it is that (ii) , the point a well known distribution that captures both the non-negativity and full-addi- tivity of [44], but because it has been used to characterize the distribution of closest to is exactly the one with maximum of ,provided in the HU context [45], [46]. However, the statistical assumption is that points outward from the true endmembers’ simplex (cf. only for analysis purpose without being involved in our geometry-oriented algo- Fig. 2 and (14)). rithm development. So even if abundance vectors are neither i.i.d. nor Dirichlet distributed, the HyperCSI algorithm can still work well (cf. Subsection IV-D). Suppose that we are given “purest” pixels ,which Furthermore, we would like to emphasize that, in our analysis (Theorems 1 and basically maximize the simplex volume inscribed in ,andthey 2), we actually only use the following two properties of Dirichlet distribution: can be obtained using the reliable and reproducible successive (i) its domain is , and (ii) it is a continuous multivariate distribution with strictly positive density on its domain [47] (cf. projection algorithm (SPA) [10], ([43], Algorithm 4). So can Appendixes A and B). Hence, any distribution with these two properties can be be viewed as the pixel in “closest” to (cf. Fig. 2). Let used as an alternative in . LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1951

and . Then, the set is affinely independent with TABLE I probability 1 (w.p.1). PSEUDO-CODE FOR HYPERCSI ALGORITHM The proof of Theorem 1 is given in Appendix A. Note that the orientation difference between and the true may not be small (cf. Fig. 2). Hence, itself may not be a good estimate for either. On the contrary, it can be shown that the orientation difference between and tends to be small for large , and actually such difference vanishes as goes to infinity (cf. Theorem 2 as well as Remark 1 in Subsection III-E). On the other hand, if the pixels with maximum inner products in are jointly sifted from the whole data cloud ,i.e., (21) where and , rather than respectively from different regions ,as which can be further shown to have a closed-form solution: given in (17), the identified pixels in may stay quite close, (26) easily leading to large deviation in normal vector estimation as illustrated in Fig. 2 where are the identified where is the th component of and pixels using (21). This is also a rationale of finding using is the th component of . (17) for better normal vector estimation. Note that is just the minimum value for to yield non-nega- tive endmember estimates. Thus, we can generally set C. Inner Product Constant Estimation for some . Moreover, the value of is For Craig’s simplex (the minimum-volume data-enclosing empirically found to be a good choice for the scenarios where simplex), all the data in should lie on the same side of signal-to-noise ratio (SNR) is greater than 20 dB; typically, the (otherwise, it is not data-enclosing), and should be as tightly value of SNR in hyperspectral data is much higher than 20 dB close to the data cloud as possible (otherwise, it is not min- [32]. Let us emphasize that the larger the value of (or the imum-volume); the only possibility is when the hyperplane smaller the value of ), the farther the estimated hyperplanes must be externally tangent to the data cloud. In other words, from the origin , or the closer the estimated endmembers’ will incorporate the pixel that has maximum inner product with simplex to the boundary of the nonnegative , and hence it can be determined as ,where is orthant . On the other hand, we empirically observed that obtained by solving typical endmembers in the U.S. geological survey (USGS) li- brary [48] are close to the boundary of .Thatistosay,a (22) reasonable choice of should be relatively large (i.e., However, it has been reported that when the observed data relatively close to 1), accounting for the reason why the preset pixels are noise-corrupted, the random noise may expand the value of can always yield good performance. The re- data cloud, thereby inflating the volume of the Craig’s data-en- sulting endmember estimation processing of the HyperCSI al- closing simplex [21], [33]. As a result, the estimated hyper- gorithm is summarized in Steps 1 to 6 in Table I. planes are pushed away from the origin (i.e., the data mean in D. Abundance Estimation the DR space) due to noise effect, and hence the estimated inner product constant in (22) would be larger than that of the ground Though the abundance estimation is often done by solving truth. To mitigate this effect, the estimated hyperplanes need to FCLS problems [15], which can be equivalently formulated in be properly shifted closer to the origin, so instead, the DR space into (cf. ([49], Lemma 3.1)) , are the desired hyperplane estimates for some . Therefore, the corresponding DR endmember estimates are ob- tained by (cf. (13)) (27) it has been reported that some geometric quantities, acquired (23) during the endmember extraction stage, can be used to signif- icantly accelerate the abundance estimation procedure [50]. where and are given by (11) and (12) with and With similar computational efficiency improvements taken into replaced by and , respectively. Moreover, it account, we aim at expressing the abundance in terms is necessary to choose such that the associated endmember of readily available quantities (e.g., normal vectors and inner estimates in the original space are non-negative (cf. ), i.e., product constants) obtained when estimating the endmembers, in this subsection. The results are summarized in the following (24) proposition: By (23) and (24), the hyperplanes should be shifted closer to the Proposition 3: Assume – hold true. Then origin with at least, where has the following closed-form expression:

(25) (28) 1952 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

Proposition 3 can be derived from some simple geometrical ob- (in Step 5 in Table I) is exactly the Craig’s minimum-volume servations (cf. items (i) and (ii) in Fact 1) and the following well simplex (i.e., solution of (8)) and the true endmembers’ simplex known formula in the Algebraic Topology context w.p.1. Two noteworthy remarks about the philosophies and intu- (29) itions behind the proof of this theorem are given as follows: Remark 1: With the abundance distribution stated in , and its proof is omitted here due to space limitation; note that the pixels in can be shown to be arbitrarily close to the formula (29) has been recently derived again using different as the pixel number , and they are affinely independent approach in the HU context ([50], (12)). w.p.1 (cf. Theorem 1). Therefore, can be uniquely obtained Based on (28), the abundance vector can be estimated as by (19), and its orientation approaches to that of w.p.1. Remark 2: Remark 1 together with (7) implies that is (30) upper bounded by w.p.1 (assuming without loss of generality that ), and this upper bound can be shown to be achievable w.p.1 as .Thus,as ,wehavethat where is to enforce the non-negativity w.p.1. of abundance fractions (cf. Step 7 in Table I). One can It can be further inferred, from the above two remarks, that show that when , the abundance is exactly the true w.p.1 (cf. (23)) as in the absence estimates obtained using (30) is exactly the solution to the of noise. Although the identifiability analysis in Theorem 2 is FCLS problem in (27), while using (30) yields much lower conducted for the noiseless case and , we empirically computational cost than solving FCLS problems. Nevertheless, found that the HyperCSI algorithm can yield good endmember one should be aware of a potential limitation of using (30). estimates for a moderate and finite SNR, to be demonstrated Specifically, if is too far away from the endmembers’ by simulation results and real data experiments later. simplex (i.e., is much larger than Next, we analyze the computational complexity of the Hy- for some ), the zeroing operation in (30) could cause perCSI algorithm. The computation time of HyperCSI is pri- nontrivial deviation in abundance estimation. This can happen marily dominated by the computations of the feasible sets if is an outlier or the SNR is very low. However, as the (in Step 3), the active pixels in (inStep4),andthe SNR is reasonably high (like in AVIRIS data [21], [32]), most abundances (in Step 7), and they are respectively analyzed pixels in the hyperspectral data are expected to lie inside or in the following: very close to the endmembers’ simplex (cf. Step 3: Computing the feasible sets – )—especially when the endmembers are extracted , is equivalent to computing the based on Craig’s criterion. Hence, with the endmembers esti- sets ; cf. (18). Since is an mated by the Craig-criterion-based HyperCSI algorithm, simply open Euclidean norm ball, the computation of each set using (30) to enforce the abundance non-negativity is not only can be done by examining inequalities computationally efficient, but also still capable of yielding good . However, examining each abundance estimation as will be demonstrated later with the inequality requires (i) calculating one 2-norm (in ), simulation results (Table III in Subsection IV-C and Table IV in which yields , and (ii) checking whether this 2-norm Subsection IV-D). is smaller than , which yields . Hence, Step 3 yields Unlike most of the existing abundance estimation algorithms, . where all the abundance maps must be jointly estimated (e.g., Step 4: To determine , we have to identify the pixel FCLS [15]), the proposed HyperCSI algorithm offers an option from the set , whose complexity of solely obtaining the abundance map of a specific material of amounts to computing inner products in interest (say the th material) (each yielding ), and performing the point-wise max- (31) imum operation among the values of these inner products (cf. (17)), and hence the complexity of identifying is to save computational cost, or obtaining all the abundance maps easily verified as .More- by parallel processing (cf. (30)). Moreover, when over, gathering requires the com- calculating using (30), the denominator is a con- plexity stant for all pixel indices and hence only needs to be calculated once regardless of (which is usually large). ; the inequality is due to that s are disjoint. Re- peating the above for , Step 4 yields E. Identifiability and Complexity of HyperCSI . In this subsection, let us present the identifiability and com- Step 7: Estimation of the abundances requires to com- plexity analyses of the proposed HyperCSI algorithm. Particu- pute the fraction in (30) times. Each fraction involves larly, the asymptotic identifiability of the HyperCSI algorithm 2 inner products (in ), 2 scalar subtractions, and 1 can be guaranteed as stated in the following theorem with the scalar division, and thus costs . So, this step yields proof given in Appendix B: . Theorem 2: Under – , the noiseless assumption and Therefore, the overall computational complexity of HyperCSI , the simplex identified by HyperCSI algorithm with is . LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1953

Surprisingly, the complexity order of the proposed HyperCSI algorithm is the same as (rather than much higher than) that of some pure-pixel-based EEAs; see, e.g., [17], [21], [43], [51]. Moreover, to the best of our knowledge, the MVES algorithm [28] for which the CSI problem in (8) is handled by solving linear programing (LP) subproblems in an alternative fashion, and the LPs are solved using efficient primal-dual in- terior-point method [26], is the existing Craig-criterion-based algorithm with lowest complexity order ,where is the number of iterations [28]. Hence, the introduced hyper- plane identification approach (without simplex volume compu- tations) indeed yields much smaller complexity than most ex- isting Craig-criterion-based algorithms. Let us conclude this section with a summary of some re- markable features of the proposed HyperCSI algorithm (given in Table I) as follows: Fig. 3. The endmember identifiability of the HyperCSI algorithm with finite a) Without involving any simplex volume computations, the data length . Craig’s minimum-volume simplex is reconstructed from hyperplane estimates, i.e., the estimates , which can be obtained in parallel (cf. Step 4 in Table I) is used as the performance measure of endmember esti- by searching for active pixels from .The mation, where reconstructed simplex in the DR space is actually is the set of all permutations of the intersection of halfspaces . Similarly, the performance measure of abundance . estimation is the rms angle error defined as [28] b) By noting that if, and only if, , the po- tential requirement of pixels lying on, or close to, the associated hyperplanes is not too difficult to be met (33) in practice because the abundance maps in hyperspectral imaging often show large sparseness. This will be further discussed in more detail in experiments with AVIRIS data where and are the true abundance map of th endmember in Section V. (cf. (31)) and its estimate, respectively. All the HU algorithms c) All the processing steps (including SPA in Step 2 of under test are implemented using Mathworks Matlab R2013a Table I; cf. Algorithm 4 in [43]) can be carried out either running on a desktop computer equipped with Core-i7-4790K by simple linear algebraic formulations or by closed-form CPU with 4.00 GHz speed and 16 GB random access memory, expressions, and so its high computational efficiency can and all the performance results in terms of , and compu- be anticipated. tational time are averaged over 100 independent realizations. Next, we show some simulation results for the endmember identifiability for moderately finite data length (cf. Theorem 2), IV. COMPUTER SIMULATIONS the choice of the parameter , and the performance evaluation of This section demonstrates the efficacy of the proposed Hy- the proposed HyperCSI algorithm, in the following subsections, perCSI algorithm by Monte Carlo simulations. In the simula- respectively. tion, endmember signatures with spectral bands ran- domly selected from USGS library [48] are used to generate A. Endmember Identifiability of HyperCSI for Finite Data noise-free synthetic hyperspectral data according to linear mixing model in (1), where the abundance vectors are i.i.d. gen- In Theorem 2, the perfect endmember identifiability of the erated following the Dirichlet distribution with (cf. proposed HyperCSI algorithm (with inStep5inTableI) (20)) as it can automatically enforce and [28], [33]. under the noise-free scenario is proved in the asymptotic sense Then we add i.i.d. zero-mean Gaussian noise with variance (i.e., the data length ). In this subsection, we would to the noise-free synthetic data for different values of SNR like to show some simulation results (cf. Fig. 3) to illustrate the asymptotic identifiability of the HyperCSI algorithm and its defined as ,andthoseneg- good endmember estimation accuracy even with a moderately ative entries in the generated noisy data vectors are artificially finite number of pixels . set to zero, so as to maintain the non-negativity nature of hyper- Fig. 3 shows some simulation results of versus for spectral imaging data. . From this figure, one can observe that for a The root-mean-square (rms) spectral angle error between given decreases as increases, and the HyperCSI algo- thetrueendmembers and their estimates rithm indeed achieves perfect identifiability (i.e., ,cf. defined as [17], [28] (32)) as . On the other hand, the HyperCSI algorithm needs to identify essential pixels for the construc- (32) tion of the Craig’s simplex, which indicates that the HyperCSI algorithm would need more pixels to achieve good performance 1954 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

Fig. 4. The average r.m.s. spectral angle error versus different values of .

for larger . Intriguingly, the results shown in Fig. 3 are con- TABLE II sistent with the above inferences, where a larger corresponds SIMULATION SETTINGS FOR THE ALGORITHMS UNDER TEST to a slightly slower convergence rate of . However, these results also allude to a high possibility that the HyperCSI al- gorithm can yield accurate endmember estimates with a typical data length (i.e., several ten thousands) for high SNR in HRS applications.

B. Choice of the Parameter The simulation results for versus obtained by the proposed HyperCSI algorithm, for (dB), and are shown in Fig. 4. From this figure, one can observe that for a fixed , the best choice of (i.e., the one that yields the smallest ) decreases as SNR decreases. The reason for this is that the larger the noise power, the more the data cloud is expanded, weight for robustness against noise, and hence has also been and hence the more the desired hyperplanes should be shifted tuned w.r.t. different SNRs. The implementation details and towards the data center (implying a larger or a smaller ). parameter settings for all the algorithms under test are listed in Moreover, one can also observe from Fig. 4 that for each Table II. scenario of , the best choice of basically belongs The purity index for each synthetic pixel [28], [29], to the interval [0.8, 1], a relatively large value in the interval [33] has been defined as (due to (0, 1], as discussed in Subsection III-C. It is also interesting and ); a larger index means higher pixel purity of to note that for a given SNR, the best choice of tends to . Each synthetic data set in the simulation approach the value of 0.9 as increases. For instance, for is generated with a given purity level denoted as , following dB, the best choices of for are the same data generation procedure as in [28], [29], [33], where , respectively. is a measure of mixing level of a data set. Specifically, a pool The above observations also suggest that , the only of sufficiently large number of synthetic data is first generated, parameter in the proposed HyperCSI algorithm, is a good and then from the pool, pixels with the purity index not choice. Next, we will evaluate the performance of the proposed greater than are randomly picked to form the desired data set HyperCSI algorithm with the parameter preset to 0.9 for all with a purity level of . the simulated scenarios and real data tests, though it may not In the above data generation, six endmembers (i.e., Jarsoite, be the best choice for some scenarios. Pyrope, Dumortierite, Buddingtonite, Muscovite, and Goethite) with 224 spectral bands randomly selected from the USGS li- C. Performance Evaluation of HyperCSI Algorithm brary [48] are used to generate 10000 synthetic hyperspectral We evaluate the performance of the proposed HyperCSI data (i.e., )with algorithm, along with a performance comparison with five and (dB). The sim- state-of-the-art Craig-criterion-based HU algorithms, including ulation results for , and computational time are dis- MVC-NMF [23], MVSA [25], MVES [28], SISAL [30], and played in Table III, where bold-face numbers correspond to the ipMVSA [27]. As the operations of MVC-NMF, MVSA, best performance (i.e., the smallest ,and )ofallthe SISAL, and ipMVSA are data-dependent, their respective HU algorithms under test for a specific . regularization parameters have been well selected in the simu- Some general observations from Table III are as follows. For lation, so as to yield their best performances. In particular, the fixedpuritylevel , all the algorithms under test perform better regularization parameter involved in SISAL is the regression for larger SNR. As expected, the proposed HyperCSI algorithm LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1955

TABLE III PERFORMANCE COMPARISON, IN TERMS OF (DEGREES), (DEGREES), AND AVERAGE RUNNING TIME (SECONDS), OF VARIOUS HU ALGORITHMS FOR DIFFERENT DATA PURITY LEVELS AND SNRS,WHERE ABUNDANCES ARE I.I.D. AND DIRICHLET DISTRIBUTED

rightly performs better for higher data purity level , but this The simulation results, in terms of ,andcom- performance behavior does not apply to the other five algo- putational time , are shown in Table IV, where bold-face rithms, perhaps because the non-convexity of the complicated numbers correspond to the best performance among the al- simplex volume makes their performance behaviors more in- gorithms under test for a particular data set and a specific tractable w.r.t. different data purities. (dB). As expected, for both data Among the five existing benchmark Craig-criterion-based sets, all the algorithms perform better for larger SNR. HU algorithms, MVC-NMF yields more accurate endmember One can see from Table IV that for both data sets, Hy- estimates than the other algorithms at the highest computational perCSI yields more accurate endmember estimates than the cost, while ipMVSA is the most computationally efficient one other algorithms, except for the case of (dB). but with lower performance as a trade-off. Nevertheless, the As for abundance estimation, HyperCSI performs best for proposed HyperCSI algorithm outperforms all the other five SYN1, while MVC-NMF performs best for SYN2. Moreover, algorithms when the data are heavily mixed (i.e., ) among the five existing benchmark Craig-criterion-based HU or moderately mixed (i.e., ). As for high data purity algorithms, ipMVSA and SISAL are the most computationally , the HyperCSI algorithm also performs best except for efficient ones. However, in both data sets, the computational the case of . On the other hand, the efficiency of the proposed HyperCSI algorithm is at least computational efficiency of the proposed HyperCSI algorithm more than one order of magnitude faster than the other five is about 1 to 4 orders of magnitude faster than the other five HU algorithms. These simulation results have demonstrated the algorithms under test. Note that the computational efficiency of superior efficacy of the proposed HyperCSI algorithm over the the HyperCSI algorithm can be further improved by an order other algorithms under test in both estimation accuracy and of if parallel processing can be implemented in Step computational efficiency. 4 (hyperplane estimation) and Step 7 (abundance estimation) in Table I. Moreover, ipMVSA is around 4 times faster than V. E XPERIMENTS WITH AVIRIS DATA MVSA, but performs slightly worse than MVSA, perhaps In this section, the proposed HyperCSI algorithm along with because ipMVSA [27] does not adopt the hinge-type soft two benchmark HU algorithms, i.e., the MVC-NMF algorithm constraint (for noise resistance) as used in MVSA [25]. [23] developed based on Craig’s criterion, and the VCA algo- rithm [17] (in conjunction with the FCLS algorithm [15] for D. Performance Evaluation of HyperCSI Algorithm With the abundance estimation) developed based on the pure-pixel Non-i.i.d., Non-Dirichlet and Sparse Abundances assumption, are used to process the hyperspectral imaging data collectedbytheAirborneVisible/Infrared Imaging Spectrom- In practice, the abundance vectors may not be i.i.d. and eter (AVIRIS) [32] taken over the Cuprite mining site, Nevada, seldom follow the Dirichlet distribution, and, moreover, the in 1997. We consider this mining site, not only because it has abundance maps often show large sparseness [52]. In view of been extensively used for remote sensing experiments [54], this, as considered in [52], [53], two sets of sparse and spatially but also because the available classification ground truth in correlated abundance maps displayed in Fig. 5 were used to [55], [56] (though which may have coregistration issue as it generate two synthetic hyperspectral images, denoted as SYN1 was obtained earlier than 1997, this ground truth has been and SYN2 .Thenall widely accepted in the HU context) allows us to easily verify the algorithms listed in Table II are tested again with these the experimental results. The AVIRIS sensor is an imaging two synthetic data sets for which the abundance vectors are spectrometer with 224 channels (or spectral bands) that cover obviously neither i.i.d. nor Dirichlet distributed. wavelengths ranging from 0.4 to 2.5 mwithanapproximately 1956 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

TABLE IV PERFORMANCE COMPARISON, IN TERMS OF (DEGREES), (DEGREES) AND RUNNING TIME (SECONDS), OF VARIOUS HU ALGORITHMS USING SYNTHETIC DATA SYN1 AND SYN2 FOR DIFFERENT SNRS,WHERE ABUNDANCES ARE NON-I.I.D., NON-DIRICHLET AND SPARSE (SEE FIG.5)

Fig. 6. The subimage of the AVIRIS hyperspectral imaging data cube for the 50th band, where the locations of the ten outliers identified by the RASF algo- rithm are marked with yellow color.

Farrand-Chang (NWHFC) eigenvalue-thresholding-based algo- rithm with false-alarm probability . The obtained estimate is and used in the ensuing experiments for all the three HU algorithms under test. The estimated abundance maps are visually compared with those reported in [17], [23], [28] as well as the ground truth reportedin[55],[56],soastodeterminewhatmineralsthey are associated with. The nine abundance maps obtained by the Fig. 5. Two sets of sparse and spatially correlated abundance maps, where each proposed HyperCSI algorithm are shown in Fig. 7, and they subblock in subfigure (b) contains 10 10 pixels. (a) Ground truth abundance maps of SYN1, (b) Ground truth abundance maps of SYN2. are identified as mineral maps of Muscovite, Alunite, Desert Varnish, Hematite, Montmorillonite, Kaolinite #1, Kaolinite #2, Buddingtonite, Chalcedony, respectively, as listed in Table V. 10-nm spectral resolution. The bands with low SNR as well The minerals identified by MVC-NMF and VCA are also listed as those corrupted by water-vapor absorption (including bands in Table V, where MVC-NMF also identifies nine distinct min- 1–4, 107–114, 152–170, and 215–224) are removed from the erals, while only eight distinct minerals are retrieved by VCA, original 224-band imaging data cube, and hence a total of perhaps due to lack of pure pixels in the selected subscene or bands is considered in our experiments. Furthermore, randomness involved in VCA. Owing to space limitation, their the selected subscene of interest includes 150 vertical lines with mineral maps are not shown here. 150 pixels per line, and its 50th band is shown in Fig. 6, where The mineral spectra extracted by the three algorithms under the 10 pixels marked with yellow color are removed from the test, along with their counterparts in the USGS library [48], are data set as they are outlier pixels identified by the robust affine shown in Fig. 8, where one can observe that the spectra ex- set fitting (RASF) algorithm [57]. tracted by the proposed HyperCSI algorithm hold a better re- The number of the minerals (i.e., endmembers) present in semblance to the library spectra. For instance, the spectrum of the selected subscene is estimated using a virtual dimension- Alunite extracted by HyperCSI shows much clearer absorption ality (VD) approach [37], i.e., the noise-whitened Harsanyi- feature than MVC-NMF and VCA, in the bands approximately LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1957

Fig. 8. (a) The endmember signatures taken from the USGS library, and sig- natures of the endmember estimates obtained by (b) HyperCSI, (c) MVC-NMF and (d) VCA.

Fig. 7. The abundance maps of minerals estimated by HyperCSI algorithm. that the potential requirement of sufficient number (i.e., , in this experiment) of pixels lying close to the TABLE V THE COMPUTATIONAL TIMES (SECONDS) AND SPECTRAL ANGLE hyperplanes associated with the actual endmembers’ simplex, DISTANCE (DEGREES)BETWEEN LIBRARY SPECTRA AND ENDMEMBERS has been met for the considered hyperspectral scene. However, ESTIMATED BY HYPERCSI, MVC-NMF, AND VCA. THE BOLD FACE we are not too surprised with this observation, since the number NUMBERS CORRESPOND TO THE SMALLEST VALUES OF OR AMONG THE THREE ALGORITHMS UNDER TEST of minerals present in one pixel is often small (typically, within five [10]), i.e., the abundance vector often shows sparseness [52] (cf. Fig. 7), indicating that a non-trivial por- tion of pixels are more likely to lie close to the boundary of the endmembers’ simplex (note that if, and only if, ). Moreover, as the pure pixels may not be present in the selected subscene, as expected the two Craig-criterion-based HU algorithms (i.e., HyperCSI and MVC-NMF) outperform VCA in terms of endmember estimation accuracy. On the other hand, in terms of the computation time as given in Table V, in spite of parallel processing not yet applied, the HyperCSI algo- rithm is around 2.5 times faster than VCA (note that VCA itself only costs 0.31 seconds (out of the 5.40 seconds), and the re- maining computation time is the cost of the FCLS) and almost from 2.3 to 2.5 m. To quantitatively compare the endmember four orders of magnitude faster than MVC-NMF. estimation accuracy among the three algorithms under test, the spectral angle distance between each endmember estimate and VI. CONCLUSION its corresponding library spectrum serves as the performance Based on the hyperplane representation for a simplest sim- measure and is defined as plex, we have presented an effective and computationally effi- cient Craig-criterion-based HU algorithm, called HyperCSI al- gorithm, given in Table I. The proposed HyperCSI algorithm has the following remarkable characteristics: The values of associated with the endmember estimates for • It never requires the presence of pure pixels in the data. all the three algorithms under test are also shown in Table V, • It is reproducible without involving random initialization. where the number in the parentheses is the value of associ- • It only involves simple linear algebraic computations, ated with Kaolinite #1 repeatedly classified by VCA. One can so suitable for parallel implementation. Its computa- see from Table V that the average of of the proposed Hy- tional complexity (without using parallel implementation) perCSI algorithm is the smallest. The good performance of Hy- is , which is also the complexity of some perCSI in endmember estimation in the experiment also implies state-of-the-art pure-pixel-based HU algorithms. 1958 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

• It estimates Craig’s minimum-volume simplex by finding where .As is only pixels (regardless of the data length ) affinely independent (by ), the matrix is of full column from the data set for the construction of the associated hy- rank [26], implying that (by (39)). Then, perplanes, without involving any simplex volume compu- by the facts of and , one can readily tations, thereby accounting for its high computational effi- come up with , or, equivalently, ciency in endmember estimation. (by (37)), implying that is true [26]. • The estimated endmembers are guaranteed non-negative, Thus we have proved that implies , and hence and the identified simplex was proven to be both Craig’s simplex and true endmembers’ simplex w.p.1. as (40) for the noiseless case. • The abundance estimation is readily fulfilled by a closed- As Dirichlet distribution is a continuous multivariate distribu- form expression, and thus is computationally efficient. tion [47] for a random vector to satisfy – with Some simulation results were presented to demonstrate the an -dimensional domain, any given affine hull analytic results on the asymptotic endmember identifiability of with affine dimension must satisfy [44] the proposed HyperCSI algorithm, and its superior efficacy over (41) some state-of-the-art Craig-criterion-based HU algorithms in both solution accuracy and computational efficiency. Finally, Moreover, as are i.i.d. random vectors and the the proposed HyperCSI algorithm was tested using AVIRIS hy- affine hull must have affine dimen- perspectral data to show its applicability. sion ,wehavefrom(41)that

APPENDIX (42) A. Proof of Theorem 1 Then we have the following inferences: For a fixed , one can see from (18) that ,implyingthatthe pixels , identified by solving (17) must be distinct. Hence, it suffices to show that is affinely independent w.p.1 for any that satisfies

(34)

Then, as ,wehavefrom and i.e., . Therefore, the proof is completed. (34) that there exist i.i.d Dirichlet distributed random vectors such that (cf. (2)) B. Proof of Theorem 2 It can be seen from (20) that the p.d.f. of Dirichlet distribution (35) satisfies For ease of the ensuing presentation, let denote the prob- ability function and define the following events: (43) The set is affinely dependent. The set is affinely dependent. Moreover, by the facts of and . Then, to prove that is affinely independent w.p.1, it suffices to prove . Next, let us show that implies . Assume is true. Then where denotes the interior of a set , the linear mapping for some . Without loss of (i.e., ) of the abundance domain full fills the generality, let us assume .Then, interior of the true endmembers’ simplex , (36) namely for some , satisfying (44) (37) Then, from (43)–(44) and , it can be inferred that By substituting (35) into (36), we have (38) where . For notational simplicity, let for any given vector which, together with the fact that the affine mapping (cf. (2)) . Then, from the facts of (by (37)) preserves the geometric structure of [38] (note and , (38) can be rewritten as that ), further implies

(39) (45) LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1959 where throughout the ensuing where the last inequality is due to proof. It can be inferred from (45) that there is always a pixel . Hence, by (53), to prove (52), it that can be arbitrarily close to the extreme point suffices to show that, for all , of the simplex ,i.e.,forall ,

(46) (54)

Let denote the set of all minimum-volume en- However, the SPA before post-processing (cf. Algorithm 4 in closing of (i.e., Craig’s simplex containing the set [43]) is exactly the same as the TRIP algorithm (cf. Algorithm ). Then, one can infer from the convexity of a simplex that (cf. 2 in [51]), and it has been proven in ([51], Lemma 3) that (46) [29, eq. (32)]) straightforwardly yields (54) for ; note that the condition “(46) with ” is equivalent to the pure-pixel assumption re- (47) quired in ([51], Lemma 3). One can also show that (46) yields (54) for any , and the proof basically follows the same Moreover, by the fact that any simplex must induction procedure as in the proof of ([51], Lemma 3) and is also be a closed set and the fact that the closure of omitted here for conciseness. Then, recalling that (54) is a suf- is exactly ,it ficient condition for (52) to hold, we have proven (52). can be seen that ifandonlyif By the fact that is a continuous function (cf. (14)) and by [58], and hence (16) and (18), we see that

(48) (55) Thus, it can be inferred from (45), (47) and (48) that Moreover, we have from (43), (52), (55) and that (49) the pixel identified by (17) can be arbitrarily close to . Furthermore, by Theorem 1, we have that the vectors As itself is a simplex, are not only arbitrarily close to ,butalso , which together with affinely independent w.p.1, which together with Proposition 2 (49) yields implies that the estimated (cf. (19)) is arbitrarily close to the true w.p.1, provided that the outward-pointing normal (50) vectors and have the same norm without loss of gen- erality. Then, from (43), (22), and the premises of In other words, we have proved that the Craig’s min- and , it can be inferred that the estimated hyperplane imum-volume simplex is always the true endmembers’ simplex is arbitrarily close to the true . To complete the proof of Theorem 2, it (cf. (9)); precisely, we have suffices to show that the true endmembers’ simplex is always identical to the simplex identified by the HyperCSI algorithm, i.e., for all ,

(51) (56) where are the estimated DR endmembers using Hy- Consequently, by comparing the formulas of (cf. (13)) and perCSI algorithm. (cf. (23)), we have, from and (56), that is always To this end, let us first show that, for all , arbitrarily close to , i.e., (51) is true for all , and hence the proof of Theorem 2 is completed. (52) where are the purest pixels identified by SPA (cf. Step 2 in Table I). However, directly proving (52) is difficult due ACKNOWLEDGMENT to the post-processing involved in SPA (cf. Algorithm 4 in [43]). The authors would like to express our sincere gratitude In view of this, let be those pixels identified to knowledgeable reviewers, providing many insightful per- by SPA before post-processing. Because the post-processing is spective comments that significantly improve the quality of nothing but to obtain the purest pixel by iteratively pushing this paper. The authors would also like to thank Dr. Jose M. each away from the hyperplane [43], we Bioucas-Dias, Dr. Marian-Daniel Iordache, and Dr. Jie Chen have the following simplex volume inequalities for providing sparse and spatially correlated abundance distri- bution maps on one hand, and offering valuable advice for our (53) simulation studies on the other hand. 1960 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. 8, APRIL 15, 2016

REFERENCES [23] L. Miao and H. Qi, “Endmember extraction from highly mixed data using minimum volume constrained nonnegative matrix factorization,” [1] C.-H. Lin, C.-Y. Chi, Y.-H. Wang, and T.-H. Chan, “A fast hyperplane- IEEE Trans. Geosci. Remote Sens., vol. 45, no. 3, pp. 765–777, 2007. based MVES algorithm for hyperspectral unmixing,” in Proc. IEEE [24] E. M. Hendrix, I. Garc´ıa, J. Plaza, G. Martin, and A. Plaza, “A new ICASSP, Brisbane, Australia, Apr. 19–24, 2015, pp. 1384–1388. minimum-volume enclosing algorithm for endmember identification [2]N.KeshavaandJ.F.Mustard,“Spectralunmixing,”IEEE Signal and abundance estimation in hyperspectral data,” IEEE Trans. Geosci. Process. Mag., vol. 19, no. 1, pp. 44–57, Jan. 2002. Remote Sens., vol. 50, no. 7, pp. 2744–2757, 2012. [3] J. M. Bioucas-Dias, A. Plaza, G. Camps-Valls, P. Scheunders, N. [25] J. Li and J. M. Bioucas-Dias, “Minimum volume simplex analysis: A Nasrabadi, and J. Chanussot, “Hyperspectral remote sensing data fast algorithm to unmix hyperspectral data,” in Proc. IEEE IGARSS, analysis and future challenges,” IEEE Geosci. Remote Sens. Mag., Boston, MA, USA, Aug. 8–12, 2008, vol. 4, pp. 2369–2371. vol. 1, no. 2, pp. 6–36, June 2013. [26] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, [4] W.-K.Ma,J.M.Bioucas-Dias,J.Chanussot,andP.Gader,Eds.,IEEE U.K.: Cambridge Univ. Press, 2004. Signal Process. Mag., Special Issue on Signal and Image Processing [27]J.Li,A.Agathos,D.Zaharie,J.M.Bioucas-Dias,A.Plaza,andX. in Hyperspectral Remote Sensing, vol. 31, no. 1, Jan. 2014. Li, “Minimum volume simplex analysis: A fast algorithm for linear [5] D. Landgrebe, “Hyperspectral image data analysis,” IEEE Signal hyperspectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 53, Process. Mag., vol. 19, no. 1, pp. 17–28, Jan. 2002. no. 9, pp. 5067–5082, Apr. 2015. [6] G. Shaw and D. Manolakis, “Signal processing for hyperspectral image [28] T.-H. Chan, C.-Y. Chi, Y.-M. Huang, and W.-K. Ma, “A convex anal- exploitation,” IEEE Signal Process. Mag., vol. 19, no. 1, pp. 12–16, ysis-based minimum-volume enclosing simplex algorithm for hyper- Jan. 2002. spectral unmixing,” IEEE Trans. Signal Process., vol. 57, no. 11, pp. [7]D.Stein,S.Beaven,L.Hoff,E.Winter, A. Schaum, and A. Stocker, 4418–4432, 2009. “Anomaly detection from hyperspectral imagery,” IEEE Signal [29] C.-H.Lin,W.-K.Ma,W.-C.Li,C.-Y.Chi,andA.Ambikapathi,“Iden- Process. Mag., vol. 19, no. 1, pp. 58–69, Jan. 2002. tifiability of the simplex volume minimization criterion for blind hy- [8] J. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader, perspectral unmixing: The no pure-pixel case,” IEEE Trans. Geosci. and J. Chanussot, “Hyperspectral unmixing overview: Geometrical, Remote Sens., vol. 53, no. 10, pp. 5530–5546, 2015. statistical, and sparse regression-based approaches,” IEEE J. Sel. [30] J. M. Bioucas-Dias, “A variable splitting augmented Lagrangian ap- Topics Appl. Earth Observ., vol. 5, no. 2, pp. 354–379, 2012. proach to linear spectral unmixing,” in Proc. IEEE Whispers, Grenoble, [9] W. R. Johnson, M. Humayun, G. Bearman, D. W. Wilson, and W. Fink, France, Aug. 26–28, 2009, pp. 1–4. “Snapshot hyperspectral imaging in ophthalmology,” J. Biomed. Opt., [31] A. Packer, “NP-hardness of largest contained and smallest containing vol. 12, no. 1, pp. 0140361–0140367, Feb. 2007. simplices for V- and H-,” Discrete Computat. Geometr., vol. [10] W.-K. Ma, J. M. Bioucas-Dias, T.-H. Chan, N. Gillis, P. Gader, A. J. 28, no. 3, pp. 349–377, 2002. Plaza, A. Ambikapathi, and C.-Y. Chi, “A signal processing perspec- [32] AVIRIS Free Standard Data Products [Online]. Available: http://aviris. tive on hyperspectral unmixing,” IEEE Signal Process. Mag., vol. 31, jpl.nasa.gov/html/aviris.freedata.html no. 1, pp. 67–81, 2014. [33] A. Ambikapathi, T.-H. Chan, W.-K. Ma, and C.-Y. Chi, “Chance-con- [11] A. Zymnis, S.-J. Kim, J. Skaf, M. Parente, and S. Boyd, “Hyperspectral strained robust minimum-volume enclosing simplex algorithm for hy- image unmixing via alternating projected subgradients,” presented at perspectral unmixing,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. the 41st Asilomar Conf. Signals, Syst., Comput., Pacific Grove, CA, 11, pp. 4194–4209, 2011. USA, Nov. 4–7, 2007. [34] N. Dobigeon, J.-Y. Tourneret, C. Richard, J. Bermudez, S. Mclaughlin, [12] N. Dobigeon, S. Moussaoui, M. Coulon, J.-Y. Tourneret, and A. Hero, and A. O. Hero, “Nonlinear unmixing of hyperspectral images: Models “Joint Bayesian endmember extraction and linear unmixing for hyper- and algorithms,” IEEE Signal Process. Mag., vol. 31, no. 1, pp. 82–94, spectral imagery,” IEEE Trans. Signal Process., vol. 57, no. 11, pp. 2014. 4355–4368, Nov. 2009. [35] A. Zare and K. Ho, “Endmember variability in hyperspectral analysis: [13]M.Berman,H.Kiiveri,R.Lagerstrom,A.Ernst,R.Dunne,andJ.F. Addressing spectral variability during spectral unmixing,” IEEE Signal Huntington, “ICE: A statistical approach to identifying endmembers in Process. Mag., vol. 31, no. 1, pp. 95–104, 2014. hyperspectral images,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. [36] J. M. Bioucas-Dias and J. M. P. Nascimento, “Hyperspectral subspace 10, pp. 2085–2095, Oct. 2004. identification,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 8, pp. [14] A. Zare and P. Gader, “Sparsity promoting iterated constrained end- 2435–2445, 2008. member detection in hyperspectral imagery,” IEEE Geosci. Remote [37] C.-I. Chang and Q. Du, “Estimation of number of spectrally distinct Sens. Lett., vol. 4, no. 3, pp. 446–450, 2007. signal sources in hyperspectral imagery,” IEEE Trans. Geosci. Remote [15] D. Heinz and C.-I. Chang, “Fully constrained least squares linear Sens., vol. 42, no. 3, pp. 608–619, Mar. 2004. mixture analysis for material quantification in hyperspectral imagery,” [38] T.-H. Chan, W.-K. Ma, C.-Y. Chi, and Y. Wang, “A convex analysis IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, pp. 529–545, Mar. framework for blind separation of non-negative sources,” IEEE Trans. 2001. Signal Process., vol. 56, no. 10, pp. 5120–5134, Oct. 2008. [16] J.W.Boardman,F.A.Kruse,andR.O.Green,“Mappingtargetsigna- [39] I. K. Fodor, “A survey of dimension reduction techniques,” Lawrence tures via partial unmixing of AVIRIS data,” in Proc. Summ. JPL Air- Livermore Nat. Lab., Tech. Rep. UCRL-ID-148494, 2002. borne Earth Sci. Workshop, Pasadena, CA, USA, Dec. 9–14, 1995, vol. [40] W. E. Full, R. Ehrlich, and J. E. Klovan, “EXTENDED 1, pp. 23–26. QMODEL—Objective definition of external endmembers in the [17] J. M. P. Nascimento and J. M. Bioucas-Dias, “Vertex component analysis of mixtures,” Math. Geol., vol. 13, no. 4, pp. 331–344, 1981. analysis: A fast algorithm to unmix hyperspectral data,” IEEE Trans. [41] Y. Zhou and S. Suri, “Algorithms for a minimum volume enclosing Geosci. Remote Sens., vol. 43, no. 4, pp. 898–910, Apr. 2005. simplex in three ,” SIAM J. Comput., vol. 31, no. 5, pp. [18] M. E. Winter, “N-FINDR: An algorithm for fast autonomous spectral 1339–1357, 2002. end-member determination in hyperspectral data,” in Proc. SPIE Conf. [42] S. Friedberg, A. Insel, and L. Spence, Linear Algebra, 4th ed. Upper Imag. Spectrometry, Pasadena, CA, USA, Oct. 1999, pp. 266–275. Saddle River, NJ, USA: Prentice-Hall, 2003. [19] C.-I. Chang, C.-C. Wu, W.-M. Liu, and Y.-C. Quyang, “A new [43]S.Arora,R.Ge,Y.Halpern,D.Mimno,A.Moitra,D.Sontag,Y.Wu, growing method for simplex-based endmember extraction algorithm,” and M. Zhu, “A practical algorithm for topic modeling with provable IEEE Trans. Geosci. Remote Sens., vol. 44, no. 10, pp. 2804–2819, guarantees,” 2012, arXiv preprint arXiv:1212.4777. 2006. [44] B. A. Frigyik, A. Kapila, and M. R. Gupta, “Introduction to [20] C.-I.Chang,C.-C.Wu,C.S.Lo,andM.-L.Chang,“Real-timesimplex the Dirichlet distribution and related processes,” Electr. Eng. growing algorithms for hyperspectral endmember extraction,” IEEE Dept., Univ. of Washington, Seattle, WA, USA, Tech. Rep., 2010 Trans. Geosci. Remote Sens., vol. 48, no. 4, pp. 1834–1850, 2010. [Online]. Available: http://www.semanticsearchart.com/down- [21] T.-H. Chan, W.-K. Ma, A. Ambikapathi, and C.-Y. Chi, “A sim- loads/UWEETR-2010-0006.pdf plex volume maximization framework for hyperspectral endmember [45] J. M. Nascimento and J. M. Bioucas-Dias, “Hyperspectral unmixing extraction,” IEEE Trans. Geosci. Remote Sens., vol. 49, no. 11, pp. based on mixtures of Dirichlet components,” IEEE Trans. Geosci. Re- 4177–4193, 2011. mote Sens., vol. 50, no. 3, pp. 863–878, 2012. [22] M. D. Craig, “Minimum-volume transforms for remotely sensed data,” [46] J. M. P. Nascimento and J. M. Bioucas-Dias, “Hyperspectral unmixing IEEE Trans. Geosci. Remote Sens., vol. 32, no. 3, pp. 542–552, May algorithm via dependent component analysis,” in Proc. IEEE IGARSS, 1994. Barcelona, Spain, Jul. 23–28, 2007, pp. 4033–4036. LIN et al.: A FAST HYPERPLANE-BASED MINIMUM-VOLUME ENCLOSING SIMPLEX ALGORITHM FOR BLIND HU 1961

[47] N. L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Multivariate Engineering (ICE) since 1999 (also the Chairman of ICE during 2002–2005), Distributions, Models and Applications, 1st ed. New York, NY,USA: National Tsing Hua University, Hsinchu, Taiwan. He has published more Wiley, 2002. than 210 technical papers, including about 80 journal papers (mostly in IEEE [48] R. Clark, G. Swayze, R. Wise, E. Livo, T. Hoefen, R. Kokaly, TRANSACTIONS ON SIGNAL PROCESSING),4bookchaptersandmorethan130 and S. Sutley, USGS Digital Spectral Library Splib06a: U.S. Geo- peer-reviewed conference papers, as well as a graduate-level textbook, Blind logical Survey, Digital Data Series 231, 2007 [Online]. Available: Equalization and System Identification, Springer-Verlag, 2006. His current http://speclab.cr.usgs.gov/spectral.lib06 research interests include signal processing for wireless communications, [49] R. Heylen, B. Dzevdet, and P. Scheunders, “Fully constrained least convex analysis and optimization for blind source separation, biomedicaland squares spectral unmixing by simplex projection,” IEEE Trans. Geosci. hyperspectral image analysis. Remote Sens., vol. 49, no. 11, pp. 4112–4122, Nov. 2011. Dr. Chi has been a Technical Program Committee member for many IEEE [50] P. Honeine and C. Richard, “Geometric unmixing of large hyperspec- sponsored and co-sponsored workshops, symposiums and conferences on tral images: A barycentric coordinate approach,” IEEE Trans. Geosci. signal processing and wireless communications, including Co-organizer and Remote Sens., vol. 50, no. 6, pp. 2185–2195, Jun. 2012. General Co-chairman of 2001 IEEE Workshop on Signal Processing Advances [51] A. Ambikapathi, T.-H. Chan, C.-Y. Chi, and K. Keizer, “Two effective in Wireless Communications (SPAWC), and Co-Chair of Signal Processing for and computationally efficient pure-pixel based algorithms for hyper- Communications (SPC) Symposium, ChinaCOM 2008 & Lead Co-Chair of spectral endmember extraction,” in Proc. IEEE ICASSP, Prague, Czech SPC Symposium, ChinaCOM 2009. He was an Associate Editor (AE) of IEEE Republic, May 22–27, 2011, pp. 1369–1372. TRANSACTIONS ON SIGNAL PROCESSING (2001–2004, 2006, and 2012–2015), [52] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Total variation IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II (2006–2007), IEEE spatial regularization for sparse hyperspectral unmixing,” IEEE Trans. TRANSACTIONS ON CIRCUITS AND SYSTEMS I (2008–2009), AE of IEEE Geosci. Remote Sens., vol. 50, no. 11, pp. 4484–4502, 2012. SIGNAL PROCESSING LETTERS (2006–2010), and a member of the Editorial [53] J. Chen, C. Richard, and P. Honeine, “Nonlinear estimation of material Board of Elsevier Signal Processing (2005–2008), and an editor (2003–2005) abundances in hyperspectral images with -norm spatial regulariza- as well as a Guest Editor (2006) of the EURASIP Journal on Applied Signal tion,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5, pp. 2654–2665, Processing. He was a member of Signal Processing Theory and Methods 2014. Technical Committee (SPTM-TC) (2005–2010), IEEE Signal Processing [54] R. N. Clark, G. A. Swayze, K. E. Livo, R. F. Kokaly, S. Sutley, J. B. Society. Currently, he is a member of Signal Processing for Communications Dalton, R. R. McDougal, and C. A. Gent, “Imaging spectroscopy: Earth and Networking Technical Committee (SPCOM-TC) and a member of Sensor and planetary remote sensing with the USGS tetracorder and expert Array and Multichannel Technical Committee (SAM-TC), IEEE Signal systems,” J. Geophys. Res., vol. 108, no. 12, pp. 5–44, Dec. 2003. Processing Society. [55] G. Swayze, R. Clark, S. Sutley, and A. Gallagher, “Ground-truthing AVIRIS mineral mapping at Cuprite, Nevada,” in Proc. Summ. 4th Annu. JPL Airborne Geosci. Workshop, 1992, vol. 2, pp. 47–49. [56] G. Swayze, “The hydrothermal and structural history of the Cuprite Yu-Hsiang Wang received his B.S. degree from the Mining District, Southwestern Nevada: An integrated geological and Department of Communications Engineering, Yuan geophysical approach,” Ph.D. dissertation, Univ. of Colorado, Boulder, Ze University, Taiwan, in 2011, and M.S. degree from CO,USA,1997. Institute of Communications Engineering, National [57] T.-H. Chan, A. Ambikapathi, W.-K. Ma, and C.-Y. Chi, “Robust affine Tsing Hua University, Taiwan, in 2015. set fitting and fast simplex volume max-min for hyperspectral end- He is currently a graduate student with the Institute member extraction,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. of Communications Engineering, National Tsing Hua 7, pp. 3982–3997, 2013. University, Taiwan. His research interests are convex [58] T. M. Apostol, Mathematical Analysis, 2nd ed. Reading, MA, USA: optimization and hyperspectral unmixing. Addison-Wesley, 1974.

Chia-Hsiang Lin received the B.S. degree in Elec- trical Engineering from the National Tsing Hua Uni- versity (NTHU), Hsinchu, Taiwan, in 2010. He is pursuing the Ph.D.degreeinCommunica- Tsung-Han Chan received his B.S. degree from tions Engineering at NTHU. He is currently a vis- the Department of Electrical Engineering, Yuan Ze iting Doctoral Graduate Research Assistant of Vir- University, Taiwan, in 2004 and his Ph.D. degree ginia Polytechnic Institute and State University, Ar- from the Institute of Communications Engineering, lington, VA, USA. His research interests are in net- National Tsing Hua University, Taiwan, in 2009. work science, game theory, convex geometry and op- He is currently working as a Senior Engineer with timization, and blind source separation. MediaTek Inc., Hsinchu, Taiwan. He was a co-recip- ient of a WHISPERS 2011 Best Paper Award. He was also recognized as an Outstanding Reviewer for CVPR 2014, and a BestRevieweroftheIEEETGRS 2014. His research interests are in image processing Chong-Yung Chi (S’83–M’83–SM’89) received the and numerical optimization, with a recent emphasis on , ma- B.S. degree in 1975 and M.S. degree in 1977, from chine learning and hyperspectral remote sensing. Tatung Institute of Technology and National Taiwan University, all in Electrical Engineering. Then he received the Ph.D. degree in Electrical Engineering from the University of Southern California, Los Angeles, California, in 1983. From 1983 to 1988, he was with the Jet Propulsion Laboratory, Pasadena, California. He has been a Pro- fessor with the Department of Electrical Engineering since 1989 and the Institute of Communications