Quick viewing(Text Mode)

An Outlier Detection Method Based on Mahalanobis Distance for Source Localization

An Outlier Detection Method Based on Mahalanobis Distance for Source Localization

sensors

Article An Detection Method Based on Mahalanobis for Source Localization

Qingli Yan 1,2,*, Jianfeng Chen 1 and Lieven De Strycker 2

1 School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China; [email protected] 2 KU Leuven, ESAT-DRAMCO, Ghent Technology Campus, 9000 Ghent, Belgium; [email protected] * Correspondence: [email protected]; Tel.: +86-136-5919-3864

 Received: 20 June 2018; Accepted: 5 July 2018; Published: 7 July 2018 

Abstract: This paper addresses the problem of localization accuracy degradation caused by of the angle of arrival (AOA). The problem of outlier detection of the AOA is converted into the detection of the estimated source position sets, which are obtained by the proposed division and greedy replacement method. The Mahalanobis distance based on robust and estimation method is then introduced to identify the outliers from the position sets. Finally, the weighted least squares method based on the reliable probabilities and is proposed for source localization. The simulation and experimental results show that the proposed method outperforms representative methods when unreliable AOAs are present.

Keywords: angle of arrival; source localization; outlier detection; Mahalanobis distance; unreliable nodes

1. Introduction The source localization techniques based on the angle of arrival (AOA) estimate the target position using a set of estimated bearings. Various methods have been proposed to solve the localization problem [1–3], among which the closed-form method of the pseudolinear estimator (PLE) was proposed under the assumption that AOA errors are small [4]. Although the PLE method is easy to implement and is efficient to compute, it is sensitive to outliers (AOAs with large errors), and such outliers (also referred to as unreliable AOAs) may exist in many practical applications of distributed node networks, which typically consist of a large number of small, low-cost sensor nodes. When nodes are deployed in harsh and unattended environments, animal attack or other forms of interference may occur. Moreover, low-cost nodes have limited amounts of power, computational, and memory capacity, and these limitations may also cause outliers. Other factors, such as node failures, data loss, and non-line of sight (NLOS) propagation [5,6], can lead to unreliable measurements. As a result, the estimated AOAs at each node will deviate significantly from the true values. Such outliers have been found to be detrimental to the PLE [7,8]. Thus, it is important to identify these erroneous data to improve the localization performance or perform a repair of the data. To reduce the error induced by outliers in node networks, several hybrid localization methods have been proposed by combining the AOA with the time difference of arrival (TDOA) and received signal strength (RSS) to identify and mitigate the NLOS error [9]. The expectation maximization (EM) method is introduced to identify unreliable AOAs caused by NLOS [10]. The intersection points (IPs)-based method [11] calculates the source position by taking the centroid of the set of intersections obtained by pairs of bearing lines; however, this method cannot significantly improve the localization performance, even eliminating the IPs obtained by two bearing lines close to parallel. The proposed unreliable AOA detection method in [7] can improve the localization accuracy; however, many

Sensors 2018, 18, 2186; doi:10.3390/s18072186 www.mdpi.com/journal/sensors Sensors 2018, 18, 2186 2 of 17 threshold parameters need to be set. The steered-response power phase transform (SRP-PHAT) [12] source localization approaches have demonstrated robustness when operating in reverberant and noisy environments. Regardless, these methods require a considerably higher amount of information to be transmitted to the central processing node and cannot be applicable to large-range localization scenarios (e.g., hundreds or thousands of meters). In this work, every node is equipped with a microphone array to estimate the AOA and then transmits the estimated AOA to the central node. This method does not require time synchronization in different nodes. Note that AOA estimation methods under an environment with complex environmental noise are outside the scope of this paper; interested readers are referred to [13–15]. These robust AOA estimation methods are proposed under the assumption that only small portions of snapshots are contaminated; they can perform well for continuous source signals or impulsive interference noise, which only have influence on limited snapshots. However, with other causes that can last a period of time, such as sensor failures and non-line of sight (NLOS) propagation, all snapshots for one node are unreliable; thus, outliers that may deteriorate the localization performance are still present even when these methods are applied to estimate AOAs under complex noise. Therefore, the outlier detection for the AOA is still necessary to improve the localization accuracy. Here, we propose a robust localization method when outliers are present. A large number of positions can be obtained by different node combinations. The maximum number of estimated positions is N × (N − 1)/2 for an N-node network. However, the estimated positions are sensitive to the bearing lines and their differences. Deleting the outliers from all intersections alone cannot significantly improve the localization performance [11]. To increase the estimated position reliability and improve the detection accuracy, we propose the division and greedy replacement (DIG) method to obtain different estimated position sets by changing one node at one time. The robust estimation method of the mean and covariance matrix for estimated position sets is then addressed to provide the information for outlier detection. The Mahalanobis distance (MD) [16] is finally proposed to identify the outliers from the estimation position sets. Finally, the weighted least squares (WLS) method based on detected reliable probabilities and distances is used to estimate the source position. The proposed method is easy to implement and can be easily extended to a three-dimensional (3D) source localization method. The main contributions of this paper can be summarized as follows:

• The division and greedy replacement (DIG) method is developed to estimate the target positions. • The Mahalanobis distance based on robust estimation of mean and covariance matrix is proposed to detect the outliers from estimated source positions. • An improved WLS localization method based on reliable probabilities and distances is introduced. • Outdoor experiments are conducted to verify the proposed method.

The remainder of this paper is organized as follows. Section2 describes the AOA localization method and addresses the existing problem. The unreliable node detection method is proposed in Section3. Simulations and experimental results are presented in Sections4 and5, respectively. Finally, our work is summarized in Section6.

2. AOA-Based Localization Method and Problem Statement

2.1. The Pseudolinear Estimator (PLE) We consider N nodes equipped with a microphone array for each one, with known positions T sk = [xk, yk] (k = 1, 2, ... , N), are deployed in an area of interest to estimate the location of a single source p = [x, y]T as shown in Figure1. Under the Gaussian background noise assumption, the estimated angle θˆk of k-th node can be given by the following:

θˆk = θk + ηk, (1) Sensors 2018, 18, x FOR PEER REVIEW 3 of 17

θθηˆ =+ kkk, (1) Sensors 2018, 18, 2186 3 of 17 where y − y where θ = k k arctan( ) , (2) x −yx− yk θk = arctan( k ), (2) x − xk η σ 2 and k is the zero mean Gaussian noise with variance 2i . and ηk is the zero mean Gaussian noise with variance σi .

s3 θˆ 3

p

θˆ 2 θˆ 1 s2 s1

FigureFigure 1. Illustration 1. Illustration of angle of angle of ofarrival arrival (AOA) (AOA) localizat localizationion with with three three nodes nodes deployed deployed in in the the test test area. area.

TheThe set set of ofmeasurements measurements from from NN nodesnodes can can be be written written as as follows: follows: θθηˆ =+, (3) θˆ = θ + η, (3) ˆˆ= θθˆ θˆ Τ = θθ θΤ = ηη ηΤ where θ [12 ...N ] , θ [12T ...N ] , and η [12T ...N ] . Thus, the pseudolinearT where θˆ = [ θˆ θˆ ... θˆ ] , θ = [ θ θ ... θ ] , and η = [ η η ... η ] . Thus, estimator (PLE), 1also 2known asN the orthogonal1 vect2 ors (OV)N estimator, can be1 used2 to estimateN the the pseudolinear estimator (PLE), also known as the orthogonal vectors (OV) estimator, can be used to source position and is given by the following [4]: estimate the source position and is given by the following [4]: Ap = B + e , (4) Ap = B + e, (4) where the estimated source position is as follows:

where the estimated source position is as follows:Τ -1 Τ p(AA)ABˆ = , (5) −1 ˆp = (A=TA) θθˆATB,ˆ =−θθˆ ˆ (5) where the k-th row of matrix A and B is A(k ,:) [sinkk cos ] , B(kx ,:)kkkk sin y cos , k = 1, 2, …, N, and ˆ ˆ where the k-th row of matrix A and B is A(k, :) = [ sin θˆk cos θˆk ], B(k, :) = xk sin θk − yk cos θk, k = 1, = ηη ηΤ 2, . . . , N, and e [rr1122 sin sin rNN sin ] , (6) T e = [ r1 sin η1 r2 sin η2 ... rN sin ηN ] , (6) where rk is the distance between the source and node sk . where rk is the distance between the source and node sk. 2.2. Problem Formulation 2.2. Problem Formulation The PLE is easy to implement, even for large-scale data. However, the PLE is sensitive to The PLE is easy to implement, even for large-scale data. However, the PLE is sensitive to unreliable unreliable measurements (i.e., outliers). In this section, we use theoretical analysis and simulation measurements (i.e., outliers). In this section, we use theoretical analysis and simulation results to results to illustrate this problem. illustrate this problem. If the measurement error is sufficiently small, then we have sinη ≈ η . Thus, the ≈ kk If the measurement error is sufficiently small, then we have sin ηk≈ ηηk. Thus, the approximation approximation of the residuals of Equation (6) can be expressed as erkkk. The estimated error of of the residuals of Equation (6) can be expressed as ek ≈ rkηk. The estimated error of the source position thecan source be expressed position ascan follows be expressed [8]: as follows [8]: Δ=pppˆ − = − ∆p ˆpT p-1Τ TT -1 =()AA AB−1 − () AA AA−1p . = ATA ATB − ATA ATAp .(7) (7) -1 −1 =−=()AAATTTA AA()T e(−e)

Sensors 2018, 18, x FOR PEER REVIEW 4 of 17 Sensors 2018, 18, 2186 4 of 17 The covariance matrix of Equation (7) can be obtained by the following:

Τ The covariance matrix of Equation (7)cov( canΔ=ppp be ) obtainedE ( ΔΔ by ) . the following: (8)

Thus, the mean-square error (MSE)cov is( ∆givenp) = byE( the∆p∆ following:pT). (8) =Δ[] Thus, the mean-square error (MSE) isMSE given bytr cov( the following:p ) . (9) Submitting Equations (6) and (7) into Equation (10), we have the following: MSE = tr[cov(∆p)]. (9) 1 MSE = Submitting Equations (6) and2 (7)θθ into− Equation (10), we have the following:  sin (ij ) ij, ∈ s , (10) 1 MSE = 2 2 22 ∑ sin{}σθθ(θi−()()θffj) sin−+− cos ff sin θθ cos i,j∈s eiii 11 12 21 ii 22 nis∈ h io , (10) · 2 ( f − f )2 + ( f − f )2 ∑ σei 11 sin θi 12 cos θi 21 sin θi 22 cos θi i∈s > = 2 θ = 2 θ where S is defined as all the combinations if {,ij } with ji. f11 sin i , f22 cos i and is∈ is∈ 2 Τ { } > = 2 = 2 whereff==S is definedcosθθ sin as alland the σ combinations= E()ee . if i, j with j i. f11 ∑ sin θi, f22 ∑ cos θi and 21 12  ii ei i∈s i∈s is∈ f = f = 2 = E(eeT) 21 12 ∑ cos θi sin θi and σei . We cani∈ ssee from Equation (10) that the MSE is affected by the relative geometry between the sourceWe and can the see fromnodes, Equation the number (10) thatof nodes, the MSE and is the affected AOA bymeasurement the relative errors. geometry To betweenillustrate thethe sourceeffect of and outliers, the nodes, we conducted the number several of nodes, simulations and the AOA to analyze measurement the characteristics errors. To illustrate of the localization the effect oferror outliers, for different we conducted source several positions. simulations The source to analyze is assumed the characteristics to be located of the at localizationthe gridded error points, for different source positions. The source is assumed2 to be located at the gridded points, ranging from ranging from −10 m to 10 m in a 20 × 20 m grid with a resolution of 0.5 m. Four nodes— sss123,,, and 2 −10 m to 10 m in a 20 × 20 m grid with a resolution of 0.5 m. Four nodes—s1, s2, s3, and s4—are s4 —are randomly deployed in the test area, as shown in Figure 1. The root-mean-square error randomly deployed in the test area, as shown in Figure1. The root-mean-square error (RMSE) of the (RMSE) of the PLE of 500 trials for every target position is used as the performance metrics. PLE of 500 trials for every target position is used as the performance metrics. σσσσ=== ° The RMSEs for different source positions are shown in Figure 2a when 1234=1 . It ◦is The RMSEs for different source positions are shown in Figure2a when σ1 = σ2 = σ3 = σ4= 1 . clear that the localization errors are relatively lower when the source is surrounded by the nodes It is clear that the localization errors are relatively lower when the source is surrounded by the nodes compared to the outside source. The conclusion follows the analysis based on the Cramer–Rao lower compared to the outside source. The conclusion follows the analysis based on the Cramer–Rao lower bound (CRLB) in [17]. bound (CRLB) in [17].

(a) (b)

FigureFigure 2. 2.Root-mean-square Root-mean-square error error (RMSE) (RMSE) fordifferent for differe sourcent source positions positions when four when nodes four are nodes deployed are × 2 σσσσ=== ° σ =° σσσ===° deployed in 2a 20 20 m test area: (a) 1234◦ =1 ; (b)◦ 1 10 , ,1234◦ . The black in a 20 × 20 m test area: (a )σ1 = σ2 = σ3 = σ4 = 1 ;(b) σ1 = 10 , , σ2 = σ3 = σ4 = 1 . The black dots denotedots denote the nodes, the nodes, and the and numbers the numbers on the on dotted the do linetted contours line contours are the are values the values of the of RMSEs. the RMSEs.

σ =° Assume that the unreliable node s1 is subject to a large noise with zero 1 10◦ and Assume that the unreliable node s1 is subject to a large noise with zero means σ1 = 10 and σσσ===°1 . The◦ resulting RMSEs are plotted in Figure 2b. When the source is close to the σ2 234= σ3 = σ4 = 1 . The resulting RMSEs are plotted in Figure2b. When the source is close to the unreliableunreliable node,node, thethe localizationlocalization accuracyaccuracy isis notnot significantlysignificantly deteriorated.deteriorated. However,However, thethe RMSEsRMSEs areare significantlysignificantly increasedincreased when when the the source source is is far far from from the the unreliable unreliable nodes. nodes. From From Equation Equation (10), (10), we we can can

Sensors 2018, 18, 2186 5 of 17

see thatSensors for 2018 the, 18 same, x FOR AOA PEER estimationREVIEW error σi, the MSE is mainly influenced by the distance r5k ofbetween 17 the source and the node sk. σ Toseedemonstrate that for the same the importance AOA estimation of detecting error unreliablei , the MSEnodes, is mainly the influenced RMSEs of theby the estimated distance positions rk obtainedbetween from the the source four and nodes the node with s onek . being unreliable are compared with the RMSEs when only three reliableTo demonstrate observations the areimportance used for of the detecting source un locatedreliable at nodes,p = [the3, 0 ]RMSEsT m. As of shownthe estimated in Table 1, the localizationpositions obtained errors from obtained the four using nodes only with three one reliablebeing unreliable nodes are are significantlycompared with lower the RMSEs than those = Τ obtainedwhen with only four three nodes, reliable one observations of whichis are unreliable. used for the Therefore, source located it is necessary at p [3,0] to detectm. As theshown unreliable in nodesTable and 1, then the removelocalization them errors to improveobtained theusing localization only three accuracy.reliable nodes are significantly lower than those obtained with four nodes, one of which is unreliable. Therefore, it is necessary to detect the unreliable nodes and then removeTable 1. themRMSE to for improve different the numbers localization of nodes. accuracy.

Table 1. RMSE for different numbers of nodes. Unreliable Node s1 s2 s3 s4 RMSEUnreliable (m): N Node= 4 1.2201s1 0.3168s2 0.3146s3 s 0.80604 RMSERMSE (m): (m):N N= = 3 4 0.21551.2201 0.3168 0.1642 0.3146 0.1695 0.8060 0.1780 RMSE (m): N = 3 0.2155 0.1642 0.1695 0.1780

3. The3. DIG_MDThe DIG_MD Method Method We knowWe know that that at leastat least two two nonparallel nonparallel bearingbearing lines lines are are required required to toestimate estimate an IP, an and IP, andthe the maximummaximum number number of IPsof IPs for forN nodesN nodes is isN (NNN(1)2−− 1)/. 2Regardless. Regardless of the of theparallel parallel cases cases of two of twobearing bearing lines,lines, all IPs all areIPs expectedare expected to beto be close close to to each each other other andand surround surround the the true true source source position position when when they they are onlyare only subjected subjected tolow-level to low-level environment environment noise. noise. In contrast, the the bearings bearings corrupted corrupted by large by large noise noise s will leadwill lead the IPsthe toIPs beto be far far from from the the source source position.position. As As shown shown in inFigure Figure 3, the3, theIPs IPsobtained obtained from from1 s1 are obviouslyare obviously far fromfar from the the other other IPs. IPs. Therefore, Therefore, wewe can identify identify the the unreliable unreliable AOAs AOAs by detecting by detecting outliersoutliers from from the the estimated estimated target target positions. positions. However,However, there there are are too too many many intersections intersections to calculate to calculate for large-scalefor large-scale node node networks networks if only if only two two bearings bearin aregs used.are used. Moreover, Moreover, the the IPs IPs are alsoare also easily easily affected affected by the errors of either one and by the angular distance [11]. For example, it is easy to cause a by the errors of either one and by the angular distance [11]. For example, it is easy to cause a false false alarm if s6 is determined to be unreliable when p16 and p36 are detected as outliers. To solve alarm if s6 is determined to be unreliable when p16 and p36 are detected as outliers. To solve these these problems, the division and greedy replacement (DIG) method is proposed here to improve problems, the division and greedy replacement (DIG) method is proposed here to improve the stability the stability of the estimated positions. The two-dimensional (2D) outlier detection method is then of theused estimated to find positions.the unreliable The bearings. two-dimensional Finally, the (2D) WLS outlier based detection on detected method reliable is probabilities then used to and find the unreliabledistances bearings. from initial Finally, position the WLS to nodes based is onused detected to perform reliable the localization. probabilities The and procedure distances is from given initial positionas follows: to nodes is used to perform the localization. The procedure is given as follows:

p16

s s 3 2 p′ s4 p12

p15 p

p14

s5 s 1 p36

s 6

FigureFigure 3. Illustration 3. Illustration of the of intersectionthe intersecti pointson points (IPs) (IPs) distribution distribution when when one one bearing bearing is unreliable,is unreliable, where where the rectangle denotes the IP and pij is the intersection point obtained by si and sj. the rectangle denotes the IP and pij is the intersection point obtained by si and sj.

3.1. The Division and Greedy Replacement (DIG) Method 3.1. The Division and Greedy Replacement (DIG) Method In order to detect outliers from the AOAs, based on estimated source positions, a set of position Inestimations order to detectare needed, outliers which from should the AOAs, be calculat baseded onby estimateda fixed number source of positions, nodes only a setwith of one position estimationsindependent are needed, variable. which Thus, should every be estimated calculated positi by aon fixed corresponds number of to nodes the unique only with different one independent node. variable.In this Thus, paper, every we estimated propose to position divide all corresponds nodes into totwo the sets, unique and differentthe greedy node. replacement In this paper, is thenwe used propose to divideto obtain all nodes different into combinations two sets, and of the a fi greedyxed number replacement of nodes is with then one used difference. to obtain different combinations of a fixed number of nodes with one difference. Sensors 2018, 18, 2186 6 of 17

(1) Division: In this section, the two separated set are defined as the reference node set (Ωre f ) and the replacement node set (Ωrep), with sizes m and N − m, respectively. Here, to provide an easier explanation, we assume that the reference nodes are indexed from 1 to m. Thus, Ωre f and Ωrep can be denoted as Ωre f = {s1, s2,..., sm} (m ≥ 3) and Ωrep = {sm+1, sm+2,..., sN}, respectively. Algorithm 1 presents the selection method of reference nodes:

Algorithm 1. Selection method of reference nodes.

(1) Estimate the initial source position p0 by the PLE based on all measurements; (2) Calculate the distances from p0 to all nodes; (3) Select m nodes that have short distances and can form a convex polygon with the target inside.

The performance analysis in [17] shows that the nodes that are close to the target are dominant in the localization results and that the localization error for the target inside a convex polygon composed of multiple nodes is smaller than that of an outside one. So, we propose to use the nodes that can comprise a convex polygon with the target inside and have short distances to the source as reference nodes, as shown in step 3; thus, no fewer than three nodes should be selected as the reference nodes. As the true position is unknown, an initial obtained from all the measurements can be used to evaluate the distances stated as step 1 and step 2. As shown in Figure3, p0 is the initial position calculated by 0 0 six measurements; s1, s2, and s3 are closest to p , and p is inside the convex polygon formed by the three nodes. Thus, s1, s2, and s3 are selected as reference nodes (i.e., Ωre f = {s1, s2, s3} when m = 3). The detail of the division method can be summarized as follows. To identify the unreliable bearings by detecting outliers from a set of estimated positions, we obtain the positions by changing only one node at a time. As noted above, the localization error is sensitive to the bearing error of the nodes relatively far from the source. Thus, we design the greedy replacement method by using each node in Ωrep to replace one of those in Ωre f . The procedure is given by Algorithm 2. Every node in Ωre f is replaced by (N − m) nodes from Ωrep. Next, m sets, including (N − m) positions in each set, can be obtained, and every point is calculated by m nodes with (m − 1) same nodes from Ωre f . For this method, the position sets can be calculated with cost (−m2 + mN). In contrast, the cost is [N(N − 1)/2] if all IPs are estimated. In general, the DIG method is computationally simpler than the IP method.

Algorithm 2. Greedy Replacement Method. For k = 1:m For j = m + 1:N Ω = Ωre f ∪ {sj} − {sk} T −1 T pk,j = (A(Ω) A(Ω)) A(Ω) B(Ω) T Pk(j,:) = pk,j end end

In this paper, X ∪ Y and X − Y denote the union and difference of sets X and Y, respectively; A(Ω) represents the matrix A in Equation (4) calculated based on the nodes from set Ω; and Pk(j, :) is the j-th row of Pk. Each element pk,j of Pk is the estimated position using sj, j ∈ {m + 1, . . . , N} to replace sk, k ∈ {1, . . . , m}.

3.2. Outlier Detection Method for Estimated Target Position Sets T All the position elements in Pk = [pk,m+1, pk,m+2,..., pk,n] , k = 1, 2, ... , m should be close to each other under the assumption that all the nodes are reliable. The outlier positions should be obtained from the unreliable nodes. For the 2D source localization problem, the elements in Pk are identically distributed 2D random vectors with mean µk and a positive-definite covariance matrix Σk. To identify Sensors 2018, 18, 2186 7 of 17

the unreliable nodes in set Ωrep, the square of the Mahalanobis distance (MD) [18–20], which can be formulated as in Equation (11), is proposed to detect outliers from the position matrix in Pk as follows:

2 T −1 dk,j(µk, Σk) = (pk,j − µk) Σk (pk,j − µk) (11)

In the field of data statistics, MD is typically used to characterize how far a particular datum is from the center. A point with a distance greater than a predetermined threshold is assumed to be an outlier. The outlier detection problem in this work is a 2D data detection problem. Therefore, the robust estimated method of Σk and µk is important for robust outlier detection. The outlier detection method for one position set Pk is given by Algorithm 3. To better explain Algorithm 3, let us recall the Gnanadesikan Kettenring (GK) estimator first [18], which provides a reasonable relationship between variance and covariance. Assume that V is the covariance matrix of L-dimensional random vector x and σ(·) represents the ; thus, we have 2 σ(cTx) = cTVc (12) for all c ∈ RL. The GK estimator can be formulated as the following:

1   cov(x, y) = σ(x + y)2 − σ(x − y)2 , (13) 4 where x and y are a pair of random vectors.

Algorithm 3. Outlier detection method from Pk.

−1 Step 1. Let D = diag(σ(Pk(:, 1)), σ(Pk(:, 2))), and Mk = PkD . Step 2. Compute the correlation matrix Ψ, Ψ11 = Ψ22 = 1, and 1 2 2 Ψ12 = Ψ21 = 4 [σ(Mk(:, 1) + Mk(:, 2)) − σ(Mk(:, 1) − Mk(:, 2)) ]. Step 3. Compute the matrix E whose columns are the eigenvectors of Ψ, and Ψ = EΛET, where Λ = diag(λ1, λ2), and λi are the eigenvalues. T  −1  T 0 Step 4. Let G = DE and Zk(j,:) = G pk,j , and v = [µ(Zk(:, 1)), µ(Zk(:, 2))] , and define Σk ← c1Σk , 0 0 T 0 2 2 and µk ← c2µk , where Σk = GΓG , µk = Gv and Γ = diag(σ(Zk(:, 1)) , σ(Zk(:, 2)) ), 2 2 2 Step 5. Calculate the square of MD dk,j based on Equation (11) with a threshold of dk0 = χ2(α).

Step 6. If dsk,j > dk0, pk,j is an outlier. Thus, the unreliable probability for sj is 1/m; otherwise, it is 0.

2 In Algorithm 3, med(·) represents the median value, χp(α) is the α-quantile of the chi-squared distribution with p degrees of freedom, diag(·) is the diagonal matrix, and σ(·) and µ(·) denote the 0 0 univariate standard deviation and average value, respectively. c1 and c2 are a constant. Σk and µk are the estimations of Σk and µk. Steps 1–4 in Algorithm 3 provide a method to obtain the positive-definite and approximately equal-variant covariance matrix Σk for high-dimensional scatter datasets with much shorter computing times [19]. The first step in Algorithm 3 makes the position vector scale-equivariant for different dimensions. Then, the GK estimator is used to calculate the covariance matrix Ψ in step 2. However, Ψ is symmetric but not necessarily positive semidefinite, it cannot satisfy the requirement of positive definiteness of Σk [20]. Considering the fact that, the eigenvalues of a covariance matrix can be seen as the variances along the directions of respective eigenvectors, the eigenvalue decomposition is performed to find eigenvalues and eigenvectors in step 3. A modification is then made in step 4 by using the positive robust variances calculated by Equation (12) to replace the eigenvalues, which may be negative [21], to obtain the positive diagonal covariance matrix Γ. Then, Γ is used to estimate the 0 positive-definite covariance matrix Σk instead of Λ. It has been proven in [22] that there exist constant 0 0 c1 and c2, such that the true Σk and µk can be approximated by the estimations Σk and µk , that is Sensors 2018, 18, 2186 8 of 17

0 0 Σk ← c1Σk , and µk ← c2µk . For the classical fast minimum covariance determinant (FASTMCD) method [23], c1 is defined as follows:

med(d ,..., d ) = sk,m+1 sk,N c1 2 , (14) χ2(0.5)

0 and c2 = 1. Once µk and Σk are obtained in step 4, the MD for every position vector can be calculated according to Equation (11), which can be rewritten as follows:

2 −1 T 0−1 dk,j(µk, Σk) = c1 (pk,j − µk) Σk (pk,j − µk). (15)

Thus, the outliers are identified by comparing the squared MDs with the defined threshold 2 dk0 obtained in step 5. The choice of the threshold is based on the fact that, when the position µ 2 2 matrix Pk ∼ N( k, Σk), the squared MD dk,j is distributed as a χ random variance with 2 degrees of freedom [22]. To reduce the false-alarm probability, we set an unreliable probability to every node in step 6. If pk,j is detected as outliers, then the unreliable probability of sj is set to be qk,j = 1/m; otherwise, qk,j = 0. After m position matrices are evaluated, the unreliable probability for every node in Srep m can be obtained by qj = ∑ qk,j. Thus, the unreliable probabilities for the nodes from Srep have been k=1 determined. To identify the unreliable nodes in Sre f , the detection method is repeated with different reference nodes, which are selected from the set of Srep that have been identified as reliable.

3.3. WLS Based on Reliable Probability and Distance When the unreliable probabilities for all nodes are determined, the WLS method with reliable probability qi and distance rˆi from the initial position to node si, i = 1, ... , N is applied to perform localization and can be formulated as follows:

−1 ˆp = (ATWA) ATWB, (16)

where W = diag(w1/rˆi,..., wN/rˆN). (17) wi = 1 − qi. The procedure for the proposed localization method is given by Algorithm 4.

Algorithm 4. The procedure of the proposed method based on DIG and MD: DIG_MD.

(1) Perform the DIG method to determine Ωre f = {s1, s2,..., sm} (m ≥ 3) and Ωrep = {sm+1, sm+2,..., sN }, and then calculate m position matrices Pk, k = 1, . . . , m. (2) Identify all the outliers in the matrices Pk k = 1, . . . , m based on Algorithm 2. (3) Calculate the unreliable probabilities for nodes in Ωrep. (4) Estimate the source position based on Equation (16).

(5) Reselect the nodes with high low unreliable probability qi < 0.5 from Ωrep with the new initial position obtained from step 5. (6) Repeat steps 1–4 to estimate the source position as the final localization results.

4. Simulations In this section, we compare the performance of the proposed method, DIG_WD, with that of the PLE, the WLS-based distance method denoted as WLS (i.e., the reliable probabilities for all nodes are 1), and the EM-based method [10] through a series of computer simulations. 2 We assume that N nodes are placed uniformly in an L × L m test area with a resolution of ∆x and ∆y along the horizontal and vertical directions, respectively. Each node is equipped with a microphone Sensors 2018, 18, x FOR PEER REVIEW 9 of 17

4. Simulations In this section, we compare the performance of the proposed method, DIG_WD, with that of the PLE, the WLS-based distance method denoted as WLS (i.e., the reliable probabilities for all nodes are 1), and the EM-based method [10] through a series of computer simulations. 2 We assume that N nodes are placed uniformly in an L × L m test area with a resolution of  x and  y along the horizontal and vertical directions, respectively. Each node is equipped with a Sensors 2018, 18, 2186 9 of 17 microphone array to estimate the AOA of the target, and 1000 Monte Carlo simulations are conducted for every case based on the parameters L  250 , xy    50 , and   0.95 . Next, u randomlyarray to estimate selected the nodes AOA are of assumed the target, to andbe subject 1000 Monte to large Carlo noise simulations or interference are conducted, and their standard for every case based on the parameters L = 250, ∆x = ∆y = 50, and α = 0.95. Next, u randomly selected nodes deviation of the estimated bearing error is set to be  2 ; moreover, those of the remaining “reliable” are assumed to be subject to large noise or interference, and their standard deviation of the estimated nodes are set to be 1 , 21. The initial positions for WLS, EM, and DIG_MD are obtained by the bearing error is set to be σ2; moreover, those of the remaining “reliable” nodes are set to be σ , σ2 σ . PLE. 1 1 The initial positions for WLS, EM, and DIG_MD are obtained by the PLE. For comparison purposes, we also apply the detection method, Algorithm 3, to identify the For comparison purposes, we also apply the detection method, Algorithm 3, to identify the outliers from all IPs calculated by every two bearing lines. Instead of calculating the mean of IPs as outliers from all IPs calculated by every two bearing lines. Instead of calculating the mean of IPs as the source position, the WLS estimator based on reliable probabilities and distance is also used to the source position, the WLS estimator based on reliable probabilities and distance is also used to determine the position. When t (tn 1) the IPs can be obtained on the bearing line extending from determine the position. When t (t ≤ n − 1) the IPs can be obtained on the bearing line extending from si , i = 1, 2, …, n; the unreliable probabilities of si is qt if q IPs included in the t points are si, i = 1, 2, ... , n; the unreliable probabilities of si is q/t if q IPs included in the t points are identified ideasntified outliers as from outliers all IPs. from Next, all IPs. the Next, WLS the based WLS on based Equation on Equation (16) is used (16) to is findused the to sourcefind the position; source position;this method this is method defined is as defined the IP_WLS as the method. IP_WLS Furthermore, method. Furthermore, the center of the all center IPs after of excluding all IPs after all excludingdetected outliers all detected is defined outliers as CIP.is defined as CIP.

4.1.4.1. The The RMSEs RMSEs for for D Differentifferent σ11 and andσ 2 2 TheThe localization localization performances performances of of various various approaches approaches are are influenced influenced by by the the standard standard deviation deviation T ofof the estimated estimated bearings bearings error. error. For For the the source source located located at atp =p  [[73.3,62.3]73.3, 62.3 ], the, the RMSEs RMSEs of of different different = ◦ u = m = algorithmsalgorithms with with different different σ1 are are plotted plotted as as shown shown in in Figure Figure4a 4a when whenσ 2  2 1515, , u 66 and and m 4.4 .

(a) (b)

Figure 4. The RMSEs of the pseudolinear estimator (PLE), weighted least squares (WLS), center of Figure 4. The RMSEs of the pseudolinear estimator (PLE), weighted least squares (WLS), center of all all intersected points (CIP), expectation maximization (EM), IP_WLS, and division and greedy intersected points (CIP), expectation maximization (EM), IP_WLS, and division and greedy replacement– replacement–Mahalanobis distance (DIG_MD) methods with (a) different◦ 1 when  2 15 , Mahalanobis distance (DIG_MD) methods with (a) different σ1 when σ2 = 15 ,(b) different σ2 when (b) different◦  2 when 1 2 . ( u  6 , m  4 ). σ1 = 2 .(u = 6, m = 4 ).

It can be seen that the existence of unreliable bearings can severely deteriorate the localization It can be seen that the existence of unreliable bearings can severely deteriorate the localization performance of the PLE, especially when is small. When 1 0.5 ◦, the RMSE of the estimated performance of the PLE, especially when σ1 is small. When σ1 = 0.5 , the RMSE of the estimated errorserrors for for the the PLE PLE is is as as large large as as 4.83 m. Compared Compared with with WLS, WLS, the the CIP CIP method method shows shows lower lower RMSEs RMSEs only when  2 ,◦ and IP_WLS always outperforms WLS for all values of  , because it is easier to only when σ1 < 2 , and IP_WLS always outperforms WLS for all values of1 σ1, because it is easier to detect the outliers when the data are contaminated severely. This phenomenon also illustrates the importance of detecting outliers. From Figure4a, we can also observe that IP_WLS shows better performance than CIP, illustrating the superiority of WLS over simple CIP. The EM exhibits a somewhat similar performance to that of DIG_WD when σ1 is small; however, it shows a greater advantage as σ1 increases. Sensors 2018, 18, x FOR PEER REVIEW 10 of 17

detect the outliers when the data are contaminated severely. This phenomenon also illustrates the importance of detecting outliers. From Figure 4a, we can also observe that IP_WLS shows better performance than CIP, illustrating the superiority of WLS over simple CIP. The EM exhibits a

somewhat similar performance to that of DIG_WD when 1 is small; however, it shows a greater Sensorsadvantage2018, 18 as, 2186 increases. 10 of 17

The simulation is then conducted when 1 2 and  2 ranges from 10° to 20°, considering ◦ ◦ ◦ the factThe that simulation the background is then conducted noise usually when doesσ1 =not2 changeand σ2 greatlyranges during from 10 a shortto 20 period, considering for certain the factapplications. that the background The results noisein Fig usuallyure 4b doesindicate not that change the greatlyDIG_MD during can significantly a short period improve for certain the applications.localization performance The results compared in Figure 4withb indicate the conventional that the DIG_MD PLE and canWLS. significantly CIP can outperform improve WLS the localizationonly when the performance difference comparedbetween withand the  conventional is large, and PLE it always and WLS. has CIPhigher can RMSEs outperform than t WLShose 2 onlyof the when IP_WLS the differencemethod. EM between performsσ1 and slightlyσ2 is large,better and than it alwaysDIG_MD has when higher RMSEs is significantly than those larger of the IP_WLS method. EM performs slightly better than DIG_MD when σ2 is significantly larger than σ1. than . In contrast, the DIG_MD clearly outperforms EM when  2 is less than 16. In contrast, the DIG_MD clearly outperforms EM when σ2 is less than 16. From Figure 4a,b, we can see that both IP_WLS and DIG_MD can improve the localization From Figure4a,b, we can see that both IP_WLS and DIG_MD can improve the localization accuracy compared with the PLE and WLS. However, DIG_MD shows better performance than accuracy compared with the PLE and WLS. However, DIG_MD shows better performance than IP_WL. IP_WL. This is because IP_WLS is based on the outlier detection results of IP. These IPs are This is because IP_WLS is based on the outlier detection results of IP. These IPs are sensitive to the sensitive to the difference of two AOAs. When the source and two nodes are close to located at a difference of two AOAs. When the source and two nodes are close to located at a line, the IP will be line, the IP will be easily identified as outliers, and thus, false alarm probability will be increased. easily identified as outliers, and thus, false alarm probability will be increased. On the other hand, On the other hand, any of the two AOA errors will have an influence on the IP. When one IP is any of the two AOA errors will have an influence on the IP. When one IP is detected as an outlier, detected as an outlier, then two nodes will be allocated unreliable probabilities. As a result, the false then two nodes will be allocated unreliable probabilities. As a result, the false alarm also exists if only alarm also exists if only one of them is reliable, especially when the node is close to the source. All one of them is reliable, especially when the node is close to the source. All these problems can be these problems can be solved by the proposed DIG method. solved by the proposed DIG method.

4.2. The InfluenceInfluence of the NumberNumber ofof ReferenceReference NodesNodes To discuss discuss the the effect effect of ofthe the number number of reference of reference nodes nodes on the on localization the localization performance, performance, the RMSEs the RMSEs for different scale of reference nodes are plotted in Figure 5 when◦  2 and◦  15 . for different scale of reference nodes are plotted in Figure5 when σ1 = 2 and1 σ2 = 15 . 2

(a) (b)

Figure 5. (a) The RMSEs of DIG_MD with the number of reference nodes when different unreliable Figure 5. (a) The RMSEs of DIG_MD with the number of reference nodes when different unreliable nodes are present and N  36 ; (b) the RMSEs of DIG_MD with the number of reference nodes when nodes are present and N = 36;(b) the RMSEs of DIG_MD with the number of reference nodes when different nodes are used and u  6 . different nodes are used and u = 6.

We can see that more reference nodes should be used when the number of unreliable nodes We can see that more reference nodes should be used when the number of unreliable nodes increases, and the number of reference nodes should be no more than [n/6] ( x is the nearest integer increases, and the number of reference nodes should be no more than [n/6] ([x] is the nearest integer to x). Otherwise, the performance of DIG_WD will deteriorate seriously. As illustrated in Section 3, to x). Otherwise, the performance of DIG_WD will deteriorate seriously. As illustrated in Section3, m position sets with (N − m) elements in each set can be obtained using the DIG_MD method. When m position sets with (N − m) elements in each set can be obtained using the DIG_MD method. When the the number of reference nodes increases, the position sets also increase, whereas the number of number of reference nodes increases, the position sets also increase, whereas the number of estimated estimated locations decreases. Only if there are enough positions to be evaluated should more locations decreases. Only if there are enough positions to be evaluated should more position sets be used to increase the reliability of detection. To guarantee enough positions in each set to detect outliers, (N − m) should be significantly greater than m. From the simulation results, it can be seen that the reference node number is preferred to be within the range from three to [N/6]. Sensors 2018, 18, x FOR PEER REVIEW 11 of 17 Sensors 2018, 18, x FOR PEER REVIEW 11 of 17 position sets be used to increase the reliability of detection. To guarantee enough positions in each Sensors 2018, 18, 2186 11 of 17 setposition to detect sets outliers,be used to(N increase − m) should the reliability be significantly of detection. greater To than guarantee m. From enough the simulation positions results,in each itset can to bedetect seen outliers,that the reference(N − m) should node number be significantly is preferred greater to be than within m. theFrom range the fromsimulation three to results, [N/6]. 4.3. Theit can Localization be seen that Performance the reference for node Different number Numbers is preferred of Unreliable to be within Nodes the range from three to [N/6]. 4.3. The Localization Performance for Different Numbers of Unreliable Nodes Figure6 further shows the localization performance for different numbers of unreliable nodes. 4.3. TheFigure Localization 6 further P erformanceshows the forlocalization Different N performanceumbers of Unreliable for different Nodes numbers of unreliable nodes. It can be seen that the RMSEs of all the methods increase as the number of unreliable nodes increases. It canFigure be seen 6 thatfurther the showsRMSEs the of alllocalization the methods performance increase as for the different number numbers of unreliable of unreliable nodes increases. nodes. ComparedComparedIt can be with seen with the that the PLE, the PLE, RMSEs CIP CIP can ofcan all improve improve the methods the the localizationlocalization increase as theperformance performance number of when unrel when iableunreliable unreliable nodes nodes increases. nodes are are present;present;Compared however, however, with it the exhibits it PLE,exhibits CIP slightly slightly can improve higher higher the RMSERMSE localization than PLE PLE performance when when there there when is no is unreliable nooutlier. outlier. EM nodes EMhas theare has the highesthighestpresent; RMSE RMSE however, among among EM,it exhibits EM, WLS, WLS, slightly IP_WLS, IP_WLS, higher and and RMSE DIG_MD;DIG_MD; than however, however,PLE when it itperforms there performs is no better outlier. better than thanEM WLS has WLS when the when the numberthehighest number ofRMSE unreliable of amongunreliable EM, nodes nodes WLS, increases. increases.IP_WLS, Theand The DI IP_WLSIP_WLSG_MD; and andhowever, DIG_MD DIG_MD it performs methods methods better can can inhibit than inhibit WLS the effectwhen the effect of unreliableofthe unreliable number bearing of bearing unreliable measurements measurements nodes increases. for for all all cases. Thecases. IP_WLS The superiority superiority and DIG_MD of of the themethods DIG_MD DIG_MD can method inhibit method over the overeffectother other methodsmethodsof unreliable increases increas bearing ases the as measurementsthe number number of of unreliable unreliablefor all cases.nodes nodes The increases.superiority of the DIG_MD method over other methods increases as the number of unreliable nodes increases.

Figure 6. RMSE vs. the number of unreliable nodes when  2 and◦  15 and◦ m  4 . Figure 6. RMSE vs. the number of unreliable nodes when σ11 = 2 and 2σ2 = 15 and m = 4.

Figure 6. RMSE vs. the number of unreliable nodes when 1 2 and  2 15 and m  4 . To investigate the robustness of the proposed method, the hit percentage of DIG_MD (when the To investigate the robustness of the proposed method, the hit percentage of DIG_MD (when the errorsTo of investigate the evaluated the robustness methods areof the less proposed than WLS method, or the the PLE) hit percentage is shown in of Figure DIG_MD 7. The (when figure the errors of the evaluated methods are less than WLS or the PLE) is shown in Figure7. The figure shows showserrors ofthat the the evaluated CIP method methods has the are lowest less than hit percentages WLS or the compared PLE) is shown with both in Figure the PLE 7. Theand figureWLS. that theEMshows CIPhas that methodhigher the hitCIP has percentages method the lowest has than the hit lowestIP_WLS percentages hit compared percentages compared with compared the with PLE, both withwhile theboth the PLE latterthe andPLE can and WLS. improve WLS. EM has higherlocalizationEM hit has percentages higher accuracy hit percentages than with IP_WLS greater than compared probability IP_WLS withcompared than the EM PLE, with compared whilethe PLE, the with while latter WLS. the can Inlatter improve contrast, can improve localization the hit accuracypercentageslocalization with greater of accuracy DIG_MD probability with retain greater its than superiority probability EM compared compared than EM with with compared WLS.both theIn with PLE contrast, WLS.and WLS. In the contrast, hit percentages the hit of DIG_MDpercentages retain itsof DIG_MD superiority retain compared its superiority with bothcompared the PLE with and both WLS. the PLE and WLS.

(a) (b)

(a) (b)

Figure 7. (a) Hit percentages of the DIG_MD, EM, IP_WLS, and CIP methods compared with the PLE ◦ ◦ when σ1 = 2 and σ2 = 15 , and m = 4;(b) hit percentages of the DIG_MD, EM, IP_WLS, and CIP ◦ ◦ methods compared with WLS when σ1 = 2 and σ2 = 15 , and m = 4. Sensors 2018, 18, x FOR PEER REVIEW 12 of 17 Sensors 2018, 18, x FOR PEER REVIEW 12 of 17

FigureFigure 7. 7. ( a(a) )Hit Hit percentages percentages of of the the DIG_MD, DIG_MD, EM, EM, IP_WLS IP_WLS,, and and CIP CIP methods methods compared compared with with the the PLE PLE σ =° σ =° = when 1 2 and 2 15 , and m 4 ; (b) hit percentages of the DIG_MD, EM, IP_WLS, and CIP Sensors 2018when, 18 , 21861 2 and  2 15 , and m  4 ; (b) hit percentages of the DIG_MD, EM, IP_WLS, and CIP12 of 17 σ =° σ =° = methods compared with WLS when 1 2 and 2 15 , and m 4 . methods compared with WLS when 1 2 and  2 15 , and m  4 .

4.4.4.4.4.4. TheThe The LocalizationLocalization Localization PerformancePerformance Performance forfor for DifferentDifferent Different NumbersNumb Numbersers ofof of NodesNodes Nodes andand and forfor for DifferentDifferent Different SourceSource Source PositionsPositions Positions AsAsAs thethe the numbernumber number ofof of nodesnodes nodes usuallyusually usually hashas has aa a greatgreat great influenceinfl influenceuence onon on thethe the localizationlocalization localization performance,performance, performance, wewe we plotplot plot thethethe relationshiprelationship relationship betweenbetween between RMSERMSE RMSE andand and thethe the numbernumber number ofof of nodesnodes nodes inin in FigureFigure Figure8 .8. 8. For For For fairness, fairness, fairness, the the the number number number of of of unreliableunreliableunreliable nodesnodes nodes isis is NN N/6./6./6. The TheThe number numbernumber of of reference reference nodes nodes is is four.four four.. Figure Figure8 8 8shows shows shows that that that EM EM EM has has has a a highera higher higher RMSERMSERMSE thanthan than IP_WLSIP_WLS IP_WLS andand and DIG_MDDIG_MD DIG_MD methodsmethods methods whenwhen when thethe the numbernumber number ofof of nodesnodes nodes isis is 12.12. 12. However,However, However, thethe the IP_WLSIP_WLS IP_WLS methodmethodmethod shows shows shows worse worse worse performance performance performance than than than EM EM as EM the as as number the the number number of nodes of of increases.nodes nodes increases. increases. The proposed The The proposed proposed method, DIG_MD,method,method, DIG_MD, DIG_MD, always has always always the best has has localization the the best best locali localization accuracyzation foraccuracy accuracy the different for for the the cases.different different cases. cases.

6 PLE WLS 5 CIP

4 EM IP_WLS DIG_MD RMSE (m) 3

2

12 18 24 30 36 Number of nodes

σ =° Figure 8. RMSE of PLE, WLS, CIP, EM, IP_WLS and DIG_MD vs. the number of nodes when 1 2 ◦ , FigureFigure 8. 8.RMSE RMSE of of PLE, PLE, WLS, WLS, CIP, CIP, EM, EM, IP_WLS IP_WLS and and DIG_MD DIG_MD vs. vs. the the number number of of nodes nodes when whenσ 1=1 22 , , σ =° 2 15 ◦. σ2=2 1515 ..

To study the efficiency of the proposed method for different source positions, Figure 9b shows ToTo study study the the efficiency efficiency of of the the proposed proposed method method for for different different source source positions, positions, Figure Figure9b 9 showsb shows the localization performance for five different source positions when six unreliable measurements thethe localization localization performance performance for for five five different different source source positions positions when when six unreliablesix unreliable measurements measurements are are present. It is clear that the proposed method can improve the localization performance present.are present. It is clear It is that clear the proposed that the proposedmethod can method improve can the improve localization the performance localization significantly performance significantly for all source positions. forsignificantly all source positions.for all source positions.

(a(a) ) (b(b) ) Figure 9. (a) Five target positions in a 36-node distributed network; (b) RMSE for different target Figure 9. (a) Five target positions in a 36-node distributed network; (b) RMSE for different target Figurepositions 9. when(a) Five m target= 4 , u positions= 5 , σ =°2 in aand 36-node σ =°15 distributed. network; (b) RMSE for different target m  4 u  5 1 2 2 15 positions when , , 1 ◦ and 2 ◦ . positions when m = 4, u = 5, σ1 = 2 and σ2 = 15 .

Sensors 2018, 18, 2186 13 of 17

5. Outdoor Experiment Results and Analysis In this section, we describe the verification of our proposed method using a 30-node network 2 for acousticSensors source2018, 18, xlocalization. FOR PEER REVIEW All nodes were placed in an 11 × 11 m square field,13 as of shown 17 in Figure 10Sensors. Each 2018, 18 node, x FOR is PEER an REVIEW autonomous vehicle equipped with a four-element cross13 microphone of 17 array, as5. shownOutdoor in Experiment Figure 11 R.esults The and microphone Analysis array is arranged into two orthogonal pairs 20 cm 5. Outdoor Experiment Results and Analysis apart. Each pairIn this of section, microphones we describe estimates the verification an AOA of using our proposed the generalized method using cross a correlation30-node network with phase transformfor (GCC-PHAT) acousticIn this section, source we localization.[24 describe] method. the All verification The nodes final were AOAof placed our isproposed in then an 11 obtained method 11m2 square using by the a field, 30-node fusion as shownnetwork of two in AOAs 2 obtainedforFigure by acoustic the 10. two Each source pairs node localization. of is microphones. an autonomous All nodes Thevehicle were vertical placed equipped distancein an with 11 a× from 11 fourm - elementsquare ground field, cross to microphone as microphone shown in is also array, as shown in Figure 11. The microphone array is arranged into two orthogonal pairs 20 cm 20 cm.Figure During 10. the Each test, node all is nodes an autonomous transmitted vehicle the equi estimatedpped with angles a four-element to the base cross station microphone following a array,apart. as Each shown pair in of Figure microphones 11. The estimatesmicrophone an AOAarray is using arranged the generalized into two orthogonal cross correla pairstion 20 with cm predefinedapart.phase collision-avoidance Eachtransform pair of(GCC microphones-PHAT) communication [24 estimates] method. an The protocol. AOA final using AOA The the is localization thengeneralized obtained tests cross by werethe correlation fusion repeated of with two 40 times. The acousticphaseAOAs sourcetransformobtained was by (GCC-PHAT) the a car two engine pairs [24] of noise microphones.method. generated The finalThe by verticalAOA a loudspeaker is distance then obtained from orientated ground by the tofusion microphone upward. of two Without T loss of generality,AOAsis also 20obtained cm. we During placedby the the two thetest, pairs speaker all nodes of microphones. attransmitted the center The the of verticalestimated the test distance angles field from (i.e.,to the groundx base= [ 5.5m,station to microphone 5.5mfollowing] ). Theisa experimentalsopredefined 20 cm. Duringcollision is conducted the-avoidance test, allin nodes communication an outdoortransmitted environment, protocol.the estimated The localizationangles with noiseto the tests base always werestation present. repeated following However,40 the signal-noise-ratioatimes. predefined The acoustic collision-avoidance (SNR) source for eachwas a node communicationcar engine is different noise protocol. generated as the distancesThe by localization a loudspeaker from sourcetests orientated were to nodesrepeated upward. are 40 different. times.Without The loss acoustic of generality, source wewas placed a car enginethe speaker nois eat generated the center by of athe loudspeaker test field (i.e., orientated x  [5.5mm upward. ,5.5 ] ). The range is from 5 to 15 dB. During the experiment, unreliable AOAs may be introducedΤ by the WithoutThe lossexperiment of generality, is conducted we placed in anthe outdoor speaker environment, at the center ofwith the noisetest field always (i.e., present. x =[5.5 However,mm ,5.5 ] ). following: the signalThe experiment-noise-ratio is (conductedSNR) for each in an node outdoor is different environment, as the with distances noise always from source present. to However, nodes are • Multipaththedifferent. signal-noise-ratio signal:The range because is (SNR)from 5 thefor to 15 distanceeach dB. node During betweenis diffthe erentexperiment, the as microphonethe unreliabledistances from arrayAOAs source may and be theto introducednodes ground are is only different.by the following: The range is from 5 to 15 dB. During the experiment, unreliable AOAs may be introduced 20 cm, unreliable AOAs may be introduced by a multipath signal. by the following: • Interferences:• Multipath the signal: movements because the of distance people between and cars the microphone during the array experiment and the ground are also is only causes of • Multipath20 cm, unreliable signal: AOAsbecause may the be distance introduced between by a thmultipathe microphone signal. array and the ground is only unreliable measurements. • 20Interferences: cm, unreliable the AOAs movements may be ofintroduced people and by a cars multipath during signal. the experiment are also causes of • The• lowInterferences:unreliable SNR: because measurements. the ofmovements the possible of people nonstationary and cars during background, the experiment the SNR are of also the causes received of signal of each• unreliableT nodehe low may SNR:measurements. vary because in a large of the range, possible possibly nonstationary resulting background, in unreliable the SNR measurements. of the received • Tsignalhe low of eachSNR: node because may ofvary the in possible a large range, nonsta possiblytionary resultingbackground, in unreliable the SNR measurements. of the received signal of each node may vary in a large range, possibly resulting in unreliable measurements.

12 5 10 15 20 25 30 10

8 4 9 14 19 24 29

nodes 6 source

4 3 8 13 18 23 28

2 2 7 12 17 22 27

0 1 6 11 16 21 26

024681012 Figure 10. Node placementx (m)for the 35-node network. Figure 10. Node placement for the 35-node network. Figure 10. Node placement for the 35-node network.

Figure 11. The four-element cross-microphone array. Figure 11. The four-element cross-microphone array. Figure 11. The four-element cross-microphone array.

SensorsSensors 2018 2018,,, 18 18,,, 2186xx FORFOR PEERPEER REVIEWREVIEW 1414 ofof 1717

ToTo verifyverify thethe influenceinfluence ofof thethe numbernumber ofof referencereference nodesnodes onon thethe localizationlocalization accuracy,accuracy, wewe plottedplottedTo verifythethe RMSERMSEs thes influence ofof differentdifferent of the numbersnumbers number ofof of referencereferenc referencee nodes,nodes, nodes asas on shownshown the localization inin FigFigureure accuracy, 112.2. WeWe cancan we seesee plotted thatthat thethe RMSEsproposedproposed of different methodmethod numbersDIG_MDDIG_MD of clearlyclearly reference outperformsoutperform nodes, ass shown otherother comparedcompared in Figure 12 methodmethods. We cans when seewhen that thethe the numbernumber proposed ofof methodreferencereference DIG_MD nodesnodes isis clearly fewerfewer outperforms thanthan six.six. AsAs other thethe compared numbernumber exceedsexceeds methods six,six, when thethe the RMSEsRMSEs number ofof DIG_MD ofDIG_MD reference increaseincrease nodes isgradually.gradually. fewer than WhenWhen six. As moremore the numberthanthan ninenine exceeds referencereference six, the nodesnodes RMSEs areare ofused,used, DIG_MD thethe DIG_MDDIG_MD increase method gradually.method yieldsyields When similarsimilar more thanlocalizationlocalization nine reference performanceperformance nodes toto are thethe used, IP_WLS.IP_WLS. the DIG_MD ToTo showshow method thethe lolocalizationcalization yields similar resultsresults localization moremore clearly,clearly, performance wwee furtherfurther to thecomparecompared IP_WLS.d thethe To localizationlocalization show the localization resultsresults ofof results DIG_MDDIG_MD more withwith clearly, thethe we PLEPLE further forfor thethe compared 4400 experimentsexperiments the localization withwith resultsmm=44,, ofasas DIG_MD shownshown inin with FigureFigure the PLE 13.13. TheThe for the resultsresults 40 experiments showshow thatthat withwhilewhilem mostmost= 4, as ofof shown thethe largelarge in Figure errorerror 13 peakspeaks. The ofof results PLEPLE showwerewere thatsubstantiallysubstantially while most degradeddegraded, of the large, therethere error werewere peaks aa fewfew of PLE casescases were inin which substantiallywhich thethe DIG_MDDIG_MD degraded, methodmethod there performed wereperformed a few casesslightlyslightly in whichbetterbetter thanthan the DIG_MD thethe PLEPLE (e.g.,(e.g., method inin thethe performed 1111thth andand slightly3333rdrd runs).runs better). ToTo thaninvestigateinvestigate the PLE thethe (e.g., underlyingunderlying in the 11th reason,reason, and 33rdwewe plottedplotted runs). Tothethe investigate unreliableunreliable the sensorsensor underlying nodenode detectiondetection reason, we resultsresults plotted forfor the thethe unreliable twotwo cacases,ses, sensor asas node shownshown detection inin FigFigureure results 114a,b.4a,b. for FortheFor twocomparisoncomparison cases, as purposes,purposes, shown in thethe Figure estimatedestimated 14a,b. ForAOAAOA comparison valuesvalues andand purposes, thethe detectiondetection the estimated resultsresults forfor AOA thethe values10th10th andand and 32nd32nd the detectionexperimentalexperimental results runruns fors theforfor 10th whichwhich and 32nd thethe proposedexperimentalproposed methodmeth runsod for significantlysignificantly which the proposed improvesimproves method thethe significantly localizationlocalization improvesperformanceperformance the are localizationare plottedplotted inin performance FigFigureure 14c,14c,d,d are, respectivelyrespectively. plotted in Figure. 14c,d, respectively.

0.7

0.6 PLE WLS 0.5 CIP

EM 0.4 IP_WLS DIG_MD 0.3

0.2

0.1 345678910 Number of reference nodes

FigureFigure 12.12. RMSEsRMSEs for for different different numbersnumbers ofof referencereference nodes.nodes.

1.4

PLE 1.2 DIG_MD

1

0.8

0.6

0.4

0.2

0 1 6 11 16 21 26 31 36 experiment times

FigureFigure 13.13. TheThe localization localization error error forfor 3030 repetitionrepetition estimationestimation results.results.

Sensors 2018, 18, 2186 15 of 17 Sensors 2018, 18, x FOR PEER REVIEW 15 of 17

1 1 200 AOA error reliable probability

100 AOA error reliable probability 0.5 100 0.5

0 0 0 0 0102030 0 102030 nodes nodes (a) (b) 1 1

AOA error 50 reliable probability

50

0.5 0.5 AOA error reliable probability reliable probability AOA error (degree) AOA

0 0 0 0 0102030 0 102030 nodes nodes (c) (d)

FigureFigure 14. 14.Estimated Estimated AOA AOAerrors errors andand unreliableunreliable sensor detection detection results. results. (a ()a The) The 11th 11th experiment; experiment; (b) the 33rd experiment; (c) the 10th experiment; (d) the 32nd experiment. (b) the 33rd experiment; (c) the 10th experiment; (d) the 32nd experiment.

As nodes, s11, s 24 , are misjudged as unreliable for the 11th run, the localization error of As nodes, s11, s24, are misjudged as unreliable for the 11th run, the localization error of DIG_MD DIG_MD is only slightly better than that of the PLE, even though the unreliable nodes s6922,,ss can is only slightly better than that of the PLE, even though the unreliable nodes s , s , s can be detected; be detected; a similar situation can also be found in the 33rd experiment. In contrast,6 9 22 the unreliable a similar situation can also be found in the 33rd experiment. In contrast, the unreliable nodes in nodes in the 10th run can be detected correctly. For the 32nd run, the localization error can be the 10th run can be detected correctly. For the 32nd run, the localization error can be significantly significantly decreased while s18 and s28 are detected with a very low false-alarm probability. decreased while s and s are detected with a very low false-alarm probability. To verify the18 localization28 performance under different numbers of nodes in a node network, To verify the localization= performance under different numbers of nodes in a node network, we only use ss1  k , k 20,25,30 to perform localization when four reference nodes are used. we only use s1 ∼ sk, k = 20, 25, 30 to perform localization when four reference nodes are used. NoteNote that that when when different different numbers numbers ofof nodesnodes are used, the the source source location location is isno no longer longer located located at the at the center of all nodes. The simulation results shown in Figure 15 reveal that DIG_MD has the best center of all nodes. The simulation results shown in Figure 15 reveal that DIG_MD has the best localization performance for all the cases considered. localization performance for all the cases considered.

Sensors 2018, 18, 2186 16 of 17 Sensors 2018, 18, x FOR PEER REVIEW 16 of 17

FigFigureure 15. RMSEs for different numbers of nodes.

6. Conclusions 6. Conclusions The localization performance of conventional AOA-based method, the PLE, is prone to be The localization performance of conventional AOA-based method, the PLE, is prone to be deteriorated when unreliable measurements are present. In this paper, we propose an unreliable deteriorated when unreliable measurements are present. In this paper, we propose an unreliable node detection method based on the characteristics of the estimated positions of different node node detection method based on the characteristics of the estimated positions of different node combinations. In the proposed approach, the DIG method is used to acquire different position sets, combinations. In the proposed approach, the DIG method is used to acquire different position sets, and the MD based on robust location and covariance matrix estimator is used to identify the outliers and the MD based on robust location and covariance matrix estimator is used to identify the outliers from the estimated target position sets. The proposed method does not require any prior from the estimated target position sets. The proposed method does not require any prior information information about the target and is easy to implement. Both simulation and outdoor experiment about the target and is easy to implement. Both simulation and outdoor experiment results show that results show that DIG_MD is efficient and robust against the influence of unreliable measurements DIG_MD is efficient and robust against the influence of unreliable measurements and can significantly and can significantly improve the localization accuracy when the measurements are contaminated. improve the localization accuracy when the measurements are contaminated.

Author Contributions: Q.Y.Q.Y. proposed,proposed, programmed, programmed, and and tested tested the the proposed proposed method. method. Q.Y. Q.Y. and and J.C. J.C. designed designed the theexperiment experiment and and analyzed analyzed the experimental the experimental results, results Q.Y., performedQ.Y. performed the experiment, the experiment Q.Y. wrote, Q.Y. thewrote paper, the andpaper, J.C. and J.C. L.D.S. and improved L.D.S. improved the paper. the paper.

FundiFunding:ng: This work is supported by thethe NationalNational Natural Natural Science Science Foundation Foundation ofof China China (No.(No. 61501374) and NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Information (U1609204). NSFC-Zhejiang Joint Fund for the Integration of Industrialization and Information (U1609204). Conflicts of Interest: The authors declare no conflict of interest. Conflicts of Interest: The authors declare no conflict of interest. References References 1. Han, G.; Xu, H.; Duong, T.Q.; Jiang, J.; Hara, T. Localization algorithms of wireless sensor networks: A survey. 1. Han, G.; Xu, H.; Duong, T.Q.; Jiang, J.; Hara, T. Localization algorithms of wireless sensor networks: A Telecommun. Syst. 2013, 52, 2419–2436. [CrossRef] survey. Telecommun. Syst. 2013, 52, 2419–2436. 2. Fresno, J.M.; Robles, G.; Martínez-Tarifa, J.M.; Stewart, B.G. Survey on the performance of source localization 2. Fresno, J.M.; Robles, G.; Martínez-Tarifa, J.M.; Stewart, B.G. Survey on the performance of source algorithms. Sensors 2017, 17, 2666. [CrossRef][PubMed] localization algorithms. Sensors 2017, 17, 2666. 3. Cobos, M.; Antonacci, F.; Alexandridis, A.; Mouchtaris, A.; Lee, B. A survey of sound source localization 3. Cobos, M.; Antonacci, F.; Alexandridis, A.; Mouchtaris, A.; Lee, B. A survey of sound source localization methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017.[CrossRef] methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017, 4. Koks, D. Passive Geolocation for Multiple Receivers with No Initial State Estimate; No. DSTO RR-0222; Defence doi:10.1155/2017/3956282. Science & Technology Organization: Edinburgh, SA, Australia, 2001. 4. Koks, D. Passive Geolocation for Multiple Receivers with No Initial State Estimate; No. DSTO RR-0222; Defence 5. Sayed, A.H.; Tarighat, A.; Khajehnouri, N. Network-Based wireless location: Challenges faced in developing Science & Technology Organization: Edinburgh, SA, Australia, 2001. techniques for accurate wireless location information. IEEE Signal Process. Mag. 2005, 22, 24–40. [CrossRef] 5. Sayed, A.H.; Tarighat, A.; Khajehnouri, N. Network-Based wireless location: Challenges faced in 6. Lee, W.C. Uncertainty in Wireless Sensor Networks; Workshop on AFRL: Wright-Patterson Air Force Base, OH, developing techniques for accurate wireless location information. IEEE Signal Process. Mag. 2005, 22, 24– USA, 2010. 40. 7. Yan, Q.; Chen, J.; Ottoy, G.; Cox, B.; De Strycker, L. An accurate AOA localization method based on unreliable 6. Lee, W.C. Uncertainty in Wireless Sensor Networks; Workshop on AFRL: Wright-Patterson Air Force Base, sensor detection. In Proceedings of the 2018 IEEE Sensors Applications Symposium (SAS), Seoul, Korea, OH, USA, 2010. 12–14 March 2018; pp. 1–6. 7. Yan, Q.; Chen, J.; Ottoy, G.; Cox, B.; De Strycker, L. An accurate AOA localization method based on unreliable sensor detection. In Proceedings of the 2018 IEEE Sensors Applications Symposium (SAS), Seoul, Korea, 12–14 March 2018; pp. 1–6.

Sensors 2018, 18, 2186 17 of 17

8. Yan, Q.; Chen, J.; Ottoy, G.; De Strycker, L. Robust AOA based acoustic source localization method with unreliable measurements. Signal Process. 2018, 152, 13–21. [CrossRef] 9. Yu, K.; Guo, Y.J. Statistical NLOS identification based on AOA, TOA, and signal strength. IEEE Trans. Veh. Technol. 2009, 58, 274–286. [CrossRef] 10. Giménez Febrer, P.J.; Silva Pereira, S.; López Valcarce, R. Distributed AOA-based source positioning in NLOS with sensor networks. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Brisbane, QLD, Australia, 19–24 April 2014; pp. 3197–3201. 11. Griffin, A.; Mouchtaris, A. Localizing multiple audio sources from DOA estimates in a wireless acoustic sensor network. In Proceedings of the 14th IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 20–23 October 2013; pp. 1–4. 12. Do, H.; Silverman, H. A fast microphone array srp-Phat source location implementation using Coarse-To-Fine Region Contraction (CFRC). In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 21–24 October 2007; pp. 295–298. 13. Lee, D.D.; Kashyap, R.L. Robust maximum likelihood bearing estimation in contaminated Gaussian noise. IEEE Trans. Signal Process. 1992, 40, 1983–1986. [CrossRef] 14. Yardimci, Y.; Cetin, A.E.; Cadzow, J.A. Robust direction-of-arrival estimation in non-Gaussian noise. IEEE Trans. Signal Process. 1998, 46, 1443–1451. [CrossRef] 15. Besson, O.; Yuri, A.; Ben, J. Direction-of-arrival estimation in a mixture of K-Distributed and Gaussian noise. Signal Process. 2016, 128, 512–520. [CrossRef] 16. De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D.L. The mahalanobis distance. Chemom. Intell. Lab. Syst. 2000, 50, 1–18. [CrossRef] 17. Bishop, A.N.; Fidan, B.; Anderson, B.D.; Pathirana, P.N.; Dogancay, K. Optimality analysis of sensor node-Target geometries in passive localization: Part 1-Bearing-Only localization. In Proceedings of the 2007 3rd International Conference on Intelligent Sensors Sensor Networks and Information Processing (ISSNIP), Melbourne, VIC, Australia, 3–6 December 2007; pp. 7–12. 18. Gnanadesikan, R.; Kettenring, J.R. Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 1972, 28, 81–124. [CrossRef] 19. Maronna, R.A.; Ruben, H.Z. Robust estimates of location and dispersion for high-dimensional datasets. Technometrics 2002, 44, 307–317. [CrossRef] 20. Rousseeuw, P.J.; Molenberghs, G. Transformation of nonpositive semidefinite correlation matrices. Commun. Stat. Theory Methods 1993, 22, 965–984. [CrossRef] 21. Marcus, M.; Minc, H. A Survey of Matrix Theory and Matrix Inequalities; Courier Corporation: Chelmsford, MA, USA, 1992. 22. Rousseeuw, P.J.; Hubert, M. Robust statistics for outlier detection. In Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery; John Wiley & Sons: Hoboken, NJ, USA, 2011; pp. 73–79. 23. Rousseeuw, P.J.; Driessen, K.V. A fast algorithm for the minimum covariance determinant estimator. Technometrics 1999, 41, 212–223. [CrossRef] 24. Knapp, C.; Carter, G. The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Signal Process. 1976, 24, 320–327. [CrossRef]

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).