Computation of scalar accuracy metrics LE, CE, and SE as both predictive and sample- based statistics

John Dolloff [C] and Jacqueline Carr [C] National Geospatial-Intelligence Agency (NGA) 7500 GEOINT Drive, Springfield, VA Government POC: Christopher O’Neill ([email protected]); PA case #: 16-277

ABSTRACT

Scalar accuracy metrics often used to describe geospatial data quality are Linear Error (LE), Circular Error (CE), and Spherical Error (SE), which correspond to vertical, 2d horizontal, and 3d radial errors, respectively, in a local tangent plane coordinate system (e.g. East-North-Up). In addition, they also correspond to specifiable levels of probability; for example, CE_50 is Circular Error probable, or CE at the 50% probability level. Typical probability levels of interest are 50, 90, and 95%. Scalar accuracy metrics are very important for both near-real time accuracy predictions corresponding to derived location information, and for post-analysis verification and validation of system performance (location accuracy) requirements. The former normally correspond to predictive scalar accuracy metrics and the latter usually correspond to sample-based scalar accuracy metrics. Predictive metrics are generated from a priori error matrices and -values, if non-zero, corresponding to the underlying (up to 3) error components. Sample-based metrics are generated from independent and identically distributed (i.i.d.) samples of error relative to “ground truth”. This paper details the proper but practical computation of both types of scalar accuracy metrics, including corresponding confidence intervals – techniques with non-trivial computational approximation errors are not included nor needed. Recommended sample-based metrics utilize order statistics which require no assumptions regarding the underlying probability distributions of errors.

Keywords: Linear Error (LE), Circular Error (CE), Spherical Error (SE), probability, predictive statistics, sample statistics, order statistics

1.0 Introduction Scalar accuracy metrics are commonly used to express the accuracy or corresponding error in an estimate of geospatial location in a local tangent plane coordinate system (e.g. East-North-Up) in a way that is easy to visualize, comprehend, and communicate. Common scalar accuracy metrics used include Linear Error (LE) for describing vertical or elevation data accuracy and Circular Error (CE) for describing two dimensional horizontal data accuracy. Accuracy of geospatial location in three dimensions is often described using either a combination of the LE and CE scalar metrics into a representative CE-LE cylinder or by the single metric representation of Spherical Error (SE) providing a radial representation of accuracy in three dimensions. Representing accuracy through the use of a single number, or even pair of numbers, has many advantages as it provides simplified and intuitive representations, a clear of comparison, and minimal bandwidth requirements. This simplification of sometimes complex information, however, is not without risk. There is not clear guidance or broad agreement on the best practices for calculation of these commonly desired metrics which results in representations which may not be reliably compared, such as to a desired requirement. Also, common computations of predictive statistics are typically excessive in their approximations. Similarly, computations based on sample-based statistics may rely on incorrect assumptions of error distribution and subsequently incorrectly reflect system or process performance. This paper considers these challenges in scalar accuracy metric representation and provides guidance to fast, rigorous, and accurate methods to robustly calculate these representative values.

2.0 Defining Scalar Accuracy Metrics LE, CE, and SE To clarify the discussion presented in this paper, it is important to set the stage with explicit descriptions of the popular scalar accuracy metrics, LE, CE, and SE.

 Linear Error (LE) - LE corresponds to the length of a vertical line (segment) such that there is a 90% probability that the absolute value of vertical error resides along the line. If the line is doubled in length and centered at the position solution, there is a 90% probability the true position vertical location resides along the line. LE_XX corresponds to LE at the XX % probability level. See Figure 2.0-1.

1

 Circular Error (CE) - CE corresponds to the radius of a circle such that there is a 90% probability that the horizontal error resides within the circle, or equivalently, if the circle is centered at the position solution, there is a 90% probability the true position horizontal location resides within the circle. CE_XX corresponds to CE at the XX % probability level. See Figure 2.0-1.  Spherical Error (SE) - SE corresponds to the radius of a 3D sphere such that there is a 90% probability that 3d error resides within, or equivalently, if the sphere is centered at the position solution, there is a 90% probability the true position location resides within the sphere. SE_XX corresponds to SE at the XX % probability level.

Figure 2.0-1: Graphic depiction of CE (left) and LE (right) for examples where CE = 4m and LE = 5m

For the above scalar accuracy metrics:

 It is assumed that the underlying x-y-z coordinate system is a local tangent plane system, i.e., x and y are horizontal components and z the vertical component.  CE-LE corresponds to the CE-LE error cylinder. There is a probability between 81 to 90 percent that 3d radial error resides within the cylinder. The former value corresponds to uncorrelated horizontal and vertical errors, the latter value to highly correlated horizontal and vertical errors. See Figure 2.0-2.

Figure 2.0-2: Example of CE-LE cylinder compared to corresponding error ellipsoid based on full covariance matrix

References for the definitions of scalar accuracy metrics include [6-7]; an application of scalar accuracy metrics which also discusses error ellipsoids is [5].

2 As important means of conveying both near-real time accuracy predictions corresponding to derived location information, and for post-analysis verification and validation of system performance (location accuracy) requirements, scalar accuracy metrics are developed based on both predictive statistics and sample statistics. Predictive statistics correspond to the mathematical modeling of assumed a priori error characteristics and are part of a statistical error model. Predictive scalar accuracy metrics are generated from a priori error covariance matrices and mean-values, if non-zero, corresponding to the underlying (up to 3) error components. Sample statistics correspond to the analysis of a collection of error samples. Sample-based scalar accuracy metrics are generated from independent and identically distributed (i.i.d.) samples of error relative to “ground truth”.

The following sections of this paper detail the proper but practical computation of both types of scalar accuracy metrics, including corresponding confidence intervals – techniques with non-trivial computational approximation errors are not included nor needed with the availability of today’s computer systems. Sample-based techniques include the recommended use of order statistics which require no assumptions regarding the underlying probability distributions of errors.

3.0 Scalar Accuracy Metrics: Computations based on Predictive Statistics This section of the document presents the evaluation of the scalar accuracy metrics based on the predictive statistics: (1) 푛푥1 mean-value 휖푋̅̅̅̅, and (2) 푛푥푛 error covariance matrix 퐶푋; where 푛 = 1 for the computation of LE_XX, 푛 = 2 for the computation of CE_XX, and 푛 = 3 for the computation of SE_XX. The mean-value is typically zero for predictive statistics, although a non-zero mean-value is also accounted for in the following. A (multi-variate) Gaussian (normal) distribution is also assumed for the underlying random variables corresponding to location error (휖푋 = [휖푥 휖푦 휖푧]푇). This is necessary in order to assign probabilities; e.g., CE_90 corresponds to 90% probability. Note that radial error 휖푟 is defined as follows: 휖푟 = |휖푧|, 휖푟 = √휖푥2 + 휖푦2, ,휖푟 = √휖푥2 + 휖푦2 + 휖푧2, for LE, CE, and SE, respectively. (Note that confidence intervals are not computed for predictive-based scalar accuracy metrics as the underlying mean-value and error covariance matrix are assumed a “given”.)

3.1 LE LE_XX is defined as that line length L such that:

1 2 2 −1/2((푧−푧̅) /휎푧 ) 푝 = 1/2 ∫ 푒 푑푧, (3.1-1) (2휋) 휎푧 integrated over the region √푧2 ≤ 퐿, and where probability 푝 = 푋푋/100, 1d error 휖푋 = 휖푧 is defined as 푧 for notational convenience, with mean-value 휖푋̅̅̅̅ defined as 푧̅, and 1 × 1 error covariance matrix 퐶푋 about the mean 2 defined as 휎푧 . Note that if the mean-value is not zero, the length 퐿 is still relative to the origin per the standard definition of LE_XX.

∗ If we assume that the mean-value of error is zero, and change variables such that 푧/(휎푧√2) → 푧 , Equation 3.1-1 can be rewritten as:

∗ ∗ 2휎푧√2 퐿 −푧∗2 ∗ 2 퐿 −푧∗2 ∗ ∗ 푝 = 1/2 ∫ 푒 푑푧 = 1/2 ∫ 푒 푑푧 ≡ erf (푧 ∗), where 퐿 = 퐿/(휎푧√2). (3.1-2) (2휋) 휎푧 0 (휋) 0

Thus, since erf (Error Function) is a well-tabulated function and its inverse available via MATLAB and other programming languages, we have by definition, erf 푖푛푣(푝) = erf 푖푛푣(푋푋/100) = 퐿∗; thus, and accounting for the change of variables:

푋푋 퐿퐸_푋푋 = 휎 √2 × 푒푟푓푖푛푣( ), and specifically, (3.1-3) 푧 100

퐿퐸_푋푋 = 퐿(푝)휎푧, where 푝 = 푋푋/100 and the multiplier L(p) is listed in Table 5.1-1: (3.1-4)

3 Table 3.1-1: Linear Error (LE) multiplier L(p) versus probability level p

Probabilities p=0.5 p=0.6827 p=0.90 p=0.95 p=9545 p=0.99 p=0.9973 p=0.999 L(p) 0.6745 1.0000 1.6499 1.9600 2.0000 2.5758 3.0000 3.2905

The light blue entries include the standard probability levels of interest. Thus, for example, L(p)=1.6499 is applicable for LE_90. The violet entries are others of general interest. For example, p=0.9973 is the “three-sigma” level of probability. If the desired probability level is different than any of the above, simply evaluate Equation 3.1- 3 using the desired value for XX. If the mean-value for error is not equal to zero, solve Equation 3.1-1 directly using iteration and numerical integration.

Examples are as follows: 2 (1) Assume a desired probability level of 90%, a mean error of zero, and 퐶푋 ≡ 휎푧 = [9] meters-squared: 퐿퐸_90 = 퐿 × 휎푧 = 1.6499 × 3 = 4.95 meters.

푇 2 (2) Assume a desired probability level of 90%, a mean predictive error equal to 푋̅ ≡ 푧̅ = [−2], and 퐶푋 ≡ 휎푧 = [9] meters-squared. Thus, the Integral Equation 3.1-1 is applicable: 퐿퐸_90 = 5.976 meters.

The solution corresponding to the first example was virtually instantaneously, while the solution corresponding to the second example took on the order of 0.02 seconds using non-optimized MATLAB code on a notebook computer. The calculation error was negligible for all.

3.2 CE CE_XX is defined as that circular radius 푅 such that:

1 푇 −1 −1/2((푋−푋̅) 퐶푋 (푋−푋̅) 푝 = 1/2 ∬ 푒 푑푥푑푦, (3.2-1) (2π))det (퐶푋) integrated over the region √푥2 + 푦2 ≤ 푅, and where probability 푝 = 푋푋/100, 2d error 휀푋푇 = [휖푥 휖푦] is defined as 푋푇 = [푥 푦] for notational convenience, with mean-value 휖푋̅̅̅̅ defined as 푋̅푇 = [푥̅ 푦̅], and 2 × 2 error covariance matrix about the mean defined as 퐶푋. Note that if the mean-value is not zero, the radius 푅 is still relative to the origin [0 0]푇, per the standard definition of CE_XX.

Assuming a mean-value of zero and an additional change of variables (x*,y* ) relative to a eigenvector aligned system and scaled by the square-roots of the two eigenvalues of the covariance matrix, Equation 3.2-1can also be written as:

1 ∗2 ∗2 푝 = ∬ 푒−1/2(푥 +푦 )푑푥∗푑푦∗ (3.2-2) (2휋)

∗2 2 ∗2 integrated over the region √푥 + 푟 푦 ≤ 푅/휎푚푎푥 , where 푟 = 휎푚푖푛/휎푚푎푥 , and 휎푚푖푛 and 휎푚푎푥 are the square- root of the minimum eigenvalue and maximum eigenvalue, respectively, of the covariance matrix 퐶푋.

Therefore, the value 푅 = 푅(푝, 푟), such that the above integral equals the desired level of probability p, is related to CE_XX as follows:

퐶퐸_푋푋 = 푅(푝, 푟)휎푚푎푥. (3.2-3)

The following table presents the pre-computed values of 푅(푝, 푟) for various probability levels. In particular, columns 2-5 correspond to p=0.5, 0.9. 0.95, 0.99, and 0.999, respectively, or alternatively, to XX=50, 90, 95, 99, and 99.9 %, respectively.

4 Table 3.2-1: Circular Error (CE) multiplier 푅(푝, 푟) versus probability level p and ratio r

Ratio Probabilities r p=0.5 p=0.9 p=0.95 p=0.99 p=0.999 0.00 0.6745 1.6449 1.9600 2.5758 3.2905 0.05 0.6763 1.6456 1.9606 2.5763 3.2910 0.10 0.6820 1.6479 1.9625 2.5778 3.2921 0.15 0.6916 1.6518 1.9658 2.5803 3.2940 0.20 0.7059 1.6573 1.9704 2.5838 3.2967 0.25 0.7254 1.6646 1.9765 2.5884 3.3003 0.30 0.7499 1.6738 1.9842 2.5942 3.3049 0.35 0.7779 1.6852 1.9937 2.6013 3.3104 0.40 0.8079 1.6992 2.0051 2.6099 3.3172 0.45 0.8389 1.7163 2.0190 2.6203 3.3252 0.50 0.8704 1.7371 2.0359 2.6326 3.3346 0.55 0.9021 1.7621 2.0564 2.6474 3.3459 0.60 0.9337 1.7915 2.0813 2.6653 3.3595 0.65 0.9651 1.8251 2.1111 2.6875 3.3759 0.70 0.9962 1.8625 2.1460 2.7151 3.3965 0.75 1.0271 1.9034 2.1858 2.7492 3.4227 0.80 1.0577 1.9472 2.2303 2.7907 3.4570 0.85 1.0880 1.9936 2.2791 2.8401 3.5018 0.90 1.1181 2.0424 2.3318 2.8974 3.5594 0.95 1.1479 2.0932 2.3881 2.9625 3.6310 1.00 1.1774 2.1460 2.4478 3.0349 3.7169

∗ ∗ CE_XX is computed as 퐶퐸_푋푋 = 푅 휎푚푎푥, where the normalized radius 푅 is computed as the linear interpolation of 푅(푋푋/100, 푟) from the corresponding column of Table 3.2-1. Computation time is virtually instantaneous and its accuracy relative to the true value of CE_XX is typically on the order of 0.01 % (experimental max of 0.1%). Therefore, if the true value of CE_90=2 meters, we expect the computed value to be CE_XX=2 +/- 0.0002 meters.

If the mean-value for error is not equal to zero, one can solve Equation 3.2-1 directly using iteration and numerical integration. However, this solution may have convergence problems, so the Monte-Carlo method of subsection 3.2.1 is recommended instead. Its evaluation via MATLAB pseudo-code takes on the order of 0.08 seconds on a high-end notebook computer, and its accuracy relative to the true value of CE_XX is typically on the order of 0.05 % (experimental max of 0.6%).

3.2.1 Monte-Carlo Method for Computation of CE The following approach to the computation of CE_XX is applicable to arbitrary mean-values and arbitrary probability levels, is computationally accurate and reasonably fast:

̅ 1/2 (1) Compute 1E6 independent samples of the 2x1 horizontal error: 푠푖 = 푋 + 퐶푋 푛푖 , 푖 = 1, . . ,1퐸6, (3.2.1-1) where 푋̅ and 퐶푋 are the 2x1 mean and the 2x2 error covariance about the mean relative to the original (non- eigenvector aligned) coordinate system, 푛푖 is a two-element vector with each element the realization of an independent Gaussian or normal 푁(0,1) , and where the superscript “1/2” indicates principal ̅ 1/2 matrix square root. 푋 and 푛푖 are 2x1 vectors, and 퐶푋 is a 2x2 matrix. Also, 푠푖 is a Gaussian distributed random vector with mean 푋̅ since it is a linear function of the mean-zero random vector 푛푖 and added to 푋̅.

(2) Order the magnitudes of the error samples 푠푖 from smallest to largest, and designate 푅퐸푋푋 the XX_th percent ∗ 2 2 largest, and 푅퐸푋푋 the next largest magnitude. (Magnitudes equal √푠(1)푖 + 푠(2)푖 .)

∗ (3) CE_XX=(푅퐸푋푋 + 푅퐸푋푋)/2.

1/2 Note that the symmetric 퐶푋 is computed once prior to generating the independent samples, and the samples 1/2 푠푖 = 퐶푋 푛푖 are consistent with the error covariance matrix about the mean:

5 ̅ ̅ 푇 1/2 1/2 푇 1/2 푇 1/2 1/2 1/2 퐸{(푠푖 − 푋)(푠푖 − 푋) }=퐸{ 퐶푋 푛푖(퐶푋 푛푖 ) }=퐶푋 퐸{푛푖 푛푖 }퐶푋 =퐶푋 퐼2푥2 퐶푋 =퐶푋,where 퐸{ } is the expected value operator.

1/2 Note that MATLAB pseudo-code for the above takes advantage of efficiencies such as: (1) evaluating 퐶푋 only once and use of the MATLAB sqrtm function, (2) generating all 1E6 samples of 푛푖 simultaneously using the MATLAB function randn(2,1E6), and (3) ordering the magnitude samples using the MATLAB sort function. (Magnitudes-squared can also be used in step 2 of Equation/Algorithm 3.2.1-1 to avoid taking square-roots with step 3 modified appropriately to reduce computation time about 15%.)

The following examples assumes a mean-value 푋̅푇 = [10 5] meters, and an error covariance matrix about the 2 mean 퐶 = [10 0.75 × 10 × 12] meters-squared. Equation/Algorithm 3.2.1-1 was applied twice: once for 푋 . 122 CE_50 and once for CE_95.

Figure 3.2.1-1 CE_50 circle (red), CE_95 circle (black), and 10,000 of 1,000,000 random samples

The results are plotted in Figure 3.2.1-1, including the first 10,000 of the 1,000,000 independent samples used in the calculation of CE_50 for context. (The CE_50 circle in the figure was computed using all 1,000,000 independent samples. The CE_95 circle was computed similarly, but used a different set of 1,000,000 independent samples for convenience. Both circles are centered at zero by definition.)

Note that sample results (blue points) are not centered about zero and in a non-symmetric fashion due to a mean- value with different non-zero components in the x and y directions, as is correct. Also note that the actual statistical significance is greater than that implied by the figure, which displays only 1/100_th the actual number of samples used in the calculation of CE_50.

3.2.2 CE Evaluation Examples Examples are as follows: 4 2 (1) Assume a desired probability level of 90%, a mean error of zero, and 퐶 = [ ] meters-squared. Eigenvalues 푋 2 3 equal 5.562 and 1.438 meters-squared (per MATLAB pseudo code eig(A), where 퐴 ≡ 퐶푋). 휎푒푖푔_푚푎푥 = 2.36 meters, 푟 = .509; 0.041 0.009 푅∗ = 1.74 (via linear interpolation: 1.7371 + 1.7621 = 1.7416); 0.05 0.05 ∗ 퐶퐸_90 = 푅 휎푚푎푥 = 4.11 meters.

6 (2)As above except wi a mean-value 푋̅푇 = [1 −3] . Thus, since the mean-value is not zero, the Monte-Carlo Matrix Square Root method (Equation/Algorithm 3.2.1-1) is applicable: 퐶퐸_90 = 5.69 meters.

3.3 SE The definition and derivations/computations for SE are similar to that described above for CE, but extended from two dimensions to three dimensions.

In particular, SE_XX is defined as that spherical radius R such that:

1 푇 −1 −1/2((푋−푋̅) 퐶푋 (푋−푋̅) 푝 = 3/2 1/2 ∭ 푒 푑푥푑푦푑푧, (3.3-1) (2휋) det (퐶푋) integrated over the region √푥2 + 푦2 + 푧2 ≤ 푅, and where probability 푝 = 푋푋/100, 3d error 휀푋푇 = [휖푥 휖푦 휖푧] is defined as 푋푇 = [푥 푦 푧] for notational convenience, with mean-value 휖푋̅̅̅̅ defined as 푋̅ = [푥̅ 푦̅ 푧]̅ , and 3 × 3 error covariance matrix 퐶푋 about the mean. Note that if the mean-value is not zero, the radius 푅 is still relative to the origin [0 0 0]푇, per the standard definition of SE_XX.

Corresponding practical evaluation techniques for SE are analogous to those for CE. In particular, assuming a mean-value equal to zero and eigenvalue square roots of 휎푚푖푛, 휎푚푖푑, 휎푚푎푥:

푆퐸_푋푋 = 푅(푝, 푟1, 푟2)휎푚푎푥, where 푟1 = 휎푚푖푑/휎푚푎푥 and 푟2 = 휎푚푖푛/휎푚푎푥. (3.3-5)

The following are pre-computed tables of 푅(푝 = 0.5, 푟1, 푟2), 푅(푝 = 0.9, 푟1, 푟2), and 푅(푝 = 0.95, 푟1, 푟2). All table entries are presented although each table is symmetric. Tables are interpolated using bilinear interpolation.

Table 3.3-1: Spherical Error (SE) multiplier 푅(푝 = 0.5, 푟1, 푟2) versus ratios r1 and r2

r1 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.00 0.6745 0.6763 0.6820 0.6916 0.7059 0.7254 0.7499 0.7779 0.8079 0.8389 0.8704 0.9021 0.9337 0.9651 0.9962 1.0271 1.0577 1.0880 1.1181 1.1479 1.1774 0.05 0.6763 0.6782 0.6838 0.6934 0.7076 0.7271 0.7516 0.7795 0.8094 0.8404 0.8719 0.9035 0.9350 0.9664 0.9975 1.0283 1.0589 1.0891 1.1192 1.1489 1.1784 0.10 0.6820 0.6838 0.6894 0.6989 0.7130 0.7324 0.7567 0.7844 0.8141 0.8449 0.8762 0.9077 0.9390 0.9703 1.0013 1.0320 1.0625 1.0926 1.1225 1.1522 1.1817 0.15 0.6916 0.6934 0.6989 0.7084 0.7223 0.7414 0.7654 0.7927 0.8221 0.8526 0.8836 0.9147 0.9459 0.9768 1.0077 1.0381 1.0684 1.0984 1.1282 1.1578 1.1870 0.20 0.7059 0.7076 0.7130 0.7223 0.7359 0.7546 0.7781 0.8048 0.8336 0.8636 0.8941 0.9248 0.9556 0.9862 1.0167 1.0469 1.0769 1.1067 1.1362 1.1655 1.1947 0.25 0.7254 0.7271 0.7324 0.7414 0.7546 0.7727 0.7952 0.8211 0.8491 0.8783 0.9081 0.9382 0.9684 0.9986 1.0286 1.0584 1.0881 1.1174 1.1466 1.1756 1.2045 0.30 0.7499 0.7516 0.7567 0.7654 0.7781 0.7952 0.8167 0.8414 0.8684 0.8966 0.9256 0.9549 0.9844 1.0140 1.0434 1.0728 1.1019 1.1309 1.1597 1.1883 1.2168 0.35 0.7779 0.7795 0.7844 0.7927 0.8048 0.8211 0.8414 0.8651 0.8909 0.9181 0.9462 0.9748 1.0035 1.0324 1.0612 1.0899 1.1185 1.1470 1.1753 1.2035 1.2315 0.40 0.8079 0.8094 0.8141 0.8221 0.8336 0.8491 0.8684 0.8909 0.9157 0.9420 0.9692 0.9970 1.0251 1.0533 1.0814 1.1096 1.1376 1.1656 1.1934 1.2211 1.2488

0.45 0.8389 0.8404 0.8449 0.8526 0.8636 0.8783 0.8966 0.9181 0.9420 0.9675 0.9939 1.0210 1.0484 1.0760 1.1036 1.1313 1.1588 1.1863 1.2137 1.2409 1.2681 2

r 0.50 0.8704 0.8719 0.8762 0.8836 0.8941 0.9081 0.9256 0.9462 0.9692 0.9939 1.0197 1.0462 1.0730 1.1002 1.1273 1.1545 1.1816 1.2086 1.2356 1.2625 1.2893 0.55 0.9021 0.9035 0.9077 0.9147 0.9248 0.9382 0.9549 0.9748 0.9970 1.0210 1.0462 1.0722 1.0985 1.1251 1.1519 1.1788 1.2055 1.2322 1.2589 1.2854 1.3119 0.60 0.9337 0.9350 0.9390 0.9459 0.9556 0.9684 0.9844 1.0035 1.0251 1.0484 1.0730 1.0985 1.1245 1.1508 1.1772 1.2037 1.2302 1.2567 1.2830 1.3093 1.3355 0.65 0.9651 0.9664 0.9703 0.9768 0.9862 0.9986 1.0140 1.0324 1.0533 1.0760 1.1002 1.1251 1.1508 1.1767 1.2029 1.2291 1.2554 1.2817 1.3078 1.3339 1.3599 0.70 0.9962 0.9975 1.0013 1.0077 1.0167 1.0286 1.0434 1.0612 1.0814 1.1036 1.1273 1.1519 1.1772 1.2029 1.2288 1.2549 1.2810 1.3070 1.3330 1.3590 1.3848 0.75 1.0271 1.0283 1.0320 1.0381 1.0469 1.0584 1.0728 1.0899 1.1096 1.1313 1.1545 1.1788 1.2037 1.2291 1.2549 1.2807 1.3067 1.3325 1.3585 1.3843 1.4101 0.80 1.0577 1.0589 1.0625 1.0684 1.0769 1.0881 1.1019 1.1185 1.1376 1.1588 1.1816 1.2055 1.2302 1.2554 1.2810 1.3067 1.3324 1.3582 1.3840 1.4098 1.4355 0.85 1.0880 1.0891 1.0926 1.0984 1.1067 1.1174 1.1309 1.1470 1.1656 1.1863 1.2086 1.2322 1.2567 1.2817 1.3070 1.3325 1.3582 1.3840 1.4098 1.4356 1.4611 0.90 1.1181 1.1192 1.1225 1.1282 1.1362 1.1466 1.1597 1.1753 1.1934 1.2137 1.2356 1.2589 1.2830 1.3078 1.3330 1.3585 1.3840 1.4098 1.4355 1.4612 1.4869 0.95 1.1479 1.1489 1.1522 1.1578 1.1655 1.1756 1.1883 1.2035 1.2211 1.2409 1.2625 1.2854 1.3093 1.3339 1.3590 1.3843 1.4098 1.4356 1.4612 1.4869 1.5125 1.00 1.1774 1.1784 1.1817 1.1870 1.1947 1.2045 1.2168 1.2315 1.2488 1.2681 1.2893 1.3119 1.3355 1.3599 1.3848 1.4101 1.4355 1.4611 1.4869 1.5125 1.5382

7 Table 3.3-2: Spherical Error (SE) multiplier 푅(푝 = 0.9, 푟1, 푟2) versus ratios r1 and r2

r1 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.00 1.6449 1.6456 1.6479 1.6518 1.6573 1.6646 1.6738 1.6852 1.6992 1.7163 1.7371 1.7621 1.7915 1.8251 1.8625 1.9034 1.9472 1.9936 2.0424 2.0932 2.1460 0.05 1.6456 1.6464 1.6487 1.6525 1.6581 1.6654 1.6745 1.6860 1.6999 1.7170 1.7378 1.7628 1.7922 1.8258 1.8632 1.9040 1.9478 1.9942 2.0429 2.0938 2.1466 0.10 1.6479 1.6487 1.6509 1.6548 1.6604 1.6676 1.6769 1.6882 1.7021 1.7192 1.7400 1.7650 1.7944 1.8279 1.8652 1.9060 1.9497 1.9961 2.0448 2.0956 2.1483 0.15 1.6518 1.6525 1.6548 1.6587 1.6642 1.6714 1.6806 1.6920 1.7059 1.7229 1.7436 1.7686 1.7979 1.8314 1.8687 1.9094 1.9530 1.9993 2.0479 2.0987 2.1512 0.20 1.6573 1.6581 1.6604 1.6642 1.6697 1.6769 1.6861 1.6974 1.7113 1.7282 1.7489 1.7738 1.8030 1.8364 1.8735 1.9141 1.9576 2.0039 2.0523 2.1029 2.1555 0.25 1.6646 1.6654 1.6676 1.6714 1.6769 1.6841 1.6932 1.7045 1.7183 1.7352 1.7558 1.7806 1.8097 1.8429 1.8799 1.9204 1.9638 2.0098 2.0581 2.1086 2.1610 0.30 1.6738 1.6745 1.6769 1.6806 1.6861 1.6932 1.7023 1.7135 1.7273 1.7441 1.7646 1.7892 1.8182 1.8513 1.8881 1.9283 1.9715 2.0173 2.0654 2.1156 2.1678 0.35 1.6852 1.6860 1.6882 1.6920 1.6974 1.7045 1.7135 1.7247 1.7383 1.7550 1.7755 1.7999 1.8286 1.8614 1.8981 1.9380 1.9809 2.0265 2.0743 2.1243 2.1762 0.40 1.6992 1.6999 1.7021 1.7059 1.7113 1.7183 1.7273 1.7383 1.7519 1.7685 1.7887 1.8130 1.8414 1.8740 1.9102 1.9498 1.9923 2.0375 2.0850 2.1347 2.1862

0.45 1.7163 1.7170 1.7192 1.7229 1.7282 1.7352 1.7441 1.7550 1.7685 1.7849 1.8049 1.8289 1.8569 1.8890 1.9248 1.9639 2.0060 2.0506 2.0977 2.1469 2.1981 2

r 0.50 1.7371 1.7378 1.7400 1.7436 1.7489 1.7558 1.7646 1.7755 1.7887 1.8049 1.8245 1.8481 1.8757 1.9071 1.9422 1.9807 2.0221 2.0663 2.1127 2.1614 2.2120 0.55 1.7621 1.7628 1.7650 1.7686 1.7738 1.7806 1.7892 1.7999 1.8130 1.8289 1.8481 1.8710 1.8979 1.9287 1.9630 2.0007 2.0413 2.0847 2.1304 2.1783 2.2282 0.60 1.7915 1.7922 1.7944 1.7979 1.8030 1.8097 1.8182 1.8286 1.8414 1.8569 1.8757 1.8979 1.9240 1.9539 1.9873 2.0240 2.0637 2.1061 2.1510 2.1980 2.2472 0.65 1.8251 1.8258 1.8279 1.8314 1.8364 1.8429 1.8513 1.8614 1.8740 1.8890 1.9071 1.9287 1.9539 1.9827 2.0151 2.0507 2.0894 2.1308 2.1746 2.2207 2.2689 0.70 1.8625 1.8632 1.8652 1.8687 1.8735 1.8799 1.8881 1.8981 1.9102 1.9248 1.9422 1.9630 1.9873 2.0151 2.0464 2.0809 2.1185 2.1587 2.2015 2.2464 2.2936 0.75 1.9034 1.9040 1.9060 1.9094 1.9141 1.9204 1.9283 1.9380 1.9498 1.9639 1.9807 2.0007 2.0240 2.0507 2.0809 2.1143 2.1506 2.1898 2.2314 2.2753 2.3214 0.80 1.9472 1.9478 1.9497 1.9530 1.9576 1.9638 1.9715 1.9809 1.9923 2.0060 2.0221 2.0413 2.0637 2.0894 2.1185 2.1506 2.1858 2.2237 2.2642 2.3070 2.3520 0.85 1.9936 1.9942 1.9961 1.9993 2.0039 2.0098 2.0173 2.0265 2.0375 2.0506 2.0663 2.0847 2.1061 2.1308 2.1587 2.1898 2.2237 2.2605 2.2998 2.3415 2.3854 0.90 2.0424 2.0429 2.0448 2.0479 2.0523 2.0581 2.0654 2.0743 2.0850 2.0977 2.1127 2.1304 2.1510 2.1746 2.2015 2.2314 2.2642 2.2998 2.3380 2.3786 2.4213 0.95 2.0932 2.0938 2.0956 2.0987 2.1029 2.1086 2.1156 2.1243 2.1347 2.1469 2.1614 2.1783 2.1980 2.2207 2.2464 2.2753 2.3070 2.3415 2.3786 2.4180 2.4597 1.00 2.1460 2.1466 2.1483 2.1512 2.1555 2.1610 2.1678 2.1762 2.1862 2.1981 2.2120 2.2282 2.2472 2.2689 2.2936 2.3214 2.3520 2.3854 2.4213 2.4597 2.5003

Table 3.3-3: Spherical Error (SE) multiplier 푅(푝 = 0.95, 푟1, 푟2) versus ratios r1 and r2

r1

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 0.00 1.9600 1.9606 1.9625 1.9658 1.9704 1.9765 1.9842 1.9937 2.0051 2.0190 2.0359 2.0564 2.0813 2.1111 2.1460 2.1858 2.2303 2.2791 2.3318 2.3881 2.4478 0.05 1.9606 1.9612 1.9632 1.9664 1.9711 1.9771 1.9848 1.9943 2.0058 2.0197 2.0365 2.0570 2.0819 2.1117 2.1466 2.1864 2.2309 2.2796 2.3324 2.3887 2.4482 0.10 1.9625 1.9632 1.9651 1.9683 1.9729 1.9791 1.9867 1.9962 2.0077 2.0215 2.0383 2.0589 2.0837 2.1135 2.1483 2.1881 2.2325 2.2813 2.3339 2.3902 2.4498 0.15 1.9658 1.9664 1.9683 1.9716 1.9762 1.9823 1.9899 1.9994 2.0108 2.0247 2.0415 2.0620 2.0868 2.1165 2.1513 2.1910 2.2354 2.2841 2.3367 2.3929 2.4524 0.20 1.9704 1.9711 1.9729 1.9762 1.9808 1.9868 1.9945 2.0039 2.0153 2.0292 2.0459 2.0664 2.0912 2.1208 2.1555 2.1952 2.2394 2.2880 2.3406 2.3967 2.4561 0.25 1.9765 1.9771 1.9791 1.9823 1.9868 1.9929 2.0005 2.0099 2.0213 2.0351 2.0518 2.0722 2.0969 2.1265 2.1611 2.2006 2.2448 2.2932 2.3457 2.4016 2.4609 0.30 1.9842 1.9848 1.9867 1.9899 1.9945 2.0005 2.0081 2.0175 2.0288 2.0425 2.0592 2.0795 2.1041 2.1336 2.1682 2.2075 2.2515 2.2998 2.3520 2.4078 2.4669 0.35 1.9937 1.9943 1.9962 1.9994 2.0039 2.0099 2.0175 2.0268 2.0381 2.0518 2.0683 2.0885 2.1131 2.1425 2.1767 2.2160 2.2598 2.3079 2.3598 2.4154 2.4743 0.40 2.0051 2.0058 2.0077 2.0108 2.0153 2.0213 2.0288 2.0381 2.0493 2.0630 2.0795 2.0995 2.1239 2.1531 2.1873 2.2262 2.2697 2.3175 2.3692 2.4246 2.4831

0.45 2.0190 2.0197 2.0215 2.0247 2.0292 2.0351 2.0425 2.0518 2.0630 2.0764 2.0929 2.1129 2.1371 2.1660 2.1999 2.2385 2.2816 2.3291 2.3804 2.4353 2.4935

2

r 0.50 2.0359 2.0365 2.0383 2.0415 2.0459 2.0518 2.0592 2.0683 2.0795 2.0929 2.1092 2.1290 2.1529 2.1816 2.2150 2.2532 2.2959 2.3429 2.3936 2.4481 2.5058 0.55 2.0564 2.0570 2.0589 2.0620 2.0664 2.0722 2.0795 2.0885 2.0995 2.1129 2.1290 2.1485 2.1722 2.2004 2.2333 2.2708 2.3129 2.3592 2.4093 2.4631 2.5202 0.60 2.0813 2.0819 2.0837 2.0868 2.0912 2.0969 2.1041 2.1131 2.1239 2.1371 2.1529 2.1722 2.1953 2.2229 2.2551 2.2919 2.3332 2.3786 2.4279 2.4809 2.5371 0.65 2.1111 2.1117 2.1135 2.1165 2.1208 2.1265 2.1336 2.1425 2.1531 2.1660 2.1816 2.2004 2.2229 2.2497 2.2810 2.3168 2.3570 2.4014 2.4497 2.5017 2.5570 0.70 2.1460 2.1466 2.1483 2.1513 2.1555 2.1611 2.1682 2.1767 2.1873 2.1999 2.2150 2.2333 2.2551 2.2810 2.3112 2.3460 2.3850 2.4281 2.4752 2.5259 2.5801 0.75 2.1858 2.1864 2.1881 2.1910 2.1952 2.2006 2.2075 2.2160 2.2262 2.2385 2.2532 2.2708 2.2919 2.3168 2.3460 2.3794 2.4170 2.4589 2.5046 2.5539 2.6067 0.80 2.2303 2.2309 2.2325 2.2354 2.2394 2.2448 2.2515 2.2598 2.2697 2.2816 2.2959 2.3129 2.3332 2.3570 2.3850 2.4170 2.4533 2.4937 2.5379 2.5858 2.6371 0.85 2.2791 2.2796 2.2813 2.2841 2.2880 2.2932 2.2998 2.3079 2.3175 2.3291 2.3429 2.3592 2.3786 2.4014 2.4281 2.4589 2.4937 2.5325 2.5751 2.6214 2.6713 0.90 2.3318 2.3323 2.3339 2.3367 2.3406 2.3457 2.3520 2.3598 2.3692 2.3804 2.3936 2.4093 2.4279 2.4497 2.4752 2.5046 2.5379 2.5751 2.6162 2.6609 2.7091 0.95 2.3881 2.3887 2.3902 2.3929 2.3967 2.4016 2.4078 2.4154 2.4246 2.4353 2.4481 2.4631 2.4809 2.5017 2.5259 2.5539 2.5858 2.6214 2.6609 2.7040 2.7506 1.00 2.4478 2.4482 2.4498 2.4524 2.4561 2.4609 2.4669 2.4743 2.4831 2.4935 2.5058 2.5202 2.5371 2.5570 2.5801 2.6067 2.6371 2.6713 2.7091 2.7506 2.7955

∗ ∗ SE_XX is computed as 푆퐸_푋푋 = 푅 휎푚푎푥, where the normalized radius 푅 is computed as the bi-linear interpolation of 푅(푋푋/100, 푟1, 푟2) from the corresponding row and column of the appropriate table. Computation time is virtually instantaneous. Its accuracy relative to the true value of SE_XX is typically on the order of 0.02% (experimental max 0.15%)

If the mean-value for error is not equal to zero, one can solve Equation 3.3-1 directly using iteration and numerical integration. However, this solution may have convergence problems, so the Monte-Carlo method of subsection 3.3.1 is recommended instead. Its evaluation via MATLAB pseudo-code takes on the order of 0.08 seconds on a high-end notebook computer, and its accuracy relative to the true value of SE_XX is typically on the order of 0.2 % (experimental max of 0.4%).

8 3.3.1 Monte-Carlo Method for Computation of SE The algorithm/equation for computation of SE based on the Monte-Carlo method is basically the same as Equation/Algorithm 3.2.1-1 for CE, except that 푋̅ and 퐶푋 are the 3x1 mean-value and 3x3 error covariance matrix, respectively, and 푛푖 is a three-element vector with each element the realization of an independent 푁(0,1) random variable.

3.3.2 SE Evaluation Examples Examples are as follows: 4 −5.4 6 (1) Assume a desired probability level of 90%, a mean error of zero, and 퐶푋 = [−5.4 9 −9] meters-squared. 6 −9 25 Eigenvalues equal 31.2, 6.22, and 0.55 meters-squared. 휎푒푖푔_푚푎푥 = 5.59 meters, 푟1 = 0.446, 푟2 = 0.132; 0.018 0.004 0.046 푅∗ = 1.72 ( via bilinear linear interpolation: ( 1.7021 + 1.7192) + 0.05 0.05 0.05 0.032 0.004 0.046 ( 1.7059 + 1.7229) = 1.7202 ); 0.05 0.05 0.05 ∗ 푆퐸_90 = 푅 휎푒푖푔_푚푎푥 = 9.61 meters.

(2) As above except with a mean-value 푋̅푇 = [1 0 −1]. Thus, since the mean-value is not zero, the Monte-Carlo Matrix Square Root method (Equation 3.3.2-1) is applicable: 푆퐸_90 = 9.76 meters.

4.0 Scalar Accuracy Metrics: Computations based on Sample Statistics Order statistics are recommended for the computation of sample-based scalar accuracy metrics. Order statistics are not only used to compute the scalar accuracy metric values, but confidence intervals around those values as well since confidence in the values is highly dependent on the number of samples used.

Subsection 4.1 describes the relevant portions of order statistics for this application. Subsection 4.2 then specifically addresses the computation of scalar accuracy metrics based on the material of subsection 4.1. It equates a radial error random variable 휖푟 to the general random variable 푥 of subsection 4.1. The radial error random variable corresponds to different combinations of geolocation error for each scalar accuracy metric: (1) 휖푟 = |휖푧| if LE, 휖푟 = √휖푥2 + 휖푦2 if CE, and 휖푟 = √휖푥2 + 휖푦2 + 휖푧2 if SE.

Order statistics require no assumptions regarding the probability distribution of the corresponding random variable 푥, or the probability distributions of the underlying errors 휖푥, 휖푦 , and 휖푧. That is, whether they are Gaussian distributed or not, have a non-zero mean-value or not, etc.

4.1 Order Statistics The 푝 of a random variable 푥 is defined as the smallest number 푥푝 such that 푝 = 푝푟표푏{푥 ≤ 푥푝}. Thus, the probability distribution function (typically unknown) of the random variable 푥 evaluated at 푥푝 is equal to 푝 [8, pg. 69]. 푥푝 is a deterministic parameter with typically unknown value.

We are interested in estimating 푥푝 given a set of independent and identically distributed (i.i.d.) samples of the random variable 푥 using order statistics. This approach to estimating 푥푝 is sometimes termed the “percentile method”.

Assume that 푛 i.i.d. samples 푥푖 of the random variable 푥 are ordered (increasingly) as 푦푖 (see Figure 4.1-1 below). Samples can be signed, although for our application of interest (subsection 4.2), they are strictly positive.

9 samples : . . . −

5.1 6.2 1.5 3.3 . . . 7.2 2.2

ordered samples : . . . −

1.1 1.5 3.3 4.0 . . . 6.9 7.2

Figure 4-1: Samples versus corresponding ordered samples

The collection of 푦푖 are the “order statistics” of the random variable 푥 [8, pg. 254]. In general, they too are considered random variables, but when assigned specific values corresponding to the samples 푥푖, they are considered samples as well. The appropriate interpretation of 푦푖 should be clear from context.

At first glance, it seems obvious what the best estimate of 푥푝 should be given the collection of 푦푖. For example, if we had 푛 = 200 i.i.d. samples and the 푝 percentile 푥0.90 was of interest, we would expect the best estimate of 푥0.90 to be 푦180, give or take an ordered sample or two, since 푝 ∙ 푛 = 0.90 ∙ 200 = 180. However, we need to be more precise. Also, this does not address the considerable effects of small 푛 (all too typical), the computation of corresponding confidence intervals, and underlying theory – all described in the following subsections.

4.1.1 Fundamental Equations We now present various equations [8, pg. 254] for the probability that 푥푝 is contained in various intervals of the form (푦푘 < 푥푝 < 푦푘+푟), where 푦푘 and 푦푘+푟 are considered random variables. The equations are based on the binomial distribution, and if 푛 is large, by approximation of the binomial distribution by the Gaussian distribution [8, pg. 50]. The formulas are exact (other than the Gaussian approximation if large 푛) – somewhat amazing when one considers that there is no a priori information of the random variable 푥 and its probability distribution function. The equations presented below also assume a continuous-type random variable 푥, i.e., a random variable with a continuous probability distribution function.

If (푛 ≈≤ 50) we can use direct binomial evaluation to obtain: 푛 푝푟표푏{푦 < 푥 < 푦 } = ∑푘+푟−1 ( ) 푝푚(1 − 푝)푛−푚 ; (4.1.1-1) 푘 푝 푘+푟 푚=푘 푚 else, use Gaussian approximation for large 푛 (≈≥ 50): 푘+푟−0.5−푛푝 푘−0.5−푛푝 푝푟표푏{푦 < 푥 < 푦 } ≅ 퐺 ( ) − 퐺 ( ) = (4.1.1-2) 푘 푝 푘+푟 √푛푝(1−푝) √푛푝(푛−푝) 1 1 푘+푟−0.5−푛푝 1 1 푘−0.5−푛푝 푚푎푡푙푎푏 erf ( ) − 푚푎푡푙푎푏 erf ( ) , in terms of MATLAB pseudo-code. 2 √2 √푛푝(1−푝) 2 √2 √푛푝(1−푝)

Note that for Equation 4.1.1-1, in general, 푘 and 푘 + 푟 are integers between the values 1 and 푛. However, the case

푘 = 0 and 푟 = 1 is defined as corresponding to 푝푟표푏{푥푝 < 푦1} and the case 푘 = 푛 and 푟 = 1 is defined as corresponding to 푝푟표푏{푦푛 < 푥푝}.

1 푥 2 Note that 퐺(푥) = ∫ 푒−푦 /2푑푦 is the standard normal cumulative distribution function. √2휋 −∞

10 (Caution: “푝푟표푏” in the above equations corresponds to the probability that 푥푝, the p-percentile of the random variable 푥, is within the specified interval , and does not correspond to the value of “푝” associated with the p- percentile itself.)

Also, as demonstrated in [8, pg. 46], relative to all single intervals of the form (푦푘, 푦푘+1), the single interval (푦푘′ , 푦푘′+1) contains the maximum probability that the 푝 percentile 푥푝 resides within, where:

푘′ = 푓푙표표푟[ (푛 + 1)푝 ], i.e., the integer part of (푛 + 1)푝. (4.1.1-3)

An exception occurs when (푛 + 1)푝 equals and integer, in which case the adjacent interval (푦푘′−1, 푦푘′ ) also contains the maximum probability.

4.1.2 Best Estimate of Percentile and Two-Sided Confidence Intervals The best (maximum likelihood) estimate of the percentile 푥푝 corresponds to the mid-point of the interval that contains the maximum probability that 푥푝 resides within. It is designated 푦푘 푚푙 with corresponding ordered sample location designated 푘 푚푙. Based on Equation 4.1.3-1, it is equal to:

푦푘 푚푙 ≡ 0.5(푦푘′ + 푦푘′+1), 푘 푚푙 ≡ 푘′ + 0.5., if (푛 + 1)푝 does not equal an integer; else (4.1.2-1) 푦푘 푚푙 ≡ 푦푘′, 푘 푚푙 ≡ 푘′.

The following general algorithm computes the two-sided confidence interval (푐표푛푓 = 훾) for the percentile 푥푝:

Find the integer 푘1 with the smallest possible corresponding integer 푘2 that satisfies: (4.1.2-3)

푝푟표푏{푦푘1 < 푥푝 < 푦푘2} ≥ 훾, where 0 < 훾 < 1.

The order sample indices 푘1 and 푘2 represent the two-sided confidence interval (푦푘1, 푦푘2).

For large 푛 [8, pg. 255]: (4.1.2-4)

푘1 = 푟표푢푛푑_푡표_푛푒푎푟푒푠푡_푖푛푡푒푔푒푟[푛푝 − (푧0.5+훾/2)√푛푝(1 − 푝) ],

푘2 = 푟표푢푛푑_푡표_푛푒푎푟푒푠푡_푖푛푡푒푔푒푟[푛푝 + (푧0.5+훾/2)√푛푝(1 − 푝) ],

푧 훾 1 0.5+ 훾 2 where by definition: erf (푧0.5+ ) = 훾/2, and similarly in terms of MATLAB pseudo-code 푚푎푡푙푎푏 erf ( ) = 2 2 √2 훾 훾/2. Therefore 푧0.5+ = √2푚푎푡푙푎푏 푒푟푓푖푛푣(훾) can be used in Equation 4.1.2-4 above. Note that the 푒푟푓 function 2 corresponds to the standard normal (Gaussian) distribution.

Furthermore, for large 푛, the best estimate of 푥푝 is set to 푦푘 푚푙, where: 푘 푚푙 = 푘1 + (푘2 − 푘1)/2. (4.1.2-5)

4.1.3 Corresponding Tables The appropriate equations in subsection 4.1.2 were evaluated as a function of the number of samples 푛 with results presented in Table 4.1.3-1 through Table 4.1.3-3, assuming a desired percentile value of 푝 = 0.50, 푝 = 0.90, and 푝 = 0.95, respectively. Each table includes the order sample index 푘 푚푙 corresponding to the best (maximum likelihood) estimate of the 푥-percentile 푥푝, and three two-sided confidence intervals corresponding to confidence values 훾 = 0.50, 훾 = 0.90, and 훾 = 0.95. Each of these intervals is designated by corresponding order sample indices 푘1 and 푘2.

In general, the larger 푛 the better the estimate 푦 푚푙 and the smaller the relative size of the confidence intervals. Note that for confidence intervals, table entries are only populated for values of 푛 such that the confidence interval is contained within the spanning interval (1, 푛). Also, for large 푛 with no corresponding table entries, evaluate Equation 4.1.2-5 for 푘 푚푙 and Equation 4.1.2-4 for (푘1, 푘2).

11 As a general example of how to interpret the tables: Table 4.1.3-2 corresponds to 푥0.90 (percentile value 0.90). For 푛 = 35, 푘 푚푙 = 32.5, and for confidence 훾 = 0.50 , (푘1, 푘2) = (31,34). Thus, the best estimate of 푥0.90 equals 푦푘 푚푙 = 푦32.5 = 0.5(푦32 + 푦33), and there is a 50% confidence (probability) that the true value of 푥0.90 resides within the interval (푦31, 푦34).

Table 4.1.3-1:Order statistics 풑 = ퟎ. ퟓퟎ best estimate and two-sided confidence intervals

p= 0.50 0.50 conf 0.90 conf 0.95 conf 0.50 conf 0.90 conf 0.95 conf n k ml k1 k2 k1 k2 k1 k2 n k ml k1 k2 k1 k2 k1 k2 5 3 2 4 1 5 36 18.5 16 21 13 23 12 24 6 3.5 2 4 1 6 1 6 37 19 17 22 14 24 13 25 7 4 3 5 2 7 1 7 38 19.5 17 22 14 25 13 26 8 4.5 3 6 2 7 1 7 39 20 18 23 15 26 14 27 9 5 4 7 2 8 2 8 40 20.5 18 23 15 26 14 27 10 5.5 4 7 2 8 2 9 41 21 19 24 16 27 15 28 11 6 5 8 3 9 3 10 42 21.5 19 24 16 27 15 28 12 6.5 5 8 3 9 3 10 43 22 20 25 17 28 16 29 13 7 6 9 4 10 3 11 44 22.5 20 25 17 28 16 29 14 7.5 6 9 4 11 3 11 45 23 21 26 17 29 16 30 15 8 7 10 5 12 4 12 46 23.5 21 26 17 29 16 30 16 8.5 7 10 5 12 4 12 47 24 22 27 18 30 17 31 17 9 8 11 6 13 5 13 48 24.5 22 27 18 30 17 31 18 9.5 8 11 6 13 5 14 49 25 23 28 19 31 18 32 19 10 8 12 6 14 6 15 50 25 23 27 19 31 18 32 20 10.5 8 12 6 14 6 15 51 25.5 23 28 20 31 19 32 21 11 9 13 7 15 6 16 52 26 24 28 20 32 19 33 22 11.5 9 13 7 15 6 16 53 26.5 24 29 21 32 19 34 23 12 10 14 8 16 7 17 54 27 25 29 21 33 20 34 24 12.5 10 14 8 17 7 17 55 27.5 25 30 21 34 20 35 25 13 11 15 9 18 8 18 56 28 25 31 22 34 21 35 26 13.5 11 15 9 18 8 19 57 28.5 26 31 22 35 21 36 27 14 12 16 10 19 9 20 58 29 26 32 23 35 22 36 28 14.5 12 16 10 19 9 20 59 29.5 27 32 23 36 22 37 29 15 13 17 11 20 10 21 60 30 27 33 24 36 22 38 30 15.5 13 17 11 20 10 21 75 37.5 35 40 30 45 29 46 31 16 14 18 11 21 10 22 100 50 47 53 42 58 40 60 32 16.5 14 18 11 21 10 22 125 62.5 59 66 53 72 52 73 33 17 15 19 12 22 11 23 150 75 71 79 65 85 63 87 34 17.5 15 19 12 22 11 23 175 87.5 83 92 77 98 75 100 35 18 16 20 13 23 12 24 200 100 95 105 88 112 86 114

12 Table 4.1.3-2: Order statistics 풑 = ퟎ. ퟗퟎ best estimate and two-sided confidence intervals p= 0.90 0.50 conf 0.90 conf 0.95 conf 0.50 conf 0.90 conf 0.95 conf n k ml k1 k2 k1 k2 k1 k2 n k ml k1 k2 k1 k2 k1 k2 5 36 33.5 32 35 30 36 29 36 6 37 34.5 33 36 31 37 30 37 7 7 4 7 38 35.5 34 37 32 38 30 38 8 8 6 8 39 36 35 38 32 39 31 39 9 9 7 9 40 36.5 35 38 33 40 32 40 10 9.5 8 10 41 37.5 36 39 34 41 33 41 11 10.5 9 11 42 38.5 37 40 35 42 34 42 12 11.5 10 12 43 39.5 38 41 36 43 35 43 13 12.5 11 13 44 40.5 39 42 37 44 36 44 14 13.5 12 14 45 41.5 40 43 38 45 37 45 15 14.5 13 15 46 42.5 41 44 39 46 38 46 16 15.5 14 16 47 43.5 42 45 40 47 39 47 17 16.5 15 17 48 44.5 43 46 40 48 39 48 18 17.5 16 18 49 45 44 47 41 49 40 49 19 18 17 19 50 45 44 46 42 48 41 49 20 18.5 17 20 51 45.5 44 47 42 49 42 50 21 19.5 18 21 52 46.5 45 48 43 50 43 51 22 20.5 19 22 15 22 53 47.5 46 49 44 51 43 52 23 21.5 20 23 17 23 54 48.5 47 50 45 52 44 53 24 22.5 21 24 18 24 55 49.5 48 51 46 53 45 54 25 23.5 22 25 19 25 56 50.5 49 52 47 54 46 55 26 24.5 23 26 20 26 57 51.5 50 53 48 55 47 56 27 25.5 24 27 21 27 58 52.5 51 54 48 56 48 57 28 26.5 25 28 22 28 59 53.5 52 55 49 57 49 58 29 27 26 29 23 29 21 29 60 54 52 56 50 58 49 59 30 27.5 26 29 24 30 22 30 75 67.5 66 69 63 72 62 73 31 28.5 27 30 25 31 24 31 100 90 88 92 85 95 84 96 32 29.5 28 31 26 32 25 32 125 112.5 110 115 107 118 106 119 33 30.5 29 32 27 33 26 33 150 135 133 137 129 141 128 142 34 31.5 30 33 28 34 27 34 175 157.5 155 160 151 164 150 165 35 32.5 31 34 29 35 28 35 200 180 177 183 173 187 172 188

Table 4.1.3-3: Order statistics 풑 = ퟎ. ퟗퟓ best estimate and two-sided confidence intervals p= 0.95 0.50 conf 0.90 conf 0.95 conf 0.50 conf 0.90 conf 0.95 conf n k ml k1 k2 k1 k2 k1 k2 n k ml k1 k2 k1 k2 k1 k2 5 36 35.5 34 36 6 37 36.5 35 37 7 38 37.5 36 38 8 39 38 37 39 9 40 38.5 37 40 10 41 39.5 38 41 11 42 40.5 39 42 12 43 41.5 40 43 13 44 42.5 41 44 14 14 11 14 45 43.5 42 45 37 45 15 15 13 15 46 44.5 43 46 39 46 16 16 14 16 47 45.5 44 47 41 47 17 17 15 17 48 46.5 45 48 42 48 18 18 16 18 49 47.5 46 49 43 49 19 19 17 19 50 47.5 46 49 45 50 44 51 20 19.5 18 20 51 48 47 49 46 51 45 52 21 20.5 19 21 52 49 48 50 47 52 46 52 22 21.5 20 22 53 50 49 51 48 53 47 53 23 22.5 21 23 54 51 50 52 49 54 48 54 24 23.5 22 24 55 52 51 53 50 55 49 55 25 24.5 23 25 56 53 52 54 51 56 50 56 26 25.5 24 26 57 54 53 55 51 57 51 57 27 26.5 25 27 58 55 54 56 52 58 52 58 28 27.5 26 28 59 56 55 57 53 59 53 59 29 28.5 27 29 60 57 56 58 54 60 54 60 30 29.5 28 30 75 71.5 70 73 68 74 68 75 31 30.5 29 31 100 95 94 96 91 99 91 99 32 31.5 30 32 125 118.5 117 120 115 123 114 124 33 32.5 31 33 150 142.5 141 144 138 147 137 148 34 33.5 32 34 175 166 164 168 162 171 161 172 35 34.5 33 35 200 190 188 192 185 195 184 196

13 4.2 Scalar Accuracy Metric Computations Order statistics, as described in Section 4.1, are a “natural” for the computation of the sample-based scalar accuracy metrics LE, CE, and SE. For a specified probability level XX (%), e.g., CE_50, XX/100 corresponds to the value of 푝 percentile. Application of order statistics not only provides a best estimate of the desired scalar accuracy metric, but a confidence interval around it as well. The following presents the appropriate procedure: (4.2-1)

 Define 3d errors as a multi-variate random variable 휖푋 = [휖푥 휖푦 휖푧]푇relative to “ground truth”, assumed to be in a local tangent plane coordinate system, e.g., East-North-Up. We then equate the appropriate radial error of interest to the general random variable 푥 of subsection 4.1. If we are interested in LE, define the random variable 푥 ≡ |휖푧|; if in CE, 푥 ≡ √휖푥2 + 휖푦2; if in SE, 푥 ≡ √휖푥2 + 휖푦2 + 휖푧2 푇  Given 3d samples 휖푋푖 = [휖푥푖 휖푦푖 휖푧푖] , 푖 = 1, . . , 푛, generate appropriate scalar samples 푥푖 per the above definitions, and then generate corresponding ordered samples 푦푖. 2 2 2 2 2 2 o For example, if CE is of interest, 푥1 = √휖푥1 + 휖푦1 , 푥2 = √휖푥2 + 휖푦2 , ... , 푥푛 = √휖푥푛 + 휖푦푛 ; 푦푖, 푖 = 1, . . , 푛, their ordered counterparts.  Specify the probability level XX (XX=50, 90, or 95%) for the desired scalar accuracy metric LE, CE, or SE, and compute the equivalent 푝 percentile as 푝 = 푋푋/100. Specify the desired level of confidence 훾 (훾 = 0.50, 0.90, 표푟 0.95) for the two-sided confidence interval.  Utilize the appropriate table of subsection 4.1 based on the value of 푝 and the value of 훾 (in conjunction

with the ordered samples 푦푖, 푖 = 1, . . , 푛) as detailed in subsection 4.1 and with examples presented below. This yields the best estimate of the percentile 푥푝, i.e., the value of either LE_XX, CE_XX, or SE_XX, 푝 = 푋푋/100. This also yields the appropriate confidence interval at confidence level 훾.

Examples: The following examples utilize simulated i.i.d. samples of geolocation error, based on specified (but unknown to the scalar accuracy metric computation algorithm) probability distributions for these errors. Therefore, the true answer for the scalar accuracy metric of interest is known as well for comparison. For CE and Gaussian distributed geolocation errors (example 2), the Monte-Carlo method for the predictive statistic CE (subsection 3.2.1) was used to compute the true answer. For LE and uniform distributed geolocation errors (example 1), the method of subsection 3.2.1 was modified in a straight-forward manner to compute LE instead of CE and to use simulated uniform errors instead of Gaussian errors, in order to compute the true answer. (Direct analysis of the specified uniform distribution could also have been used instead to compute the true LE.)

(1) LE_90 is of interest; a priori uniform (1,4) probability distribution for the random variable 푥 ≡ |휖푧| (hence, a mean-value of 2.5 meters) ; 푛 = 30 and 0.90 percentile 푝; 30 independent samples of 푥푖 generated via simulation consistent with the a priori probability distribution, subsequently ordered to yield 푦푖, 푖 = 1: 30:

i = : 1 2 .. 22 23 24 25 26 27 28 29 30 x_i : 2.10 1.77 .. 2.46 2.49 1.14 2.02 3.81 2.81 2.48 3.65 1.88 y_i : 1.03 1.14 .. 3.04 3.15 3.20 3.48 3.52 3.65 3.80 3.81 3.86

Because the 0.90 percentile 푝 is of interest, we use Table 4.1.3-2 which states that 푘 푚푙 = 27.5 with corresponding value 푦푘 푚푙 = 푦27.5 = 0.5(푦27 + 푦28). Thus, the best estimate of 푥0.90 is 푦 푘 푚푙 = 0.5(3.65 + 3.80) = 3.73; or equivalently, the best estimate of LE_90 equals 3.73 meters.

50% confidence interval for LE_90 is (푦푘1, 푦푘2) = (푦26, 푦29) or (3.52,3.81); 90% confidence interval for LE_90 is (푦푘1, 푦푘2) = (푦24, 푦30) or (3.20,3.86); 95% confidence interval for LE_90 is (푦푘1, 푦푘2) = (푦22, 푦30) or (3.04,3.86). It is known that 푥0.90푡푟푢푒 = 퐿퐸_90_푡푟푢푒 = 3.70 meters, since the probability distribution was specified for simulation of the samples.

(2) CE_50 is of interest; a priori Gaussian probability distribution of 2d random variable 휖푋 = [휖푥 휖푦]푇 with 4 3 mean-value [1 0]푇 and Covariance matrix [ ]; 푛 = 24 and 0.50 percentile 푝; 24 independent samples of 휖푋 3 5 푖

14 generated via simulation consistent with the a priori probability distribution function; corresponding 24 independent 2 2 samples of (horizontal error) 푥푖 = √휖푥푖 + 휖푦푖 generated and subsequently ordered to yield 푦푖, i=1:24:

i = : .. 7 8 9 10 11 12 13 14 15 16 17 .. 24 x_i : .. 1.68 1.77 2.99 1.86 1.21 3.80 0.87 6.38 4.57 3.62 1.49 .. 1.28 y_i : .. 1.49 1.68 1.77 1.86 1.87 2.04 2.23 2.25 2.76 2.77 2.99 .. 6.38

Because the 0.50 percentile 푝 is of interest, we use Table 4.1.3-1 which states that 푘 푚푙 = 12.5 with corresponding value 푦푘 푚푙 = 푦12.5 = 0.5(푦12 + 푦13). Thus, the best estimate of CE_50 equals 0.5(푦12 + 푦13) = 2.14 meters.

Corresponding two-sided confidence intervals are:

(푦푘1, 푦푘2) = (푦10, 푦14) or (1.86,2.25), for 50% confidence; (푦푘1, 푦푘2) = (푦8, 푦17) or (1.68,2.99), for 90% confidence; (푦푘1, 푦푘2) = (푦7, 푦17) or (1.49,2.99), for 95% confidence. Thus, for example, there is a 90% probability that the true value of CE_50 is within the interval (1.68,2.99) meters.

It is known that 푥0.50푡푟푢푒 = 퐶퐸_50_푡푟푢푒 = 2.47 meters, since the probability distribution was specified for simulation of the samples. Thus, the 50% two-sided confidence interval “fails” in this example, i.e., the true value is not within; not unexpected since we expect the interval to fail approximately 50% of the time.

Other references In terms of order statistics, [4] is another good reference; their applications to scalar accuracy metrics are also discussed in [1-3].

5.0 Summary and Conclusions This paper has presented methodologies for proper but practical computation of LE, CE, and SE based on both predictive and sample statistics, including corresponding confidence intervals, which address known risks in the application and use of scalar accuracy metrics. The calculations presented are fast, rigorous and accurate methods to describe and evaluate data. These methods meet a broad community need and support the contemporary necessity to be practical and efficient. To effectively utilize scalar accuracy metrics to characterize accuracy and error of geolocation data, such practices must be broadly adopted by both geospatial data producers and users. This paper is intended to support the necessary research and documentation that will provide needed references and standardize the community approach to describing and evaluating the accuracy of geospatial data.

References

[1] Ager, T., “An Analysis of Metric Accuracy Definitions and Methods of Computation, NGA white paper, 2004. [2] Bresnahan, P.C and Jamison, T.A., “A Monte Carlo Simulation of the Impact of Sample Size and Percentile Method Implementation on Imagery Geolocation Accuracy Assessments”, ASPRS Annual Conference, Tampa, FL, 7-11 May, 2007. [3] Bresnahan, P., Brown, E., HenryVazquez, L., “WorldView-3 Absolute Geolocation Accuracy Evaluation”, JACIE Workshop, 5-7 May, 2015. [4] Conover, W., Practical Nonparametric Statistics, 3rd Edition, John Wiley and Sons, Inc., New York, 1999. [5] Dolloff, J., and Theiss, H., The Specification and Validation of Predictive Accuracy Capabilities for Commercial Satellite Imagery, Proceedings of ASPRS Annual Conference, 2014. [6] Greenwell, C.R., and M.E. Schultz, “Principals of Error Theory and Cartographic Applications, ACIC Technical Report #96, Feb. 1962. [7] MIL-STD-6000001, Department of Defense Standard Practice; Mapping, Charting and Geodesy Accuracy, February 26, 1990. [8] Papoulis, A., Probability, Random Variables, and Stochastic Processes, 3rd Edition, McGraw-Hill, 1991.

15