Typing Instructions for FUSION 2001 Documents s1

Exploring alternative wavelet base selection techniques with application to high resolution radar classification

Donald E. Waagen Mary L. Cassabaum ATR Design Center ATR Design Center Raytheon Company Raytheon Company Tucson, AZ, U.S.A. Tucson, AZ, U.S.A. [email protected] [email protected]

Clayton Scott Harry A. Schmitt Electrical and Computer Engineering Cognitive Systems Rice University Raytheon Company Houston, TX, U.S.A. Tucson, AZ, U.S.A. [email protected] [email protected]

Abstract – Coifman, Wickerhauser, and Saito developed vectors and associated labels and minimizes the the concepts of the wavelet ‘best-basis’ algorithm and the classification error in a performance environment. local discriminant bases (LDB) algorithm for signal and For signals and images that are localized in extent (i.e. image characterization and classification. LDB was non-periodic), decomposition of the signal into an originally based on the differentiation of class-specific orthonormal library of time-frequency bases that are time-frequency energy distributions, using an L1 norm for spatiotemporally localized (for instance, wavelets) has multi-class feature weighting. Recent extensions to the demonstrable benefits [1]. Besides spanning the original approach for discrimination estimate the Kullback- space, wavelet orthonormal bases tend to realign and Leibler distance between empirically estimated class- concentrate signal energy, thereby allowing signal conditional probability densities. This paper offers a characterization and representations in a reduced feature complimentary approach for wavelet base/feature space [2]. A novel approach for orthonormal base selection via the Kolmogorov-Smirnov test and selection in a classification setting was developed by algorithmic extensions for wavelet feature selection in a Coifman and Saito [3][4], which they named ‘local multi-class environment. Alternative feature score discriminant bases’ or LDB. An overview of the original normalizations are investigated. Additionally, this LDB technique plus recent modifications is discussed in research develops a dynamic re-weighting scheme for the next section. feature selection. The goal of the dynamic feature re- In this paper, we introduce an alternative fitness weighting process is similar in spirit to ‘boosting’, as it function (a standard statistical test) to measure the attempts to bias the base selection process to select distributional differences between class-conditional features which offer more discrimination between probability densities and the use of alternative norms in currently ‘costly’ or ‘tough’ classes of interest. We determining base (feature) efficacy in a multi-class setting. investigate the efficacy of the algorithms in a multi-class Additionally, a dynamic approach to re-weighting fitness discrimination setting using simulated high-resolution scores conditionally based on the efficacy of previously multi-polarimetric millimeter-wave real-beam radar selected bases is developed and demonstrated in a multi- signatures of ground vehicles. class radar discrimination problem. Keywords: Automatic target recognition, wavelet 2 Local Discriminant Bases packets, Kolmogorov-Smirnov test, feature selection, millimeter-wave radar. Given a pre-selected library of orthonormal bases (wavelet packets, local cosine, etc.), the original LDB [4] consists of the following steps: 1 Introduction 1: Given a training set of m classes, expand n Let X   be a random vector (signal or image) with training signals x i of length n onto the library of an associated class label Y  Z . Given a set of training redundant orthonormal bases ws f t (indexed by {(x ,y ),(x ,y ),,(x ,y )} pairs 1 1 1 1 N N from multiple target scale, frequency, and time/location). Compute classes, the goal in classifier design is to determine a  the time-frequency map of class j. denoted as partitioning F : X  Y which correctly maps input   j (s, f , t ) , and specified by 

  N j 2 A preferred approach will exploit the distributional ( j)  ws f t , xi characteristics of the individual sample projections, i.e. i1 ( j) j (s, f ,t)  N (1)  w , x  j 2 s f t i . Indeed, recent work by Saito [5] has ( j)  xi abandoned the energy map, and computes the relative i1 efficacy of bases for classification via empirical average shifted histograms (ASH) [6] density estimates and N where j is the number of training samples of computation of the Kullback-Liebler [7] divergence class j. measure: p ( x )  D ( p, q )   p ( x ) lo g d x (5)  2: Compute an overall discriminant score s, f q ( x ) for each subspace, by the following: 3 Kolmogorov-Smirnov Best Bases n 2s   s, f   D(1 (s, f ,t),...,m (s, f ,t)) (2) This section will introduce the algorithmic changes to t1 LDB investigated in our research. First, we define an alternative discriminant function D( p (i) , p ( j) ) for with f  1,...,2s ,s  S,...,1 , where S is the measuring class-pairwise base classification efficacy. maximum scale (depth) of the library tree, and Second, we identify the alternative multi-class the composite score is a summation of the normalizations examined in our effort, and finally this pairwise comparison scores, research will introduce a scheme for base (feature) m1 m (l) m (i) ( j) selection that reevaluates the efficacy of wavelet bases at D({p }l1 )    D( p , p ) (3) i1 ji1 each iteration of the selection process. Desirable characteristics for any discriminant function are to minimize the required assumptions for its use while (i) ( j) D( p , p ) is a discriminant function specified maximizing the power (efficacy) of the test. For instance, on the energy maps. Typical definitions for D correct use of Fisher’s discriminant function requires the include relative entropy assumption of unimodal class densities. The two-sample Kolmogorov-Smirnov test provides a distribution (model) n   pi free approach to measure the discrepancy between two D( p,q)   pi log , (4) samples. t1 qi

or Fisher’s measure of class separation. See [4] 3.1 Kolmogorov-Smirnov test statistic for details. The Kolmogorov-Smirnov test [8] is a nonparametric statistical test with a two-sample variant that attempts to 3: Determine the best basis (spanning the space) quantify whether two samples {x11 , x12 ,, x1n } and by comparing the  with    (  s ,k s1,2k s1,2k1 s ,k {x , x ,, x } were produced from the same underlying ‘s subspace tree descendants). By starting at the 21 22 2m highest scale (tree nodes) and moving to the probability density f (x) . The empirical distribution lowest (tree roots), select the subspace ˆ ˆ functions (EDF) of the two samples, F1 (X ) and F2 (X ) , representatives (best-basis) with the maximum are the proportion of samples  X , and are computed overall score. from their respective samples via

4: Order the basis functions ws f t of the selected n i ˆ 1 ‘best-basis’ by their classification efficacy. Fi ( X )   I ( X  x i j ) (6) n Select the best k  n bases as features to i j1 provide to a classifier. 1 if x  0 I(x)  Unfortunately, the use of the ‘energy map’ in step 1    0 otherwise replaces the distributional characteristics of the training set with an estimate of the overall average power projected by The Kolmogorov-Smirnov (or K-S) test statistic D is just the training data onto the bases. Energy map comparisons ˆ ˆ between classes therefore consist of comparison of the the largest deviation between F1 (X ) and F2 (X ) , differences in average power (a comparison between means). Higher moments are unavailable for exploitation, ˆ ˆ ˆ ˆ D(F1 , F2 )  max F1 (X )  F2 (X )  (7) and furthermore the mean can be susceptible to training X outliers. This measure of separation is robust to outliers and makes LDB (equation (3)) is equivalent to using an L1 (p=1) no assumptions concerning the form of the underlying norm for D . density function. Incorporating the K-S test as our discriminant function 3.3 Dynamic re-weighting of feature utility for estimation of pairwise class separation, we replace step one in the original LDB with the following: In a two-class setting, one would like to select those bases with maximum discriminatory capability. In a 1: Given a training set of m classes, project multi-class environment, the approach is not necessarily as ( j) straightforward. For instance, if a particular selected training signals xi onto the library of wavelet base perfectly separates two classes of m, would redundant orthonormal bases: additional features that separate the same classes be ( j) ws f t , xi desirable to keep? Should not the features to be selected a( j) (s, f ,t)  (8) i ( j) be ‘biased’ toward the classes that have yet to be xi separated? To address this issue, we wish to modify the feature 1a: Compute the K-S test statistic selection process such that selections are ‘biased’ toward ˆ ˆ D jk (s, f ,t)  D(F j (a(s, f , t)), Fk (a(s, f ,t))) features that discriminate between classes which are not currently well separated by the previous selections. Our across all base indexes (s,f,t). Form the v m(m 1) /2 K m(m 1) / 2 element vector of class pair approach defines an element vector  , whose elements at every iteration K of the feature discriminant measures for each base: selection process represent a relative pairwise ‘cost of  misclassification’ or ‘current discrimination benefit’ D  [D (s, f ,t), D (s, f ,t),..., D (s, f ,t)]  s f t 12 13 (m1)(m) (9) between the associated classes. The new ‘dynamically reweighted’ discriminant vector is defined as This approach requires more memory than the original  approach, as a wavelet packet decomposition is stored for D K  [ K D (s, f ,t),,...,  K D (s, f ,t)] (12) each signal. The energy map approach of LDB allowed s f t 12 12 (m1)m (m1)(m) cumulative summations of samples to a single (mean) A user-specified normalization, as discussed in the representation, hence requiring one wavelet packet  K decomposition per class. previous section, is applied to Ds f t , thereby computing Given the array of class-pair discriminant scores, we the overall utility of the respective base w . After the need to provide an overall score for the wavelet base. This s f t vK requirement is discussed in the next section. base is selected, the weights  are updated according to the following equation (inspired from the technique of  3.2 Alternative normalizations of D Boosting [9]): At this stage we have an array of m(m-1)/2 class-pair    D  discriminant statistics, and we have some liberty in K 1 K   2 ij  ij  ij exp    (13) determining a mapping from the pairwise scores to an  m(m 1) D    ij 1  overall score. A natural general formulation consists of   the p-norm, given by v 1/ p m(m1) / 2 where Di j is the linear sum (L1 norm) of the unweighted   p  1 D   D  (10) p   k  discriminant scores associated with the currently selected  k1  wavelet base ws f t . Weights with discriminant score Popular values for p include 1, 2, and  (sup norm),  values greater than selected base’s discriminant average which is given as will be biased downward, while weights with pairwise  discrimination less than the selected base’s average will be D  max Dk (11)  k increased. The resulting new weights or ‘costs of misclassification’ are then re-normalized to sum to one. The effect of these adjustments therefore increases the This paper will investigate the use of these p-norms and likelihood that future selected features will address class and also a minimax (maxi-min) normalization given by pairings that were not as well separated by the previous  selected bases. This ‘dynamic re-weighting’ approach for D  min Dk . Although choosing the minimum might min k wavelet feature selection replaces original LDB step four, seem counterintuitive, it allows the user to select the bases and is summarized as: with the ‘best-worst-case’ efficacy, and therefore merits v investigation. In the p-norm context, the approach used by 1  2 ,..., 2 4: Initialize m( m1) m( m1) 

 For k = 1 to {number of desired features} 5 Application to Radar Classification To evaluate the relative efficacy of the various strategies For all bases w s , f ,t and criteria for wavelet base selection, a trade study was Selection Process performed using simulated high-resolution radar (HRR)  K data. Previous publications of automatic target recognition a) Compute Ds f t (eq. 12) for all bases  algorithms in the HRR environment abound [12][13][14].  k b) Given specified norm, compute Ds f t This section presents a brief description of the problem p  (HRR) domain and our specific problem and data used for k c) Select base w with largest Ds f t s , f ,t p algorithm analysis. Update Process vK 1 d) Compute  using eq. (13) 5.1 Millimeter wave radar signatures  K 1 K 1 K 1 For this study, simulated fully polarimetric millimeter e) Normalize ij  ij  ij ij wave (MMW) inverse synthetic aperture radar (ISAR) End for images were generated for five classes of ground vehicles. End for The range resolution of this data was on the order of six inches. The images were converted from 2D images to 1D The update formulation (13) is similar in spirit and real-beam range profiles by means of frequency domain functional form to the re-sampling rule in ‘boosting’, as it processing. Range profiles are 1xn complex attempts to bias the base selection process to select representations of the processed signal returns vs. range features that offer more discrimination between currently (sensor to target range), where n is the number of range ‘costly’ or ‘tough’ classes of interest. bins processed. Example magnitudes of range profiles for It is important to note that the concepts of alternative two vehicles are shown in Figure 1 below. It is the normalizations and/or dynamic re-weighting introduced in differences in these profiles, at all sensor-target these sections are applicable to any multi-class orientations, that we seek to characterize via wavelet base/feature ‘selection’ approach, including LDB. In the representations and exploit for classification purposes. following sections, we will compare the original LDB with the K-S best basis. We also compare the original LDB with a ‘modified’ LDB, augmenting LDB with alternative p-norms and dynamic ‘cost’ re-weighting. These algorithms are evaluated using simulated data from a signal-processing domain of considerable interest to the authors, and the results are discussed in the following sections. 4 Measuring algorithmic efficacy Figure 1. MMW range profile of two target classes. Given the options discussed in the previous section, a The training data consists of 360 (one signal per degree study was performed to quantify the value of the various pose) dual-polarimetric (left-circular and right-circular) approaches (e.g. original LDB vs. K-S selected bases, range profiles for each vehicle of interest. Test data multi-class normalizations, cost-based re-weighting of consists of insertion of the ISAR ‘chips’ into complex bases). To compare the algorithms, a data environment of SAR images and converting the composite image into a current interest (multi-polarimetric real-beam radar real-beam range profile representation. An example of the signatures of ground vehicles, discussed in the following training and corresponding test range profiles at the same section) was selected for algorithmic analysis. The pose (pose is the relative orientation of sensor and target criterion used in this research for comparative analysis is vehicle) is shown in Figure 2. Our test set consists of 715 the probability of correct classification of test data. This images with targets placed in random locations and criterion requires a classifier to be trained on the wavelet random pose (relative to sensor). These images were bases/features selected via the algorithm under analysis. converted to realbeam signatures. The classifier selected for our comparative analysis is a Detection of the target was pre-supposed, as this study support vector machine (SVM), a general non-linear was geared toward the relative analysis of algorithmic classifier which projects the features into an alternative performance. space (via a Kernel function) and finds the ‘optimal’ linear Training Signature Test Signature decision boundary in the alternative space. See [10][11] 0 0 for more details on support vector machines. -5 -5 -10 -10 B B d

-15 n -15 i n i

r r e e w

w -20 -20 o o P P -25 -25

-30 -30

-35 -35 0 20 40 60 80 100 120 0 20 40 60 80 100 120 Range Bin Range Bin Figure 2. Corresponding training and test range profiles.

In this research, the complex range profiles (both training and test) are converted to real-valued magnitude range profiles for wavelet processing and analysis. Wavelet analysis is performed in both polarizations, and the best features (as defined by a criterion under study) are selected as input to an SVM classifier for training and testing. Target pose information is not given either in the training or testing phases, as the features extracted are Figure 3. Number of features vs. 5-class identification representative of differences across vehicle angular performance for ‘best’ 4 to16 Daubechies (4) features. aspects. daub8 Library Five vehicles of similar size were chosen to test the 1 algorithms in a realistic multi-class environment. Our

) 0.9 experimental discrimination capabilities and classification C C P ( results are discussed in the next section. 0.8 n o i t

a 0.7 c i f i

5.2 Algorithm Trade Study Space s s

a 0.6 l C

The dimensionality of the potential trade space is quite t c 0.5 e r high (i.e. wavelet library X discriminant function X p- r o

C 0.4 LDB norm X unweighted/ dynamic re-weighting) and it is

% K-S (L1) beyond the scope of the paper to provide an exhaustive 0.3 K-S (L2) evaluation. However we will summarize our results to 4 6 8 10 12 14 16 date. Number of Bases Figure 4. Number of features vs. 5-class identification 5.2.1 K-S Best Bases vs. LDB performance for ‘best’ 4 to16 Daubechies (8) features. The classification efficacy of the Kolmogorov-Smirnov test is compared to the energy map approach of LDB for daub4 Library two wavelet libraries (Daubechies-4 and Daubechies-8). 1 The classifier results for K-S Best Bases and LDB are

) 0.9 shown in Figures 3 and 4 for a small number of features. C C P ( The norms displayed for K-S Best Bases are the L1 and L2 0.8 n o i norms, and illustrate little/no difference for these families t a 0.7 c i f between the two normalizations. The results also indicate i s s

a 0.6 that a difference exists between LDB and K-S approaches l C

t when the number of features is less than 10. However, c 0.5 e r r with a higher number of features, LDB selected features o C 0.4 LDB outperform those selected via K-S Best Bases, illustrated % K-S (L1) in Figure 5. On an interesting note, all algorithmic 0.3 K-S (L2) variants for feature selection converge to identical 16 24 32 48 64 128 256 classification results when a complete wavelet basis Number of Bases (spanning the original signal space) is supplied to the Figure 5. Number of features vs. 5-class identification classifier. The complete basis classification results (PCC performance for ‘best’ 16-256 Daubechies (4) features. = 0.95) are independent of the wavelet library selected (Daubechies, Coiflets, etc.), ‘how’ the basis was selected  (i.e. the discriminant function),daub4 Library the p-norm, or whether or D not dynamic1 weights are applied. The complete-basis 5.2.2 Normalization of s f t results were only a function of the SVM classifier ) 0.9 Several experiments were performed to determine the C hyperparameters,C but further discussion is beyond the classification efficacy of the various multiclass vector P (

0.8 scope of thisn paper. normalizations (section 3.2) on the problem under study. o i t

c Other measures (e.g., number of matching features) could i f i

s be applied to quantify feature selection overlap, but since s

a 0.6 l C

c 0.5 e r r o C 0.4 LDB % K-S (L1) 0.3 K-S (L2) 4 6 8 10 12 14 16 Number of Bases our focus is on classification efficacy, the overall weighting of pairwise ‘utility’ produces negligible effects classification continued to be our metric of choice. when an L1 norm is applied to Ds f t , regardless of the Figure 6 displays classification performance for the K-S discriminant function or wavelet family applied. This is discriminant function and the Daubechies (8) library using  illustrated for the Daubechies (4) library in Figure 8 given L1, L2, L  , and min() functions applied to for the best D a K-S discriminant function and the L1 norm. Similar n features, with n varying from 4 to 256. Figure 7 displays results were obtained when LDB was modified with the same information for the Daubechies (4) library. dynamic re-weighting, with no significant differences daub8 Library were detected across wavelet libraries. However, using 1 L  or min() normalization, differences in behavior were detected. Figure 9 illustrates a small but consistent

C improvement in classification efficacy via dynamic

C  K P

( 0.8 adjustment of Ds f t . n o i t

a 0.7 daub4 Library c i f

i 1 s s

a 0.6 l ) C 0.9

t C c C

e 0.5 r P r (

0.8 o L1 n C o

L2 t % a 0.7 min c i f 0.3 i sup s s

a 0.6 4 16 32 64 256 l C

Number of Bases c 0.5 e r Figure 6. Normalization effects on Duabechies (8) / K-S r o C

classification percentages for 4-256 bases. 0.4

% Reweighted K-S (L1) 0.3 K-S (L1) daub4 Library 1 4 16 32 64 128 256 Number of Bases

) 0.9 Figure 8. Dynamic re-weighting of pair-wise costs C

C produces minimal results with L1 normalization. P (

0.8 n o

i daub8 Library t

a 0.7 1 c i f i s ) s 0.9

a 0.6 C l C C

c 0.5 0.8 n e r o r i t

o L1 a

C 0.7 c

0.4 L2 i f i % min s s

a 0.6 0.3 sup l C

4 16 32 64 128 256 c 0.5 e r Number of Bases r o

C 0.4 Figure 7. Normalization effects on Daubechies (4) / K-S

% Reweighted K-S (min) Classification percentages for 4-256 bases. 0.3 K-S (min) 4 16 32 64 128 256 As illustrated by the figures above, we discovered no  Number of Bases particular D normalization approach was superior across Figure 9. Classification with/without dynamic re- the wavelet libraries or the number of bases selected. weighting: K-S discriminant and min() norm. However, the choice of normalization approach did effect the performance of dynamic re-weighting of misclassification, illustrated in the next section. 5.2.4 Modifying LDB 5.2.3 Uniform vs. Dynamic Re-weighting As mentioned in a previous section, the algorithmic modifications developed in this research are not specific to Several effects became apparent while quantifying the the K-S best basis discriminant function, but are broadly applicability of dynamic re-weighting of pair-wise class applicable to any multi-class feature evaluation and misclassification costs. It was noted that dynamic re- selection technique. It was therefore of interest to see what are the effects of augmenting LDB with alternative quantify the classification efficacy of these approaches for normalization and/or dynamic re-weighting. the identification of vehicles from high-resolution radar Adding dynamic re-weighting of pair-wise costs into signatures. LDB (which implicitly uses an L1 norm) resulted in It is evident that the effects of our modifications become minimal changes alone. less significant as the number of features increase, with all For small numbers of features, dynamic re-weighting techniques converging to identical classification results and alternative norms in LDB provided improved results when providing a complete basis of wavelet features to the for some wavelet families, and insignificant or poorer support vector machine. Whether this is due to the performance in others. The improvement is illustrated in robustness of the wavelet representations across families Figure 10, while Figure 11 illustrates less significant and techniques or due to the capabilities of the support dynamic re-weighting results. But again, the classification vector machine classifier is a topic for investigation. It is differences between the various techniques become important to note that similar results (convergence of negligible as the number of features increase. classification efficacy as the number of wavelet bases daub8 Library increase) was reported by Saito et. al. [5] when comparing 1 LDB with the Kullback-Liebler based discriminant function (eq. 5).

) 0.9 Although the overall classification efficacy was not the C

C focus of the research, the classification results show P

n demonstrable promise for classification of ground vehicles o i t using wavelets libraries for feature extraction and a 0.7 c i

f characterization of high-resolution radar signatures. i s s

a 0.6 l C

t 7 References c 0.5 e r r

o [1] Ronald. R. Coifman, M. V. Wiskerhauser, Entropy- C 0.4 based algorithms for best basis selection, IEEE Trans. % Reweighted LDB (min) Info. Theory, Vol. 38 no. 2, pp. 713-718, 1992 0.3 LDB (L1) 4 6 8 10 12 14 16 [2] Jonathan Buckheit, David Donoho, Improved Linear Number of Bases Figure 10. Reweighted/min norm LDB vs. original LDB Disrimination Using Time Frequency Dictionaries, for Daubechies(8) library. Technical report, Department of Statistics, Stanford University, http://www-stat.stanford.edu/~donoho/Reports

daub4 Library [3] Naoki Saito, Ronald R. Coifman, Improved 1 discriminant bases using empirical probability density estimation, 1996 Proc. Computing Section of Amer. ) 0.9 Reweighted LDB (min) C LDB (L1) Statist. Assoc., pp.312-321, 1997 C P (

0.8 n o

i [4] Naoki Saito, Ronald R. Coifman, Local discriminant t

a 0.7 c bases, Mathematical Imaging: Wavelet Applications n i f i

s Signal and Image Processing, A. F. Laine, M. A. Unser, s

a 0.6 l Editors, Proc. SPIE, Vol. 2303, 1994. C

c 0.5 e r

r [5] Naoki Saito, Ronald R. Coifman, Frank B. o C

0.4 Geshwind, Fred Warner, Discriminant feature extraction

% using empirical probability density estimation and a local 0.3 basis library, Pattern Recognition, Vol. 35, pp. 2841-2852, 4 6 8 10 12 14 16 2002. Number of Bases Figure 11. Reweighted/min norm LDB vs. original LDB [6] David W. Scott, Multivariate Density Estimation, for Daubechies(4) library. John Wiley & Sons, New York, 1992.

6 Conclusions [7] S. Kullback, R. A. Liebler, On Information and This research has investigated three distinct Sufficiency, Annals of Mathematical Statistics, Vol. 22, modifications (discriminant function, multi-class pp. 79-86, 1951. discriminant normalization, and dynamic re-weighting of pairwise misclassification costs) to the original LDB [8] Jerold H. Zar, Biostatistical Analysis, Prentice Hall, library-base selection process, while attempting to New Jersey, 1984. [9] Yoav Freund, Robert E. Schapire, A Short Introduction to Boosting, Journal of Japanese Society for Artificial Intelligence, Vol. 14, no.5, pp. 771-780, 1999.

[10] Vladimir N. Vapnik, An Overview of Statistical Learning Theory, IEEE Trans. On Neural Networks, Vol. 10, no. 5, pp. 988-999, 1999.

[11] Nello Cristianini, John Shawe-Taylor, An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, 2000.

[12] Dale E. Nelson, Janusz A. Starzyk, High Range Resolution Radar Signal Classification a Partioned Rough Set Approach, Proc. of the 33rd IEEE Southeastern Symposium on System Theory, pp. 21-24, 2001.

[13] Steven P. Jacobs, Joseph A. O’Sullivan, Automatic Target Recognition Using Sequences of High Resolution Range-Profiles, IEEE Trans. on Aerospace and Electronic Systems, Vol. 36, no. 2, pp. 364-381, 2000.

[14] Rob Williams, John Westerkamp, Dave Gross, Adrian Palomino, Automatic Target Recognition of Time Critical Moving Targets Using 1D High Range Resolution (HRR) Radar, IEEE Aerospace and Electronic Systems Magazine, pp. 37-43, April 2000.