Area-Based Automatic Image Registration Techniques and Quality Assessment Criteria

Area-Based Automatic Image Registration Techniques and Quality Assessment Criteria

Area-based Automatic Image Registration Techniques and Quality Assessment Criteria

SAQIB YOUSAF, WILL HOSSACK

Institute of Meteorology, University of Edinburgh, UK

Abstract: - Automatic image registration has become an important issue in many different types of applications and data. In this paper, we present a quantitative evaluation for various area-based registration techniques. To benchmark the accuracy of final registration parameters, a new quality measure is proposed, which is also tested successfully as a similarity measure for wavelet-based image registration using the LL subband. Also dependence of registration accuracy on detail and contrast in images is analyzed. To get more reliable registration parameters, smoothing found to be helpful in suppressing noise and very high frequency components. Energy of Canny edged image can be used in estimating the required degree of smoothness. Smoothing must be achieved with minimum effect on important significant features and edges. The algorithms have been tested for Landsat Thematic Mapper (TM), aerial and medical (CT) images.

Key-Words: -Image registration, wavelet decomposition, multiresolution analysis, similarity measures, TM images, remote sensing.

1 Introduction

Image registration is required in many diverse fields such as medical imagery, computer vision, pattern recognition, robotics and remote sensing. The main registration problems can be categorized as (a) Integrating multi-sensor/multi-spectral data, (b) Inferring three-dimensional (3-D) information from shifted images (c) Change detection (d) Model-based object recognition. Examples of the applications are data fusion, multi-sensor/multi-spectral classification, map updating (in cartography), environmental monitoring, mosaicing, combining CT and NMR data, monitoring of tumour, target localization, visual inspection, stereo matching and object height estimation [[i]].

There can be no universal registration technique for all applications. However, the majority of these consist of the following main tasks [[ii]].

  • Feature detection
  • Feature matching
  • Transform model estimation
  • Similarity metric selection
  • Search space and strategy
  • Image resampling and transformation

Theregistration techniques are generally classified on the basis of feature space, search strategies and similarity measures.

This work is motivated by various automatedarea-based image-to-image registration methodsto findRotation-Scale-Translation (RST) parameters. In the next two sections, we discuss the Fourier-Mellin transform (FMT) and waveletsbased techniques. Section 4covers the quality assessment issues. Section 5concludes the paper suggesting various possible schemes. Four different types of images are carefully selected for analysis as shown in Fig. 1.

2 FMT-based Image Registration

FMT can be used to register images havingRST-distortions due to its RST-invariance[[iii]].

Two imagesf1andf2shifted by (xo, yo) can be represented in Fourier domain as follows:

(1)

The inverse FFT of normalized cross power spectrum is a Diracpulse centred at (xo, yo):

(2)

Typically, P(x,y) contains a dominant peak at (xo, yo) denoted here as tpeak.The amplitude and energy of tpeak is a direct measure of the congruence between the two images. The normalization of cross power spectrum makesphase correlation method (PCM)quite robust to those types of noises that are correlated to the image, e.g., uniform variations ofillumination, offsets in average intensity and fixedgain errors due to calibration.

Two imagesf1andf2rotated and shifted byθ and (xo, yo) can be represented in Fourier domain as follows:

(3)

Thus, |F2| is only a rotated replica of |F1|. In polar coordinates (setting u=rcosψ, v=rsinψ):

(4)

Thus, rotation appears as vertical shift along ψ-axis, so PCM can be applied to the pair of polar Fourier images to recover θ.

If image f2 is a (1/a, 1/b) scaled replica of f1:

(5)

Thus, magnitude of F2 scaled down.

If input image f2 is a 1/s scaled, θ-rotated and (xo, yo) shifted replica of reference image f1:

(6)

FMT is obtained on using log-polar coordinates:

(7)

Fig. 1. Left: Test images, Right: ‘+’ Artifacts (a) FFT of case-I image, (b) FFT of case-I image rotated by 45o. (c) FFT of case-I image with RST= [45, 1.25, (50 50)], (d) FFT of case-I image rotated by 90o (e) FFT of case-IV image. (f) FFT of case-IV image rotated by 45o

Using the Fourier power spectrum of these functions, the shift (logs,θ) can be determined by PCM from the position of the peak denoted here as rspeak. The exact location of both rspeak and tpeak can be determined by interpolation or from centre of mass of the peak [[iv]].Fig. 2 showsrspeak and tpeakversusindividual RST-distortions and hence the overlap. The effect of ‘+’ artifacts is obvious in rotation-curve. At 90otpeakincreases sharply due to the absence of these artifacts.

Fig. 2. Effect of individual RST-distortions

2.1Implementation issues

Eq.1-7 consider images as continuous & infinite. Discretenessof images introduces sampling errors. Also, Logpolarimages naturally have poor resolution for pixels away from the centre. These errors may be reduced by using high-resolution logpolar images or better interpolation methods.

Due to noise and ‘’ shaped artifacts in Fourier space,tpeak is much less then 1 even if the image is corrected by true RS-parameters [[v]].These arise due to sharp boundaries when rotation and scale are rectified before finding translation parameters (due to finite nature of images). These artifacts can only be avoided for rotations that are integer multiples of 90o(see Fig. 1). Image can be multiplied by a circularly symmetric or blurred-border window to reduce this effect[[vi]]. These errors are less significant for images having uniform background, which can be the case for some medical images. Thustpeak is computed after RS-correction and only the rspeak giving tpeak above a suitable value considered reliable (see Fig. 6).Here we are not concerned with this suitable value of tpeak due to its lack of generality for all types of images; rather we will discuss the issues, which can improve tpeak.

Images must be of the same size especially for Logpolar conversion square images are required.

Let (Xs, Ys) be the position of rspeak or tspeak, then Xs must be replaced with Nc-Xs when XsNc/2. Similarly, Ys must be replaced with Nr-Ys when YsNr/2. Here Nr and Nc represent number of rows and columns of images. Such peaks correspond to negative shifts and rotations or scale less than 1[[vii]].

Following high-pass filter is used on logpolar image due to its numerical instability near the origin[[viii]].

(8)

A window-function must be applied to the images before calculating FFT to reduce boundary discontinuity effects and suppress spurious peaks. We tested hamming, hann, Gaussian, Kaiser and Tukey(cosine-tapered) filters [[ix]]. Hamming windows has shown better results for our cases.

The analysis based on mean and variance of P(x,y) and value of tpeak gives an idea of registration validity[[x]].

2.2Decoupling Scale and Rotation

Integrating FMT along the -axis produces an RT-invariant function S(logr) called the scale signature. Similarly, integrating FMT along the logr-axis produces an ST-invariant function R() called rotation signature. The scale can be found by 1-D normalized cross correlation of the two scale signatures S1S2of two images. According to M. McGuire [[xi]]a suitable filter must be applied on FMT before integration to remove the artifacts.A Tukey filter with parameter 0.3 (which corresponds to a blurred-border window) has shown excellent results. Also, a moving average subtraction is performed onS1S2 to improve signal-to-noise ratio. Thenusing the same FMT, rotation can be obtained by 1-D normalized cross correlation of two rotation signatures R1R2. It can be useful to rectify the scale error before calculating rotation.

Fig. 3 shows that about three more peaks are comparable to the peak corresponding to actual value. Thus, it is better to test a list of scale parameters obtained from top few peaks. The scale parameter giving best tpeak should be selected.Also, CT image has fewer artifacts because of uniform background, which result in larger peak and less contribution from the false peaks.

Fig. 3. Cross correlation peak to find scale parameterseparately

Fig. 4. Effect of various filters on cross correlation peak in finding rotation separately

Fig. 4shows false peaks at about 90o interval due to boundary discontinuity effect.Hence, brain image has no false peaks even if there is no filter applied.Kaiser window applied on images before calculating FFTremoves these artifacts very nicely when its parameter is increased. An Edge-enhancer filter on FMTcan make this peak sharper.

Iterative improvement of rotation/scale parameters is observed by increasing resolution along the rotation/scale axis in logpolar image. The 1-D cross-correlation is performed each time using maximum lag corresponding to twice the previous estimate of rotation parameter, which not only helps to avoid searching for irrelevant estimate but also reduce computational time. The iteration can be stopped when there is no significant change in the estimate. The images having smaller rspeak do not show such iterative improvement, so it can be used for registration validity.

3 Wavelets-based Image Registration

Wavelets are quickly decaying oscillatorywaves.They are preferred choice over Fourier transform mainly due to time localization and multiresolution. They can isolate singularities and irregular structures in signals due to itsmother wavelet, which can be scaled and translated. A tree structured Fast discrete wavelet transform (DWT)algorithm developed by Mallat [[xii]] can calculate lower resolution coefficients from the higher resolution coefficients by successive lowpass and highpass filtering usingtwo-channel Quadrature-Mirror Filter (QMF) Bank[[xiii]] (see Fig. 5).

Fig. 5. Wavelet decomposition of an image

Four subband images can be obtained from the original image or previous LL image. The LH and HL images contain horizontal and vertical edge features respectively, while the HH contains mostly high frequency noise. The main steps in wavelets-based registration are shown in Fig. 6.

Fig.6. Left: Steps in FMT–based image registration, Right: Steps in Wavelets-based image registration

3.1 Feature space

Let fLH(2j,x,y) and fHL(2j,x,y) are the LH and HL images at scale 2j, then we can define an imagefMODcontaining both horizontal and vertical edges as follows [[xiv]]:

(9)

Another tested feature is the histogram-thresholded (HT) image of this MOD image containing top n% intensities of the histogram [[xv]]. The mean j and standard deviation j at level 2 jcan also be used to threshold MODas follows[[xvi]]:

(10)

The parameters  and n can be used to control the number of feature points.

Fig. 7. NCC values for each level. Level 0 corresponds to correlation for rectified images. (a) For case-I image (b) for smoothed case-II image (c) for noisy case-I image (SNR=5) (d) for case-II image

Alsothe first principal component image fPC(2j,x,y) of HL and LH images at scale 2jcan be used.

Fig. 7shows thatregistration using MOD, HT and MTgives nice resultsbut PCisinconsistent. Smoothing results in better performance while noise decreases the correlation value of level 1 and 2. Case-II images give high correlation values for all subbands due to its lesser details.

3.2 Search space and search strategy

Wavelet’s multiresolution search strategies are fast and efficient for search space composed of 2-D rotations and translations.Estimated RT-parameters from the lowest resolution are used as an input to the higher resolution image giving a refined search space(see Table I). The initial estimate for rotationθo is determined by searching best angle in the interval [-90,90]. The accuracy of parameters doubles with each next level when going from low resolution to high resolution. The computation can be reduced by searching for one parameter at a time assuming the other’s last estimate to be known [15].

We suggest another possibility in search strategy ofrectifying images by using estimated RT-parameters after the first step. For such a strategy, one needs a smaller search steps at the firststep to avoid pursuing a false path.

TABLE 1: SEARCH STRATEGY

Level / Image size / Rotation search
space / Translation search space /  / x, y / Results
4 / 3232 / 0 8 / Txo 16, Tyo 16 / 4 / 8 / 3,Tx3, Ty3
3 / 6464 / 3 4 / Tx3 8, Ty3 8 / 2 / 4 / 2,Tx2, Ty2
2 / 128128 / 2 2 / Tx2 2, Ty2 2 / 1 / 2 / 1,Tx1, Ty1
1 / 256256 / 1 1 / Tx1 1, Ty1 1 / 1 / 1 / ,Tx, Ty

3.3 Wavelet filters

The choice of filter bankin image registration is an important issue[[xvii]], [[xviii]]. A higher order filter can have good frequency localization, which in turn increases the energy compaction whereas a lower order filter is expected to have a better time localization and therefore preserve the edges. It is observed that spline biorthogonal (bior) wavelets are the best choice. Daubechies (Db) can be inconsistent except Haar filter. The reason for this may be the limited discontinuity of its scaling function at only two points. Fig. 8 shows the performances of few filters with different orders for MOD subband. The filters Haar, Bior1.3 and Bior2.2 show good results but Db2 and Db3 show poor and inconsistent results.

Fig. 8. NCC values for each level. Level 0 corresponds to correlation for rectified images. (a) For case-I image (b) forsmoothed case-I image (c) for noisy case-I image (SNR=2.5) (d) for case-II image

3.4Implementation Issues

The noise has more effect on higher resolution subbands and decreases their normalized correlation coefficient(NCC) (see Fig. 7).

The gradual improvement of NCC with the decrease in level shows Registration validity.

Wavelet coefficients (especially with high frequency) are not translation invariant due to the use of convolution and subsampling [[xix]].

3.5Similarity measure

The NCC is a widely used similarity measure, but it has few limitations e.g., computationally expensive, undesirable behaviour for images containing too much/ little fine structure and high sensitivity to the image skewing, vignetting etc. It can be undefined (due to division by 0), if anyimage has uniform intensity. AlsoNCC can be sameas long as the local mean and/or histogram of pixel intensities are relatively unchanged [[xx]]. The minimum acceptable value of NCC is another issue. To deal with the limitations of NCC to some extent, a modified NCCis used, which is calculated by taking an average of NCC from four equal parts of the image.

3.6Wavelet-based Denoising

Fig. 7-9 shows that sensible smoothing improves the registration performance. Theenergy of canny edge (ECE)is found useful in quantifying the roughness. The ECE value for images of case I-IV was found 0.0665, 0.0486, 0.169, and 0.0559 respectively. Also, their standard deviation is 53.94, 48.02, 52.79 and 78.46 respectively.Thus for Landsat image ECE is large but standard deviation is low (i.e., low contrast and lot of details). Smoothing is found to be effective for such images. For aerial images, ECE is less and high standard deviation. When uniform noise is added, both ECE and standard deviation should increase.

Smoothing can be achieved by applying Gaussian lowpass filter but wavelet-based denoising methods are better as they are more flexible and can retain the edges. Donoho’s and Johnstone’s waveshrink [[xxi]] is most widely used method for denoising or smoothing images, which is based on a soft threshold [[xxii]]. We applied this method to smooth images and used ECE to find the required value of soft threshold. Moreover, a soft threshold on high frequency subbands of level 1 or 2 can solve the problem of decrease in correlation value for these levels with the increase in noise.

Fig. 9 shows results for Landsat images using Haar filter withMOD subband. Denoisng using mean of the image as a soft threshold seems useful. Also, to register variousLandsat bands, the use of subbands like MT or HT seems a better approach.

Fig. 9. Correlation values for each level. Here B# stands for Landsat band followed by the year of acquisition

4 Quality assessment criteria

As no registration algorithm can be perfect so scatter plots and difference images are usedtosee the performance. Butwe are interested in a quantitative measurement for the accuracy of final registration parameters in terms of a single number named as Quality measure.

4.1 Edge width in difference image (EWDI)

The difference image Id of rectified input image andreference image contains mostly zeros except near the edges of the objects(see Fig. 10). With more accurateRST-estimates,the lines in Id become sharper and thinner. We binarized Idby applying a threshold and calculated the energy Ed.

Here, (11)

An exponential decrease inEdis observed with the increase inthis threshold value(see Fig. 10). Also, a bad RST-estimate shows slower decrease of Ed–curve.These curves can be modelled asto quantify the rate of decrease by. The parameter can be found now by simply least square estimate of slope of straight line curve of log(Ed). We will call as EWDI (Edge width in difference image).

Fig. 10.Registration results (a) Reference image (b) Input image (c) Rectified input image (d) Difference image of (a) and (c), Right: Effect of threshold for dI on EdI . Each curve is plotted for a different shift

The effect of RST-distortions on various similarity measures is checked e.g., Fig. 11 showsNCCand EWDIversus RST-distortions. The curves for other similarity measures differ in terms of steepness, which shows their sensitivity with respect to distortion. Feature sizes, noise and image type also plays role in the shape of these curves. EWDIalso has a peak forRST-parameters, which can align two images exactly. The value of this peak depends on the noise, resampling method and type of images used. EWDI not only decrease quite rapidly and smoothly but also remains almost unchanged after the distortion exceeds beyond a significant amount. This property makes it useful for the final analysis of RST-parameters. To see the limitations of EWDI, we used various types of images. It was observed that for images with noise and very small features EWDI still showed a peak for the correct RST-values but the value of the peak was reduced.

Fig. 11. Effect of RST-distortions

4.2 EWDI based registration

EWDI is also tested as a similarity measure in wavelet-based registration.EWDIcan not be used in level 3 or 4 because of the non-availability of sharp edges in Id. So for these levels we have to use NCC as a similarity measure. Moreover, only LL subband can be used for registration in this case.