A Dissertation

Entitled

Novel Methods for Improved Fusion of Medical Images

By

Fayadh Alenezi

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Doctor of Philosophy Degree in Engineering

______Dr. Ezzatollah Salari, Committee Chair

______Dr. Mansoor Alam, Committee Member

______Dr. Junghwan Kim, Committee Member

______Dr. Richard Molyet, Committee Member

______Dr. Eddie Y Chou, Committee Member

______Dr. Cyndee Gruden, Dean

College of Graduate Studies

The University of Toledo May 2019

Copyright 2019, Fayadh Alenezi This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author.

An Abstract of Novel Methods for Improved Fusion of Medical Images

By

Fayadh Alenezi

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Doctor of Philosophy Degree in Engineering

The University of Toledo May 2019

Medical image fusion (MIF) is a key technique for the analysis of diagnostic images in clinical applications. MIF aims to reduce uncertainty and redundancy derived from examining two or more multi-mode separate images, by creating one single composite image that is more useful for human interpretation. However, current MIF techniques have not successfully addressed the poor textual properties and deficient edge formation of many resulting images. In order to address these shortcomings, this dissertation proposes a variety of algorithms aimed at exploiting different combinations of well-known image processing and fusion techniques. The first algorithm exploits the properties of Gabor filtering and links maximum pixel selection with fuzzy-based image fusion, in order to improve the textual and edge properties of the fused medical images. The second algorithm focuses on reducing defects associated with single images created from different modalities by combining the action of Gabor filtering, maximum pixel intensity selection and Pulse

Coupled Neural Network (PCNN) implementation. The third algorithm seeks to increase image information content and provide a complementary context for anatomical and physiological information by using a space-variant Wiener filter followed by image enhancement with lateral inhibition and excitation in a feature-linking PCNN under

iii

maximized normalization, and then fusion using a shift-invariant discrete wavelet transform (SIDWT). The fourth algorithm focuses on increasing the quality of the source images through a preprocessing technique which uses a greedy-iterative strategy for local contrast enhancement in order to minimize global image variance, together with global and local image contrast optimization based on the human visual system and a standard fusion algorithm. The fifth algorithm attains fusion in the Discrete Cosine Transform (DCT) domain under a novel Block Toeplitz matrix designed to enhance the finer details of all input images, followed by contrast adjustment and smoothing by bilateral filters using

Gaussian kernels. All the novel MIF methods are discussed, thoroughly described, applied to a set of medical images and then evaluated and compared to existing fusion algorithms in terms of three objective measurements, namely pixel standard deviation, root-mean square error and image entropy. Most of these performance figures show significant improvements when compared to the reference fusion methods, thus suggesting that the newly developed algorithms represent a valuable contribution towards progress in this important application field.

iv

Acknowledgments

Firstly, I would like to express my sincere gratitude to my advisor Prof. Ezzatollah

Salari for the continuous support of my Ph.D. study and related research, for his patience, motivation, and immense knowledge. His guidance helped me in all the time of research and writing of this thesis. I could not have imagined having a better advisor and mentor for my Ph.D. study.

Besides my advisor, I would like to thank the rest of my thesis committee: Prof. Mansoor

Alam, Prof. Junghwan Kim, Prof. Richard Molyet, and Prof. Eddie Y Chou for their insightful comments and encouragement, but also for the hard question which incented me to widen my research from various perspectives.

Last but not the least, I would like to thank my family: my parents and to my brothers and sister for supporting me spiritually throughout writing this thesis and my life in general.

Most importantly, I wish to thank my loving and supportive wife, Safa, and my two wonderful children, Nour, and Fares, who provide unending inspiration.

v

Table of Contents

Abstract………………………………………………………………………...……iii

Acknowledgment……………………………………………………………………..v

Table of Contents…………………………………………………………………….vi

List of Tables…………………………………………………………………………x

List of Figures……………………………………………………………………….xii

1 Introduction……………………………………………………………………...1

2 Literature Review………………………………………………………………..6

2.1 Image enhancement………………………………………………………….6

2.1.1 Image perception and quality…………………………………………. 7

2.1.2 Image Filtering…………………………………………………………9

2.1.3 Critical review of image enhancement……………………………….14

2.2 Image fusion………………….…………………………………………….23

2.2.1 Fusion rules-based medical image fusion…………………………….23

2.2.2 Critical review of image fusion…………..…………………………..27

3 Methodological Framework……………………………………………………33

3.1 Objectives……………...…………………………………………………...33

3.2 Research methods……………..……………………………………………34

4 FUZZY-BASED Medical Image Fusion Using A Combination of Maximum

Selection and Gabor Filters…………………………………………………….36 vi

4.1 Introduction…………………………………………………..…………….36

4.2 Proposed method……………………………………………….…………..40

4.2.1 Gabor filters……………………………………..……………………40

4.2.2 Maximum selection…………………………………………………..41

4.2.3 Fuzzy logic……………………………………………………...……42

4.3 Simulation Results………………………………………….………………46

4.3.1 Experimental Setup………………………………………………….. 46

4.4 Discussion………………………………………………………………….. 54

4.5 Conclusion…………………………………………………………………. 55

5 A Novel Pulse-Coupled Neural Network using Gabor Filters for Medical Image

Fusion…………………………………………………………………………..57

5.1 Introduction………………………………………………………………... 57

5.2 Proposed method…………………………………………………………... 58

5.2.1 Overview and Background………………………………………... 58

5.2.2 Gabor filtering……………………………………………………... 62

5.2.3 Proposed PCNN…………………………………………………… 63

5.3 Simulation Results….……………………………………………………… 67

5.3.1 Experimental setup…………………………………………………. 67

5.4 Discussion…………………………………………………………………. 76

5.5 Conclusion………………………………………………………………… 78

vii

6 A Novel Image Fusion Method Which combines Wiener Filtering, Pulse-Coupled

Neural Networks and Discrete Wavelet Transforms for Medical Imaging

Applications…………………………………………………………..79

6.1 Introduction……………………………………………………………...….79

6.2 Proposed method……………………………………………………………81

6.2.1 Overview and background…………………………………………….81

6.2.3 Proposed Feature linking Neural Network Model …………………….84

6.2.4 Shift-Invariant Discrete Wavelet Transform (SIDWT)………………. 91

6.3 Simulation results…………………………………………………………... 92

6.3.1 Experimental setup……………………………………………………92

6.4 Discussion …………………………………………………………………...98

6.5 Conclusion …………………………………………………………………100

7 Perceptual local contrast enhancement and global variance minimization of

medical images for improved fusion………………………………………….101

7.1 Introduction……………………………………………………………….. 101

7.2 Proposed method …………………………………………………………..103

7.2.1 Overview …………………………………………………………….103

7.2.2 Source Image Preprocessing ………………………………………...107

7.3 Simulation results ………………………………………………………….111

7.3.1 Experimental setup …………………………………………………..111

7.4 Discussion ………………………………………………………………….118

7.5 Conclusion …………………………………………………………………118

viii

8 A novel Block Toeplitz matrix for DCT-based, perceptually enhanced image

fusion………………………………………………………………………….120

8.1 Introduction ………………………………………………………………..120

8.2 Proposed Method …………………………………………………………..123

8.2.1 Overview ……………………………………………………………..123

8.2.2 Block Toeplitz matrix ………………………………………………..124

8.2.3 Contrast adjustment using adaptive histogram equalization …………126

8.2.4 Smoothing by bilateral filter using Gaussian Kernel ………………...127

8.3 Simulation Results …………………………………………………………128

8.3.1 Experimental Setup …………………………………………………..128

8.4 Discussion …………………………………………………………………137

8.5 Conclusion …………………………………………………………………138

9. Summary and future studies…………………………………………………..139

References………………………………………………………………………….143

ix

List of Tables

Table 4.1: Pixel- wise image fusion selection criteria ...... 45

Table 4.2: Performance evaluation measures for example 2 under different orientations

of Gabor filter...... 51

Table 4.3: Performance evaluation measures for example 3 under different orientations

of Gabor filter...... 52

Table 4.4: Tabulated results in comparison with existing techniques ...... 54

Table 5.1: List of PCNN parameters values used in this study...... 68

Table 5.2: Performance evaluation measures for example 1 (Figure 11) under different

orientations of Gabor filtering ...... 70

Table 5.3: Performance evaluation measures for example 2 (Figure 12) under different

orientations of Gabor filtering ...... 72

Table 5.4: Performance evaluation measures for example 3 (Figure 13) under different

orientations of Gabor filtering ...... 74

Table 5.5: Comparison of image quality metrics different for fusion algorithms...... 75

Table 6.1: List of proposed feature-linking PCNN parameter values used in the

simulation………………………………………………………………...94

Table 6.2: Comparison of Image Quality Metrics for Fusion Algorithms.

...... 98

x

Table 7.1: List of proposed local and global contrast enhancement parameters and

values used in this study ...... 113

Table 7.2: Comparison of image quality metrics for fusion algorithms ...... 117

Table 8.1: List of proposed parameters of the tridiagonal 1-toeplitz matrix used in

experiments 1, 2 and 3...... 130

Table 8.2: Comparison of image quality metrics for various fusion algorithms,

including the proposed one ...... 134

xi

List of Figures

Figure 4-1: Schematic representation of the proposed algorithm...... 40

Figure 4-2: Fuzzy fusion system flow chart...... 44

Figure 4-3: Membership functions...... 45

Figure 4-4: Inputs and Gabor filtered outputs under different orientations, and resulting

fused images for example 1...... 48

Figure 4-5: Inputs and Gabor filtered outputs under different orientations, and resulting

fused images for example 2...... 50

Figure 4-6: Inputs and Gabor filtered outputs under different orientations, and

resultants fused images for example 3...... 51

Figure 4-7: Comparison of results with other image fusion techniques...... 53

Figure 5-1: Schematic representation of the novel algorithm combining Gabor filtering,

masking to select the local maximum pixel intensities, and PCNN image

enhancement, followed by image fusion ...... 60

Figure 5-2: One neuron in a neural network consisting of Input, Linking and Generator

stages...... 61

Figure 5-3: Schematic representation of classical PCNN, using eight scalar parameters:

훼푓, 푉푓, 훼푙, 푉푙, 훼휃, 푉휃, 훽 and 푛...... 63

Figure 5-4: Example 1: Inputs are CT and MRI images as shown in Row 1. Gabor filter

results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown

xii

in column 1 and 3, with corresponding maximum value selection results in

columns 2 and 4...... 69

Figure 5-5: Example 2: Inputs are CT and MRI images as shown in Row 1. Gabor filter

results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown

in column 1 and 3, with corresponding maximum value selection results in

columns 2 and 4...... 71

Figure 5-6: Example 3: Inputs are CT and MRI images as shown in Row 1. Gabor filter

results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown

in column 1 and 3, with corresponding maximum value selection results in

columns 2 and 4...... 73

Figure 5-7: Fusion results on examples 1, 2, and 3 using the proposed method, CT,

DWT, SHFV, and FMG...... 76

Figure 6-1: Schematic representation of novel algorithm combining spatial variant

wiener filter, proposed feature linking PCNN and SIDWT...... 82

Figure 6-2: Schematic of proposed feature linking PCNN model with feeding input,

linking input, leaky integrator and spike generator...... 88

Figure 6-3: Schematic of linking inputs with excitatory and inhibitory neurons...... 88

Figure 6-4: Example 1: (a) inputs, CT and MRI images; (b) high-scale, spatially variant

Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT

output)...... 95

Figure 6-5: Example 2: (a) inputs, CT and MRI images; (b) high-scale, spatially variant

Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT

output)...... 96

xiii

Figure 6-6: Example 3: (a) inputs, CT and MRI images; (b) high-scale, spatially variant

Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT

output)...... 97

Figure 6-7: Fusion Results on Test Original Multimodality Image Example 1, 2, and 3

Using Proposed Method, PCNNGM, FMG, CT, DWT, and SHFV...... 99

Figure 7-1: Schematic representation of the proposed algorithm showing global and

local contrast enhancement stages...... 106

Figure. 7-2: Example 1: Input images are the computed tomography (CT) (top left) and

magnetic resonance imaging (MRI) (bottom left) images. Proposed global

and local contrast enhancement methods, and results of first-stage fused

images ("Imfuse") are presented and correspondingly labeled. The result of

the final fusion is shown at the extreme right...... 114

Figure 7-3: Example 2: Input images are the CT (top left) and MRI (bottom left) images.

Proposed global and local contrast enhancement methods, and results of

first-stage fused images ("Imfuse") are presented and correspondingly

labeled. The result of the final fusion is shown at the extreme right...... 115

Figure. 7-4: Example 3: Input images are the CT (top left) and MRI (bottom left) images.

Proposed global and local contrast enhancement methods, and results of

first-stage fused images ("Imfuse") are presented and correspondingly

labeled. The result of the final fusion is shown at the extreme right...... 116

Figure. 7-5: Final fusion results for image examples 1, 2, and 3 using the proposed

method, PCNNGM, FMG, SHFV, CT and DWT...... 118

xiv

Figure 8-1: Schematic representation of a novel fusion algorithm combining proposed

DCT and global and improved human perceptual quality...... 125

Figure 8-2: Example 1. Input images are the CT (above) and MRI (below) images. The

result of the final fusion is also shown (right)...... 131

Figure 8-3: Example 2. Input images are the CT (above) and MRI (below) images. The

result of the final fusion is also shown (right)...... 132

Figure 8-4: Example 3. Input images are the CT (above) and MRI (below) images. The

result of the final fusion is also shown (right)...... 132

Figure 8-5: Final fusion results for image examples 1, 2, and 3 using the proposed

method, CT, DWT and SHFV...... 135

Figure 8-6: The performance of the proposed method in relation to others, for example

1 in Figure 8-2 ...... 136

Figure 8-7: The performance of the proposed method in relation to others, for example

2 in Figure 8-3 ...... 137

Figure 8-8: The performance of the proposed method in relation to others, for example

3 in Figure 8-4 ...... 138

xv

Chapter 1

Introduction

Image fusion has been motivated by recent advances in the field of remote sensing

(Jiang, Zhuang, & Huang, 2013). Newly available image sensors feature higher resolution at a low cost. A variety of image sensors exhibit high spatial and spectral resolution while offering faster scan rates. These sensors provide output images that are more reliable, informative and better able to capture a complete picture of the scanned environment, which makes them appealing for use in many different applications.

The use of multiple image sensors helps improve performance of imaging systems

(York & Jain, 2011). Multiple image sensors have been used extensively on remote sensing, medical imaging and surveillance systems. The information value of the data collected from the images typically depends on the number of sensors used. Therefore, the quality of the image is enhanced by increasing the number of sensors for image processing.

However, the data collected from the sensors may fail to meet some properties that are desired for certain image applications. Image fusion is used in order to enhance these key image properties and thus ensure that the collected image data meets the application needs.

Image fusion is a process where two or more different images are combined to form a composite image which is more informative than any of the source images (Sanija &

Karthik, 2015). The process aims at improving the information content of the fused image

1

by preserving the information from the source images while minimizing the artifacts that may be present in them (Flusser, Sroubek, & Zitov, 2007). Image fusion therefore preserves and enhances spatial and spectral resolution from two or more complementary images.

Given its benefits, image fusion has become a topic of interest for many researchers. Image fusion has found applications in image classification, aerial and satellite imaging, medical imaging, weapon detection, multi-focus imaging and defense for situation awareness

(Cheng, Han, & Lu, 2017).

Image fusion methods can be categorized by three different processing levels on which it operates: pixel-, feature- and decision- levels (Sharma, 2016). Pixel-level fusion is performed on a pixel-by-pixel basis by creating a composite image where the information associated with each fused pixel depends on the source images pixels (Li, Kang, Fang, Hu,

& Yin, 2017). Pixel-level methods are mostly used to improve the performance of specific image processing tasks like image segmentation. Feature-level fusion is based on extraction of objects from the source data. The extraction of features relies on image pixel intensities, edges or textures (Ross & Govindarajan, 2004). Similar features extracted from the source images are then fused to form a composite image which is more informative, exhibiting unique features not present in any source image. Lastly, decision-level fusion is based on integrating information at higher levels of abstraction and combining results from multiple algorithms (Ross & Govindarajan, 2004). Decision-level fusion entails individual preprocessing of source images for information extraction, followed by the application of decision rules that reinforce common interpretation.

Fused images must preserve as much as possible all relevant information contained in the input images (Sahu & Parsai, 2012). The fusion process must not introduce any artifacts

2

or inconsistencies which could destroy the information content of the final image (Blum,

Xue, & Zhang, 2005). The fused image should suppress to a maximum extent irrelevant features and noise. Summarizing, the fused image must have the maximum amount of relevant information while minimizing the number of irrelevant details, uncertainty, and redundancy.

Image fusion algorithms can also be categorized according to the data used during fusion and the purpose of such data (Al-Azzawi, 2015). Multi-view fusion of images operates on images taken from the same modality and at the same time, but from different viewpoints. Multimodal fusion is based on images taken by different types of sensors such as visible, infrared, panchromatic, and multispectral. Multi-temporal fusion uses images taken at different times in order to detect changes between them. Multi-focus fusion is based on images of a 3D scene taken repeatedly with various focal lengths. Fusion for image restoration operates on images of the same scene and modality, either blurred or noisy and may lead to de-blurred and de-noised images. A multichannel de-convolution approach is extended to super-resolution fusion where input blurred images of low spatial resolution are fused to provide an image with a higher resolution.

Image fusion methods can be further classified into spatial and transform domain methods. The spatial domain fusion approaches include averaging method, select maximum/minimum method, high pass filtering techniques, Brovey method, Principal

Component Analysis (PCA) and Intensity-Hue-Saturation (IHS) -based methods (Kaur &

Kaur, 2015). The disadvantage of spatial domain approaches is that they produce spatial distortions in the fused image. Transform domain-based methods, such as those based on the Laplacian pyramid transform, the discrete wavelet transforms and the curvelet

3

transform do not suffer from these spatial distortions. Image fusion techniques range from the simplest method of pixel averaging to sophisticated, state-of-the-art methods such as multi-resolution and neural network-based fusion.

Multi-resolution transforms have been used to analyze the information content of images for the purpose of image fusion, with the help of tools like the discrete wavelet transform, the Radon transform, the ridgelet transform and the curvelet transform (Kourav

& Sharma, 2015). These methods show a better performance in terms of spatial and spectral quality of the fused image. The notion of multi-resolution analysis was first formulated by

Burt & Adelson (1983), who introduced a multi-resolution image representation called

Gauss-Laplacian pyramid. Since then, multiresolution analysis has become a very useful tool for analyzing remote sensing images. The idea is to decompose an image into a set of bandpass-filtered component images, each of which represents a different range of spatial frequency. This notion was further extended by other researchers who established a multi- resolution analysis method for continuous functions in connection with wavelet transforms

(Addison, 2005). Wavelet transforms are favored over Fourier transforms because they capture both frequency and time information (Addison, 2005).

The wavelet transform can provide efficient localization in both space and frequency domains. After comparing with other multi-scale transformations, Addison, 2005 suggested that wavelet transforms are more compact and better able to provide directional information across all transform domain bands, and contain unique information at different resolutions.

This dissertation describes the development of new methods for multimodal image fusion with the objective of improving the performance of existing fusion methods. The 4

methods seek to achieve improvement in visual properties (textual properties, edge formation) as well as information content of the final fused image. Image fusion is regarded as a comprehensive process that integrates source image pre-processing, image fusion itself and post-processing of the fused image, seeking to incorporate essential information from different modality sensors into a composite image that is better suited for decision-making, given its increased information content and better textual and perceptual properties.

This document is comprised of an extensive literature review section (chapter 2), a brief methodological framework (chapter 3), a thorough description and full evaluation of each of the novel MIF methods (chapters 4-8) and a summary and future work (chapter 9).

5

Chapter 2

Literature Review

There are many image fusion techniques which prioritize different features and seek to optimize specific criteria for best fusion results. This chapter presents a discussion and literature survey of image processing techniques that are important precursors or components of the image fusion pipeline. Section 2.1 focuses on strategies that are useful for enhancing features and removing artifacts from input images, such as edge detection, textural and information content

Following the review of image enhancement, a literature review of relevant image fusion techniques is provided in section 2.2 as a segue to the methods proposed in this research. These techniques include fuzzy logic, Pulse-Coupled Neural Networks (PCNN),

Feature-Linking Model (FLM), Shift invariance Discrete Wavelet Transform (SIDWT) and Block Toeplitz matrix for DCT-based, perceptually enhanced image fusion. A critical review of these methods is also offered in this section.

2.1 Image enhancement

The use of multiple image sensors helps improve performance of imaging systems

(York & Jain, 2011). Multiple image sensors have been used extensively on remote sensing, medical imaging and surveillance systems. The information value of the data collected from the images typically depends on the number of sensors used. Therefore, to a certain extent, the quality of the image is enhanced by increasing the number of sensors for image processing. However, the data collected from the sensors may fail to meet some properties that are desired for certain image applications. Alternatively, the quality of an 6

image may be degraded during acquisition (He, Sun, & Tang, 2013), or affected by random noise variation which needs to be filtered out (Nixa, 2012) making it unsuitable for direct use. Image enhancement, therefore, is defined as an operation aimed at accentuating features of interest in the data (He et al., 2013), and/or removing irrelevant or degraded information. Defined as such, imaging filtering, texture, and edge detection, and other image processing techniques are routinely used to enhance key properties and thus ensure that the collected image data meets the application needs.

2.1.1 Image perception and quality

Medical images are used in a variety of environments (Liu & Wang, 2014). The perceived quality of medical images varies due to different technologies used to acquire, store, transmit and display images. Visual signal distortion like noise, unwanted artifacts, and variations in visual information arising from processing affect the perceptual quality of images. In order to maximize the usefulness of medical images for the purpose of clinical diagnosis, the human perception of medical image quality must be understood, and this knowledge must be used to develop algorithms aimed at improving it.

Image quality and subjective image quality have been used interchangeably to refer to perceptual image quality (Wang & Shang, 2006). Image quality is defined by the human visual system and is perceived by a human observer as dependent on the contrast and spatial frequency of each image feature (Wang & Shang, 2006). The contrast of the image is increased as the spatial frequency of the image increases. Invisibility in the human visual system occurs when spatial frequency, which is the required contrast, becomes greater than unity. Spatial frequency can be more than unity only when the image is degraded.

Degradation in medical images can occur at any stage in image processing. 7

Medical images are often generated by MRI and CT imaging systems (Mehena, 2011).

MRI images are vulnerable to artifacts that tend to degrade its perceived quality. Artifacts in MRI images can arise from non-ideal hardware characteristics, intrinsic tissue properties and their possible changes during scanning, assumptions underlying the data acquisition and reconstruction processes, and a poor choice of scanning parameters (Wang & Shang,

2006). Artifacts can be eliminated by using many strategies, such as post-processing; however, their complete eradication is not easy or straightforward, and therefore, it remains a challenge. Many researchers such as Gjesteby, et al. (2018) have proposed methods aimed at achieving the optimal image perceived quality; however, the results have not been entirely adequate, prompting further research.

In order to understand the artifacts in MRI and CT images for medical diagnosis, artifact classification is important in understanding and proposing eradication strategies.

Artifacts can be classified into two categories: unstructured artifacts, such as random noise, and structured artifacts. Random noise can be white (flat frequency spectrum) or colored

(non-flat frequency spectrum). On the other hand, structured artifacts are any type of image distortion that represents anisotropy of the spectral content of the object being scanned. An example of structured artifacts is ghosting, which generates lower intensity double images which are shifted with respect to the original image content. Structured artifacts can be white or colored: white ghosting, also similar to edge ghosting, can be understood as the superposition of the gradient of the originally scanned object as a double image to the original one. Artifacts are best described generally as image features or texture and defined as localized rules of parts arrangement that remain consistent throughout an image.

8

In image processing, texture is used in visual perception to acquire knowledge about environmental objects and events by extracting information from the light they emit or reflect (Nathan, 2018). Image visual perception is concerned with acquisition of knowledge. Perceptual capabilities are not attainable by camera alone and are related to objects and events in the environment. The visual knowledge about the environment is obtained by extracting information from the image, which constitutes the information processing approach to the vision. Perceptual quality is therefore related to the information content of the image (Nathan, 2018).

2.1.2 Image Filtering

Image filters can be categorized as linear and non-linear (He et al., 2013). Linear image filters have output pixel values in line with linear combinations of the pixels in the original image. Linear methods are conformable to mathematical analysis, both in the spatial and in the frequency domains, and are consequently far better understood. Nonlinear filter are more powerful than linear filters, but also more difficult to characterize. Nonlinear filters are potentially more powerful since they can reduce noise levels without simultaneously blurring the object edges (Nathan, 2018). However, they are not fully secure since they can introduce artifacts in images.

The simplest and most common of all filters is the "box" or moving average filter

(Nathan, 2018). Box filters replace each pixel by the average of pixel values contained in a square centered at that pixel. Moving average filter may form a weighted average instead of simple average.

The box filter is defined by (He et al., 2013)

9

푚 푚 푔푖푗 = ∑푘=−푚 ∑푙=−푚 푤푘푙푓푖+푘,푗+푙 for 푖, 푗 = (푚 + 1), … , (푛 − 푚) , (2.1)

where 푔푖푗 denotes pixel output values, 푓푖푗 denotes pixel input values and

푤푘푙 represent specified weights. Weights depend on 푖 and 푗, thus allowing for moving averages which vary across the image. Box filters are typically not used in medical image fusion since they discard image edges (Nathan, 2018), given that pixel values within the same neighborhood of 푔 can be assigned to the same values as those in borders of 푓. Box filters can be used as smoothing filters when the pixels in the border of 푔 can be assigned the same values as those in the borders of 푓 (He 푒푡 푎푙. , 2013). Box filters can also be used as edge detection tools when border pixels in 푔 can be set to zero. If all the elements in 푤 are positive, then the effect of the box filter is smoothing the image. The two most commonly used filters of this type are the moving average and the Gaussian filters.

Gaussian filters are the only ones which are separable and, at least to a lattice approximation, circularly symmetric. They also overcome another drawback of moving average filters because their positive weights decay to zero in a gradual manner. Gaussian filters have weights specified by the probability density function of a bivariate Gaussian, or Normal, distribution with variance 휎2, that is,

1 −(푖2+푗2) 푤 = exp { } for 푖, 푗 = −[3휎], … , [3휎] , (2.2) 푖푗 2휋휎2 2휎2

for some specified value for 휎2; [3휎] represents the integer part of 3휎. Limits of ±3휎 are chosen since Gaussian weights are negligibly small beyond this range. The denominator of 2휋휎2 ensures the weights sum to unity and is a common feature with convention smoothing filters (He et al., 2013).

10

2.1.2.1 Gabor filter

Gabor filters are orientation-sensitive filters, and are used for texture analysis. If a

Gabor filter set with a given direction gives a strong response, its direction matches one of the gratings included in the input image.

Gabor filters are seen as sinusoidal planes representable by a Gaussian envelope, making them invariant to illumination, rotation, scale, and translation (Maruthi &

Sankarasubramanian, 2008). Gabor filters are capable of removing noise from the original . Gabor filters also possess optimal localization properties in both the spatial and frequency domains. Because of these properties, Gabor filters excel at feature extraction, textual analysis, disparity estimation, and edge detection in image processing and computer vision.

Medical images contain dense textural properties, which must be preserved during fusion (Lee, Yeom, Guschin, Son, & Kim, 2009). Given the fact that Gabor filters successfully preserve such textural and edge features so inherently important for diagnosis applications, their use in the type of fusion methods like the ones that are the subject of this research is quite appealing.

The design of Gabor filters for gathering the texture information of an image by the process of extraction of data at different orientations uses specific trigonometric operations

(Grigorescu, Petkov, & Kruizinga, 2002). In the design process, the Gabor filter is considered as a pair of a tuple of data x and y, where x and y are the textual information orientations along the axis defined according to the trigonometric functions of orientation angle, 휃. Gabor filters coefficients are capable of extracting the spatial coordinates of an image at different orientation angles from 0 to 2휋 (Han & Ma, 2007). 11

Gabor filtering of an image enhances its textures. Image texture is an important component of human visual perception and is useful in identifying image regions (Yi &

Tian, 2011). Image texture shows the image shape distribution, including the macrostructure and microstructure of the images. Texture also shows the presence of spatial patterns with homogeneity properties. Image textures have been used extensively in image processing and computer vision (Chang & Kuo, 1993). In image processing, texture features are formed by using coefficients of a certain transform on the original pixel values or, more sophisticatedly, by statistics computed from these coefficients.

2.1.2.2 Wiener filtering

In image processing, Wiener is a type of that adaptively tailors itself to the local image statistics, specifically, its variance (Benesty, Chen, Huang, & Doclo, 2005). If the variance is large, Wiener performs little smoothing, and vice versa. This approach often produces better results than linear filtering. The adaptive filter is more selective than a comparable fixed linear filter, preserving edges and other high-frequency parts of an image

(Benesty, Chen, Huang, & Doclo, 2005). In addition, there are no design tasks; the Wiener function handles all preliminary computations and implements the filter for any given input image, although it requires more computation time than fixed linear filtering. Wiener works best when the additive noise is white (constant power across all spatial frequencies), such as white Gaussian noise.

2.1.2.3 Fuzzy filters

Fuzzy filters provide promising results in image processing tasks that cope with some drawbacks of classical filters (Mythili & Kavitha, 2011). Fuzzy filters are capable of dealing with vague and uncertain information. Sometimes, it is required to recover a 12

heavily noise-corrupted image where a lot of uncertainties are present and, in this case, fuzzy set theory is very useful. Each pixel in the image is represented by a membership function and different types of fuzzy rules that consider the neighborhood information or other information. While conventional filters remove the noise but blur the edges, fuzzy filters perform both the edge preservation and smoothing.

Images and fuzzy sets can be modeled in a similar way. A fuzzy set is a class of points possessing a continuum of membership grades, where there is no sharp boundary among elements that belong to a class and those that do not. (Klir & Yuan, 1995) This membership grade is expressed by a mathematical function called membership function or characteristic function. The membership maps each element in the set to a membership grade between 0 and 1. In this way, the image is treated as a fuzzy set.

2.1.2.3.1 Fuzzy filtering for impulse noise removal

A simplified fuzzy filter can be used to remove impulse noise in an image (see

2.1.2.3.1). It uses fuzzy thresholding technique to preserve edges and fine details of the image (Abreu, Lightstone, Mitra, & Arakawa, 1996). The pixels lying outside the trimming range after ranking in the filter are further tested for noisiness by the process of fuzzy thresholding. The algorithm uses a range of threshold values rather than a crisp threshold as the level of contamination varies from pixel to pixel. The modified value for the noisy pixel is calculated depending on the impulse noise present in it. The filter is comprised of two parts. The first one determines if the central sample of pixels lies in the trimming range in the rank order set. If so, it is left unchanged. Otherwise, the second part compares it with its neighboring pixels that lie in the trimming range.

13

2.1.2.4 DCT Transforms

Image fusion can be performed in various domains (El-Hoseny, Elrahman, & El-

Samie, 2017). However, the existing image fusion based on transform domain such as spatial domain has resulted in techniques which are time consuming and unprofitable.

Discrete Cosine Transform is a transform mainly used in image compression applications.

DCT is popular since it is efficient and requires less time to get the results (Naidu & Elias,

2013). DCT has been used in many applications than any other transform. DCT’s numerous applications in many areas can be linked to its energy compaction capability, defines as the speed of signal decaying. DCT signal extension decays to zero speed within the shortest time possible. Extension signals of DCT has no discontinuities at the borders and yields smooth signal. The success of DCT in image processing has been attributed by the basis function of DCT which has existed from better approximation to eigenvectors of

Toeplitz matrices (Yaroslavsky, 2015).

2.1.3 Critical review of image enhancement

The boundaries in medical images are noisy, inconsistent, and incomplete, and their overall accuracy typically needs to be improved (Yun, Zhanhuai, Yong, & Longbo, 2005).

Many research efforts, such as those by Abramoff, Magalhaes, & Ram (2004), and Bao,

Zhang, & Wu, (2005) have attempted to do just that. The following is a summary of techniques that have been proposed to improve the accuracy and reduce noise and inconsistencies associated with edge detection.

Gaussian Based Methods are the most popular edge detection algorithms for image processing applications (Maheshwary, Shirvaikar, & Grecos, 2018). These methods are based on Gaussian filtering. Classical methods do not consider smoothing, but Gaussian- 14

based methods use smoothing filter at different scales. Firstly, they use a smoothing filter at different levels, and then, Laplacian of Gaussian (LoG) for filtering purposes is applied.

LoG is a second order derivative and does not create zero crossing, that is, it never produces new edges. However, this method does not show good performance over edges and corners.

Also, this algorithm works well only on those images in which objects are well separated and their signal to noise ratio (SNR) is high.

Canny (1986) proposed a new algorithm which is also based on Gaussian filtering. It is the most widely used algorithm for many image processing applications. The Canny edge detection algorithm primarily focuses on two criteria which showed effectiveness in localizing image data and detecting its edges. This algorithm was applied to 2-D images.

Then, it used two Gaussian filters, one in vertical direction and one in horizontal direction, to remove white noise. Canny algorithm has a drawback: it is very sensitive to weak edges and does not show good performance on blurred and shaded images.

Multi-resolution methods entail the repetition of the edge detection process for different scales or frequency bands. Many algorithms have been proposed in this area.

Other researchers proposed a new method for multi-resolution analysis based on the Canny edge detection algorithm. As the scale increases, the number of edges will be reduced, and only major edges will be shown. On smaller scales, both major and weaker edges are properly detected. However, this algorithm does not specify how many filters are to be used, and it also losses important information at larger scales. On the other hand, Bergholm

(1987) introduced a new algorithm based on a Gaussian filter known as edge focusing, which is aimed at reducing the effect of blurring. Both the Marr-Hildreth and the Canny edge detectors can be used for edge focusing. While Bergholm (1987) suggested a range

15

for the maximum scale, no indications are provided regarding the minimum scale. Also, this algorithm has one major drawback: it splits one coarse edge into several fine edges.

Lacroix (1990) eliminated the problem of splitting edges while moving from fine to coarse resolution. This method uses the Canny algorithm with four parameters: detection scales, blurring scale, local contrast, and edge type. Although it removes the problem of splitting edges, it introduces the problem of localization error. Also, no scale information is specified.

Williams & Shah (1990) proposed an algorithm to find the contour of edges with the help of multiple scales. A Gaussian operator is used to verify sizes in order to analyze the movement of edge points. By using this information, edge linking points are detected at different scales. The method employs the Canny algorithm and non-maxima suppression

(NMS) because the resulting ridges can contain more than one pixel. These points must be thinned and linked with the help of an algorithm, which assigns weights based on four parameters: gradient magnitude, contour length, curvature, and noisiness. The set of points with maximum weight gets selected. However, information regarding the choice of the value of a key scale parameter is missing.

Goshtasby & Oneill (1994) proposed an algorithm which works on a better scale-space representation of an image. This proposed method has the advantage that no new edges are formed. There is one drawback: it also requires more memory for 3D image processing.

Deng & Cahill (1993) have proposed a new adaptive Gaussian filtering algorithm for edge detection. The proposed method relies on adapting the Gaussian filters according to noise and local image data variance. These improvements come at the cost of requiring increased computational power. 16

Bennamoun & Boashash (1997) proposed a new hybrid detector that separates the tasks of edge localization and noise suppression. The hybrid detector shows good performance in terms of improvement in localization and noise removal with respect to first-order and second-order detectors. The proposed algorithm has also introduced a method to determine the optimal scale and threshold through a cost function which maximizes the probability of edge detection and minimizes the probability of false edge detection.

Wavelet transform transforms each image into a 2D image and expresses the image in two terms: shift and scale. Shift represents the while scale represents the spatial domain. A number of researchers have used wavelet methods to improve edge detection in images. For instance, Heric & Zazula (2007) have proposed a new edge detection algorithm which is based on Haar wavelet transform. Haar wavelets are selected because they are pairwise orthogonal and compact. The algorithm proposed by Heric &

Zazula (2007) decreases the noise level and uses edge linkage to form a contour. Shih &

Tseng (2005) introduced a hybrid algorithm which combines gradient-based edge detection and a wavelet-based method, in which edges are detected by a contextual filter and an edge tracker is used for refinement of image.

Statistical Methods such as those suggested by Bezdek, Chandrasekhar, & Attikouzel

(1998) proposes an edge detection algorithm that divides the process into four parts: conditioning, feature extraction, blending and scaling. Statistical methods have been useful in estimating digital gradients, statistical features and enhance edge image blending functions.

17

Machine learning-based methods such as those suggested by Wu, Yin, & Xiong (2007) have proposed a new multilevel fuzzy edge detection algorithm for blurry images which divides the process into two steps. Firstly, the Fast-Multilevel Fuzzy Enhancement (FMFE) algorithm is used to improve contrast; secondly, the edges are extracted based on gradient values. This algorithm has the advantage that it can detect even thin edges and remove false edges from the image. Wu, et al. (2007) compared the result of their proposed algorithm with Sobel, Canny and the traditional fuzzy edge detection algorithm.

Lu, Wang, & Shen (2003) introduced a new edge detection algorithm based on a fuzzy neural network system which is able to recover missing edges and eliminate false ones.

The whole work was divided into three stages: fuzzification, edge detection, and edge enhancement. For edge detection, a 3-layer fuzzy neural network is used, and for edge enhancement, a Hopfield neural network is used. Bhandarkar, Zhang, & Potter, (1994) suggested a new edge detection algorithm which uses a genetic algorithm (GA) as an optimization technique. GA showed good performance because of its robustness in the presence of noise, its fast convergence and the good quality of the final output edge image.

Shrivakshan & Chandrasekar (2012) authored a comparative study among various edge detection algorithms such as Robert, Sobel, Prewitt, Canny, and Laplacian of

Gaussian (LoG). Shark Fish was taken as a case study to find out relative advantages and disadvantages of each of these detectors. The results noted that Sobel and Prewitt edge detection algorithms are known for their simplicity but they are very sensitive to noise: as the noise increases, accuracy will keep on decreasing. Zero crossing in the Shrivakshan &

Chandrasekar (2012) proposed algorithm has the advantage of fixed characteristics in all directions, but it is sensitive to noise. The Canny edge detector has better performance even

18

under noisy conditions but it is computationally complex. According to this survey, the performance quality of these edge detection algorithms is, in decreasing order: Canny,

LoG, Sobel, Prewit and Robert.

Becerikli & Karan (2005) suggested a new approach for edge detection based on Fuzzy rules. Classical edge detectors produce edges of fixed thickness. But according to Becerikli

& Karan's proposed approach, the edge thickness can be varied by changing some rules.

The fuzzy based edge detector can be adapted according to user needs anywhere and at any time. Finally, results are compared with classical edge detectors like Prewitt, Sober, and

LoG.

Anver & Stonier (2004) presented a new edge detection technique which is based on multiple masks instead of a single one. Single mask detectors find edges of only a specific thickness. Anver & Stonier's proposed work uses three masks for finding edges with different thickness. While the Canny edge detection algorithm uses three parameters to obtain different levels, in Anver & Stonier's proposed work only two parameters are used.

The Canny edge detection algorithm provides a binary edge map while the proposed algorithm provides a strength value corresponding to each edge pixel by using fuzzy rules.

The Anver & Stonier algorithm is less complex and faster than Canny's, thus finding application in the area of remote sensing and also in medical imaging where edges need to be found rapidly and accurately.

Suliman, Boldisor, Bazavan, & Moldoveanu (2011) introduced a new algorithm for finding edges in a grey scale image by using morphological operators in order to thin edges down to one pixel. The authors have also suggested that their work can be extended by combing it with edge linking algorithms in order to obtain continuous edges. 19

Sharifi, Fathy, & Mahmoudi (2002) presented a comparative study among various edge detection techniques such as Infinite Symmetric Exponential Filter (ISEF), Canny,

Marr-Hildreth, Sobel, Kirsch, and Laplacian. Sharifi et al. (2002) used 30 images for finding advantages and disadvantages of each category of edge detection algorithm. The image detection algorithms are divided into five categories: Classical edge detectors

(Sobel, Prewit, Kirsch), Zero Crossing (Laplacian), Laplacian of Gaussian (LoG, Marr-

Hildreth), Gaussian (Canny, Shen-Castan), and Colored Edge Detectors (fusion methods, multidimensional gradient methods, and vector methods). Two criteria are used for comparison: Signal to Noise Ratio (SNR) and Average Risk (AVR). The resulting list in decreasing order of performance quality is ISEF, Canny, Marr-Hildreth, Kirsch, Sobel,

Laplacian.

Mehena (2011) proposed a new algorithm for finding edges in medical images.

Traditional edge detection methods like Robert, Sobel, Prewitt, and Canny are based on the use of a template, and no filtering is done by these algorithms themselves. Mehena

(2011) proposed filtering algorithm depends on a filtering mechanism applied before the process of edge detection.

Saxena, Kumar & Sharma, (2013) introduced a new concept for finding edges in medical images based on a Neuro-Fuzzy approach. The authors used three parameters for evaluation: number of edges, mean square error and peak signal to noise ratio and found satisfactory results in comparison to Robert and Sobel edge detection algorithms.

Information or feature extraction in image pre- or post-processing is a technique achieved by maximizing and minimizing features of interest in images. Researchers have studied texture from different perspectives. For instance, Mir, Hanmandlu & Tandon 20

(1995) proposed a new pixel-based model for texture analysis where texture is described as distribution of grey levels in the image. This method became popular for many years, partly because its measures and techniques were very simple and easy to define. It requires no changes in existing classification methods and is also applicable to single band images.

It also requires more memory and computation time. The method is flexible because some parameters like window size, orientation, and radiometric resolution reduction can be adjusted.

The linear feature extraction technique proposed by Nevatia & Babu (1980) is the most basic of these methods. It maps parameter space to feature space with the help of a linear transformation matrix. A number of algorithms have since been proposed, but Principal

Component Analysis (PCA) and Latent Dirichlet Allocation (LDA) are the most commonly used. Jolliffe (1990) introduced PCA, a feature extraction technique. PCA has been applied to various areas like pattern recognition, computer vision, and . PCA is an efficient algorithm because it has polynomial complexity. This algorithm has a drawback: it can be applied only to linear feature subspaces, not to non-linear feature subspaces. As a consequence, PCA fails if data becomes more complicated. Also, the user does not know how many principal components should be kept.

Soentpiet (1999) suggested a new technique known as Kernel Principal Component analysis (KPCM), which is an extension of PCA. While PCA can only be applied to linear feature subspaces, KPCM eliminates this limitation by extending it to non-linear dimensions. KPCM can extract the numbers of samples without massive computation.

For the purpose of content-based image retrieval, Ramamurthy & Varadarajan (2012) combined three types of features: shape, texture, and resolution. Seven types of moment 21

were calculated for shape features; texture features are extracted by using a grey level co- occurrence matrix; resolution features are extracted by using grey scale enhancement. This algorithm enhances the texture of the image by properly tuning the threshold, and it also considers color and edges information.

Khalid, Yusof & Meriaudeau, (2010) surveyed various feature extraction techniques for wood images, such as Grey Level Co-occurrence Matrix (GLCM), Local Binary

Patterns (LBP), wavelet transform, Law’s mask, and ranklet granulometry. The authors divided all feature extraction techniques into different categories; Linear, Quadratic

Classifier, Neural Network, Support Vector Machine, and K-nearest neighbor. All these feature extraction techniques are compared in terms of highest classification rate, running time and computational complexity. The LBP feature extraction technique was found to be the most appropriate for finding texture from wood images.

Nithya & Santhi (2011) presented a comparative study among various feature extraction techniques used for pattern recognition aimed at helping find breast cancer in mammographic images. Nithya & Santhi (2011) studied three types of feature extraction techniques: intensity histogram-based, GLCM, and intensity-based, using a total of 250 mammographic images, one half of which were normal, and the rest abnormal. For classification process, the proposed work uses a neural classifier based on the extraction of three types of features. Performance of each classifier is tested and then the classification success rate of the intensity histogram-based technique is found to be 92%, while for the intensity-based technique it is 96%. The highest performance is shown by GLCM, which is 98%. The proposed work can be extended by adding more classifiers for comparison.

22

Chadha, Mallik & Johar (2012) presented a comparative study among various feature extraction techniques, and also suggested a combination of them in order to achieve a performance improvement. Chadha, et al. (2012) studied various feature extraction techniques for improving Content-Based Image Retrieval (CBIR) in terms of speed and accuracy. CBIR is a technique for finding images based on their descriptors like texture, color, and intensity. The study focuses on texture features extraction techniques like average RGB, color moments, co-occurrence, local color histogram and geometric moment, which are compared in terms of accuracy, Redundancy Factor (RF) and required time. It was found that the average RGB technique has an accuracy of 56%, a RF of -0.105 and it requires a time of 8 seconds. The color moment technique resulted in output values of 48%, 3.45 and 12.4 seconds, respectively. The co-occurrence technique produced outcomes of 44%, -0.52 and 9.7 seconds, respectively. The local color histogram technique resulted in 39%, 2.68 and 12.8 seconds, respectively. The global color histogram technique exhibited 17%, -0.90 11.6 seconds, respectively. The authors further suggested that a combination of these approaches would result in an increased accuracy of 92% but at the cost of required time, which would grow to 51.3 seconds. Image cropping, a technique which concentrates the algorithm action only on a certain region of interest instead of the whole image, was used to reduce simulation load.

2.2 Image fusion

2.2.1 Fusion rules-based medical image fusion

Multimodal image fusion technique has had a significant role in superimposing multi- sensor outputs into a single image. The existing fusion rules such as maximization and averaging rules widely use pixel-based fusion rules. Yang & Wei (2013) proposed a fuzzy- 23

based image fusion rule, in which two fuzzy-based rules are used for infrared (IR) and visible light source images. The generalized fusion rules can be classified into fuzzy logic- based, Human Visual System (HVS) -based and artificial intelligence-based.

Consequently, the literature review of fusion rules provided next is organized into three subsections, following this classification.

2.2.1.1 Fuzzy logic-based fusion rule

Fuzzy logic is presented in natural linguistic terms and has therefore found many applications in image processing (Ross, 2005). Fuzzy based selection has been used widely in the past in multimodal image fusion. Yue, Wu, Pan & Wang (2013) used a fuzzy clustering technique to fuse Electrical Tomography (ET) images; experiments by Yue et al. (2013) indicated that the proposed method produces high quality ET images.

Balasubramaniam & Ananthi (2014) proposed a fusion technique using intuitionistic fuzzy sets and tested for Magnetic Resonance (MR) and Computer Tomography (CT) medical images. Balasubramaniam & Ananthi (2014) experimental results show that luminance and contrast in the proposed techniques is better than in all other methods.

Saeedi & Faez (2012) proposed an image fusion approach using fuzzy logic and a population-based optimization technique. Saeedi & Faez (2012) proposed method uses a dual-tree-discrete wavelet transform (DT-DWT) to decompose the source images, and applies fuzzy-based approach to fuse the high-frequency wavelet coefficients of the IR and visible images. The final fused images show improved subjective and objective performance, as compared to previous image fusion methods.

Jiang & Tian (2011) proposed a fuzzy-based image fusion approach using a Self-

Generating Neural Network (SGNN). Experimental results demonstrate that the proposed 24

fuzzy fusion scheme outperforms most region-based fusion methods such as wavelet multi- resolution (MR) segmentation, and region-based fusion using tree-structure wavelet MR segmentation. Jiang & Tian (2011) experiments produced results that are superior both in terms of visual effect and objective evaluation criteria.

2.2.1.2 Human Visual System-based Fusion Rule

Vision is the most important sensor of human beings (Mandal, 2003). Human Visual

System (HVS)-based techniques have been used by many researchers in various applications in image processing. Peng & Varshney (2015) proposed an image segmentation algorithm using an HVS technique which applies an integrated energy function to segmentation quality metrics based on perceptual properties of HVS. The energy function is used to help encode the HVS properties from both region-based and boundary-based perspectives. Experimental results showed improved objective performance of the proposed method when compared to existing methods.

Liu, Yin, Chai & Yang (2013) introduced a fusion approach using HVS for remote sensing applications. This method uses a weighted fusion rule to fuse the high-frequency coefficients of source medical images. Liu et al. (2013) combined their method with a pulse-coupled neural networks (PCNN) fusion rule to exploit the low-frequency coefficients of CT and MRI images. Comparative experiments conducted revealed that the proposed method produces better results than the existing techniques.

Bhatnagar, Wu & Liu (2013) proposed an HVS-based medical image fusion technique where MRI, CT, Single-Photon Emission CT (SPECT) and Positron Emission

Tomography (PET) images are used to test the proposed technique. The proposed technique decomposes all the source images using framelet transform to help combine the 25

low- and high-frequency coefficients. The low-frequency coefficients in the method are used to imitate the visibility measure, while high-frequency coefficients are used for texture information. Fused images are constructed by inverse framelet transform. Experimental results show superior objective performance than existing methods.

2.2.1.3 Artificial Intelligence Based Fusion Rule

Highly accurate fused image results are an essential requirement in medical imaging

(Qu, Zhang, & Yan, 2002). Artificial intelligence-based fusion techniques have been found helpful in achieving this goal. Artificial intelligence medical image fusion are neural network-based techniques which are used to predict the output or forthcoming stages using a set of training features, while optimization algorithms can dynamically select the adaptive network parameters such as weight, threshold and scale (James & Dasarathy, 2014). Many studies have proposed various fusion techniques based on artificial intelligence. Xu et al.

(2016) proposed an adaptive Pulse-Coupled Neural Network (PCNN) -based multimodal fusion method using a modified Quantum-behaved Particle Swarm Optimization (QPSO) algorithm. The techniques first processes source images by the QPSO-PCNN model.

QPSO-PCNN helps finding the optimal parameters for the source images. The experimental results illustrated that the proposed method exhibited better performance than a set of fusion reference methods.

Singh, Gupta, Anand & Kumar (2015) proposed a biologically inspired spiking neural network for MR- CT image fusion. The technique combines the features of the non- subsampled shearlet transform (NSST) and a spiking neural network. NSST is used to provide a better representation of image coefficients. The source images are first decomposed by NSST and the regional energies are used to fuse the low-frequency 26

coefficients, while the high-frequency coefficients are also fused using a PCNN model.

The fused images are produced via inverse NSST. The combination of NSST and PCNN helps retaining the edges and detail information from the source images. Experimental results showed improved objective performance.

2.2.1.1 Pulse-Coupled Neural Networks

Recent image fusion studies such as those by Xu, Du & Li, (2013) and Liu, Li &

Caiyun, (2012) have combined contourlet transforms, Pulse-Coupled Neural Networks

(PCNN) and fuzzy logic, demonstrating some improvements. Several published studies have implemented PCNN to sort and filter source images before fusion (Broussard &

Rogers, 1996; Blasch, 1999; Xu & Chen, 2004; Li & Zhu, 2005), and demonstrated improved edges and textures, or contrast preservation in the resulting fusion. Other approaches focus on the way PCNN is implemented, by using sub-netted PCNN components (Zhang & Liang, 2004), by operating in a region-based schema (Li, Cai, &

Tan, 2006), or by modifying the PCNN to automatically adjust its own linking coefficients

(Wang & Ma, Medical image fusion using m-PCNN, 2008). These approaches are time- consuming and require complex computations (Ma, Zhan, & Wang, 2010; Liu, Li, &

Caiyun, 2012; Wang, Zhao, Dai, & Iwahori, 2015), and still fail to take into account the characteristics of the human visual system and edge detection. If optimized for human perception, results would take advantage of HVS improved edge completion and texture- recognition capabilities (Broussard & Rogers, 1996).

2.2.2 Critical review of image fusion

Image fusion is dedicated to generating a single enhanced image that is more appropriate for the purpose of human visual perception, object detection and target 27

recognition (Sehasnainjot, 2014). A number of researchers have developed different types of image fusion. Sahu & Parsai (2012) discussed numerous fusion techniques such as

DWT-based, PCA-based, average method, select maximum, and select minimum. By comparing and contrasting these techniques, the authors provide help for the selection of fusion techniques in future research work. The paper concludes that DWT and PCA with morphological processing will improve the quality of the fused image.

Anita & Moses (2013) considered the problems of existing image fusion such as spatial distortion and color distortion, which arise while performing fusion with techniques like

Hue-Intensity-Saturation (HIS), PCA and Wavelet Transform. These authors also proposed a new fusion technique called linear pixel fusion, whose main advantages are that the fused image requires no additional transformations and that the color of the fused image is similar to the natural color.

He, Wang & Amani, (2004) suggested that wavelet technique is far better than any other techniques for fusion. The authors proposed a new and original method that is used to fuse high-resolution image and low-resolution image, regardless of the spectral relationship between them.

Naidu & Raol (2008) implemented pixel-level image fusion using wavelet transform and principal component analysis in a Matlab® environment. Degraded performance of the average method is shown. The paper concludes that DWT with higher pass filter performed better than any other considered techniques. Zhang & Cao (2013) introduced a method based on wavelet decomposition and different characters of wavelet theory. The authors used disassembled images (different frequency sub-bands) in order to preserve all information and thus attain a better fusion. 28

Desale & Verma (2013) discuss the formulation, process flow diagrams and algorithms of PCA-, DCT- (Discrete Cosine Transform) and DWT- based image fusion techniques. The comparative analysis of these techniques is performed and presented in the form of tables. The PCA and DCT are shown to be more conventional fusion techniques with many drawbacks, whereas DWT-based techniques are more appealing as they provide better results.

Huang & Chen (2002) proposed that pixel-level fusion by DWT on-scene image remove infections like images covered by clouds and their shadows, which cannot be achieved by using average fusion techniques. On the other hand, Yang (2010) proposed that DWT-based fusion of medical images helps extracting features that are not visible by the naked eye. Furthermore, Sapkal & Kulkarni (2013) proposed an image fusion algorithm based on fast DCT with different fusion rules and discussed the limitations of DWT regarding the fusion of images containing curves. In this algorithm, the statistical analysis of medical images was done using seven different quality metrics.

Prakash, Srivastava & Khare (2013) proposed a pixel-level image fusion scheme using multi-resolution Biorthogonal Wavelet Transform (BWT). Maximum fusion rule was used to fuse wavelet coefficients at different decomposition levels. BWT-based fusion is shown to be capable of preserving edge information while reducing the distortions in the fused image because of two important BWT properties, namely wavelet symmetry and linear phase.

Savic & Babic (2012) gave a review of related image fusion methods like pyramidal decomposition algorithms such as DWT and Empirical Mode Decomposition (EMD).

29

Subjective analysis of these methods performed on an in-house, multi-focus image dataset shows the superiority of a fusion method based on the first level of EMD.

Lee, Yeom, Guschin, Son & Kim (2009) presented an algorithm for detecting objects concealed underneath a person’s clothing by using images obtained through a form of electromagnetic radiation. The Symlet wavelet was used to fuse a visual image and a

Passive millimeter-Wave (PMMW) image using minimum fusion rule on Correspondence

Analysis (CA) coefficients and average fusion rule on detail coefficients.

Huang, Liu, Zhang & Hou (2009) proposed a thresholding method based on wavelet fusion of color subband images. An RGB image is decomposed into three sub-bands of red, green and blue. Then, DWT is applied to each band, followed by the application of maximum and minimum fusion rules. The method produced images of a higher quality than the original image.

2.2.2.1 Existing fusion methods with edge, texture and information content enhancement

A study authored by Sudharani, Hemaltha & Deepa, (2015) is aimed at generating a fused image with more accurate description of the scene than any of the individual source images, and is more suitable for human and machine perception or further image processing and analysis tasks. The study is also aimed at improving information for medical image fusion which required the fused image to have additional clinical information not apparent in the separate images. Sudharani et al. (2015) study used MRI and PET images from two different sources with DWT at the initial Stage. The fusion of these images occurs at the second stage based on the fuzzy rules formulated in the algorithm. Then, an inverse DWT is applied at the consecutive level, which produces a 30

single combined output image. The results of this study, which used fuzzy logic based algorithm and a combination of wavelet transform with edge-based fuzzy rules, yielded better results than other existing fusion methods such as those by Javed, Riaz, Ghafoor, Ali

& Cheema (2014) and Rao, Seetha & Prasad, (2012). However, while the algorithm focused on enhancing the edges of the image, it ignored textual properties.

A study by Das & Kundu (2011) based on Modified Spatial Frequency (MSF) in the

Discrete Ripplet Transform (DRT) domain is used to motivate the Pulse-Coupled Neural

Networks (PCNN). The low-frequency sub-bands are fused using the ‘max selection’ rule, and the high-frequency sub-bands are fused using the PCNN and the MSF. Both visual and quantitative performance evaluations are made and verified in the algorithm. Performance comparisons of the proposed method with some of the existing Medical Image Fusion

(MIF) schemes show that the proposed method performs better. The focus of this algorithm is the contrast of the fused images.

Liu & Wang, 2018 proposed an effective method for image fusion based on non- subsampled contourlet transform (NSCT) and modified PCNN (MPCNN). Their algorithm decomposes the source image into low-frequency components and high-frequency components using NSCT, and then fuses the low-frequency components by using regional statistical fusion rules. On the other hand, high-frequency components are obtained by calculating the spatial frequency (SF), which would be the input into the MPCNN model to get relevant coefficients according to the fire-mapping feature of MPCNN. The final image was reconstructed by inverse transformation of low-frequency and high-frequency components. Compared with the wavelet transform and the traditional NSCT algorithm,

31

experimental results indicate that this method achieves improved results both in terms of human visual perception and objective evaluation.

Chapter 3

Methodological Framework

3.1 Objectives

The extensive literature review presented in the previous chapter has covered many research advances in the areas of image enhancement, edge detection, perceptual quality, and image fusion. However, such a review has also revealed the fact that existing image

32

fusion and enhancement methods have not successfully addressed persisting shortcomings of image fusion, such as:

- the poor visual properties of source images, leading to less-than-ideal information quality;

- the deficient edge formation of many resulting fused images;

- the loss of textural information and visual detail during image fusion, and

- the degradation of visual quality due to excessive image variance.

These findings indicate that new techniques need to be designed and evaluated in order to further improve the performance of multimodal image fusion techniques. In this dissertation, a set of novel methods are developed to address the problems highlighted above. These new methods are aimed at exploiting well-known properties of some of the image processing techniques presented in the previous section. Consequently, the objectives of this work are as follows:

1. To develop a fuzzy-based medical image fusion using a combination of maximum

selection and Gabor filtering;

2. To develop a novel Pulse-Coupled Neural Network using Gabor filtering for

medical image fusion;

3. To develop a novel image fusion method that combines Wiener filtering, Feature

Linking Model (FLM)-PCNN and Shift-Invariant Discrete Wavelet Transform;

4. To develop a novel local and global contrast enhancement method for better

medical image fusion;

5. To evaluate all of the newly developed methods in terms of a selected set of

objective performance measurements; 33

6. To compare these results with a set of previously available fusion algorithms in

terms of the same objective performance measures;

7. To discuss and analyze these results in order to assess the fulfillment of these

objectives and provide research guidelines for future works.

3.2 Research methods

As mentioned above, a variety of MIF algorithms have been developed by combining well-known and image fusion techniques. Matlab 2016b is the software platform on which all algorithms have been implemented, using mostly originally written code, plus some standard Matlab libraries from the Image Processing Toolbox. The parameters of the proposed algorithms have been chosen after several trial experiments in order to find the ones that produce optimal results. The optimal parameters are reported in this document along with other implementation details of the newly designed methods.

A set of medical images has been obtained online in order to perform the experimental evaluation of the new techniques, as well as the benchmarking reference algorithms. Each fusion experiment is conducted on two medical images of different nature, one Computer

Tomography (CT) scan and one Magnetic Resonance Image (MRI). The comparison against the pre-existing reference methods is aimed at providing the necessary evidence in order to properly assess the potential value and relevance of the contributions stemming from this research effort. Both numeric and visual results of all MIF techniques are provided in this report. Finally, an overall assessment of the results will point at significant directions of future research and development.

34

Chapter 4

FUZZY-BASED Medical Image Fusion Using A Combination of Maximum Selection and Gabor Filters

4.1 Introduction

Image fusion (IF) techniques typically modify original images on one or many of four different levels, to varying degrees. These levels are termed as feature level, signal level, decision level, and pixel level (Maruthi & Sankarasubramanian, 2008). Most IF methods modify the pixel level, with changes made in either the spatial or the transform domain. A set of algorithmic (linear or non-linear) rules are often used in spatial domain methods

(Manchanda & Sharma, 2016), while methods operating on the transform domain require that the images be transformed, modified, and then inverse-transformed back to the original image domain after fusion (Maruthi & Sankarasubramanian, 2008).

In medical image fusion (MIF), multi-modal images are fused allowing for the visualization of information from different modalities. MIF regularly incorporates pixel- level fusion techniques to create fused images containing useful measurable quantities

(Dammavalam, Maddala, & Prasad, 2012). Pixel-level fusion is easy to implement and computationally efficient. The most common fusion algorithms utilizing pixel-level techniques include the Average Method, the Maximum Selection Method, the Principle

Component Analysis, and the Laplacian Pyramid Method (Rajkumar, Bardhan, Akkireddy,

& Munshi, 2014; Singh, Gupta, Anand, & Kumar, 2015).

Multi-modality based diagnostic interpretation is a current standard in medical diagnosis. Use of MIF allows for the visualization of diagnostic, physiologic, and anatomic 35

detail, including changes in metabolism, blood flow, regional chemical composition, and absorption (Rajkumar, Bardhan, Akkireddy, & Munshi, 2014; Dammavalam, Maddala, &

Prasad, 2012). Image modalities such as CT, X-ray, DSA, MRI, PET, and SPECT are regularly combined in order to obtain a diagnosis and design a treatment plan for patients.

This process is improved when visualization of fused images is possible.

MIF seeks to combine the most useful information from each modality. For example,

Magnetic Resonance Imaging (MRI), Computed Tomography (CT), Ultrasonography

(USG), and Magnetic Resonance Angiography (MRA) produce high-resolution images with excellent anatomical detail and precise localization capability (Swathi, Sheethal, &

Paul, 2016). In contrast, Positron Emission Tomography (PET), Single-photon Emission

Computed Tomography (SPECT) and functional MRI (FMRI) produce low-spatial- resolution images containing functional information, suitable for the detection of cancer and related metabolic abnormalities, but lacking localization precision. Image fusion allows information from multiple modalities to be brought together to create more useful images (Abramoff, Magalhaes, & Ram, 2004). For instance, CT images capture the anatomical structure of bone tissues. MRI images show the anatomical structure of soft tissues, organs, and blood vessels (Broussard & Rogers, 1996). Thus, CT and MRI images largely complement each other, offering more useful information when they are fused than when they are individually visualized (Rajkumar, Bardhan, Akkireddy, & Munshi, 2014).

In other words, fused images resulting from CT and MRI sources can provide anatomical structural information about both hard and soft tissues (Das & Kundu, 2013).

However, inherent limitations have been found in current fusion methods, both at the pixel- and transform-domain levels. Methods operating at pixel-level (spatial domain)

36

typically yield low-contrast images (Mitchell, 2010). Similarly, transform-domain fusion methods based on multi-scale approaches tend to result in poor edge definition, and are therefore less useful for clinical diagnosis.

Even recently developed, improved MIF methods suffer from these limitations. For instance, techniques based on intensity-hue-saturation (IHS) and principal component analysis (PCA) demonstrate spectral degradation, which compromises their usefulness.

Furthermore, IHS deals only with color images and lacks the ability to produce results for black and white source images (Das & Kundu, 2012). Finally, pyramidal image fusion methods, because they lack any spatial orientation selectivity in the decomposition process, result in blocking effects (Das & Kundu, 2011).

In fact, each of the existing image fusion approaches has its advantages and disadvantages. For instance, simple averaging fusion techniques, which are among the most straightforward and easy to understand MIF algorithms, produce unclear fused images (Kuruvilla & Anitha, 2014). Simple maximum fusion, on the other hand, produces highly focused images, but this feature results in blurring, which affects the local contrast significantly (Mitchell, 2010). Other techniques based on transform-domain fusion techniques such as Discrete Wavelet Transform (DWT), Dual-Tree Complex Wavelet

Transform (DT-CWT), and Curvelet Transform, despite their complexity and excellent performance, underperform when handling long curved edges and are more sensitive to directional information (Tank, Shah, Vyas, Chotaliya, & Manavadaria, 2013). The widely used Wavelet Transform (WT) technique can indeed preserve spectral information efficiently but cannot express spatial characteristics as well, which results in the loss of

37

salient features of the source images and the introduction of artifacts and inconsistencies in the fused results (Kaur & Kaur, 2016).

Recently developed algorithms meant to overcome these limitations have met only limited success. Those based on multi-scale geometric analysis (MGA) tools such as

Curvelet, Contourlet, and Ripplet (El-Hoseny, Elrahman, & El-Samie, 2017) have improved the fused image results in some ways, but they have failed to resolve issues related to poor texture and deficient edges (Gonzalez & Woods, 1993).

Some researchers have noted that the use of fuzzy logic offers solutions to these problems. Fuzzy logic provides the basis for the approximate description of different functions; hence it has found numerous applications in sophisticated systems. The fuzzy transform (Perfilieva, 2006) can indeed preserve edge formation, remove noise and smooth the images (Manchanda & Sharma, 2016). These properties of the fuzzy transform have been applied successfully in image fusion.

Based on these observations, the algorithm proposed in this section uses a novel method using a fuzzy-based medical image fusion that combines maximum averaging and Gabor filter technology. The proposed method implements a fusion algorithm whose resulting fused images demonstrates improved edge clarity and texture features, as measured by high standard deviations of pixel values (Yang, Fang, & Lin, 2015). The method incorporates techniques such as maximum averaging and Gabor filtering, together with fuzzy logic, to create a fused image with very high contrast.

The rest of this chapter is organized as follows: section 4.2 introduces the proposed method, giving a detailed explanation of the tools it uses; sections 4.3 and 4.4 shows some

38

simulation results, which are then discussed in section 4.4; and finally, section 4.5 provides the summary of this research.

4.2 Proposed method

The proposed method is pictorially summarized in Figure 4-1, and the following subsections address each of its significant blocks.

Figure 4-1: Schematic representation of the proposed algorithm.

4.2.1 Gabor filters

Because of its properties, as examined in ‘2.1.2.1 Gabor filter’, Gabor filters excel at feature extraction, textual analysis, disparity estimation, and edge detection in image processing and computer vision (Singh, Agrawal, & Gupta, 2015).

The filtering block in Gabor filtering implements one or more convolutions of the input images with the 2-D Gabor function represented by (Rodriguez, Mitra, Thampi, & El-Alfy,

2016)

푥′2+훾2푦′2 2휋 푥′ 푔 (푥, 푦) = exp ( ) cos ( + 휑 ), (4.1) 휆,휃,휑,휎,훾 2휎2 휆

39

where

푥′ = 푥 cos Θ + 푦 sin Θ,

and

푦′ = −푥 sin Θ + 푦 cos Θ

In equation 4.1, λ is the wavelength of the sinusoidal factor and is equal to 3.5, Θ is the angular orientation of the filter normal to the Gabor function, φ is the phase, σ is the standard deviation of the Gaussian envelope and γ is the aspect ratio.

As seen in Figure 4-1, the Gabor filter constitutes the initial stage of the proposed technique. First, a Gabor filter with optimum parameters is created. The optimum parameters chosen for this study are as follows: the filter is set at a 10 x 10 size with 푘푡푦푝푒 values ranging from 0−255 pixels. Orientation angles (휃) of 0, 90, 180 and 270 degrees are considered, with 훾=1.25, 휓=0, 휆=3.5, 휋=180º and a normalized bandwidth of 2.8. The images are then converted to gray scale. The texture of the fused image is extracted after transforming the RGB color to 푌퐶푏퐶푟 color space, specifically gray color, where the Y value represents the luminance component and 퐶푏퐶푟 represent the chrominance components of the image. The images are then passed through Gabor filters represented by equation 4.1. The maximum results obtained among the four considered directions for 휃 are then passed on to the maximum selection stage.

4.2.2 Maximum selection

Selecting the appropriate image pixel for the fused image is a critical stage in pixel-level image fusion (Yi & Tian, 2011). Unlike other fuzzy fusion techniques, the selection of a pixel for fusion in our algorithm is based on the centroid of the chosen area, after smoothing

40

or averaging. This procedure ensures that the pixel with maximum intensity is selected and used as the resultant pixel for the fuzzy rules and for the membership sets.

In this way, no compromise needs to be made in separating the noise from the useful information contained in the original images. During pixel selection, the pixel intensities for the Gabor-filtered images corresponding to the input pairs are compared. A binary selection matrix and its inverse are then generated (Selvakumari & Aravindh, 2014).

The selection matrix is applied to the first Gabor filtered image, while the inverse selection matrix is applied to the second Gabor filtered image. The resulting pixel values are obtained from the multiple inputs in Fuzzy Interface System (FIS) for the fuzzy logic stage.

4.2.3 Fuzzy logic

4.2.3.1 Fuzzy logic in image fusion

Fuzzy logic exploits the human reasoning in natural language to solve uncertainty and redundancy problems (Jiang & Tian, 2011). Linguistic terms can be directly encoded as algorithmic rules, which enhance its usability in many applications. For example, it can logically manage unclear boundaries of potential regions of interest. Therefore, fuzzy logic has found applications in areas where uncertainty and no known mathematical relationships exists.

In image fusion, fuzzy logic has been used to create more useful images. Fuzzy logic fuses images based on pixel intensities using simple rules that are more easily implemented, compared to other methods (Mas, Monserrat, Torrens, & Trillas, 2007).

Fuzzy Inference Systems (FIS) is a set of methods of fuzzy logic implementation. FIS models can be of either Madman or Sugeno types (Mas, Monserrat, Torrens, & Trillas,

2007). Mamdani models have constants outputs while Sugeno models permit polynomial 41

outputs. In the proposed method algorithm, only Mamdani fusion models are considered, which corresponds with the pixel-constant output required for image fusion.

FIS helps map multiple inputs into a single output. These multiple inputs are converted to linguistic variables before mapping to a single output using a set of predefined membership functions. This process is known as fuzzification and requires the determination of a degree of ownership of each input pixel in reference to a suitable fuzzy set. After fuzzification, the FIS engine is invoked to allow fuzzy operators to be applied to the fuzzified input images to produce the output image. The resulting images for the set of fuzzified input images are then aggregated and defuzzified to produce the final desired output as shown in Figure 4-2.

Figure 4-2: Fuzzy fusion system flow chart.

42

4.2.3.2 Membership function

The membership function of a fuzzy set (as shown in Figure 4-3) is defined as a mapping relationship of input pixels from X into the interval [0,1]. Therefore, the fuzzy membership function determines the "degree of belonging" of each input pixel intensity. Consequently, the fuzzy membership function dictates the appropriate fuzzy set for the FIS in the fuzzy logic.

The image pixel values, ranging from 0-255, are segmented into three regions labeled low, medium and high intensity. Resulting regions can be characterized as linguistic variables for use as fuzzy set membership functions. They are defined as

푈 = {푙표푤, 푚푒푑푖푢푚, ℎ푖푔ℎ},

where low, medium and high are used as the regions, each representing membership functions.

Figure 4-3: Membership functions

4.2.3.3 Fuzzy rules

The fuzzy operator is logical rule-based and implemented in the form of IF-THEN statements that determine the pixel value of the result; IF 푥 IS 푎 AND IF 푦 IS 푏 THEN 푧 IS 푐,

43

which can also be represented by

퐼 = 푎 × 푏 → 푐,

where 푥 is input 1, 푦 is input 2, 푧 is the output and 푎, 푏 and 푐 are the membership values.

Following the above notations, pixel-wise image fusion also takes a similar course using the following set of nine rules:

Table 4.1: Pixel- wise image fusion selection criteria Img 1 Img 2 ImF

Low Low Low

Low Medium Medium

Low High High

Medium Low Medium

Medium Medium Medium

Medium High High

High Low High

High Medium High

High High High

Where Img 1 is a pixel for the Gabor filter output of image 1 (CT, or first input), Img 2 is a pixel of the Gabor filter output for image 2 (MRI, or second input), and ImF is the output

(fused image).

4.2.3.4 Aggregation and defuzzification

After the fuzzified images are generated, all outputs are then aggregated according to a maximum selection function:

휇퐴(푥) = max{Ο1, Ο2, … , Ο9} 44

The maximum function is used to unify the outputs where 휇퐴(푥) is the aggregated curve and Ο푘 for 푘 = 1,2,3, … ,9 is the output for the 푘-th rule. The output is then defuzzified using the centroid of area (COA), resulting in the desired fused image pixel values. COA is given by:

∫ 휇퐴(푥)푥푑푥 푋퐶푂퐴 = (4.2) ∫ 휇퐴(푥)푑푥

4.3 Simulation Results

4.3.1 Experimental Setup

The proposed technique, which was implemented in Matlab® R2016b, was evaluated using three different sets of images as examples, as presented in Figure 4-4, Figure 4-5 and Figure 4-6. As explained above, the experiments were conducted using Gabor filtering in four directions: 휃 = 00, 900, 1800, 푎푛푑 2700; the results are presented in in

Figure 4-4, Figure 4-5 and Figure 4-6 along with performance measurements shown in

Table 4.2, Table 4.3, and Table 4.4. The best result (among all four possible orientations) from each example is also compared with fused images created by other researchers using existing IF techniques, and presented in Table 4.4 and Figure 4-7. The measurements used in the performance evaluation of various algorithms as presented in Table 4.4 are the standard deviation, the root mean square error (RMSE) and the entropy. The standard deviation is a measure of the contrast and textual properties and edge formation of the image (Naidu & Raol, 2008). Therefore, images with high standard deviation have better improved textual properties and better edge formation. The RMSE is a measure of the amount of change per pixel due to the processing. The lower the RMSE, the lower the level of noise in the image, meaning the better the image formation. The entropy is a measure of

45

image quality in terms of information content. The higher the entropy, the better the image quality, that is, the more information the image contains.

The criterion used for the selection of the best output, as presented in Table 4.4, is based on choosing the output with either the lowest value of RMSE, or the highest value of standard deviation or entropy. However, it must be noted that entropy values played an insignificant role in the selection of the outputs presented in Table 4.4. This is because the proposed technique intends to focus on improving textual properties and proper edge formation. Images with improved textual properties and edge formation are characterized by higher standard deviation values and lower RMSE.

Figure 4-4 shows the input images in the first column (CT and MRI images), Gabor filtered images under different orientations (θ is 0, 90, 180 and 270 degrees) in the second to fifth columns and the corresponding fused images in the last rows for example 1.

46

Figure 4-4: Inputs and Gabor filtered outputs under different orientations, and resulting fused images for example 1.

Table 3.2: Performance evaluation measures for example 1 under different orientations of Gabor filter. Values at different directions (θ)

Performance ퟎ풐 ퟗퟎ풐 ퟏퟖퟎ풐 ퟐퟕퟎ풐 evaluation parameters

Entropy 4.3803 3.4507 4.0254 4.0502

Standard deviation 102.9260 117.7324 109.3426 111.9463

RMSE 0.1003 0.0736 0.1098 0.0960

47

Figure 4-5 shows the input images in the first column (CT and MRI images), Gabor filtered images under different orientations (θ is 0, 90, 180 and 270 degrees) in the second to fifth columns and the corresponding fused images in the last rows for example 2.

Figure 4-5: Inputs and Gabor filtered outputs under different orientations, and resulting fused images for example 2.

48

Table 4.2: Performance evaluation measures for example 2 under different orientations of Gabor filter. Values at different directions (θ)

Performance evaluation ퟎ풐 ퟗퟎ풐 ퟏퟖퟎ풐 ퟐퟕퟎ풐 parameters

Entropy 3.4935 4.2900 3.8054 3.4524

Standard deviation 120.4455 106.1650 118.0412 120.7756

RMSE 0.0942 0.0852 0.1026 0.0886

Figure 4-6 shows the input images in the first column (CT and MRI images), Gabor filtered images under different orientations (θ is 0, 90, 180 and 270 degrees) in the second to fifth columns and the corresponding fused images in the last row for example 3.

Figure 4-6: Inputs and Gabor filtered outputs under different orientations, and resultants fused images for example 3.

49

Table 4.3: Performance evaluation measures for example 3 under different orientations of Gabor filter. Values at different directions (θ)

Performance evaluation ퟎ풐 ퟗퟎ풐 ퟏퟖퟎ풐 ퟐퟕퟎ풐 parameters

Entropy 2.7584 2.4584 2.3886 2.6969

Standard deviation 85.0618 94.5942 94.6862 89.202

RMSE 0.0706 0.0446 0.0774 0.0643

Table 4.4 and Figure 4-7 display the performance of the proposed algorithm in comparison with some other methods including Contourlet Transform (CT) (Al-Azzawi,

2015), Discrete wavelet transform (DWT) (Al-Azzawi, 2015), and Shearlets and Human

Feature Visibility (SHFV) (Al-Azzawi, 2015).

50

Figure 4-7: Comparison of results with other image fusion techniques.

51

Table 4.4: Tabulated results in comparison with existing techniques Examples Algorithm Entropy Standard RMSE Deviation Proposed 3.4507 117.7324 0.0736 1Contourlet 7.1332 54.1504 0.1662 Transform (CT) 6.9543 47.2304 0.2703 2Discrete wavelet 7.6572 56.7993 0.1164 (DWT)- Method 3Shearlets and Human Feature Visibility (SHFV)

Proposed 3.8054 0.1026 CT 6.9351 118.0412 0.2538 DWT 6.6997 46.6294 0.2889 SHFV 7.3791 41.4623 0.2410 55.8533

Proposed 2.3886 94.6862 0.0774 CT 6.8824 43.1963 0.2422 DWT 6.5198 42.0087 0.3142 SHFV 6.9467 44.2937 0.2133

4.4 Discussion

Experimental results have proven the proposed method to be highly applicable in the field of diagnostic medicine, where improved textual surface appearance is critical. The fused images obtained from the other existing techniques as presented in Table 4.4 and 52

Figure 4-7 suggest that our method is objectively superior, as demonstrated by the measurement of high values of standard deviation and lower values of RMSE.

Fused images created using the newly developed technique display the highest standard deviation values, thus indicating better textual properties and improved edge information.

Higher standard deviations are also associated with higher contrast. As the numbers suggest, the proposed fusion method yields images with a better contrast, which would be of great assistance for physicians in medical diagnostic applications.

Evaluated in terms of the desired objectives, the proposed method provides not only the highest values of standard deviation but also the lowest RMSE values, which proves that this technique produces images with lower noise, and hence improved image quality.

On the other hand, the reference methods showed higher entropy values than the new technique, which indicates potential room for improvement of the proposed method.

The better results obtained from the proposed method indicate the influence of Gabor filtering on the input images before fusion. This makes sense, because it is known that

Gabor filters remove noise in images, resulting in images with superior textual properties and edges with longer curves. Edges with longer curves indicate better edge information, which still shows better textual properties. Images with longer curves on the edges and higher textual properties are easier to analyze during clinical diagnosis.

4.5 Conclusion

A novel medical image fusion method based on fuzzy logic with a combination of maximum averaging selection with Gabor filtering has been presented and described.

Previously existing medical image fusion techniques have produced images with low perceptual quality, in terms of the textual properties and edge information, both of which 53

are important in the visualization of a pattern of the images. The proposed technique was able to improve the textual properties and edge information of fused images, as measured by the standard deviation and the RMSE, by exploiting the properties of Gabor filtering, fuzzy logic, and maximum selection to create more useful images for diagnosis. The study linked the Gabor filter properties, maximum selection algorithm with the fuzzy logic approach to produce fused images, which are more appropriate in assisting the physicians in making a more accurate medical diagnosis. The resulting images, which combine lower information content, as measured by their entropy, with better textual properties and edge formation, should propel future studies seeking to improve the performance of the proposed technique.

54

Chapter 5

A Novel Pulse-Coupled Neural Network using Gabor Filters for Medical Image

Fusion

5.1 Introduction

This technique aims at improving the poor illumination and shift variance among other defects in the input images (Nahvi & Sharma, 2014). These defects result in images that are sub-optimal for human visual perception, limiting their use in critical applications. By fusing multiple images, each with their own features, image fusion techniques attempt to optimize the results. Fused images are more informative and better suited for human visual perception (Xu, Du, & Li, 2013). The improved information content and readability of fused images has resulted in widespread applications in areas such as medical imaging, satellite reconnaissance, military operations, and robotics.

These common applications include multi-view, multi-modal, multi-temporal, or multi- focus fusion as well as fusion for image restoration. The corresponding algorithms typically use methodologies that capture image features at the signal, pixel, feature or decision level

(Suthakar & Monica, 2014) and attempt to incorporate the information into the fused images. Many such techniques have limitations when operating on real images because they do not preserve anisotropic features, such as texture or edge clarity transforms (Liu,

55

Li, & Caiyun, 2012), or they lack shift invariance and result in image distortion (Zhang &

Guo, 2009) or blocking effects (Maruthi & Lakshmi, 2017).

More recent fusion techniques combining Contourlet transforms, Pulse Coupled Neural

Networks (PCNN) and fuzzy logic have demonstrated some improved results (Xu, Du, &

Li, 2013). Several published studies have implemented PCNN to sort and filter source images before fusion (Li & Zhu, 2005) and demonstrated improved edges and textures or contrast preservation in the resulting fusion. Other approaches focus on the way the PCNN is implemented, by using sub-netted PCNN components (Zhang & Liang, 2004), by operating in a region-based schema (Li, Cai, & Tan, 2006), or by modifying the PCNN to automatically adjust its own linking coefficients (Wang & Ma, 2008; Wang, Wang, Zhu,

& Ma, 2016). These approaches are time-consuming and require complex computations

(Wang, Zhao, Dai, & Iwahori, 2015), and still lack the ability to take into account the characteristics of the human visual system (HVS). If optimized for human perception, results would take advantage of HVS improved edge completion and texture-recognition capabilities (Broussard & Rogers, 1996).

The newly proposed method uses Gabor filtering and maximum-intensity pixel selection through a PCNN to enhance the source images before fusion, aiming to preserve the sharpness of the images.

The remainder of the chapter is organized as follows: section 5.2 introduces the proposed method, giving a detailed explanation of the Gabor filtering and PCNN methods used; section 5.3 presents the simulation results under a number of different Gabor Filter orientations; section 5.4 discusses these results and section 5.5 provides conclusions, which point in the direction of future research. 56

5.2 Proposed method

5.2.1 Overview and Background

The proposed method, as illustrated in Figure 5-1, begins with Gabor filtering of two or more source images. Gabor filters are special classes of band-pass filters, which allow a certain ‘band’ of frequencies to pass, while rejecting the others. They can be implemented in either the spatial or frequency domain and are oriented in specific directions to preserve edges and textures (Yang, Tong, Huang, & Lin, 2014).

The masked images are then used as input to the Pulse-Coupled Neural Network

(PCNN). PCNN is a self-organizing network that does not require training (Zhan, Zhang,

& Ma, 2009). PCNN has the ability to extract relevant information from a complex background in an iterative fashion (Johnson & Padgett, 1994). The general model for

PCNN methodology was developed by Eckhorn based on practical observations of the synchronous pulse burst in visual cortices of cats and monkeys (Wang, Zhao, Dai, &

Iwahori, 2015). PCNN is characterized by global coupling and pulse synchronization of neighboring neurons which share locally enhanced action potentials through neuronal linking mechanisms. The ability of PCNN to couple and synchronize neurons makes it adaptable and easily optimized for image processing applications such as image fusion

(Candes, Romberg, & Tao, 2006).

57

Figure 5-1: Schematic representation of the novel algorithm combining Gabor filtering, masking to select the local maximum pixel intensities, and PCNN image enhancement, followed by image fusion

The PCNN typically used in image processing applications consists of three stages: the input feed (feedforward); the linking feed (feedbackward), which modulates the input; and the pulse generator, which yields the result (Wang, Zhao, Dai, & Iwahori, 2015). These stages are described in equations (5-1a-e), and illustrated in Figure 5-2 . The feeding input at each iteration, 퐹푖푗(푛), is the masking result for each pixel with position indices 푖, 푗 (5a).

In the feeding input, 푆푖푗 is the input image value in grayscale. The linking input (5-1 b) is derived from previous linking values and a combination of neighborhood results from the previous iteration,

퐹푖푗(푛) = 푆푖푗; (5-1 a)

퐿푖푗(푛) = 푒푥푝(−훼퐿) 퐿푖푗(푛 − 1) + 푉퐿 ∑푘푙 푊푖푗푘푙 푌푘푙(푛 − 1); (5-1 b)

푈푖푗(푛) = 퐹푖푗(푛) (1 + 훽퐿푖푗(푛)) ; (5-1 c)

휃푖푗(푛) = 푒푥푝(−훼휃) 휃푖푗(푛 − 1) + 푉휃푌푖푗(푛); (5-1 d)

58

1, ∀ 푈푖푗(푛) > 휃푖푗(푛); 푌푖푗(푛) = { (5-1 e) 0, ∀푈푖푗(푛) ≤ 휃푖푗(푛);

where 푛 is the time index of the iterations, while (훼퐿, 훼휃) are time attenuation constants for link inputs and (푉퐿, 푉휃) are normalization constants for dynamic thresholds;

푊푖푗푘푙 are the neighborhood weighting values, and 푌푘푙(푛 − 1) are results of the previous iteration within the neighborhood (denoted in the equation by the subscripts 5); and finally,

퐿푖푗(푛) in (5b) and 푌푖푗(푛) in (5-1e) are combinations of these values used during linking and pulse generation stages, as illustrated in Figure 5-2.

Input values to each neuron consist of 푌푘푙, which is the output from maximum value selection, and 푌푖푗, the pulse output from neighboring neurons. The nonlinear modulation stage, with weighted linking constants 훽푖푗, results in values 푈푖푗, which are compared to the dynamic firing thresholds 휃푖푗. When thresholds are exceeded, a neuron fires.

Figure 5-2: One neuron in a neural network consisting of Input, Linking and Generator stages.

59

The details of the proposed implementation of PCNN, which is slightly different, are described in section 5.2.3 and illustrated in Figure 10. The output from the PCNN becomes the input to the fusion function, "Imfuse", followed by "Imfilter". Both these functions are part of the Matlab image processing toolbox (Matlab® R2016b). In this algorithm, we use PCNN to extract image features; PCNN also can extract the information of the image’s texture, edge, and regional distribution and has a good effect on image processing. "Imfuse" rectifies input images of different sizes by padding and allowing for spatial referencing to account for variations in voxel dimensions and field of view (FOV), without resampling pixel intensities (Semmlow & Griffel, 2014). This capability is useful in multi-mode MIF, in which source images often have inherently different image resolution and FOV. "Imfuse" outputs for each source image are then placed in different color channels of a multidimensional array, which is slightly smoothed, without significantly losing texture detail, using "Imfilter". This produces the final image, examples of which are shown later.

5.2.2 Gabor filtering

As presented in section 2.1.2.1, Gabor filters possess optimal localization properties in both the spatial and frequency domains, while being useful for removing noise and enhancing visual features and edges (Prasad & Domke, 2005).

In the proposed algorithm, they are applied to the source images in order to enhance textural properties (Yang, Fang, & Lin, 2015). The 2-D Gabor function is described by the equation 3. The optimum parameters chosen for the Gabor filter in this algorithm (determined by measurements of RMSE, standard deviation (SD), and entropy) are as follows: the filter is set to a 10x10 size with a given 푘푡푦푝푒 (a classification of type 60

and range of values (0-255) at each position in the Gabor kernel). The orientation angles

(휃), 0, 90, 180 and 270 degrees are considered, with 훾=1.25, 휓=0, 휆=3.5, and a normalized bandwidth value of 2.8 (ratio of standard deviation of the Gaussian function to the preferred wavelength).

5.2.3 Proposed PCNN

The proposed PCNN model shown in Figure 5-3 is slightly different from the general PCNN model. In this case, a linking synapse is inspired by the 훾 band synchronization and dynamic threshold. As shown in Figure 5-2, a neuron has feeding synapses and lateral linking synapses. Feeding synapses are connected to a spatially corresponding stimulus, and lateral linking synapses are connected to outputs of neighboring neurons within a predetermined radius. Locally excitatory linking inputs have a negative globally inhibitory term that supports desynchronization (Eckhorn, Reitboeck,

Arndt, & Dicke, 1990; French & Stein, 1970), which is designed to simulate the refractory period of a neuron (Johnson & Padgett, 1994).

Figure 5-3: Schematic representation of classical PCNN, using eight scalar parameters: 훼푓, 푉푓, 훼푙, 푉푙, 훼휃, 푉휃, 훽 and 푛. 61

As with traditional PCNNs, the proposed PCNN also has 3 scalar variables

(푛, 푖 and 푗), and 5 matrix variables, (퐹푖푗(푛), 퐿푖푗(푛), 푈푖푗(푛), 푌푖푗(푛) 푎푛푑 Θij(푛)) where

푛 (푛 ∈ [0, 푁]) is a time variable, while (푖, 푗) are spatial coordinate variables (Zhan, Shi,

Wang, Xie, & Li, 2017).

The central terms of the membrane potential of the PCNN, in Figure 10, are attained by assuming the linking strength 훽 set to a value close to zero. In our method, β is set to 0.5 while the convolution kernel in the feeding input is also set to a low value of

0.111. The resulting membrane potential is then given by (Zhan, Shi, Wang, Xie, & Li,

2017),

푈푖푗(푛) = 푓푈푖푗(푛 − 1) + 푆푖푗, (5.2)

where f is a constant describing the relationship between values in the sequence. By rearranging equation 6, 푈푖푗(푛) can be expressed in terms of the initial values as:

푆 푆 푈 (푛) = (푈 (0) − 푖푗 ) 푓푛 + 푖푗 , (5.3) 푖푗 푖푗 1−푓 1−푓

where 푓 = 푒푥푝(−훼휃) (from equation 5.3) is a simplification for ease of understanding and 0 < 푓 < 1.

The resulting characteristics of the PCNN operation are as follows, and described further in the following subsections. The network is designed to be self- terminating, after a single-pass, where each neuron has generated an action potential only once. The firing rate of neurons in the network is governed by an iterative, dynamically adjusted threshold to ensure that a neuron with a strong stimulus will fire before a neuron with weak stimulus. This firing pattern is also synchronized within a local neighborhood,

62

and is governed by the linking inputs. The combined effect of all these mechanisms is that visual properties are enhanced by the PCNN action in the proposed image fusion method.

5.2.3.1 Single-pass

A single-pass is a complete iteration of the PCNN time matrix used to extract visual properties for enhancement and edge completion (Johnson & Padgett, 1994). A single-pass is completed when all the neurons have generated an action potential 푉휃., which is set to higher values to excite neurons to generate the action potential only once (Zhan,

Zhang, & Ma, 2009). In our method, the threshold amplification factor 푉휃 is set to 20, high enough that the iterative process will stop automatically when all neurons have fired once, so that a complete time series has been collected (Johnson & Padgett, 1994).

5.2.3.2 Firing rate

Assuming the membrane potential is equal to the stimulus, the precise firing time occurs when the membrane potential is almost equivalent to the dynamic threshold

(performance alerts are generated based on a dynamic baseline provided by the system):

Θ푖푗(푛) = 푆푖푗 . (5.4)

Sub sequential firing activity will update the dynamic threshold in the next iteration according to

Θ푖푗(푛 + 1) = 푔Θ푖푗(푛) + 푉휃 = 푔푆푖푗 + 푉휃 , (5.5)

where 푔 = 푒푥푝(−훼휃) is a simplification for ease of understanding, and 0 < 푔 < 1.

The value of the threshold decreases exponentially from its initial value Θ푖푗(0) at the first firing time, and is delayed at each subsequent firing according to the value 푔푆푖푗 +

63

푉휃. By rearranging equation 5.5, and substituting 푔 = 푒푥푝(−훼휃), it is possible to express the value of 푛 in terms of the other parameters, as given in equation 5.6:

푆푖푗 푆푖푗 푛 = logg + ∑푛>1 log푔 . (5.6) Θ푖푗(푛) 푔푆푖푗+푉휃

Assuming that the membrane potential is constant for a single neuron and that its analytical solution is an approximation based on the firing frequency ℱ푖푗,

1 ℱ = , (5.7) 푖푗 푆 log푔( ) 푔푆푖푗+푉휃

It can be shown that a neuron with a strong stimulus is fired earlier than a neuron with weak stimulus (Zhan, Shi, Wang, Xie, & Li, 2017). Neurons with weak stimuli are the last to be fired.

5.2.3.3 Synchronization

The proposed method also incorporates stimulus-induced synchronization, which is related to the linking wave (Gjesteby, et al., 2018). The linking wave propagates in circles from a central neuron, while radii of the wave increase step by step, coupling all the neurons in the neighborhood so that they fire synchronously. The linking wave then preferentially selects neurons whose stimuli are similar to the previously fired neuron (Xia,

Yin, & Ouyang, 2013).

5.2.3.4 Visual properties enhancement

Pulse synchronization in PCNN enables image property enhancement through its time matrix (Al-Azzawi, 2015), because the membrane potential is approximately equal to the stimulus when neurons fire. Let time 푇푖푗 be the time when the membrane potential exceeds its threshold, and let this time correspond to iteration n. In other words, neuron

64

(푖, 푗) fires at the time 푇푖푗 when the membrane potential 푈푖푗(푇푖푗) is only slightly greater

푇푖푗 푇푖푗 than its threshold Θ푖푗(푇푖푗) = 푔 Θ푖푗(푛); hence, 푈푖푗(푇푖푗) ≅ 푔 Θ푖푗(푛). Therefore, the firing time 푇푖푗 will be given by

푇푖푗 = log푔 푆푖푗 − logg Θ푖푗(푛) (5.8)

The time matrix represented in equation 12 corresponds to the Weber-Fechner's law, and has an approximate logarithmic relation with the stimuli matrix. Thus, neurons with a larger stimulus fire before neurons with a smaller stimulus (Xia, Yin, & Ouyang, 2013).

The resulting image exhibits enhanced visual properties, with improved contrast and preserved edges and texture.

5.3 Simulation Results

5.3.1 Experimental setup

Three different examples, presented in Figure 5-4, Figure 5-5, Figure 5-6 are used to evaluate the proposed method. The Gabor filters were oriented in four directions, 휃 =

00, 900, 1800, and 2700. PCNN parameters are defined in Table 5.1. The final results were evaluated using the same objective quantitative measurements postulated in chapter 3: standard deviation, root mean square error (RMSE) and entropy. These descriptors were also used as measurements in previously published results for other fusion methods, thus enabling us for a common comparison. As stated previously, images with high standard deviation have better improved human visual properties (textual properties and better edge formation), and lower RMSE is an indication of lower noise levels in the image and thus improved visual features. Entropy measures image quality in terms of information content

(Singh & Kapoor, 2014): images with higher entropy contain more information. Entropy

65

measures image qualities in terms of textural uniformity and is thus a preferable metric of quality for fused medical images. This measure reflects more textural properties making them more effective in terms of fusion (Al-Azzawi, 2015).

The example images and results are shown in Figure 5-4, Figure 5-5, Figure 5-6, and yielded measurements presented in Table 5.2, Table 5.3, Table 5.4. The best results were then compared with measurements from previously published (existing) techniques.

Table 5.1: List of PCNN parameters values used in this study. Parameter Value used

훼퐿 1

훼휃 0.2

훽 0.5

푉퐿 1

푉휃 20

Link arrangement 16

Iteration times 250

66

Figure 5-4: Example 1: Inputs are CT and MRI images as shown in Row 1. Gabor filter results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown in column 1 and 3, with corresponding maximum value selection results in columns 2 and 4.

67

Table 5.2: Performance evaluation measures for example 1 (Figure 11) under different orientations of Gabor filtering Values at different directions (휃)

Performance 00 900 1800 2700 evaluation measures

Entropy 8.9867 9.0049 9.1595 9.1559

Standard deviation 50.0335 63.0644 57.3573 53.3573

RMSE 0.0177 0.0173 0.0193 0.0187

68

Figure 5-5: Example 2: Inputs are CT and MRI images as shown in Row 1. Gabor filter results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown in column 1 and 3, with corresponding maximum value selection results in columns 2 and 4.

69

Table 5.3: Performance evaluation measures for example 2 (Figure 12) under different orientations of Gabor filtering Values at different directions (휃)

Performance 00 900 1800 2700 evaluation measures

Entropy 8.8606 8.7162 8.9667 8.9273

Standard deviation 77.6431 61.3606 71.6609 76.7252

RMSE 0.0178 0.0153 0.0187 0.0188

70

Figure 5-6: Example 3: Inputs are CT and MRI images as shown in Row 1. Gabor filter results at different orientations (θ is 0, 90, 180 and 270 degrees) are shown in column 1 and 3, with corresponding maximum value selection results in columns 2 and 4.

71

Table 5.4: Performance evaluation measures for example 3 (Figure 13) under different orientations of Gabor filtering Values at different directions (휃)

Performance 00 900 1800 2700 evaluation measures

Entropy 8.3932 8.7063 8.7121 8.6019

Standard-deviation 49.7551 55.4118 56.8493 52.6867

RMSE 0.0123 0.013 0.014 0.0125

Table 5.5 and Figure 5-7 describe the performance of the proposed algorithm in comparison with some other methods including Contourlet Transform (CT) (Al-Azzawi,

2015), Discrete wavelet transform (DWT) (Al-Azzawi, 2015), Shearlets and Human

Feature Visibility (SHFV) (Al-Azzawi, 2015), and Fuzzy-Based using Maximum Selection and Gabor filters (FMG) (Alenezi & Salari, 2018). Figures shown for Entropy, SD, and

RMSE of these methods are taken from the sources referenced above.

72

Table 5.5: Comparison of image quality metrics different for fusion algorithms. Examples Algorithm Entropy Standard RMSE Deviation Proposed 9.1595 57.3573 0.0193 method (Example 1) CT 7.1332 54.1504 0.1662 DWT 6.9543 47.2304 0.2703 SHFV 7.6572 56.7993 0.1164 FMG 3.4507 117.7324 0.0736

Proposed 8.9667 71.6609 0.0187 method (Example 2) CT 6.9351 46.6294 0.2538 DWT 6.6997 41.4623 0.2889 SHFV 7.3791 55.8533 0.2410 FMG 3.8054 118.041 0.1026

Proposed 8.7121 56.8493 0.014 method (Example 3) CT 6.8824 43.1963 0.2422 DWT 6.5198 42.0087 0.3142 SHFV 6.9467 44.2937 0.2133 FMG 2.3886 94.6862 0.0774

73

Figure 5-7: Fusion results on examples 1, 2, and 3 using the proposed method, CT, DWT, SHFV, and FMG.

5.4 Discussion

Image fusion is very important in data visualization. Medical image fusion, in particular, is critical in clinical medicine for non-invasive diagnosis. Human visual properties, in the form of textural properties and completeness in edge formation, have

74

proven to be vital for medical imaging. Therefore, any medical image fusion technique must preserve the visual properties of fused images.

The proposed method demonstrates significant improvements over previously published methods for fusion. As summarized in Figure 14, resulting fused images show improved textural properties and edge information, as measured by SNR and RMSE.

A close analysis of the Entropy values, summarized for all examples in Table 10, confirms that the novel method yields better information content than all of the benchmark techniques, thus highlighting the overall improvements to textural properties, edge formation, and information content.

The improvement in RMSE results presented in Table 5.5 is consistent, i.e., obtained values are the lowest across all examples. The lowest RMSE values are exhibited by highest SNR can be attributed to the Gabor filtering of the input images, maximum selection of the Gabor-filtered images, and final filtering of the fused images.

Although the proposed method has yielded improved results in most cases, the results are mixed for the standard deviation. High SD values in the new method are also a result of Gabor filtering input images. However, the FMG algorithm in Table 5.5 shows the highest values of standard deviations. The FMG method entailed double Gabor filtering as each input image was Gabor filtered, which presumably explains the highest standard deviations measurements. In the new method, the input images are first Gabor filtered and then the fused image is slightly smoothed with "Imfilter". Consequently, the standard deviation of the result is somewhat lower than that of the FMG method, but better than others. Remarkably, the rest of the existing methods did not employ Gabor filtering.

75

The use of PCNN also has an impact on the measured values. The implementation seems to reduce loss of texture during fusion by including a local neighborhood within the linking phase. The end result is increased SNR, and lower RMSE, with higher information content measured by higher entropy than other techniques.

5.5 Conclusion

The algorithm described in this section represents a novel medical image fusion technique, based on PCNN coupled with Gabor filtering and maximum selection with smoothing of the fused image. Previously existing image fusion techniques have not yielded images with high information content or adequate visual properties i.e., improved textural properties and edge information, both of which are critical in clinical diagnosis.

The proposed method presents fused images with better information quality and improved visual properties by exploiting the action of PCNN on images, maximum selection of pixels, and the properties of Gabor filtering. Based on the objective performance evaluation criteria, our method significantly outperforms leading-edge image fusion methods by making novel use of PCNN in combination with Gabor filtering. Future studies should be conducted to show the effects of the proposed method when PCNN enhanced images are fused without Gabor filtering or maximum selection, to determine the effect of Gabor filtering of input images before enhancing the images using PCNN.

76

Chapter 6

A Novel Image Fusion Method Which combines Wiener Filtering, Pulse-Coupled

Neural Networks and Discrete Wavelet Transforms for Medical Imaging

Applications

6.1 Introduction

To sharpen edges, improve geometric corrections, or even enhance certain features invisible in any of the input images, a new remedy is proposed in this chapter. The aim is to reduce visual ambiguity and minimize decision-making errors (Alenezi & Salari, 2018;

Sharma, 2016).

As stated previously, image fusion preprocessing techniques may operate on four different levels, including pixel, feature, symbol, or decision levels (Sharma, 2016).

Preprocessing at the pixel level of each input image often results in better overall fusion results (Sharma, 2016). Feature level preprocessing operates on specific features extracted from the images (Sharma, 2016). Preprocessing at the decision level operates on pixel blocks within the images and is based on multistage representations.

When combining the enhanced images, many techniques already exist, such as intensity-hue-saturation (IHS) transforms, which manipulate properties of color to create a controlled visual representation of the fused image. This technique produces images with good visual effects but distorted colors. Image fusion by other techniques such as

77

average, wavelet, pyramid and other transforms produce images with poor edges and distorted spectra. These techniques appear to be not suitable for medical image fusion.

Medical imaging has relied on image fusion to help improve the effectiveness of images for medical diagnosis. Recent techniques employed in medical image fusion have produced composite images with better edges and geometric properties (Fayadh Alenezi,

2018; Alenezi & Salari, 2018; Al-Azzawi, 2015). Techniques such as those based on wavelet transform with modulus maxima produce final images with better-preserved component information but poor edge formation (Qu, Zhang, & Yan, 2001). Results produced by discrete wavelet transforms are easier to interpret, but contain added noise and lower accuracy at curved edges (Chiorean & Vaida, 2009). Curvelet fusion of MRI and CT images yields results with clearer curved edges and less noise, but such fusion is limited in directional information and diagonal directions.

Medical image fusion based on Redundant Discrete Wavelet Transform and

Contourlet Transform for multimodality medical image fusion with quantitative Analysis gave fused images with multi-resolution, localization, and directionality but entail high computational complexity and high memory consumption (Rajkumar, Bardhan,

Akkireddy, & Munshi, 2014). More recently, good quality fusion images have been produced with the use of redundant discrete wavelet contourlet, or ripplett, which transform fused images with multi-resolution, localization, and directionality, but these methods are computationally complex and memory intensive (Das, Chowdhury, & Kundu, 2011). In the case of the non-subsampled contourlet transform (NSCT) -based variable-weight method, robust images are produced, but this method cannot be applied to positron- emission tomography (PET) or MRI source images. Directive contrast-based multimodal

78

medical image fusion in the NSCT domain produces images with better-curved areas but high noise (Bhatnagar, Wu, & Liu, 2013).

The proposed method uses a high-scale Wiener filter in the source image preprocessing step to extract detailed information from the original image (Xu, Weaver,

Healy Jr, & Lu, 1994). The Wiener filter also helps optimize the complementary effects of inverse transformation and noise smoothing. The preprocessed image is then fed into a lateral inhibited and excited feature-linking pulse-coupled neural network (PCNN) to extract, boost, and preserve key features. Lastly, a shift invariant discrete wavelet transforms (SIDWT) fusion algorithm is used to create the fused image.

The remainder of this chapter is organized as follows. Section 6.2 provides an overview of the proposed method, followed by a detailed description of preprocessing

(Weiner Filtering), feature extraction (PCNN), and fusion (SIDWT) steps. Section 6.3 presents simulation results and provides a quantitative comparison to existing methods.

These results are discussed in section 6.4, and section 6.5 provides conclusions and discusses future research directions.

6.2 Proposed method

6.2.1 Overview and background

A block diagram of the proposed method is depicted in Figure 6-1. Initially, a space-variant, high-scale Wiener filter is applied to source images. This type of filter is chosen because it provides an optimal compromise between computational efficiency and the quality of the reconstructed image (Xu, Weaver, Healy Jr, & Lu, 1994). It also ensures that more details are captured in the output image.

79

Figure 6-1: Schematic representation of novel algorithm combining spatial variant wiener filter, proposed feature linking PCNN and SIDWT.

Wiener filters are applied in the frequency domain, and designed to minimize the mean square error difference between an ideal image and the input image. A Wiener filter is used since it minimizes the overall mean square error in the process and helps in noise smoothing

(Chen, Benesty, Huang, & Doclo, 2006). The ideal image is estimated by creating a model of the noise present in the input image, which is then removed using a time-invariant finite impulse response (FIR) filter (Chen, Benesty, Huang, & Doclo, 2006). The Wiener filtered image is then passed to a feature-linked pulse-coupled neural network. This feature-linked model (FLM) specifically encodes the times at which neurons fire, using these pulses to influence laterally linked neuronal membrane potentials (Vaseghi, 2008).

The timed signals encode image information consistent with that of human visual processing. The details of the output image are boosted while preserving the information from the Wiener filtered image. FLM is used because the generated time signals are invariant to rotation, dilation, or translation of the images. The FLM images are combined into one fused image using SIDWT, the preferred fusion method because of its shift invariant properties (Zhan, Teng, Shi, & Li, 2016). Fusion using SIDWT is preferred also since it will lead to stable and unflickered results. This ensures that the final image results

80

are consistent with the input images used (Zhan. et al., 2016). SIDWT is used in order to overcome the shift dependency of the wavelet fusion method. Wiener filtering is used since it executes an optimal tradeoff between inverse filtering and noise smoothing (Chen,

Benesty, Huang, & Doclo, 2006). Wiener filter also removes the additive noise and inverts the blurring simultaneously, thereby enhancing the information quality of the final image.

6.2.2 Proposed space-variant Wiener filter

The proposed space-variant Wiener filter solves the invariance problem associated with the general Wiener filter. This Wiener filter operates by optimizing a trade-off between noise power and signal power (see equations 6.1 and 6.2, below) (Cristobal,

Schelkens, & Thienpont, 2013). In the method proposed here, Wiener filters are applied to the source images to enhance information content of the input images (Yaroslavsky L. ,

2004). The magnitude of image pixels is amplified so that their energies dominate over that of the noise. This result is achieved by setting to zero any spectral component energy of image pixels that is smaller than the noise energy. The Wiener filter output, f (i), of the input image, i, is described by the following es

̂ 1 푓(푖) = 푚푓(푖) + (푔(푖) − 푚푓(푖)) , (6.1) 휂푟

where g represents the pixel intensities, mf is the local mean of the image pixel intensities and

2 2 |훽푟| +휎푛(푖)+퐾 휂푟 = max [ 2 ; 0] , (6.2) |훽푟|

2 where 훽푟 is the variance of the local image intensities, 휎푛 is the noise standard deviation, and 퐾 is a Lagrange constant, which ensures that at high frequencies, the filter has a low frequency response.

81

The first term inside the max [•] operator in equation 6.2 ensures that the same filter is not applied throughout the image, making the filter spatially variant. The weight coefficients in this term depend on the spectrum of the input image, and have values that

2 vary from 0 to 1, depending on the order of magnitude of the noise variance 휎푛 ,

1, 푖푓 |훽 |2 ≥ 푇ℎ푟 휂 = { 푟 . 푟 0, 표푡ℎ푒푟푤푖푠푒

This filter removes from the input any spectral components whose signal-to-noise ratio is lower than the predetermined threshold.

The proposed spatially variant Weiner filter allows for the preservation of more detailed information from source images, which is very important in medical imaging, where source images are typically characterized by poor contrast (Umarani, 2016; Singh

& Khare, 2013). Thus, by varying the scale, there is enough flexibility to select the appropriate fused image for further operations. 퐾 is also a constant introduced to ensure that all the power spectrum in either un-degraded or noise images that are hard to estimate are also filtered.

6.2.3 Proposed Feature linking Neural Network Model

The proposed feature-linking model (FLM), which is similar to the traditional pulsed coupled neural network (PCNN), has two inputs: feeding inputs and linking inputs

(Eckhorn, Reitboeck, Arndt, & Dicke, 1990). While traditional PCNN utilizes three leaky integrators, FLM has only two leaky integrators, which represent the membrane potential and threshold of the neurons, one for each pixel, in the network (Johnson & Padgett, 1994).

This variation makes FLM more effective in obtaining synchronization and desynchronization across different regions in a medical image, similar to the actual human visual perception (Johnson & Padgett, 1999).

82

The FLM enhances image contrast by triggering off the timing of the first generated action potential, and keeping a time matrix record of action potential timing across the entire network (Zhan, Teng, Shi, Li, & Wang, 2016). More specifically, the first action potential (spike) is timed differently from other spikes, and this differential timing provides much of the image information. Each FLM pulse corresponds to the gray scale intensity of the image based on the time matrix (Zhan, Shi, Wang, Xie, & Li, 2017). The time matrix is implemented as a single-pass record, which has been shown to have a logarithmic relationship to the stimuli matrix, and to be consistent with the Weber-Fechner law. The parameters of FLM are therefore set carefully in a manner similar to the Mach band effect

(an optical illusion where the contrast between edges of slightly differing shades of gray are exaggerates as soon as the contact between one another is made) in the image enhancement algorithm. Mach band effect triggers edge-detection in the human visual system.

In order to enhance the Mach band effect, the proposed method has also introduced two positive constants in the linking inputs, 휀 and 휑, related to lateral inhibition (Montolio,

Janssens, Stam, & Jansonius, 2016) and lateral excitation (Byrne, 2013), respectively

(Figure 6-2). The lateral excitation ensures that only mutually exciting neurons relevant to stimuli are selected (Hoshino, 2005). Lateral inhibition ensures neurons that are irrelevant are suppressed (Hoshino, 2005). The lateral excitatory and inhibitory synapses between neurons influence stimulus-evoked inter-neuronal activity, which has a great impact on detailed information extraction (Nakamura, Tsuboi, & Hoshino, 2008). The constants 휀 and 휑 suppress asymmetric activity within neuronal neighborhoods. Because FLM

83

assumes that all activity is symmetric, the use of these constants more closely supports image detail enhancement (Nakamura, Tsuboi, & Hoshino, 2008).

The proposed FLM has three components: membrane potential, threshold, and action potential. Each dendrite receives postsynaptic action potentials through synapses from receptive fields (Eckhorn, Reitboeck, Arndt, & Dicke, 1990). The action potentials influence the membrane potentials of neighboring neurons through localized (linking) synapses on the dendrites. The combination of synaptic inputs may spike in neighboring neurons if the potential exceeds a certain threshold. Leaky integrators convert incoming synapse into a persistent signal. Neurotransmitters within the synapse are modeled via the time constant of the leaky integrator (Eckhorn, Reitboeck, Arndt, & Dicke, 1990). The neuron generates an action potential or spike if the potential is large enough to exceed a threshold.

6.2.3.1 Leaky integrator

Leaky integrators are the most crucial component of feature-linking neural networks

(Schoenauer, Atasoy, Mehrtash, & Klar, 2002). They describe the dynamic potential 푣(푡) of a neural oscillator,

푑푣(푡) = −푎푣(푡) + 푠 (6.3) 푑푡

where 푡 represents time, 푠 (the input stimulus) is the pixel value of the preprocessed image, and a is the leak rate (0 < 푎 < 1). Equation 6.3 can be discretized as

푉(푛)−푉(푛−1) = 푎푉(푛 − 1) + 푠 (6.4) 푛−(푛−1)

where 푉(푛) is the discretized potential and 푛 is the discrete time index. Equation 6.4 can be rewritten as

84

푉(푛) = 푏푉(푛 − 1) + 푠 (6.5)

where 푏 = 1 − 푎 is the attenuation time constant of the leaky integrator. Equation 6.5 represents the generic form of a leaky integrator.

6.2.3.2 Membrane potential

A cortical neuron is mostly bi-directionally connected; feeding synapses are of the feedforward type while linking synapses are of the feedback type (Eckhorn, Reitboeck,

Arndt, & Dicke, 1990). Figure 6-2 shows a feature-linking model for the proposed method.

Figure 6-2 also shows a neuron that has feeding synapses and lateral linking synapses.

Feeding synapses are connected to a spatially corresponding stimulus. Lateral linking is shown in Figure 6-3, which illustrates how synapses are connected to outputs of neighboring neurons within a predetermined radius 흈 (Eckhorn, Reitboeck, Arndt, &

Dicke, 1990; Johnson & Padgett, 1999). In this algorithm, 흈 is large enough to permit effective filtering of the neurons. Locally excitatory linking inputs have a negative, globally inhibitory term that supports desynchronization (Stewart, Fermin, & Opper,

2002). Therefore, the dendritic signals to the neuron are the feeding inputs and the linking inputs, respectively, as explained in the following equations,

퐹푖푗(푛) = ∑푘푙 푀푖푗푘푙푌푘푙(푛 − 1) + 푆푖푗 , (6.6)

퐿푖푗(푛) = ∑푝푞 푊푖푗푝푞푌푝푞(푛 − 1) − 푑 + 휀 − 휑 , (6.7)

where indices (i, j) denote each neuron, indices (k, l) and (p, q) denote neighboring neurons; 퐹푖푗(푛) is the feeding input; 푌푖푗(푛 − 1) denotes the postsynaptic action potential;

푆푖푗 is the stimulus for the neuron; the term 푀푖푗푘푙 is a synaptic weight applied to feeding inputs; 퐿푖푗 (푛) denotes a linking input, and 푊푖푗푝푞 is a synaptic weight applied to linking inputs. The positive constant d applies to the global inhibition, 휀 is a negative constant for

85

lateral inhibition, and 휑 is positive constant for lateral excitation, as explained in subsection

5.2.3.

Figure 6-2: Schematic of proposed feature linking PCNN model with feeding input, linking input, leaky integrator and spike generator.

Figure 6-3: Schematic of linking inputs with excitatory and inhibitory neurons.

In order to enable synchronization, stimulus-driven feedforward streams are combined with stimulus-induced feedback streams (Brosch & Neumann, 2014). The leaky integrator driven by the membrane potential is described by

푈푖푗(푛) = 푓푈푖푗(푛 − 1) + 퐹푖푗(푛) (1 + 훽퐿푖푗(푛)) (6.8)

86

where 푓 is the attenuation time constant of the membrane potential and 훽 is the linking strength. Substituting equations 6.6 and 6.7 into 6.8, the neural membrane potential can be finally expressed as

푈푖푗(푛) = 푓푈푖푗(푛 − 1) + (∑푘푙 푀푖푗푘푙푌푘푙(푛 − 1) + 푆푖푗) (1 + 훽(∑푝푞 푊푖푗푝푞푌푝푞(푛 − 1) − 푑 + 휀 − 휑)) (6.9)

6.2.3.3 Threshold

A leaky integrator is also used to represent the threshold of the neuron. The postsynaptic action potential 푌푖푗(푛 − 1) is the input to the threshold Θ푖푗(푛) according to

Θ푖푗(푛) = 푔Θ푖푗(푛 − 1) + ℎ푌푖푗(푛 − 1), (6.10)

where 푔 is the attenuation time constant and ℎ is a magnitude adjustment. The postsynaptic action potential drives a dynamic increase in the threshold by an amount ℎ, in order to suppress secondary action potentials during a refractory period. The threshold decays over time depending on the time constant 푔. Prior to the first action potential, the threshold Θ푖푗(푛) decreases exponentially from the initial threshold Θ푖푗(0):

푛 Θ푖푗(푛) = 푔 Θ푖푗(0) (6.11)

6.2.3.4 Action potential

The most significant element in neural coding is precision in pulse timing. An action potential Yij of a neuron is produced during each iteration when the membrane potential of a neuron exceeds its threshold,

1, 푖푓 푈 (푛) > Θ (푛) 푌 = { 푖푗 푖푗 (6.12) 푖푗 0, 푂푡ℎ푒푟푤푖푠푒

87

The feature-linking model used in this paper is summarized by equations 6.2, 6.3 and 6.4. Factorization of equation 6.12 indicates that the membrane potential is composed of a leaky integrator term, a stimulus, a feeding synapse term, a linking synapse term, and a multiplicative term. Each factor in the multiplicative term, 훽, 푀푖푗푘푙, 푊푖푗푝푞 ranges from 0 to 1, creating a much smaller modulation term, which can be omitted. The membrane potential can therefore be written as

푈푖푗(푛) = 푓푈푖푗(푛 − 1) + 푆푖푗 + 훼 ∑푘푙 푀푖푗푘푙푌푘푙(푛 − 1) + 훽푆푖푗(∑푝푞 푊푖푗푝푞푌푝푞(푛 −

1) − 푑 + 휀 − 휑) (6.13)

where 훼 is the feeding synapse strength used to simplify the analysis of the model.

6.2.3.5 Single Pass Time matrix

The key action of the neurons is triggered by the first action potential. Therefore, a time matrix 푇 is defined for the first firing time of neurons,

푇푖푗(푛) = 푇푖푗(푛 − 1) + 푛푌푖푗(푛). (6.14)

The threshold amplification factor ℎ for the action potential is large enough to make sure that neurons fire only once. This single-pass creation of the time matrix 푇, is completed when all the neurons have generated their respective action potentials (Johnson

& Padgett, 1999), and also determines the neural network stopping condition (Zhan, Teng,

Shi, & Li, 2016)

6.2.3.6 Image enhancement by FLM

Each pixel of the Wiener filtered images corresponds to one neuron of the network; therefore, a two-dimensional image matrix is represented as 푟 × 푐 neurons (r being the number of image rows and c the number of columns). The Wiener-filtered image intensity

퐼 is normalized according to

88

1 퐼−min(퐼) 푆 = ( + 휗) , (6.15) 1+푘푎푠/푎̅ max(퐼)−min(퐼)

where S represents the enhanced image, min(퐼) returns the minimum value of 퐼, max(퐼) returns the maximum value of 퐼, and 휗 is a small positive constant which ensures nonzero pixel values, which has been set to the smallest gray scale value of the matrix 휗 =

1/(퐼). The multiplying term in equation 6.15 normalizes the pixel value across its local neighborhood: 푎푠 is the peak-to-mean amplitude of the neurons’ filter response to the edge,

푎̅ is the mean amplitude, used to achieve contrast invariance during normalization, and 푘 is a normalization constant, which is set to 0.5. The normalization matrix 푆 increases the lateral inhibition sharpening the visual and feature properties of the images (Kingdom,

2014; Tsofe, Spitzer, & Einav, 2009). These sharp-masked images are then inputs to the

SIDWT algorithm, the application of which is the final step in the fused image formation.

6.2.4 Shift-Invariant Discrete Wavelet Transform (SIDWT)

The proposed method uses a shift-invariant discrete wavelet transform to overcome the shift dependency inherent in the wavelet fusion method (Deshmukh & Bhosale, 2010), which ensures that the result is independent of the location of objects and consistent with the input images. SIDWT yields stable and unflickered fused images, preserving the level of detail from the sharp masked images attained from wiener and FLM stages.

The input images are decomposed into shift invariant wavelet decomposition representation by splitting the output of proposed Feature Linking PCNN images sequence into the wavelet sequence (high frequency), 푤푖(푛), and scale sequence (low frequency),

푠푖(푛),

푖 푤푖(푛) = ∑푘 푔(2 . 푘). 푠푖(푛 − 푘), (6.16)

푖 푠푖+1 = ∑푘 ℎ(2 . 푘). 푠푖(푛 − 푘), (6.17)

89

where 푔(2푖. 푘) is a split analysis filter, ℎ(2푖. 푘) is a scale sequence analysis filter and

푖 represents decomposition levels, (푖 = 1,2,3,4). The sequence 푤푖(푛) is stored while 푠푖(푛) acts as the input for the next decomposition level. The scale sequence is set to the zero-th level making it equal to the input sequence 푠0(푛) = 푓(푛) hence defining the complete

SIDWT scheme. Filters 푔(2푖. 푘) and ℎ(2푖. 푘) at level 푖 are attained by insertion of the appropriate number of zeros between the filters taps of the prototype filters 푔(푘) and ℎ(푘).

Once the coefficients are obtained, the input sequences are reconstructed by inverse SIDWT using convolution. This process is achieved using the reconstruction filters

푔̃(2푖. 푘) and ℎ̃(2푖. 푘),

̃ 푖 푖 푠푖(푛) = ∑푘 ℎ(2 . 푛 − 푘). 푠푖+1(푛) + ∑푘 푔̃(2 . 푛 − 푘). 푤푖+1(푛) . (6.18)

SIDWT uses wavelet decomposition without the traditional discrete wavelet down- sampling process. The resulting fused image has improved temporal stability and consistency.

6.3 Simulation results

6.3.1 Experimental setup

Three different fusion examples, numbered 1, 2 and 3, as presented in Figure 6-4,

Figure 6-5, and Figure 6-6 are used to evaluate the proposed method. Lateral (inhibition and excitation) FLM parameters are listed in Table 6.1. The final results were evaluated using the same objective quantitative measurements postulated in previous chapters, and under the same justification: standard deviation (SD), root mean square error (RMSE) and entropy.

90

Medical images used as examples alongside the resulting fused images are shown in Figure 6-4, Figure 6-5, and Figure 6-6. Performance metrics are compared with results from previously published techniques (Al-Azzawi, 2015). A side-by-side display of this comparison is presented in Table 6.1. The best results were then compared with measurements from previously published (existing) techniques. Comparison results are presented in Table 6.1 and Table 6.2.

Table 6.1: List of proposed feature-linking PCNN parameter values used in the simulation Parameter Value f 0.01 g 0.98 h 2e10 d 2

ϵ -0.2

91

φ 1

β 0.03

α 0.01

Figure 6-4: Example 1: (a) inputs, CT and MRI images; (b) high-scale, spatially variant Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT output).

92

Figure 6-5: Example 2: (a) inputs, CT and MRI images; (b) high-scale, spatially variant Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT output).

93

Figure 6-6: Example 3: (a) inputs, CT and MRI images; (b) high-scale, spatially variant Wiener filter; (c) enhanced images using FLM; (d) result of fusion (SIDWT output).

Table 6.2 and Figure 6-7 describe the performance of the proposed algorithm in comparison with some other methods, namely the Pulse-Coupled Neural Network using

Gabor Filters method (PCNNGM) described in chapter 5, the Fuzzy-Based using

Maximum Selection and Gabor filters method (FMG) presented in chapter 4, Contourlet

94

Transform (CT) (Al-Azzawi, 2015), Discrete wavelet transform (DWT) (Al-Azzawi,

2015), and Shearlets and Human Feature Visibility (SHFV) (Al-Azzawi, 2015).

Table 6.2: Comparison of Image Quality Metrics for Fusion Algorithms. Examples Algorithm Entropy Standard RMSE Deviation Proposed (Example 1) 9.2141 76.8137 0.0689 1 PCNNGM 9.1595 57.3573 0.0193 FMG 3.4507 117.7324 0.0736 CT 7.1332 54.1504 0.1662 DWT 6.9543 47.2304 0.2703 SHFV 7.6572 56.7993 0.1164

Proposed (Example 2) 9.1043 70.4425 0.049 2 PCNNGM 8.9667 71.6609 0.0187 FMG 3.8054 118.0412 0.1026 CT 6.9351 46.6294 0.2538 DWT 6.6997 41.4623 0.2889 SHFV 7.3791 55.8533 0.2410

Proposed (Example 3) 8.9040 72.6212 0.0226 3 PCNNGM 8.7121 56.8493 0.014 FMG 2.3886 94.6862 0.0774 CT 6.8824 43.1963 0.2422 DWT 6.5198 42.0087 0.3142 SHFV 6.9467 44.2937 0.2133

95

Figure 6-7: Fusion Results on Test Original Multimodality Image Example 1, 2, and 3 Using Proposed Method, PCNNGM, FMG, CT, DWT, and SHFV.

6.4 Discussion

As can be seen from the results presented in section 6.3.1, the proposed method in this chapter demonstrates significant improvements over previously published methods for

96

fusion. As summarized in Figure 6-4, Figure 6-5, and Figure 6-6, resulting fused images show improved geometric correlations, textural properties, feature enhancement and edge information, as measured by RMSE and standard deviation.

Analysis of the entropy values for the three examples indicates that the proposed method consistently yields higher entropy values than the previously available techniques, which suggests that the new method produces fused images with higher information content. Although some of the other results are mixed, the overall picture favors the proposed method over the set of benchmark algorithms.

The highest entropy values, depicting high information content, are visible in

Figure 6-7, where the suppression of pixels with low intensities (caused by the introduction of lateral inhibition and excitation) and redefined normalization of the image intensities in the FLM model yield improved precision. High entropy values can also be linked to a zeroing of noise energies in the input images by the space-variant Wiener filtering, resulting in images with higher information content. Variation of the scale in the Wiener filter allows for maximizing of the information content from the input images.

A closer analysis of the results also indicates higher standard deviations of the proposed method than most existing methods, except for the FMG technique. The high standard deviation is likely the result of including increments in the lateral inhibition of the proposed FLM method, which increases the contrast, texture, and sharpness of the images.

The obtained RMSE figures are also lower (and therefore better) than the FMG,

CT, DWT and SHFV techniques. The lower RMSE is presumably a result of increased precision during the suppression of neurons caused by the redefining of lateral inhibition and excitation in the FLM model. Lower RMSE is also caused by the zeroing of noise

97

energies in the input images by the space-variant Wiener filter, resulting in images with higher SNR values.

Finally, fusion by SIDWT reduces the loss of texture and features, and reduces additive noise during fusion, resulting in fused images with improved textural properties, greater information content, and increased geometric correlations. Fusion by SIDWT also enhances the visual appearance of the fused images more than any other method presented in Table 6.2 and Figure 6-7. This observation is a result of the shift invariance within pixels during the fusion process. The fused images also have better edge formation, unlike other fused images formed with wavelet fusion techniques. The improvements in edge formation result from the proposed method’s dropping of subsampling, ensuring that a highly attainable redundant wavelet representation is formed during fusion.

6.5 Conclusion

This chapter presents, explains, and evaluates a novel medical image fusion technique using SIDWT and based on the application of space-variant Wiener filtering, an

FLM model with redefined lateral inhibition and excitation, and maximized normalization.

Based on a set of objective performance evaluation criteria, when compared with a set of previously available fusion methods, the proposed method yields fused images exhibiting better geometric correlations, improved sharpness, information content, and visual impact.

In order to isolate the effect of the space-variant Wiener filters and the SIDWT fusion method, future studies should be conducted by testing the proposed method using different fusion techniques, and also by suppressing the action of the proposed FLM model on pre- fused images.

98

99

Chapter 7

Perceptual local contrast enhancement and global variance minimization of medical images for improved fusion

7.1 Introduction

This chapter presents a novel preprocessing algorithm aimed at optimizing global and local image contrast based on human visual system (HVS) performance, prior to MIF.

Local contrast enhancement is used to improve perceived quality, while global enhancement is used to reduce overall variance of luminosity. The combined technique ultimately improves the quality of the fused image based on HVS performance.

Source images for fusion algorithms may have poor quality due to the use of low-quality sensors or patient movement during image acquisition. Low-quality images contain unwanted artifacts, noise, blur, or low contrast (Singh, Singh, Devi, & Sinam, 2012).

Contrast is related to the difference between the darker and lighter pixels of the image, and is also dependent on the highly adaptive HVS (Singh, Singh, Devi, & Sinam, 2012).

Contrast is quantified by measuring the relative variation of luminance and correlating it with the intensity gradient of the image. By improving luminance variance and intensity gradient, it is therefore possible to improve the perceptual quality of the overall image

(Gonzalez, et al., 2002).

Available contrast enhancement techniques, whether global or local, are only fast and reliable for some kinds of images (Singh et al., 2012). For other images, they tend to produce over-enhanced or over-saturated images (Lamberti, Montrucchio, & San, 2006), or else, introduce changes in local details which give the images a washed-out appearance

100

with poorly formed edges (Saleem, Beghdadi, & Boashash, 2012). Even the use of

"imadjust", a Matlab® function designed to correct the mapping of intensity values, has failed to achieve better contrast, instead introducing luminance shift or saturation (Li &

Bovik, 2010).

When the quality of local contrast in a source image is poor, it is often because image characteristics differ from region to region (Yousuf & Rakib, 2011). Local contrast enhancement methods like those that are based on Local Standard Deviation (LSD) have produced better result (Cvetkovic, Schirris, & de With, 2007); however, for medical images, local contrast enhancement can produce distracting artifacts in some image regions.

In view of the perceived importance of contrast as a key feature of pre-fusion images, and the aforementioned limitations of existing contrast enhancement techniques, a novel dual contrast enrichment method is proposed, in which local human perceptual quality is maximized and a modified global contrast enhancement algorithm is used to improve the finer image details. The proposed preprocessing method aims at optimizing global and local image contrast based on HVS performance: local contrast enhancement is used to improve perceived quality, while global enhancement is used to reduce overall variance of luminosity. The combined technique ultimately improves the quality of the fused image based on HVS performance (Singh, Singh, Devi, & Sinam, 2012). Lastly, image fusion is accomplished over two consecutive stages using a standard algorithm available in Matlab®.

The rest of this chapter is organized as follows: Section 7.2 introduces the proposed method, giving a detailed explanation of the local and global contrast enhancement

101

methods used. Sections 7.3 and 7.4 present and discuss simulation results. Section 7.5 provides conclusions, which also point in the direction of future research.

7.2 Proposed method

7.2.1 Overview

Methods for global or local contrast enhancement are designed to address very specific contrast issues, making their area of application relatively narrow (Das & Kundu,

2013). Methods not specifically based on the HVS, which combine local and global features, have a broader scope (Beghdadi & Le Negrate, 1989).

For local enhancement, the proposed method optimizes for the contrast sensitivity of the human eye by adjusting the local luminance gradient at every point in the image and reducing image variance, respectively. Control for the contrast discrimination sensitivity of the human eye is achieved during local contrast enhancement (Singh, Singh, Singh, &

Devi, 2012), in which the images are considered as heightfields (a visual representation of a function which takes as input a two-dimensional point and returns a scalar value

("height") as output) and processed as a constant (Singh, Singh, Devi, & Sinam, 2012).

Doing this maximizes the quality of the images based on the human suprathreshold contrast discrimination sensitivity (Beghdadi & Le Negrate, 1989). Reducing image variance reduces non-uniform illumination in the image (Majumder & Irani, 2006). This is achieved by controlling for the trade-off between image luminance and affine contrast stretching.

The proposed method, as illustrated in Figure 7-1, begins with proposed local and global contrast enhancement of the input images.

102

Figure 7-1: Schematic representation of the proposed algorithm showing global and local contrast enhancement stages.

Global Histogram Equalization (GHE), one of the most commonly used contrast enhancement methods (Singh, Singh, Singh, & Devi, 2012; Lo & Puchalski, 2008), transforms the image over the interval [0, L-1] such that (Maragatham & Roomi, 2015)

(퐿−1) 푆 = 푇(푟 ) = (퐿 − 1) ∑푘 푝 (푟 ) = ∑푘 푛 (7.1) 푘 푘 푗=0 푟 푗 푀푁 푗=0 푗

where 푘 = 0,1,2, … , 퐿 − 1, 푟푘 is an input image pixel, 푆푘 is the corresponding pixel in the output image, 푝푟(푟푗) is the probability density function of 푟푗, MN is the total number of pixels in the image, and 푛푗 is the number of pixels whose intensity equals 푟푗. Using GHE, some parts of the resulting image may be over-exposed or under-exposed (Singh., Singh,

Singh, & Devi, 2012), which may result in loss of information. User control during enhancement is also lacking in GHE (Singh, Singh, Devi, & Sinam, 2012).

103

In order to add user control to the enhancement method, a contrast gain coefficient is included as a single user-defined parameter (Singh, Singh, Singh, & Devi, 2012). This semi-automatic global contrast enhancement is described by 23 (Singh et al., 2012)

푓0 = (1 + 퐶푔) ∗ (푓푖 − 푔푚푒푎푛) + 0.5 (7.2)

where 푓0 is the output pixel value, 푓푖 is the input pixel value, 퐶푔 is the global contrast gain factor, and 푔푚푒푎푛 is the global mean of the pixel values of the image. The application of equation 32 brings the global mean of the pixel value to 0.5, while the contrast gain of the output image is controlled by 퐶푔. Increasing 퐶푔 increases the global contrast of the image. The 푔푚푒푎푛 is known to perform better than other methods; however, its application may suppress certain local properties, making this method unsuitable in some applications

(Singh et al., 2012).

The proposed method also enhances local contrast by optimization of local standard deviation (LSD). A modification of the commonly used LSD calculation prevents divide by zero errors by incorporating a negligible constant value (Polesel, Ramponi, &

Mathews, 2000). The output image from this method can be represented by equation 33

(Singh et al., 2012)

퐶 푓(푖, 푗) = 푥(푖, 푗) + . [푥(푖, 푗) − 푚(푖, 푗)] , (7.3) 휎(푖,푗)+푠

where 푥(푖, 푗) is the grey scale value of a pixel in an image, 푓(푖, 푗) is the enhanced value of 푥(푖, 푗), 푚(푖, 푗) is the local mean, 휎(푖, 푗) is the LSD, 퐶 controls the local contrast, and 푠 is a small quantity greater than zero, which acts as a lower (nonzero) bound to the

LSD.

104

7.2.2 Source Image Preprocessing

7.2.1 Proposed local contrast enhancement method

The proposed method maximizes 푥(푖,푗) using (Majumder & Irani, 2006)

1 푓′(푝)−푓′(푞) 푥(푖, 푗) = ∑ ∑ (7.4) 4|훺| 푝∈훺 푞∈푁4(푝) 푓(푝)−푓(푞) subject to a bounds on the local gradients, which affect perception,

푓′(푝)−푓′(푞) 1 ≤ ≤ (1 + 휏) , (7.5) 푓(푝)−푓(푞) and the intensity range:

퐿 ≤ 푓′(푝) ≤ 푈 , (7.6)

where functions 푓(푝) and 푓′(푝) represent the gray value at pixel 푝 of the input and output images respectively, |Ω| denotes the cardinality of Ω, 푁4(푝) is the set of four neighbors of 푝. 퐿 and 푈 are the lower and upper intensity bounds of the gray values, respectively (0 ≤ L ≤ U ≤ 255 for 8-bit images), and 휏 > 0 is constant that controls the amount of enhancement achieved, which is derived from the Weber law (Majumder &

Irani, 2006)

∆퐶 = 휆, (7.7) 퐶

where 휆 is constant and 퐶 is the suprathreshold contrast. Equation 7.7 indicates that visible contrast enhancement and higher contrast patterns require higher contrast increments. Therefore, equation 7.7 is one of the basis for our local contrast enhancement techniques, since it describes the relationship between the contrast and the gradient within the image (Vaseghi, 2008; Majumder & Irani, 2006),

휕푓 퐶∞ , (7.8) 휕푥

which implies

105

휕푓 휆퐶 = , (7.9) 휕푥

where 푓(푥, 푦) is the image, 퐶 is the contrast and 휆 is the contrast proportionality constant. From equation 7.9, it is clear that in order to obtain the same perceived increase in contrast across an image, larger gradients must be stretched more than smaller gradients.

From equation 7.9, image stretching must be performed such that the contrast increment is proportional to the initial gradient, yielding to (Majumder & Irani, 2006)

휕푓′ 휕푓 ≥ (1 + 휆) , (7.10) 휕푥 휕푥

where 푓′(푥, 푦) is the contrast-enhanced image. Using equation 7.10, contrast enhancement of an image 푓(푥, 푦) can be expressed by 휏 in (36) as (Majumder & Irani,

2006)

휕푓′ 휕푥 휕푓 ≤ (1 + 휏) (7.11) 휕푥

where 휏 ≥ 휆. The lower bound of 휏 ensures that the sign of the gradient is preserved, and never increased beyond its original value while its upper bound ensures the contrast is not over-enhanced. As suggested by (Mantiuk, Myszkowski, & Seidel, 2006), the overall bound on (1 + 휏) can be represented as (1 + 휏) ≥ 2; this preserves the finer details of the images. The constraint in equation 7.5 ensures that the image 푓푖 intensity values are never saturated. This also means that the pixels in the dark or bright regions of the image will still have their gradients enhanced.

A greedy iterative algorithm is proposed to maximize the 푥(푖, 푗), subject to these constraints. This algorithm preserves variation in local contrast across the image by using different degrees of enhancement at different locations on the image. The greedy algorithm

106

is based on the notion that given two neighboring pixels with gray values 휂 and 휑, 휂 ≠ 휑, scaling them by factor of (1 + 휏) results in 휂′ and 휑′ such that (Majumder & Irani, 2006)

휂′−휑′ = (1 + 휏) 7.12 휂−휑

Simply scaling the values 푓(푝) for all 푝 ∈ (i, j) by a factor of (1 + 휏) is not desirable because it could also lead to saturation intensity (Majumder & Irani, 2006). The proposed greedy iterative strategy used for image 푓 is considered an image intensity surface on the

푍 axis sampled by an 푚 × 푛 uniform grid on the 푋푌 plane. This pixel 푝 ∈ (푖, 푗) is a grid point and the height at 푝, 푓(푝) is within the upper and lower bounds 퐿 and 푈 respectively.

During each iteration, the upper and lower bounds are governed by 퐿 ≤ ∆ ≤ 푈

(Majumder & Irani, 2006). We then generate an 푚 × 푛 masking matrix 휁 by simply comparing 푓 to the threshold  to identify the pixel values above the threshold (Majumder

& Irani, 2006)

1 푖푓 푓(푖, 푗) > ∆ 휁(푖, 푗) = { . (7.13) 0 푖푓 푓(푖, 푗) ≤ ∆

The matrix 휁 has values 휁(푖, 푗) = 1, where two above-threshold vertices are adjacent to the image pixel if they are neighbors in the image. The connected components in the mask are identified and represented by hillock (difference between minima and

훿 maxima) 휓푖 , where the subscript 푖 is the component number or label and the superscript  designates the plane. A hillock value is compared to thresholds so that no pixel belonging to the hillock exceeds 푈, while still ensuring that the gradient of the pixels is enhanced by a factor larger than (1 + 휏), the unique scaling factor for each hillock (Majumder & Irani,

2006; Roli, 2005). The greedy iteration strategy cycles through each threshold plane 훿푖,

107

with 퐿 ≤ 훿푖 < 푈, thus ensuring that the contrast of pixel values is enhanced within the bounds of global and local constraints (Majumder & Irani, 2006).

7.2.2 Proposed global contrast enhancement method

The proposed global enhancement method described by equation 7.4 aims to minimize the global mean of the pixel values within the whole image and threshold 푔푚푒푎푛, such that (Singh et al., 2012);

푔 (푢) = |훻푢 − 훻푓|2푑푥 + 휚 (훻푢 − 푢̅)2푑푥 , (7.14) 푚푒푎푛 ∫훺 ∫훺

where 휚 is a gain factor that controls the trade-off between the two additive terms in equation 7.14, 훻푢 is the image pixel gradient, and 푢̅ is the mean value of the image pixel.

The 푔푚푒푎푛(푢) solution finds the vector gradient of image 푓푖 close to the vector gradient of

푓푖. Minimizing 푔푚푒푎푛(푢) reduces the variance in the image while compensating for illumination inhomogeneities (Singh et al., 2012).

The solution to this equation 7.14 is based on the Euler-Lagrange equation (Singh et al., 2012). If the mean value of 푢 coincides with the mean value of 푓, then the function

푢 that minimizes the function 푔푚푒푎푛 satisfies the Euler-Lagrange equation such that

휚(푢 − 푓)̅ − ∆푢 + ∆푓 = 0 over 훺 , (7.15) with homogeneous Neumann boundary condition

휕푢 = 0 over 휕훺 , (7.16) 휕푛

where 푛 is the vector normal to the boundary. The mean value of the solution used to represent the solution is assumed to be irrelevant. To prove that this is true, let us consider the problem;

휚푢 − ∆푢 = −∆푓 + 휚퐾 푖푛 훺 { 휕푢 (7.17) = 0 푖푛 ∆훺 휕푛

108

where K is constant, 휚 > 0 and Ω is the pixel values for eight-bit digital images (0 −

255). Equation 7.17 has a unique solution when 휚 ≠ 0. If 푢1 and 푢2 are the two solutions for equation 7.17 with constants 퐾1 and 퐾2 respectively, then it follows that from the uniqueness of the solution for equation 7.17 that

푢1 − 푢2 = 퐾1 − 퐾2 (7.18)

Setting 푀푖 = max 푢푖 and 푚푖 = min 푢푖 , 푖 = 1,2, we get

255 ̃푢 = (푢푖 − 푚푖), 푖 = 1,2, (7.19) 푀푖−푚푖

but 푀1 = 푀2 + 퐾1 − 퐾2 and 푚1 = 푚2 + 퐾1 − 퐾2, thus implying 푢̃1 = 푢̃2. Therefore, the final solution is independent of the constant 퐾. Consequently, for ease of workability, we set 퐾 = 0. Using this and previous workings (Morel, Petro, & Sbert, 2014), the solution of equation 7.17 can be represented as.

1 푢(푥, 푦) = 푓(푥, 푦) − ( 퐾 (√휚(푥2 + 푦2)) ∗ 푓(푥, 푦)), (7.20) 2휋 0

where 푢(푥, 푦) is the global enhanced output value of the original pixel value 푓(푥, 푦) at location (푥, 푦) of the original input image.

7.3 Simulation results

7.3.1 Experimental setup

Three different examples, presented in Figure. 7-2, Figure 7-3 and Figure. 7-4 are used to evaluate the proposed method. Local and global contrast enhancement parameters are listed in Table 7.1. As in previous chapters, the final results were evaluated using objective quantitative measurements: standard deviation (SD), root mean square error

(RMSE) and entropy. These were selected based on their use as descriptors of image human

109

perceptual quality, information content, lower noise levels and signal-to-noise ratio (SNR), and because they are available for comparison with previously published fusion methods.

The example images and results are shown in Figure. 7-2, Figure 7-3 and Figure.

7-4; the corresponding measurements are presented in Table 7.2. The best results were then compared with measurements from previously published (existing) techniques.

Comparison results are presented in Table 7.2 and Figure. 7-5.

Table 7.1: List of proposed local and global contrast enhancement parameters and values used in this study Parameter Value 퐶푔 4.0 휚 0.0005 C 0.08

Table 7.2 and Figure. 7-5 describe the performance of the proposed algorithm in comparison with some other methods including A Novel Pulse-Coupled Neural Network using Gabor Filters for Medical Image Fusion (PCNNGM), described in chapter 4; Fuzzy-

Based using Maximum Selection and Gabor filters (FMG), described in chapter 3;

Contourlet Transform (CT) (Yang, Guo, & Ni, 2008); Discrete wavelet transform (DWT)

(Pajares & De La Cruz, 2004); and Shearlets and Human Feature Visibility (SHFV) (Al-

Azzawi, 2015).

110

Figure. 7-2: Example 1: Input images are the computed tomography (CT) (top left) and magnetic resonance imaging (MRI) (bottom left) images. Proposed global and local contrast enhancement methods, and results of first-stage fused images ("Imfuse") are presented and correspondingly labeled. The result of the final fusion is shown at the extreme right.

111

Figure 7-3: Example 2: Input images are the CT (top left) and MRI (bottom left) images. Proposed global and local contrast enhancement methods, and results of first- stage fused images ("Imfuse") are presented and correspondingly labeled. The result of the final fusion is shown at the extreme right.

112

Figure. 7-4: Example 3: Input images are the CT (top left) and MRI (bottom left) images. Proposed global and local contrast enhancement methods, and results of first- stage fused images ("Imfuse") are presented and correspondingly labeled. The result of the final fusion is shown at the extreme right.

113

Table 7.2: Comparison of image quality metrics for fusion algorithms Examples Algorithm Entropy Standard RMSE Deviation Proposed 9.1732 70.4935 0.0143 1 (Example 1) 9.1595 57.3573 0.0193 PCNNGM 3.4507 117.7324 0.0736 FMG 7.1332 54.1504 0.1662 CT 6.9543 47.2304 0.2703 DWT 7.6572 56.7993 0.1164 SHFV

Proposed 9.0831 69.7102 0.0125 2 (Example 2) 8.9667 71.6609 0.0187 PCNNGM 3.8054 118.0412 0.1026 FMG 6.9351 46.6294 0.2538 CT 6.6997 41.4623 0.2889 DWT 7.3791 55.8533 0.2410 SHFV

Proposed 9.1066 70.2001 0.0159 3 (Example 3) 8.7121 56.8493 0.014 PCNNGM 2.3886 94.6862 0.0774 FMG 6.8824 43.1963 0.2422 CT 6.5198 42.0087 0.3142 DWT 6.9467 44.2937 0.2133 SHFV

114

Figure. 7-5: Final fusion results for image examples 1, 2, and 3 using the proposed method, PCNNGM, FMG, SHFV, CT and DWT.

115

7.4 Discussion

The proposed method demonstrates significant improvements over previously published methods for fusion. As summarized in Table 7.2 the resulting fused images show remarkable quality perception and contrast, as measured by all three indicators.

Analysis of the entropy values for the three examples indicates higher entropy values than measured for existing techniques, as tabulated in Table 7.2. Higher entropy values suggest a finer image detail and better perceptual quality than available from other fusion techniques. Entropy results presented in Table 7.2 are consistently the highest in all examples. Higher entropy values are also associated with a reduction in non-uniform illumination caused by global contrast enhancement. The improved perceptual quality also increases information content in comparison to the source images.

Although the proposed method has yielded improved results in most cases, it also produced mixed results in some other cases. The proposed method shows higher standard deviation values than all other benchmark methods, except for the FMG technique.

On the other hand, RMSE results rank lowest among all techniques considered.

Low RMSE is likely due to the greedy interactive algorithm, which ensures that variations for local contrast across the images was appropriate to the local image features. The lowest

RMSE is also caused by reduced image variance.

7.5 Conclusion

This chapter presents and describes a unique enhancement technique in image fusion based on a reduction in image variance during global contrast enhancement and improvements in perceptual image quality during local contrast enhancement. Existing medical image fusion techniques have produced images which could benefit from further

116

improvement. The proposed method produces fused images with reduced image variance, better perceptual image quality and improved contrast. Based on objective evaluation criteria, our method significantly outperforms existing leading information-content image fusion techniques. Future studies should be conducted in order to quantify the performance of the proposed method when the output of the proposed global contrast enhancement algorithm is used as an input for the proposed local contrast enhancement method, that is, using serial rather than a parallel concatenation.

117

Chapter 8

A novel Block Toeplitz matrix for DCT-based, perceptually enhanced image fusion

8.1 Introduction

This chapter presents a novel Block Toeplitz matrix for DCT-based, perceptually enhanced image fusion. Image fusion techniques are generally chosen based on the application under consideration (Deshmukh & Bhosale, 2010). Image fusion method such as those based on Pixel-averaging image fusion is the simplest technique; it is often associated with reducing the contrast of the fused image. In order to overcome the side effects of the pixel-averaging method, many fusion techniques have been developed based on multi-resolution (Anish & Jebaseeli, 2012), multi-scale (Naidu & Raol, 2008) and statistical signal processing. Some researchers have also explored the fusion of images in a transformed domain such as those based on the Discrete Cosine Transform (DCT), which is aimed at enhancing the contrast in the resulting fused image (Haghighat, Aghagolzadeh,

& Seyedarabi, 2011).

DCT has been used extensively in image compression since it is not computationally expensive (Naidu & Elias, 2013). Image fusion within the DCT domain is also popular since the process is highly efficient and less time consuming (Haghighat,

Aghagolzadeh, & Seyedarabi, 2011). DCT also has better energy compaction capabilities and is therefore preferred to other transforms (Yaroslavsky, 2014). Energy compaction is

118

quantified by the sum of pixel squared transform coefficients that are concentrated in a small fraction of the transformed image. DCT has the tendency to speed the decay of the image spectra, which results in better energy compaction. On the other hand, the Discrete

Fourier Transform (DFT), often used as a benchmark reference, has shown that it has relatively poor energy compaction, which leads to discontinuities at the signal borders.

Energy compaction in DCT is enabled by a better image approximation, which is related to the eigenvectors of the Toeplitz matrices (Zhang, Yu, Lou, Cham, & Dong,

2008). The symmetry of Toeplitz matrices in the DCT provides a reason for its computational efficiency (Yaroslavsky, 2015). These features of the DCT have made it suitable for a range of applications much wider than those of other fast transforms like the

DFT, the Discrete Fresnel Transforms (DFrT), the Walsh-Hadamard Transform (WHT), the Karhunen-Loeve transform (KLT) and the Haar Transform.

The 2D form of the DCT is the most common and widely used for block-based image processing (Fracastoro, Fosson, & Magli, 2017). The main drawbacks of the 2D

DCT, problems with smoothness and boundary discontinuities, have motivated many researchers to come up with various solutions and approaches, including using modified implementation of the 2D DCT to solve discontinuity issues along the boundaries of the formed image blocks. One of the main variations is based on the incorporation of direction information in the DCT (Directional DCT or DDCT), which leads to the formation of smoother boundaries (Zeng & Fu, 2008). However, this approach has faced technical difficulties such as lack of coherence in the transform, leading to poorer results than those produced by the original DCT. Another attempt to solve the directionality in the DCT led to the development of mode-dependent directional transforms (MDDT) from KLT, which

119

uses prediction residuals from training video data. In this approach, Block Toeplitz matrix is used to reduce the number of transform matrices needed during coding (Tanizawa,

Yamaguchi, Shiodera, Chujoh, & Yamakage, 2010). The block Toeplitz matrix helps in giving spectral interpretation of channel diversity (Gazzah, Regalia, & Delmas, 2001). This means more features are extracted for visualization. Images with better visualization of features can be interpreted more easily and are therefore more useful in diagnosis. While this approach has had some success, it also has drawbacks: it is time-consuming, and optimal results are specific to the data sets used. A recently developed approach where each image pixel is viewed as a node in the graph, and edges are seen as connectivity relations, has resulted in the design of more efficient edge-aware transforms (Kim, Narang,

& Ortega, 2012; Shen, et al., 2010). However, the results of this approach suffer from distorted and low information content images when compared to other initial transforms

(Kim, Narang, & Ortega, 2012; Fracastoro, Fosson, & Magli, 2017).

In this chapter, we present a new variation of the DCT in which the smoothness and boundary discontinuities issues associated with the previous methods have been mitigated by using a novel Block Toeplitz. The resulting DCT coefficients are fused using the standard built-in MATLAB ® function ‘imfuse’. The contrast of the fused image is enhanced to reduce contrast inhomogeneities that may have resulted from DCT processing follows the fusion process. The final image is also smoothed and processed through a

Gaussian-kernel bilateral filter for noise suppression. These later stages of the algorithm aim at improving the finer image details.

The rest of this chapter is organized as follows: Section 8.2 introduces the proposed method, giving a detailed explanation of the novel Block Toeplitz matrix used;

120

Section 8.3 presents and discusses simulation results, and Section 8.4 provides conclusions and points at directions of future research.

8.2 Proposed Method

8.2.1 Overview

Properties of linear discrete transforms such as scaling, shifting and convolution help achieve image transformation (Britanak, 2001), which in turn contributes to the removal of redundancy between neighboring pixels (Khayam, 2003). The action of the

DCT on images is based on approximations of eigenvectors of tridiagonal Toeplitz (Strang,

1999) matrices.

Block Toeplitz matrices have a wide range of applications from discretization of partial differential operators to linear autoregressive models for images. Images are non- infinite and non-periodic since the boundaries of the image have no correlation among themselves (Khayam, 2003). A 2D image may contain (512)2 pixels and the grey level of pixel at position (푖, 푗) is represented by amplitude 푥(푖, 푗) between 0 and 255, using 8 bits per pixel. The vector 푥 can be filtered by 푥 ∗ ℎ where h is vector (Strang, 1999). The main existing image processing efforts around the DCT have concentrated on improving the computation of the blocks while disregarding the role played by vector ℎ in improving the end results of the processed image. The method investigated (Fahmy, 2011) where begins by noting that using irreducible singular nonzero Block Toeplitz matrix block in the DCT increases the correlation between the pixels. Increased correlation within pixels during transformation enhances the complexity and accuracy trade-offs within pixels intensities

(Unser, Aldroubi, & Eden, 1993). The proposed method is expected to increase the finer details of the image by adjusting contrast and reducing noise levels of the final image.

121

The proposed method, as illustrated in Figure 8-1, begins with the proposed implementation of the DCT of the input images, followed by contrast adjustment using adaptive histogram equalization to maximize human perceptual quality; a final step implements smoothing by bilateral filter using Gaussian Kernel to suppress any added noise during the earlier stages.

Figure 8-1: Schematic representation of a novel fusion algorithm combining proposed DCT and global and improved human perceptual quality.

8.2.2 Block Toeplitz matrix

A block Toeplitz matrix is a special kind of matrix in which blocks are repeated along its diagonal (Gutierrez-Gutierrez & Crespo, 2012). Block Toeplitz matrices provide a good image approximation due to their eigenvectors, which to date has been the sole motivation behind their application to DCT (Sanchez, Garcia, Peinado, Segura, & Rubio, 1995). A

Toeplitz matrix constitutes the autocovariance matrix of a first-order stationary Markov process, and it has better performance than earlier transforms such as Discrete Fourier

Transform (DFT) (Britanak, Yip, & Rao, 2010). On the other hand, existing

122

implementations of the DCT are based on a matrix with entries 휌(푗−푘), representing the covariance matrix of a useful class of signals, where coefficient 휌 measures the correlation between nearest neighbors. The image resulting from these DCT implementations is lossy and often displays a lower quality than the input image (Khayam, 2003). The method proposed in this work seeks to overcome this.

In this study, a block Toeplitz matrix is proposed, in which the correlation coefficient is (휌 ± ∆푥), where ∆푥 denotes the small change in pixel values, and the block size is 푁 =

8. The consideration of small changes in image pixel intensities is viewed as the main contribution of this research effort. Modeling small variations in pixel intensities enhances the totality of image contrast, hence improving human visual perception of the final image.

Incorporating small pixel intensities in the Toeplitz matrix design also increases the smoothness of the intensity trends in the final image and improves the image finer details.

The 8 by 8 block Toeplitz matrix used is of the form represented by

훽 훼 훽 훽 훼 훽 휉 훽

훼 훽 휉 훽 훼 훽 훽 훼 훽 훼 훽 훼 훽 훽 훼 훽 훽 훽 훼 훼 퐴 = 훽 훼 훽 훽 , 휉 < 훼 < 훽 ≤ 1. (8.1) 훼 훽 훽 훼 훽 훼 훽 훽 훽 훼 훽 훽 훼 훽 훼 훽 휉 훽 훼 훽 훽 훼 훽 훼 [훽 휉 훽 훼 훽 훽 훼 훽]

The output coefficients of Block Toeplitz matrix DCT step are fused via 'imfuse', and the results then processed for contrast adjustment and bilateral smoothing (Acton & Bovik,

1999).

123

8.2.3 Contrast adjustment using adaptive histogram equalization

Contrast enhancement using adaptive histogram equalization has been used due to its simplicity and efficiency (Zhu & Huang, 2012). Adaptive histogram equalization has performed excellently on for both natural and medical images (Pizer, et al., 1987). Contrast enhancement through adaptive histogram equalization uses probability distribution to adjust the gray level of an image.

In the proposed algorithm presented in Figure 8-1, the pixels of the imfused image are gray mapped through the use of gray operations and transformed into a histogram which is smooth and uniform (Zhu & Huang, 2012). The mapping and transformation of imfused image pixels ensures contrast enhancement is achieved. Assuming the gray value of the imfused image is 푟 such that 0 < 푟 < 1 and the probability density function of 푟 is given by 푝(푟), the gray value of the pixel of the output image (the image that will be used for smoothing by the bilateral filter using Gaussian Kernel) is 푠 such that 0 < 푠 < 1. The probability density function of the output image is given by 푝(푠) and the mapping function is represented by 푠 = 푇(푟). If every bar of the histogram has equal height, then

푝푠(푠)푑푠 = 푝푟(푟)푑푟 (8.2)

Suppose the mapping function 푠 = 푇(푟) is a monotonically increasing function in the interval of the histogram heights and its inverse function 푟 = 푇−1(푠) is also a monotonic function, from (8.2) we deduce that (Zhu & Huang, 2012)

푝푟(푟)1 1 푝푠(푠) = [ 푑푠 ] = 푝푟(푟) = 1 (8.3) 푝푟(푟) 푑푟 푟=푇−1(푠)

124

If we assume based on (8.3) that the entropy of the final image is given by (Zhu &

Huang, 2012)

푛−1 푛−1 퐸 = ∑푟=0 푒(푟) = − ∑푟=0 푝푟 log 푝푟 (8.4)

The maximum entropy of the whole image is achieved by ensuring the histogram of

1 the image has uniform distribution, that is, when 푝 = 푝 = ⋯ = 푝 = . The image 0 12 푛−1 푛 with maximum entropy from this stage is then taken to the next level for smoothing and edge completeness.

8.2.4 Smoothing by bilateral filter using Gaussian Kernel

Image edges represent high frequency components, and as such, they can be smoothed by lowpass filtering. In the proposed algorithm, images are processed post-fusion in order to suppress noise and promote uniform smoothing by a bilateral filter that is defined by the combination of a domain kernel and a range kernel, as described next.

Let us assume that the image created from section II-C has a pixel of interest 푝 with intensity yp. Pixel averaging is set to operate within a Ω-neighborhood of 푝. If 푞 is a pixel within the Ω-neighborhood of 푝 and with intensity 푦푞, the output of the filter obtained by using the kernel 휙푝,푞 is represented as (Venkatesh, Mohan, & Seelamantula, 2015)

−1 푥̂푝 = ℎ푝 ∑푞∈Ω 휙푝,푞푦푞, (8.5)

where ℎ푝 is a normalizing factor given by ℎ푝 = ∑푞∈Ω 휙푝,푞, and the filter kernel 휙푝,푞푦푞 is formed by combining the domain and range kernels as

휙푝,푞(푦푝, 푦푞) = 푤푝−푞 푟(푦푝 − 푦푞) , (8.6)

where 푟(푦푝 − 푦푞) is the range kernel and 푤푝−푞 represents the domain kernel.

125

In the range kernel, pixels with relatively similar intensities are assigned higher weights, and vice versa. Therefore, the range kernel quantifies the similarity between pixel intensity 푦푝 and neighborhood pixel intensity 푦푞,

2 |푦푝−푦푞| 푟(푦푝 − 푦푞) = exp (− 2 ), (8.7) 2휎푟

where 휎푟 controls the rates of Gaussian decay and directly determines degree of smoothing. At the same time, the domain kernel depends only on the geometric distance between the pixel of interest 푝 and neighboring pixel 푞, following also a Gaussian shape represented by

|푝−푞|2 푤푝−푞 = exp (− 2 ), (8.8) 2휎푑

where |푝 − 푞| is the geometrical distance between the pixels 푝 and 푞, and 휎푑 is the

Gaussian decay parameter. Pixels that are closer to 푝 are assigned higher weights; domain kernel coefficients decay according to 휎푑 as the distance increases. This kernel acts only on the domain between pixels 푝 and 푞, which explains why it is referred to as domain kernel. Pixels outside such domains are not considered for the weighted average defined in

(8.5).

8.3 Simulation Results

8.3.1 Experimental Setup

Three different examples are used to evaluate the proposed method. Each example aims at fusing one Computer Tomography (CT) image and one Magnetic Resonance image

(MRI) into one composite image. Block Toeplitz values proposed for the matrix defined in

(1) are listed in Table 8.1. The final results were evaluated using three objective quantitative measurements: standard deviation (SD), root mean square error (RMSE) and

126

entropy. These evaluation criteria were selected based on their use as descriptors of image human perceptual quality, information content and signal-to-noise ratio (SNR). In addition, these criteria were chosen because they are available for comparison with previously published fusion methods. SD is a measure of contrast. Images with high standard deviation show better human visual properties. RMSE measures the amount of change per pixel due to processing. Lower RMSE is an indication of lower noise levels in the image and thus improved visual features and finer image details. Entropy measures image quality in terms of information content and visibility of finer details, including textural uniformity (Singh

& Kapoor, 2014).

The example images and results are shown in Figure 8-2, Figure 8-3 and Figure 8-4 corresponding measurements are presented in Table 8.2: Comparison of image quality metrics for various fusion algorithms, including the proposed one Table

8.2. The best results were then compared with measurements from previously published

(existing) techniques. Comparison results are presented in Table 8.2 and Figure 8-6

Table 8.1: List of proposed parameters of the tridiagonal 1-toeplitz matrix used in experiments 1, 2 and 3. Parameter Value

훽 1

휉 0.1

훼 0.9

127

Figure 8-2: Example 1. Input images are the CT (above) and MRI (below) images. The result of the final fusion is also shown (right).

128

Figure 8-3: Example 2. Input images are the CT (above) and MRI (below) images. The result of the final fusion is also shown (right).

Figure 8-4: Example 3. Input images are the CT (above) and MRI (below) images. The result of the final fusion is also shown (right).

Table 8.2 and Figure 8-6 describe the performance of the proposed algorithm in comparison with some other methods including Contourlet Transform (CT) (Yang, Guo,

129

& Ni, 2008), Discrete wavelet transform (DWT) (Pajares & De La Cruz, 2004), and

Shearlets and Human Feature Visibility (SHFV) (Al-Azzawi, 2015).

The results presented in Table 2 are also plotted for convenient visualization in Fig.8-

6, Fig. 8-7, and Fig. 8-8. The numbers clearly show the improvements the proposed method offers in comparison to the benchmarked methods. The entropy values of the proposed method were greatly improved due to the use of the DCT matrix and the smoothing process after the fusion. Similarly, the SD values are higher than in all the other methods, which proves that the fused results have high contrast, which is desirable. Finally, the low RMSE values of the proposed method suggest that the final fused results demonstrate improved noise suppression.

130

Table 8.2: Comparison of image quality metrics for various fusion algorithms, including the proposed one. Example Algorithm Entropy Standard Deviation RMSE

# Proposed 8.1634 82.9216 0.0179

1 CT 7.1332 54.1504 0.1662

DWT 6.9543 47.2304 0.2703

SHFV 7.6572 56.7993 0.1164 Proposed 8.1403 78.3379 0.0129

2 CT 6.9351 46.6294 0.2538

DWT 6.6997 41.4623 0.2889

SHFV 7.3791 55.8533 0.2410

3 Proposed 6.9895 94.2736 0.0177

CT 6.8824 43.1963 0.2422

DWT 6.5198 42.0087 0.3142

SHFV 6.9467 44.2937 0.2133

131

Figure 8-5: Final fusion results for image examples 1, 2, and 3 using the proposed method, CT, DWT and SHFV.

A summary of the performance of the proposed method in relation to existing medical image fusion based on other transforms is also shown in bar graphs in Figure 8-6, Figure

8-7 and Figure 8-8. The bar graphs demonstrate the improvements in the key objective measures in the comparison between the proposed method in relation to the benchmark methods used in this paper. The orange bar represents standard deviation values, and it

132

shows clearly that the proposed method yields results up to 123% greater than the existing methods used in the comparison. The blue bar illustrates entropy values, and it shows that the proposed method yields the highest values, with an improvement of 24%. Lastly, the gray bar represents the RMSE values and it shows that the proposed method yields the lowest of these, with a reduction of up to 96%.

PERFORMANCE EVALUATION COMPARISON FOR E XA MPLE 1 90 0.3

80 0.25 70

60 0.2

50 0.15 40 RMSE

30 0.1 Entropy Standard deviation Standard Entropy

20 0.05 10

0 0 Proposed CT DWT SHFV

Entropy Standard deviation RMSE

Figure 8-6: The performance of the proposed method in relation to others, for example 1 in Figure 8-2.

133

PERFORMANCE EVALUATION COMPARISON FOR E XA MPLE 2 90 0.35

80 0.3

70 0.25 60

50 0.2 RMSE 40 0.15

30 Entropy Standard deviation Standard Entropy 0.1 20

0.05 10

0 0 Proposed CT DWT SHFV

Entropy Standard deviation RMSE

Figure 8-7: The performance of the proposed method in relation to others, for example 2 in Figure 8-3.

134

PERFORMANCE EVALUATION COMPARISON FOR E XA MPLE 3 100 0.35

90 0.3 80

70 0.25

60 0.2

50 RMSE 0.15

40 Entropy Standard deviation Standard Entropy 30 0.1

20 0.05 10

0 0 Proposed CT DWT SHFV

Entropy Standard deviation RMSE

Figure 8-8: The performance of the proposed method in relation to others, for example 3 in Figure 8-4.

8.4 Discussion

The summary of results presented in Table 8.2 and in Figures 8-6, 8-7, and 8-8 shows that the proposed method exhibits better quality indicators than any of the reference methods, as measured by entropy, standard deviation and RMSE.

Further analysis in Table 8.2 and Figure 8-6, Figure 8-7 and Figure 8-8 indicates entropy values for the proposed algorithm are greater than for any other technique. Images with higher entropies depict visibly finer details and less inhomogeneity, which in turn suggest better human perceptual quality and increased smoothness. Higher entropy values are due to better extraction and enhancement of image finer details, which can be attributed

135

to the use of the Block Toeplitz DCT matrix, resulting in an improvement in the amount of information contained in the final images as compared to input images.

The proposed method also yielded results with better standard deviation than any other reference method. Post-fusion processing aimed at enhancing contrast and edge completeness is the probable reason behind these very high standard deviation values. A significant reduction in RMSE is also observed in these indicators, which could be also associated with improvements in contrast enhancement and Smoothing by bilateral filter using Gaussian Kernel in the proposed technique.

8.5 Conclusion

Existing state-of-the-art medical image fusion techniques have failed to address the loss of fine details and the lack of image contrast. This paper presents new transform- domain image fusion technique based on a block Toeplitz DCT matrix, followed by a post- processing stage aimed at reducing noise, improving edge completeness and smoothing of the fused image. The proposed method has produced fused images with lower RMSE, higher SD and better entropy than any of the selected benchmark methods. Future studies should be done to assess the performance of the proposed technique in the absence of contrast adjustment and smoothing by bilateral filter using Gaussian Kernel techniques.

136

Chapter 9

Summary and future studies

9.1 Summary

This dissertation consists of a research effort directed at the problem of improving the quality of images resulting from the fusion of two or more multi-modal medical images. In the context of this project, the concept of fused medical image quality represents one or several of the following characteristics: dense textural properties, edge formation, adequate contrast, visual detail, illumination balance, rich information content, improvement of image smoothness and finer details, a series of features generally associated to better perceptual properties and, consequently, improved usefulness in terms of human diagnosis.

Such effort is motivated by the perceived deficiencies of existing fusion algorithms, which has been supported by an exhaustive literature review on this subject.

Within the framework of this research, image fusion is conceived as a combination of pre-processing of the source images, the fusion process itself, and post-processing of the fused image. Such an integral, multi-dimensional conception of the fusion process allows for increased complexity, versatility and power of the proposed algorithms in terms of the desired objectives.

A variety of different approaches have been attempted in this effort; each of them inspired in a novel application and combination of well-known image processing and fusion tools. The motivation for such multiple approaches is the need to explore different

137

combinations of techniques, as well as the intention of improving different sets of properties in the fused image. The first approach, inspired by the appealing properties of

Gabor filters (invariance, noise rejection and edge detection, among others), maximum selection and fuzzy-based combining (resolution of uncertainty and redundancy), seeks to enhance the textual properties and edge formation in the fused image. The second postulated method exploits the aforementioned qualities of Gabor filtering, together with maximum combining and the image enhancing qualities of a modified pulse-coupled neural network (in an attempt to recreate the human visual system properties) in order to mitigate some of the common defects in source medical images, such as poor illumination and shift variance. The third proposed algorithm is based on a self-adjusting, space-variant Wiener filter followed by a feature-linking model implemented through a self-organizing pulse- coupled neural network, and fusion using a shift-invariant discrete wavelet transform, and its purpose is to sharpen the edges, improve geometric corrections, or even enhance certain features invisible in the original images. The fourth proposed method consists of a dual enhancement strategy (local contrast enhancement in parallel with global variance minimization) operating on the source images, followed by standard fusion, and it is aimed at improving the quality of the fused image in terms of the human visual system perception.

Finally, the last approach entails increasing the finer details of the inputs followed by enhancing the perceptual quality of the fused images.

All the methods have been tested and evaluated using three pairs of medical images and a set of three standard objective performance measures. The performance figures have been compared to those obtained from the application of existing fusion techniques to the same set of images. In the case of the third and fourth proposed methods, they have also

138

been contrasted against the first two proposed algorithms. These comparisons are presented both visually and quantitatively. The outcome of such quantitative comparisons favors mostly the newly proposed methods over the benchmark techniques, in most cases in a decisive manner; furthermore, the third and the fourth algorithms appear to prevail over the fifth, first and second ones. It can be concluded, consequently, that the techniques, which have been presented, represent a significant contribution toward the better development of the key area of medical image fusion.

9.2 Future studies

Some of the directions of future work derived from this dissertation are related to the need of isolating the effect of some of the key building blocks of the proposed methods.

For instance, there is a perceived need to assess the effect of Gabor filtering on the first algorithm, which could be done by suppressing this block from the processing chain and observing the results on the fused image. The same could be said about the second method, which also uses Gabor filtering. Maximum selection could also be the subject of analysis: its real impact on the performance of the first and second algorithms can be evaluated by excluding this processing strategy. The high-scale Wiener filter used in the third algorithm is also viewed as a potential candidate for this impact isolation strategy.

Another set of future efforts can be envisioned in the area of postulating alternative processing blocks and schemes. For instance, alternative fusion approaches can be tested in lieu of the shift-invariant discrete wavelet transform that is at the core of the third algorithm. Serial concatenation of the local contrast enhancement strategy and the global variance minimization method used in the fourth algorithm may replace the currently used parallel concatenation structure, which would additionally reduce the algorithmic

139

complexity by decreasing the number of required fusions. The application of a Block

Toeplitz matrix as a part of a Discrete Cosine Transform (DCT) can also be subject to more testing with different sets of values to evaluate the efficiency of every parameter in the matrix.

Finally, the proposed methods can be subject to a more extensive evaluation that includes a larger set of source images. New evaluations could also contemplate alternative objective measurements, and also the possibility of subjective evaluation, which would require specialized auditory tests.

140

References

[1] Abramoff, M. D., Magalhaes, P. J., & Ram, S. J. (2004). Image processing with ImageJ. Biophotonics international, 11(7), 36-42. [2] Abreu, E., Lightstone, M., Mitra, S. K., & Arakawa, K. (1996). A new efficient approach for the removal of impulse noise from highly corrupted images. IEEE transactions on image processing, 5(6), 1012-1025. [3] Acton, S. T., & Bovik, A. C. (1999). Piecewise and local image models for regularized image restoration using cross-validation. IEEE Transactions on Image Processing, 8(5), 652-665. [4] Addison, P. S. (2005). Wavelet transforms and the ECG: a review. Physiological measurement, 26(5), 155. [5] Al-Azzawi, N. A. (2015). Medical Image Fusion based on Shearlets and Human Feature Visibility. International Journal of Computer Applications, 125(12), 1-12. [6] Alenezi, F., & Salari, E. (2018, March). A Fuzzy-Based Medical Image Fusion Using a Combination of Maximum Selection And Gabor Filters. International Journal of Scientific & Engineering Research, 9(3), 118- 129. [7] Alenezi, F., & Salari, E. (2018). A Novel Pulse-Coupled Neural Network using Gabor Filters for Medical Image Fusion. Internation al Journal of Computer Science And Technology, 9(2), 72-81. [8] Alenezi, F., & Salari, E. (2018). Perceptual local contrast enhancement and global variance minimization of medical images for improved fusion. International Journal of Imaging Science and Engineering (IJISE), 10(3), 1-11. [9] Anish, A., & Jebaseeli, T. J. (2012). A Survey on Multi-Focus ImageFusion Methods. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(8), 319. [10] Beghdadi, A., & Le Negrate, A. (1989). Contrast enhancement technique based on local detection of edges. Computer Vision, Graphics, and Image Processing, 46(2), 162-174. [11] Benesty, J., Chen, J., Huang, Y. A., & Doclo, S. (2005). Study of the Wiener filter for noise reduction. Speech Enhancement (pp. 9-41). Springer.

141

[12] Bhatnagar, G., Wu, Q. J., & Liu, Z. (2013). Directive contrast based multimodal medical image fusion in NSCT domain. IEEE Transactions on Multimedia, 14(5), 1014-1024. [13] Blasch, E. P. (1999). Biological information fusion using a PCNN and belief filtering. Neural Networks, 1999. IJCNN'99. International Joint Conference on. 4, pp. 2792-2795. IEEE. [14] Blum, R. S., Xue, Z., & Zhang, Z. (2005). An overview of image fusion. In R. S. Blum, & Z. Liu, Multi-Sensor Image Fusion and Its Applications (pp. 16-50). CRC Press. [15] Britanak, V. (2001). Discrete Cosine and Sine Transforms. The Transform and Data Compression HandbookEd. KR Rao et al. Boca Raton, CRC Press LLC. [16] Britanak, V., Yip, P. C., & Rao, K. R. (2010). Discrete cosine and sine transforms: general properties, fast algorithms and integer approximations. Elsevier. [17] Brosch, T., & Neumann, H. (2014). Interaction of feedforward and feedback streams in visual cortex in a firing-rate model of columnar computations. Neural Networks, 54, 11-16. [18] Broussard, R. P., & Rogers, S. K. (1996). Physiologically motivated image fusion using pulse-coupled neural networks. Applications and science of artificial neural networks II. 2760, pp. 372-384. International Society for Optics and Photonics. [19] Byrne, J. H. (2013). Introduction to neurons and neuronal networks. Textbook for the Neurosciences. https://nba.uth.tmc.edu/neuroscience/s1/introduction.html [20] Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on information theory, 52(2), 489-509. [21] Chang, T., & Kuo, C.-C. (1993). Texture analysis and classification with tree-structured wavelet transform. IEEE Transactions on image processing, 2(4), 429-441. [22] Chen, J., Benesty, J., Huang, Y., & Doclo, S. (2006). New insights into the noise reduction Wiener filter. IEEE Transactions on audio, speech, and language processing, 14(4), 1218-1234. [23] Cheng, G., Han, J., & Lu, X. (2017). Remote sensing image scene classification: benchmark and state of the art. Proceedings of the IEEE, 105(10), 1865-1883. [24] Chiorean, L., & Vaida, M.-F. (2009). Medical image fusion based on discrete wavelet transform using Java technology. Information Technology Interfaces, 2009. ITI'09. Proceedings of the ITI 2009 31st International Conference on (pp. 55-60). IEEE. [25] Cristobal, G., Schelkens, P., & Thienpont, H. (2013). Optical and digital image processing: fundamentals and applications. John Wiley & Sons.

142

[26] Cvetkovic, S. D., Schirris, J., & de With, P. H. (2007). Locally-adaptive image contrast enhancement without noise and ringing artifacts. Image Processing, 2007. ICIP 2007. IEEE International Conference on. 3, pp. III--557. San Antonio, TX, USA: IEEE. [27] Dammavalam, S. R., Maddala, S., & Prasad, M. H. (2012, February). Quality Assessment of Pixel-Level ImageFusion Using Fuzzy Logic. International Journal on Soft Computing, 3(1), 11-23. [28] Das, S., & Kundu, M. K. (2013). A neuro-fuzzy approach for medical image fusion. IEEE transactions on biomedical engineering, 60(12), 3347-3353. [29] Das, S., Chowdhury, M., & Kundu, M. K. (2011). Medical image fusion based on ripplet transform type-I. Progress In Electromagnetics Research, 30, 355-370. [30] Deshmukh, M., & Bhosale, U. (2010). Image fusion and image quality assessment of fused images. International Journal of Image Processing (IJIP), 4(5), 484. [31] Eckhorn, R., Reitboeck, H. J., Arndt, M. T., & Dicke, P. (1990). Feature linking via synchronization among distributed assemblies: Simulations of results from cat visual cortex. Neural computation, 2(3), 293-307. [32] El-Hoseny, H. M.-S., Elrahman, W. A., & El-Samie, F. E. (2017). Medical image fusion techniques based on combined discrete transform domains. Radio Science Conference (NRSC), 2017 34th National (pp. 471-480). IEEE. [33] Fahmy, G. (2011). Fast Multiplier-less Implementation of Bspline Basis with Enhanced Compression Performance. American Journal of Signal Processing, 1(1), 6-11. [34] ALenezi, F. , Salari, E. (2018). A Novel Pulse-Coupled Neural Network using Gabor Filters for Medical Image Fusion. International Journal of Computer Science And Technology, 9(2), 72-81. [35] Flusser, J., Sroubek, F., & Zitov, B. (2007). Image Fusion: Principles, Methods, and Applications, Tutorial EUSIPCO 2007. Institute of Information Theory and Automation Academy of Sciences of the Czech Republic, 182(8), 1-60. [36] Fracastoro, G., Fosson, S. M., & Magli, E. (2017). Steerable discrete cosine transform. IEEE Transactions on Image Processing, 26(1), 303- 314. [37] French, A. S., & Stein, R. B. (1970). A flexible neural analog using integrated circuits. IEEE Transactions on Biomedical Engineering(3), 248-253. [38] Gazzah, H., Regalia, P. A., & Delmas, J.-P. (2001). Asymptotic eigenvalue distribution of block Toeplitz matrices and application to blind SIMO channel identification. IEEE Transactions on Information Theory, 47(3), 1243-1251. [39] Gjesteby, L., Shan, H., Yang, Q., Xi, Y., Claus, B., Jin, Y., . . . Wang, G. (2018). Deep Neural Network for CT Metal Artifact Reduction with a

143

Perceptual Loss Function. The fifth international conference on image formation in X-ray computed tomography, (pp. 439-443). [40] Gonzalez, R. C., & Woods, R. E. (1993). Digital image processing Addison. Wesley publishing company. [41] Grigorescu, S. E., Petkov, N., & Kruizinga, P. (2002). Comparison of texture features based on Gabor filters. IEEE Transactions on Image processing, 11(10), 1160-1167. [42] Gutierrez-Gutierrez, J., & Crespo, P. M. (2012). Block Toeplitz matrices: Asymptotic results and applications. Foundations and Trends in Communications and Information Theory, 8(3), 179-257. [43] Haghighat, M. B., Aghagolzadeh, A., & Seyedarabi, H. (2011). Multi- focus image fusion for visual sensor networks in DCT domain. Computers & Electrical Engineering, 37(5), 789-797. [44] Han, J., & Ma, K.-K. (2007). Rotation-invariant and scale-invariant Gabor features for texture image retrieval. Image and vision computing, 25(9), 1474-1481. [45] Hoshino, O. (2005). Hoshino, O. (2005). Cognitive enhancement mediated through postsynaptic actions of norepinephrine on ongoing cortical activity. Neural computation, 17(8), 1739-1775. [46] Jain, A. K. (1979). A sinusoidal family of unitary transforms. IEEE Transactions on Pattern Analysis and Machine Intelligence(4), 356-365. [47] James, A. P., & Dasarathy, B. V. (2014). Medical image fusion: A survey of the state of the art. Information Fusion, 19, 4-19. [48] Jiang, D., Zhuang, D., & Huang, Y. (2013). Investigation of image fusion for remote sensing application. In D. Jiang, D. Zhuang, & Y. Huang, New Advances in Image Fusion (pp. 1-24). InTech. [49] Jiang, H., & Tian, Y. (2011). Fuzzy image fusion based on modified Self- Generating Neural Network. Expert Systems with Applications, 38(7), 8515-8523. [50] Johnson, J. L., & Padgett, M. L. (1994). PCNN models and applications. IEEE transactions on neural networks, 10(3), 480-498. [51] Johnson, J. L., & Padgett, M. L. (1999). PCNN models and applications. IEEE transactions on neural networks, 10(3), 480-498. [52] Kaur, R., & Kaur, S. (2016). An Approach for Image Fusion using PCA and Genetic Algorithm. Image, 15(16). [53] Kaur, V., & Kaur, J. (2015). Comparison of Image Fusion Techniques: Spatial and Transform Domain based Techniques. International Journal Of Engineering And Computer Science, 4(5). [54] Khayam, S. A. (2003). The discrete cosine transform (DCT): theory and application. Michigan State University, 114. [55] Kim, W.-S., Narang, S. K., & Ortega, A. (2012). Graph based transforms for depth video coding. Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on (pp. 813-816). IEEE.

144

[56] Kingdom, F. A. (2014). Mach bands explained by response normalization. Frontiers in human neuroscience, 8, 843. [57] Klir, G., & Yuan, B. (1995). Fuzzy sets and fuzzy logic (Vol. 4). Prentice hall New Jersey. [58] Kourav, A. K., & Sharma, A. (2015). Multiresolution Transform Techniques in Digital Image. International Journal of Computer Applications, 123(12), 0975 – 8887. [59] Kuruvilla, S., & Anitha, J. (2014). Comparison of registered multimodal medical image fusion techniques. Electronics and Communication Systems (ICECS), 2014 International Conference on (pp. 1-6). IEEE. [60] Kyrki, V., Kamarainen, J.-K., & Kalviainen, H. (2004). Simple Gabor feature space for invariant object recognition. Pattern recognition letters, 25(3), 311-318. [61] Lamberti, F., Montrucchio, B., & San, A. (2006). CMBFHE: a novel contrast enhancement technique based on cascaded multistep binomial filtering histogram equalization. IEEE Transactions on Consumer Electronics, 52(3), 966-974. [62] Lee, H., Yeom, S., Guschin, V., Son, J.-y., & Kim, S.-H. (2009). Image fusion of visual and millimeter wave images for concealed object detection. Infrared, Millimeter, and Terahertz Waves, 2009. IRMMW-THz 2009. 34th International Conference on (pp. 1-2). IEEE. [63] Li, C., & Bovik, A. C. (2010). Content-weighted video quality assessment using a three-component image model. Journal of Electronic Imaging, 19(1), 011003(1-9). [64] Li, H., Manjunath, B. S., & Mitra, S. K. (1995). Multisensor image fusion using the wavelet transform. Graphical models and image processing, 57(3), 235-245. [65] Li, M., Cai, W., & Tan, Z. (2006). A region-based multi-sensor image fusion scheme using pulse-coupled neural network. Pattern Recognition Letters, 27(16), 1948-1956. [66] Li, S., Kang, X., Fang, L., Hu, J., & Yin, H. (2017). Pixel-level image fusion: A survey of the state of the art. Information Fusion, 33, 100-112. [67] Li, W., & Zhu, X.-F. (2005). A new image fusion algorithm based on wavelet packet analysis and PCNN. Machine Learning and Cybernetics, 2005. Proceedings of 2005 International Conference on. 9, pp. 5297-5301. IEEE. [68] Liu, F., Li, J., & Caiyun, H. (2012). Image fusion algorithm based on simplified PCNN in nonsubsampled contourlet transform domain. Procedia Engineering, 29, 1434-1438. [69] Lo, W. Y., & Puchalski, S. M. (2008). Digital image processing. Veterinary Radiology & Ultrasound, 49, S42-S47. [70] Lu, S., Wang, Z., & Shen, J. (2003). Neuro-fuzzy synergism to the intelligent system for edge detection and enhancement. Pattern Recognition, 36(10), 2395-2409.

145

[71] Ma, Y., Zhan, K., & Wang, Z. (2010). Applications of Pulse-Coupled Neural Networks. Heidelberg-New York: Springer. [72] Maheshwary, P., Shirvaikar, M., & Grecos, C. (2018). Blind image sharpness metric based on edge and texture features. Real-Time Image and Video Processing 2018. 10670, p. 1067004. International Society for Optics and Photonics. [73] Majumder, A., & Irani, S. (2006). Contrast enhancement of images using human contrast sensitivity. Proceedings of the 3rd symposium on Applied perception in graphics and visualization (pp. 69-76). Irvine, California: ACM. [74] Mallat, S. G. (1989). A theory for multiresolution signal decomposition: the wavelet representation. IEEE transactions on pattern analysis and machine intelligence, 11(7), 674-693. [75] Manchanda, M., & Sharma, R. (2016). A novel method of multimodal medical image fusion using fuzzy transform. Journal of Visual Communication and Image Representation, 40, 197-217. [76] Mandal, M. K. (2003). The Human Visual System and Perception. Multimedia Signals and Systems (pp. 33-56). Boston, MA: Springer. [77] Mantiuk, R., Myszkowski, K., & Seidel, H.-P. (2006). A perceptual framework for contrast processing of high dynamic range images. ACM Transactions on Applied Perception (TAP), 3(3), 286-308. [78] Maragatham, G., & Roomi, S. M. (2015). A review of image contrast enhancement methods and techniques. Research Journal of Applied Sciences, Engineering and Technology, 9(5), 309-326. [79] Maruthi, R., & and Sankarasubramanian, K. (2008). Pixel level multifocus image fusion based on fuzzy logic approach. Asian Journal of Information Technology, 7(4), 168-171. [80] Maruthi, R., & Lakshmi, I. (2017). Multi-Focus Image Fusion Methods – A Survey. Computer Engineering, 19(4), 9-25. Retrieved from www.iosrjournals.org [81] Maruthi, R., & Sankarasubramanian, K. (2008). Pixel level multifocus image fusion based on fuzzy logic approach. Asian Journal of Information Technology, 7(4), 168-171. [82] Mas, M., Monserrat, M., Torrens, J., & Trillas, E. (2007). A survey on fuzzy implication functions. IEEE Transactions on fuzzy systems, 15(6), 1107-1121. [83] Mehena, J. (2011). Medical Image edge detection based on mathematical morphology. International Journal of Computer and communication technology}, 2(6), 45-48. [84] Mir, A. H., Hanmandlu, M., & Tandon, S. N. (1995). Texture analysis of CT images. IEEE Engineering in Medicine and Biology Magazine, 14(6), 781-786. [85] Mitchell, H. B. (2010). Image fusion: theories, techniques and applications. Springer Science & Business Media.

146

[86] Montolio, F. G., Janssens, M. S., Stam, L., & Jansonius, N. M. (2016, March 8). Lateral inhibition in the human visual system in patients with glaucoma and healthy subjects: a case-control study. PloS one, 11(3), 1- 11. [87] Morel, J.-M., Petro, A.-B., & Sbert, C. (2014). Screened Poisson equation for image contrast enhancement. Image Processing On Line, 4, 16-29. [88] Mythili, C., & Kavitha, V. (2011). Efficient technique for color image noise reduction. The research bulletin of Jordan, ACM, (pp. 41-44). [89] Nahvi, N., & Sharma, O. C. (2014). Comparative Analysis of Various Image Fusion Techniques For Biomedical Images: A Review. Engineering Research and Applications, 4(5), 81-86. [90] Naidu, V. P., & Raol, J. R. (2008). Pixel-level image fusion using wavelets and principal component analysis. Defence Science Journal, 58(3), 338. [91] Naidu, V., & Elias, B. (2013). A novel image fusion technique using DCT based Laplacian pyramid. International Journal of Inventive Engineering and Sciences (IJIES) ISSN, 2319-9598. [92] Naidu, V., & Raol, J. R. (2008). Pixel-level image fusion using wavelets and principal component analysis. Defence Science Journal, 58(3), 338. [93] Nakamura, Y., Tsuboi, K., & Hoshino, O. (2008). Lateral Excitation between Dissimilar Orientation Columns for Ongoing Subthreshold Membrane Oscillations in Primary Visual Cortex. International Conference on Artificial Neural Networks (pp. 318-327). Springer. [94] Nathan, H. (2018). Practical Approaches to Virtual Acoustics. Game Audio Programming 2 (pp. 289-306). AK Peters/CRC Press. [95] Nevatia, R., & Babu, K. R. (1980). Linear feature extraction and description. Computer Graphics and Image Processing, 3(13), 257-269. [96] Nithya, R., & Santhi, B. (2011). Classification of normal and abnormal patterns in digital mammograms for diagnosis of breast cancer. International journal of computer applications, 28(6), 21-25. [97] Pajares, G., & De La Cruz, J. M. (2004). A wavelet-based image fusion tutorial. Pattern recognition, 37(9), 1855-1872. [98] Peng, R., & Varshney, P. K. (2015). A human visual system-driven image segmentation algorithm. Journal of Visual Communication and Image Representation, 26, 66-79. [99] Perfilieva, I. (2006). Fuzzy transforms: Theory and applications. Fuzzy sets and systems, 157(8), 993-1023. [100] Pizer, S. M., Amburn, E. P., Austin, J. D., Cromartie, R., Geselowitz, A. a., ter Haar Romeny, B., . . . Zuiderveld, K. (1987). Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing, 39(3), 355-368. [101] Polesel, A., Ramponi, G., & Mathews, V. J. (2000). Image enhancement via adaptive unsharp masking. IEEE transactions on image processing, 9(3), 505-510.

147

[102] Prakash, O., Srivastava, R., & Khare, A. (2013). Biorthogonal wavelet transform based image fusion using absolute maximum fusion rule. Information \& Communication Technologies (ICT), 2013 IEEE Conference on (pp. 577-582). IEEE. [103] Prasad, V. S., & Domke, J. (2005). Gabor filter visualization. J. Atmos. Sci, 13, 2005. [104] Pratt, W. K. (2017). Digital Image Processing: College Edition. Wiley. [105] Qi, J., Lyu, B., AlAli, A., Machado, G., Hu, Y., & Marfurt, K. (2018). Image processing of seismic attributes for automatic fault extraction. Geophysics, 84(1), 1-67. [106] Qu, G., Zhang, D., & Yan, P. (2001). Medical image fusion by wavelet transform modulus maxima. Optics Express, 9(4), 184-190. [107] Rajkumar, S., Bardhan, P., Akkireddy, S. K., & Munshi, C. (2014). CT and MRI image fusion based on Wavelet Transform and Neuro-Fuzzy concepts with quantitative analysis. Electronics and Communication Systems (ICECS), 2014 International Conference on (pp. 1-6). IEEE. [108] Ramamurthy, N., & Varadarajan, S. (2012, June). The Robust digital image watermarking using quantization and fuzzy logic approach in DWT domain. International Journal of Computer Science and Network (IJCSN), 1(5). [109] Ranganath, H. S., Kuntimad, G., & Johnson, J. L. (1995). Pulse coupled neural networks for image processing. Southeastcon'95. Visualize the Future., Proceedings., IEEE (pp. 37-43). IEEE. [110] Rao, D. S., Seetha, M., & Prasad, M. (2012). Comparison of fuzzy and neuro fuzzy image fusion techniques and its applications. International Journal of Computer Applications, 43(20), 31-37. [111] Renato, A.-N., Petronilho, J., & Quintero, N. R. (2005). On some tridiagonal k-Toeplitz matrices: Algebraic and analytical aspects. Applications. Journal of computational and applied mathematics, 184(2), 518-537. [112] Rockinger, O. (1997). Image sequence fusion using a shift-invariant wavelet transform. Image Processing, 1997. Proceedings., International Conference on. 3, pp. 288-291. IEEE. [113] Rodriguez, J. M., Mitra, S., Thampi, S. M., & El-Alfy, E.-S. (2016). Intelligent systems technologies and applications 2016 (Vol. 530). Springer. [114] Roli, F. (2005). Image Analysis And Processing Iciap 2005: 13th International Conference Cagliari, Itlay, September 6-8, 2005 Proceedings (Vol. 3617). Springer Science and Business Media. [115] Ross, A., & Govindarajan, R. (2004). Feature level fusion in biometric systems. proceedings of Biometric Consortium Conference (BCC) (pp. 1- 2). West Virginia University, Morgantown, West Virginia: Lane Dept. of Computer Science and Electrical Engg.

148

[116] Ross, T. J. (2005). Fuzzy logic with engineering applications. John Wiley & Sons. [117] Roushdy, M. (2006). Comparative study of edge detection algorithms applying on the grayscale noisy image using morphological filter. GVIP journal, 6(4), 17-23. [118] Saeedi, J., & Faez, K. (2012). Infrared and visible image fusion using fuzzy logic and population-based optimization. Applied Soft Computing, 12(3), 1041-1054. [119] Sahu, D. K., & Parsai, M. (2012). Different image fusion techniques--a critical review. International Journal of Modern Engineering Research (IJMER), 2(5), 4298-4301. [120] Saleem, A., Beghdadi, A., & Boashash, B. (2012). Image fusion-based contrast enhancement. EURASIP Journal on Image and Video Processing, 2012, 1-10. [121] Sanchez, V., Garcia, P., Peinado, A. M., Segura, J. C., & Rubio, A. J. (1995). Diagonalizing properties of the discrete cosine transforms. IEEE transactions on Signal Processing, 43(11), 2631-2641. [122] Sanchez, V., Garcia, P., Peinado, A. M., Segura, J. C., & Rubio, A. J. (1995). Diagonalizing properties of the discrete cosine transforms. IEEE transactions on Signal Processing, 43(11), 2631-2641. [123] Sanija, T. S., & Karthik, M. (2015). Image fusion techniques and performance evaluation for clinical applications: a review. International Journal of Computer Science and Information Technology & Security (IJCSITS), 5(4), 339-345. [124] Sapkal, R. J., & Kulkarni, S. M. (2013). Innovative image fusion algorithm based on fast discrete curvelet transform with different fusion rules. Information & Communication Technologies (ICT), 2013 IEEE Conference on (pp. 1070-1074). IEEE. [125] Savic, S., & Babic, Z. (2012). Fusion of low contrast multifocus images. Telecommunications Forum (TELFOR), 2012 20th (pp. 658-661). IEEE. [126] Saxena, S., Kumar, S., & Sharma, V. K. (2013). Edge detection using soft computing in Matlab}. International journal of advanced research in Computer Science and Software Engineering, 3(6). [127] Sayood, K. (2012). Introduction to data compression. Central Tablelands: Newnes. MATH Google Scholar, 5-6. [128] Schoenauer, T., Atasoy, S., Mehrtash, N., & Klar, H. (2002). NeuroPipe- Chip: A digital neuro-processor for spiking neural networks. IEEE Transactions on Neural Networks, 13(1), 205-213. [129] Sehasnainjot, S. (2014, December). Literature survey on Image Fusion Techniques. An International Journal of Engineering Sciences, 2, 40-43. [130] Selvakumari, G., & Aravindh, R. (2014). A New Medical Image Fusion based on Non-Subsampled Contourlet Transform. Journal of Computer Applications (JCA), 7(1), 2014.

149

[131] Semmlow, J. L., & Griffel, B. (2014). Biosignal and medical image processing. CRC press. [132] Sharifi, M., Fathy, M., & Mahmoudi, M. T. (2002). A classified and comparative study of edge detection algorithms. Information Technology: Coding and Computing, 2002. Proceedings. International Conference on (pp. 117-120). IEEE. [133] Sharma, M. (2016). A review: image fusion techniques and applications. Int J Comput Sci Inf Technol, 7(3), 1082-1085. [134] Sharma, M. (2016). A review: image fusion techniques and applications. Int J Comput Sci Inf Technol, 7(3), 1082-1085. [135] Shen, G., Kim, W.-S., Narang, S. K., Ortega, A., Lee, J., & Wey, H. (2010). Edge-adaptive transforms for efficient depth map coding. Picture Coding Symposium (PCS), 2010 (pp. 566-569). IEEE. [136] Shih, M.-Y., & Tseng, D.-C. (2005). A wavelet-based multiresolution edge detection and tracking. Image and Vision Computing, 23(4), 441- 451. [137] Shrivakshan, G. T., & Chandrasekar, C. (2012). A comparison of various edge detection techniques used in image processing. International Journal of Computer Science Issues (IJCSI), 9(6), 269. [138] Singh, B., & Singh, A. P. (2008). Edge Detection in Gray Level Images based on the Shannon Entropy 1. Citeseer. [139] Singh, K., & Kapoor, R. (2014). Image enhancement using exposure based sub image histogram equalization. Pattern Recognition Letters, 36, 10-14. [140] Singh, K., & Kapoor, R. (2014). Image enhancement using exposure based sub image histogram equalization. Pattern Recognition Letters, 36, 10-14. [141] Singh, L., Agrawal, S., & Gupta, P. (2015, July). Review on medical image fusion based on neuro-fuzzy. International Journal of Scientific Research Engineering & Technology (IJSRET), 4(7), 777-781. [142] Singh, R., & Khare, A. (2013). Multiscale medical image fusion in wavelet domain. The Scientific World Journal, 2013, 1-11. [143] Singh, S. S., Singh, T. T., Devi, H. M., & Sinam, T. (2012). Local contrast enhancement using local standard deviation. International Journal of Computer Applications, 47(15). [144] Singh, S. S., Singh, T. T., Singh, N. G., & Devi, H. M. (2012). Global- Local Contrast Enhancement. International Journal of Computer Applications, 54(10), 7-11. [145] Singh, S., Gupta, D., Anand, R., & Kumar, V. (2015). Nonsubsampled shearlet based CT and MR medical image fusion using biologically inspired spiking neural network. Biomedical Signal Processing and Control, 18, 91-101. [146] Soentpiet, R. (1999). Advances in kernel methods: support vector learning. MIT press.

150

[147] Stewart, R. D., Fermin, I., & Opper, M. (2002). Region growing with pulse-coupled neural networks: an alternative to seeded region growing. IEEE Transactions on Neural Networks, 13(6), 1557-1562. [148] Strang, G. (1999). The Discrete Cosine Transform. SIAM Review, 41(1), 135-147. [149] Su, B.-B., Gu, M.-H., Wang, M.-M., & Wang, Z.-L. (2018). Anisotropic Gaussian kernels edge detection algorithm based on the chromatic difference. Tenth International Conference on Digital Image Processing (ICDIP 2018) (p. 108062Q). International Society for Optics and Photonics. [150] Sudharani, B., Hemaltha, M., & Deepa, B. (2015). Wavelet Transform for a Fuzzy Based Image Fusion . International Journal of Advances in Electrical and Electronics Engineering, 58-67. [151] Suliman, C., Boldisor, C., Bazavan, R., & Moldoveanu, F. (2011). A fuzzy logic based method for edge detection. Bulletin of the Transilvania University of Brasov. Engineering Sciences. Series I, 4(1), 159. [152] Sun, C.-C., Ruan, S.-J., Shie, M.-C., & Pai, T.-W. (2005). Dnamic contrast enhancement based on histogram specification. IEEE Transactions on Consumer Electronics, 51(4), 1300-1305. [153] Suthakar, J. R., & Monica E. M. E., A. D. (2014). Study of Image Fusion- Techniques, Method and Applications. International Journal of Computer Science and Mobile Computing (IJCSMC), 3(11), 469-476. [154] Swathi, P. S., Sheethal, M. S., & Paul, V. (2016). Survey on Multimodal Medical Image Fusion Techniques. International Journal of Science, Engineering and Computer Technology, 6(1), 33. [155] Tang, J. (2004). A contrast based image fusion technique in the DCT domain. Digital Signal Processing, 14(3), 218-226. [156] Tanizawa, A., Yamaguchi, J., Shiodera, T., Chujoh, T., & Yamakage, T. (2010). Improvement of intra coding by bidirectional intra prediction and 1 dimensional directional unified transform. Doc. JCTVC-B042, MPEG- H/JCT-VC. [157] Tank, V. P., Shah, D. D., Vyas, T. V., Chotaliya, S. B., & Manavadaria, M. S. (2013, Jan-Feb). Image Fusion Based On Wavelet And Curvelet Transform. IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) ISSN, 1(5), 32-36. [158] Tsofe, A., Spitzer, H., & Einav, S. (2009). Does the Chromatic Mach bands effect exist? Journal of vision, 9(6), 20-20. [159] Umarani, A. (2016). Enhancement of coronary artery using image fusion based on discrete wavelet transform. Biomedical Research, 27(4), 1118- 1122. [160] Unser, M., Aldroubi, A., & Eden, M. (1993). B-Spline Signal Processing: Part I Theory. IEEE transactions on signal processing, 41(2), 821-833. [161] Varadarajan, S. (2014). Texture structure analysis. Arizona State University.

151

[162] Vaseghi, S. V. (2008). Advanced digital signal processing and noise reduction (2 ed.). Southern Gate, Chichester, West Sussex, United Kingdom: John Wiley and Sons. [163] Venkatesh, M., Mohan, K., & Seelamantula, C. S. (2015). Directional bilateral filters for smoothing fluorescence microscopy images. AIP Advances, 5(8), 084805. [164] Vishwakarma, A. K., & Mishra, A. (2012). Color image enhancement techniques: a critical review. Indian J. Comput. Sci. Eng, 3(1), 39-45. [165] Wang, A., Zhao, J., Dai, S., & Iwahori, Y. Z. (2015). Medical Image Fusion based on Pulse Coupled Neural Network Combining with Compressive Sensing. International Journal of Signal Processing, Image Processing and Pattern Recognition, 8(5), 223-230. [166] Wang, D., Mao, K., & Ng, G.-W. (2017). Convolutional neural networks and multimodal fusion for text aided image classification. Information Fusion (Fusion), 2017 20th International Conference on (pp. 1-7). IEEE. [167] Wang, Z. (1984). Fast algorithms for the discrete W transform and for the discrete Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing, 32(4), 803-816. [168] Wang, Z., & Ma, Y. (2008). Medical image fusion using m-PCNN. Information Fusion, 9(2), 176-185. [169] Wang, Z., & Shang, X. (2006). Spatial pooling strategies for perceptual image quality assessment. Image Processing, 2006 IEEE International Conference on (pp. 2945-2948). IEEE. [170] Wang, Z., Wang, S., Zhu, Y., & Ma, Y. (2016). Review of image fusion based on pulse-coupled neural network. Archives of Computational Methods in Engineering, 23(4), 659-671. [171] Williams, D. J., & Shah, M. (1990). A fast algorithm for active contours. Computer Vision, 1990. Proceedings, Third International Conference on (pp. 592-595). IEEE. [172] Wu, J., Yin, Z., & Xiong, Y. (2007). The fast multilevel fuzzy edge detection of blurry images. IEEE signal processing letters, 14(5), 344. [173] Xia, W., Yin, S., & Ouyang, P. (2013). A high precision feature based on LBP and Gabor theory for face recognition. Sensors, 13(4), 4499-4513. [174] Xu, B., & Chen, Z. (2004). A multisensor image fusion algorithm based on PCNN. Intelligent Control and Automation, 2004. WCICA 2004. Fifth World Congress on (pp. 3679-3682). IEEE. [175] Xu, L., Du, J., & Li, Q. (2013). Image fusion based on nonsubsampled contourlet transform and saliency-motivated pulse coupled neural networks. Mathematical Problems in Engineering, 2013. [176] Xu, Y., Weaver, J. B., Healy Jr, D. M., & Lu, J. (1994). Spatially Selective Noise Filtration Technique. IEEE transactions on image processing, 3(6), 747. [177] Yang, B., Jing, Z., & Zhao, H.-t. (2010). Review of pixel-level image fusion. Journal of Shanghai Jiaotong University (Science), 15, 6-12.

152

[178] Yang, F., & Wei, H. (2013). Fusion of infrared polarization and intensity images using support value transform and fuzzy combination rules. Infrared Physics & Technology, 60, 235-243. [179] Yang, H., Fang, Y., & Lin, W. (2015). Perceptual quality assessment of screen content images. IEEE Transactions on Image Processing, 24(11), 4408-4421. [180] Yang, H., Fang, Y., & Lin, W. (2015). Perceptual quality assessment of screen content images. IEEE Transactions on Image Processing, 24(11), 4408-4421. [181] Yang, L., Guo, B. L., & Ni, W. (2008). Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing, 72(1-3), 203-211. [182] Yang, L., Guo, B. L., & Ni, W. (2008). Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing, 72(1-3), 203-2011. [183] Yang, Y. (2010). Multimodal medical image fusion through a new DWT based technique. Bioinformatics and Biomedical Engineering (iCBBE), 2010 4th International Conference on (pp. 1-4). IEEE. [184] Yang, Y., Tong, S., Huang, S., & Lin, P. (2014). Log-Gabor energy based multimodal medical image fusion in NSCT domain. Computational and mathematical methods in medicine, 2014. [185] Yaroslavsky, L. (2004). Yaroslavsky, L. (2013). Digital holography and digital image processing: principles, methods, algorithms. Springer Science & Business Media. New York: Springer Science+Business Media. doi:10.1007/978-1-4757-4988-5 [186] Yaroslavsky, L. P. (2014). Fast transforms in image processing: compression, restoration, and resampling. Advances in Electrical Engineering, 2014, 23. [187] Yaroslavsky, L. P. (2015). Compression, restoration, resampling,‘compressive sensing’: fast transforms in digital imaging. Journal of Optics, 17(7), 073001. [188] Yi, C., & Tian, Y. (2011). Text detection in natural scene images by stroke gabor words. Document Analysis and Recognition (ICDAR), 2011 International Conference on (pp. 177-181). IEEE. [189] Yi, C., & Tian, Y. (2011). Text detection in natural scene images by stroke gabor words. Document Analysis and Recognition (ICDAR), 2011 International Conference on, 177-181. [190] York, T., & Jain, R. (2011). Fundamentals of image sensor performance. jain/cse567-11/ftp/imgsens/index. html, 1-8. [191] Yousuf, M. A., & Rakib, M. R. (2011). An effective image contrast enhancement method using global histogram equalization. Journal of scientific research, 3(1), 43. [192] Yue, S., Wu, T., Pan, J., & Wang, H. (2013). Fuzzy clustering based ET image fusion. Information Fusion, 14(4), 487-497.

153

[193] Yun, J., Zhanhuai, L., Yong, W., & Longbo, Z. (2005). Joining associative classifier for medical images. Hybrid Intelligent Systems, 2005. HIS'05. Fifth International Conference on (pp. 6-10). IEEE. [194] Zeng, B., & Fu, J. (2008). Directional discrete cosine transforms—a new framework for image coding. IEEE transactions on circuits and systems for video technology, 18(3), 305-313. [195] Zhan, K., Shi, J., Wang, H., Xie, Y., & Li, Q. (2017). Computational mechanisms of pulse-coupled neural networks: a comprehensive review. Archives of Computational Methods in Engineering, 24(3), 573-588. [196] Zhan, K., Teng, J., Shi, J., Li, Q., & Wang, M. (2016). Feature-linking model for image enhancement. Neural computation, 28(6), 1072-1100. [197] Zhan, K., Zhang, H., & Ma, Y. (2009). New spiking cortical model for invariant texture retrieval and image processing. IEEE Transactions on Neural Networks, 20(12), 1980-1986. [198] Zhang, C., Yu, L., Lou, J., Cham, W.-K., & Dong, J. (2008). The technique of prescaled integer transform: concept, design and applications. IEEE Transactions on Circuits and Systems for Video Technology, 18(1), 84-97. [199] Zhang, H., & Cao, X. (2013). A way of image fusion based on wavelet transform. Mobile Ad-hoc and Sensor Networks (MSN), 2013 IEEE Ninth International Conference on, 498-501. [200] Zhang, J. Y., & Liang, J. L. (2004). Image fusion based on pulse-coupled neural networks. Computer Simulation, 21(4), 102-104. [201] Zhang, Q., & Guo, B.-l. (2009). Multifocus image fusion using the nonsubsampled contourlet transform. Signal processing, 89(7), 1334- 1346. [202] Zhu, Y., & Huang, C. (2012). An adaptive histogram equalization algorithm on the image gray level mapping. Physics Procedia, 25, 601- 608.

154