Fractal Initialization for High-Quality Mapping with Self-Organizing Maps
Total Page:16
File Type:pdf, Size:1020Kb
Neural Comput & Applic DOI 10.1007/s00521-010-0413-5 ORIGINAL ARTICLE Fractal initialization for high-quality mapping with self-organizing maps Iren Valova • Derek Beaton • Alexandre Buer • Daniel MacLean Received: 15 July 2008 / Accepted: 4 June 2010 Ó Springer-Verlag London Limited 2010 Abstract Initialization of self-organizing maps is typi- 1.1 Biological foundations cally based on random vectors within the given input space. The implicit problem with random initialization is Progress in neurophysiology and the understanding of brain the overlap (entanglement) of connections between neu- mechanisms prompted an argument by Changeux [5], that rons. In this paper, we present a new method of initiali- man and his thought process can be reduced to the physics zation based on a set of self-similar curves known as and chemistry of the brain. One logical consequence is that Hilbert curves. Hilbert curves can be scaled in network size a replication of the functions of neurons in silicon would for the number of neurons based on a simple recursive allow for a replication of man’s intelligence. Artificial (fractal) technique, implicit in the properties of Hilbert neural networks (ANN) form a class of computation sys- curves. We have shown that when using Hilbert curve tems that were inspired by early simplified model of vector (HCV) initialization in both classical SOM algo- neurons. rithm and in a parallel-growing algorithm (ParaSOM), Neurons are the basic biological cells that make up the the neural network reaches better coverage and faster brain. They form highly interconnected communication organization. networks that are the seat of thought, memory, con- sciousness, and learning [4, 6, 15]. This simple model of Keywords Hilbert curves Á Self-organizing maps Á the neuron is drawn from various biological observations: a Initialization Á Neural networks neuron is equipped with multiple dendrites and an axon; the dendrites collect electric signals from other neurons’ axons; these signals are weighted according to the strength 1 Introduction of the connections and summed; if the sum is above a specific threshold, the neuron in turn fires a signal along its Self-organization is a principle for a system to internally own axon to other neurons whose dendrites are connected organize itself to the environment. Self-organization is to it [12]. Some reinforcement mechanisms exist that within the domain of unsupervised learning, as no outside modify the connection weights for each neuron and allow factor influences the organization. the network to adapt and learn [12]. Computer scientists use this model to simulate to a small extent the functionality of the brain. Recent advances in neurophysiology show that biological neurons’ function- ality relies not only on electrical signals along the axons and dendrites but also on numerous chemical processes. I. Valova (&) Á D. Beaton Á A. Buer Á D. MacLean ANN can no longer be considered a model of biological Computer and Information Science, neurons, but rather a different paradigm of computation. University of Massachusetts Dartmouth, This paradigm of computation is referred to as connec- 285 Old Westport Rd, North Dartmouth, MA 02747, USA tionism [4]. Connectionism is the field of study focused on e-mail: [email protected] connected networks of simple units that define specific 123 Neural Comput & Applic cognitive processes such as recognition, memory, learning, This paper proposes a method of initialization to sup- and behavior. plement a staple unsupervised algorithm. As indicated in [15], when a one-dimensional SOM maps a two-dimen- 1.2 Self-organization sional input state of uniform probability, the resulting chain resembles a Peano curve, which is defined as self-similar A number of computational intelligence algorithms exploit [19]. Following this observation, we explore initialization self-organization, such as swarm intelligence and learning of the SOM using curves that belong to the Peano curve vector quantization [13, 14], and, more famously, self- family, namely, Hilbert curves. The unsupervised initiali- organizing map (SOM) [15]. This paper focuses on the zation approach is to set neurons to an initial topology latter. SOMs are unsupervised and are a priori unaware consistent with the Hilbert curves at several stages. We about input and input features. overwhelmingly show that Hilbert initialization in self- Neurons are initialized and organized in topologies that organizing systems directly causes a faster stable-network are preset by the designer of the network. SOM have been state with a better topological mapping than with random applied as data compression tools—mapping high dimen- initialization of equivalent neurons. The proposed initiali- sional data onto low-dimensional structures—yet main- zation technique is tested on two very different SOMs. One taining important details and features [15]. The topological is the well-known classical SOM, which processes the properties reflect those of maps in the brain, which are input space on a vector-by-vector basis. The other network, mostly two-dimensional folded sheets or one-dimensional ParaSOM, although based on the principles of self-orga- strings, such as auditory cortex [15]. nization, is a growing architecture that processes the input The intent of this study is to show an initialization in parallel. The winner is computed for each signal in method of self-organizing systems (specifically, SOM and parallel, and the neurons operate with regions of coverage, variations), wherein the initialization plays a critical role in rather than single input vectors. Given the diversity of the convergence speed [26, 29]. Although random initializa- tested SOM models and the nature of the reported results, tion is currently the common method of unsupervised we confidently support Hilbert curve as a model for SOM learning, it represents of the challenges of SOM. Kohonen initialization. [15] proposed that initialization should be based on random vectors in input space. This demonstrates the lack of 1.3 Fractals and self-similarity dependence between neurons, and input upon initialization. Kohonen proposed a faster convergence wherein neurons Mandelbrot’s study of geometry and fractals [18] found are initialized to random input vectors or a linear that the deterministic equations that model nature combine initialization. into dynamic systems that account for the complexity of Initialization methods can include some form of pre- the world and preclude precise determinism of the whole processing for a better initial topological start, but that can system. However, the debate between a deterministic and negatively affect the convergence time of a SOM. There materialistic science, and a science with view of the world have been several studies in recent literature reflecting the only structured by determinism, is still present. general opinion that weight initialization is crucial to the Fractals [19] are geometric objects that exhibit intricate success of SOM mapping. Ritter and Schulter [22] have structures on a smaller scale that are self-similar to the shown that SOM may exhibit local minima behavior. As whole object. A typical example of a fractal is the Koch the initialization affects the final map, the investigation of curve. The curve is constructed by recursively applying a such methods is of interest. The authors in [1, 2, 25, 26, geometric transformation to each segment, as illustrated in 29] propose different methods to alleviate the problem, Fig. 1. The segment is cut in three equal pieces, where the although some of them are too application specific. In middle piece is replaced with the two other edges of an [26], a three-stage method is proposed, which requires equilateral triangle. Starting with a straight-line segment, finding a hypercube to cover all the input space. While the the algorithm proceeds in stages. At each stage, the system is simple in nature, it requires preprocessing of the recursion traverses the curve obtained previously. Every input. The authors in [30] propose three approaches to time the algorithm encounters a straight-line segment, it initialization. Although these can be regarded as applica- replaces it with the four-segment pattern as was first done tion specific, they require either preliminary clustering or some form of input space preprocessing. Their conclusion is in line with all other studies, i.e., that random initiali- zation has the drawback of scrabbling the neurons, thus requiring greater processing times to finish the mapping satisfactorily. Fig. 1 Three iterations of Koch fractal curve 123 Neural Comput & Applic Fig. 2 Koch curve after several recursive iterations Fig. 3 Hilbert space-filling curve—first three iterations with the original line segment. After just a few iterations, the curve becomes very convoluted (Fig. 2). Fig. 4 Hilbert H(6) curve Space-filling curves were discovered by Guiseppe Peano in 1890. Peano was looking for continuous curves that would provide a bisection from part of the plane in R2 to a curve in R [19]. They are similar to fractals in their defi- speed of organization. The simulation organization is pre- nition and construction, being defined as the limit curve of sented in Sect. 4, while the results and discussion are given a set of curves constructed by recursion and are produced in Sect. 5. The conclusions follow in Sect. 6. with self-similar patterns like fractals [7]. However, these curves are not fractal as they have an integer dimension, whereas by definition, fractals have a fractional dimension. 2 Algorithmic overview of SOMs used in this study These curves are called space filling because the limit curve they define is one that fills a higher dimension space. The brain of higher animals is organized by specific Any point within the area covered by the curve can be function; the visual cortex processes the information approached to an arbitrary precision by choosing an received through the optical nerve from the eyes; the appropriate iteration of the series; furthermore, any point of auditory cortex processes the sounds perceived by the ears; the area is reached by the limit curve [23].