Hexagonal Image Processing in the Context of Machine Learning: Conception of a Biologically Inspired Hexagonal Deep Learning Framework
Total Page:16
File Type:pdf, Size:1020Kb
Hexagonal Image Processing in the Context of Machine Learning: Conception of a Biologically Inspired Hexagonal Deep Learning Framework Tobias Schlosser , Michael Friedrich , and Danny Kowerko Junior Professorship of Media Computing, Chemnitz University of Technology, 09107 Chemnitz, Germany, {firstname.lastname}@cs.tu-chemnitz.de Abstract—Inspired by the human visual perception system, While the structure and functionality of artificial neural hexagonal image processing in the context of machine learning networks is inspired by biological processes, they are also deals with the development of image processing systems that limited by their underlying structure due to the current state combine the advantages of evolutionary motivated structures based on biological models. While conventional state-of-the-art of the art of recording and output devices. Therefore, mostly image processing systems of recording and output devices almost square structures are used, which also significantly restrict exclusively utilize square arranged methods, their hexagonal subsequent image processing systems, as the set of allowed counterparts offer a number of key advantages that can benefit operations is reduced depending on the arrangement of the both researchers and users. This contribution serves as a general underlying structure [4]. application-oriented approach the synthesis of the therefore designed hexagonal image processing framework, called Hexnet, In comparison to the current state of the art in machine the processing steps of hexagonal image transformation, and learning, the human visual perception system suggests an alter- dependent methods. The results of our created test environment native, evolutionarily-based structure, which manifests itself in show that the realized framework surpasses current approaches the human eye. The retina of the human eye displays according of hexagonal image processing systems, while hexagonal artificial to Curcio et al. [5] a particularly strong hexagonal arrangement neural networks can benefit from the implemented hexagonal architecture. As hexagonal lattice format based deep neural of sensory cells, whereas following Middleton and Sivaswamy networks, also called H-DNN, can be compared to their square [6] the processing of incoming signals is handled much more counterparts by transforming classical square lattice based data efficiently. Through the layers of the retina the reduction of sets into their hexagonal representation, they can also result in a the received information to the nerve fibers takes place, which reduction of trainable parameters as well as result in increased are connected to the brain via afferents over the optic nerve. training and test rates. Hubel and Wiesel Index Terms—Computer Vision, Pattern and Image Recogni- [7] demonstrated that the image in the visual tion, Deep Learning, Convolutional Neural Networks, Hexagonal cortex is projected retinotopic, whereas adjacent structures of Image Processing, Hexagonal Lattice, Hexagonal Sampling the retina are preserved and processed through the following areas of the brain. I. INTRODUCTION AND MOTIVATION The use of hexagonal structures emerges therefore as an evolutionarily motivated approach. Compared to its square Following the developments of recent years, machine learn- counterpart, the hexagonal lattice format (Fig.1) features a ing methods in the form of artificial neural networks are be- number of decisive advantages. These include the homogeneity arXiv:1911.11251v7 [cs.LG] 18 Mar 2021 coming increasingly important. Convolutional neural networks of the hexagonal lattice format, whereby the equidistance and (CNN) for object recognition and classification in this context uniqueness of neighborhood as well as an increased radial are one of the approaches that are in the focus of current symmetry are given. Together with an by 13:4 % increased research in the domain of deep learning. In order to meet transformation efficiency as shown by Petersen and Middleton the ever-increasing demands of more complex problems, novel [8] and Mersereau [9], hexagonal representations enable the application areas, and larger data sets, following Krizhevsky et storage of larger amounts of data on the same number of al. [1], more and more novel models and procedures are being sampling points, which following Golay [10] result in reduced developed, as their complexity and diversity following Szegedy computation times, less quantization errors, and an increased et al. [2] steadily increases. However, novel architectures efficiency in programming. for artificial neural networks, including convolutional and pooling layers, are being developed that exceed the conven- A. Related work tional cartesian-based architectures. These include, but are not Current hexagonal image processing research includes ap- limited to, spherical or non-Euclidean manifolds, as current plications in the fields of observation, experiment, and simula- research by Bronstein et al. [3] demonstrates. tion in ecology by Birch et al. [11], the design of geodesic grid 978-1-7281-4550-1/19/$31.00 © 2019 IEEE, 10.1109/ICMLA.2019.00300 y y z 0 1 2 3 2 (0, 0) (1, 0) (2, 0) (0, 1) (1, 1) 3 4 5 4 0 1 1 x p 1 1 x (0, 1) (1, 1) (2, 1) (-1, 0) (0, 0) (1, 0) 2 6 7 8 5 6 (0, 2) (1, 2) (2, 2) (-1, -1) (0, -1) (a) Linewise address- (b) Spiral addressing ing [27], [28] (a) Square lattice format (b) Hexagonal lattice format 23 22 Figure 1: Square and hexagonal lattice format in comparison. (1, 4) (2, 4) 24 20 21 13 12 (0, 3) (1, 3) (2, 3) (3, 3) (4, 3) 33 32 25 26 14 10 11 systems by Sahr et al. [12], sensor-based image processing and (-2, 2) (-1, 2) (0, 2) (1, 2) (2, 2) (3, 2) (4, 2) 34 30 31 3 2 15 16 remote sensing [13], [14], [15], medical imaging [16], [17], (-3, 1) (-2, 1) (-1, 1) (0, 1) (1, 1) (2, 1) (3, 1) 35 36 4 0 1 63 62 [18], and image synthesis [19]. (-3, 0) (-2, 0) (-1, 0) (0, 0) (1, 0) (2, 0) (3, 0) Furthermore, first experimental results for supply demand 43 42 5 6 64 60 61 (-3, -1) (-2, -1) (-1, -1) (0, -1) (1, -1) (2, -1) (3, -1) forecasting by Ke et al. (2018) [20] and the analysis of atmo- 44 40 41 53 52 65 66 spheric telescope data for event detection and classification (-4, -2) (-3, -2) (-2, -2) (-1, -2) (0, -2) (1, -2) (2, -2) 45 46 54 50 51 by Erdmann et al. (2018) [21] and Shilon et al. (2019) [22] (-4, -3) (-3, -3) (-2, -3) (-1, -3) (0, -3) give an overview over currently emerging application areas for 55 56 hexagonal image processing in the context of machine learn- (-2, -4) (-1, -4) ing. These application areas show promising opportunities as (c) Combination of (a) and (b) current research often deploys an initial processing step to Figure 2: Construction of hexagonal addressing schemes. transform the given hexagonal lattice format based image data into a square representation [23], [24]. As the amount of research which combines principles from As this contribution aims to benefit both researchers and hexagonal image processing and machine learning is quite lim- users alike by providing the necessary tools to ease the ited, this contribution also aims to overcome the downsides of application and development of hexagonal models, we will currently developed hexagonal machine learning approaches. also provide the tools to evaluate said models with focus These include not only the development of hexagonal convo- on their performance and transformation quality as compared lutional layers following Hoogeboom et al. (2018) [25] and to classical square and hexagonal lattice format based ap- Steppa and Holch (2019) [26], but also the implementation proaches. The related developed architectures, procedures, of the underlying addressing scheme as well as the necessary and test results, including future developments and further processing steps for transformation and visualization. application-specific contributions, are made publicly available to the research community via open science and are found on B. Contribution of this work the project page of Hexnet and its repository1. In order to combine the advantages of the research fields of hexagonal image processing and machine learning in the II. FUNDAMENTALS AND METHODS form of deep neural networks, this contribution serves as To allow the addressing, transformation, and visualization a general application-oriented approach the conception of a of images in their hexagonal representation, the necessary first hexagonal deep learning framework. While serving the processing steps as well as their underlying addressing scheme recognition and classification of objects in larger data sets have to be determined. The following sections introduce our where the original image resolution often needs to be reduced proposed hexagonal addressing scheme as identified for its as an initial processing step, the necessary information content applicability in comparison to currently deployed square and should also be maximized. With focus on the implementation hexagonal lattice format based architectures. In conjunction of the underlying structure, the designed framework, called with the proposed architecture, we will then introduce the Hexnet, includes the necessary processing steps as well as its principles of hexagonal image processing in the context of underlying hexagonal addressing scheme. deep learning. However, as hexagonal image data has to be captured using a hexagonal sensor, and the corresponding hardware is A. The hexagonal addressing scheme either rare or limited in its applicability, alternatives include While the one-dimensional spiral architecture addressing the transformation or synthesis of hexagonal image data. scheme (SAA) introduced by Burt [27] and Gibson and Lucas Therefore, the question arises, in which way and to what [28] established itself as the preferred addressing scheme degree currently deployed square lattice format based data sets for hexagonal lattice format based image processing systems can be leveraged in the context of hexagonal deep learning approaches. 1https://github.com/TSchlosser13/Hexnet Input (75 × 86) FM2 (6 @ 8 × 9) FCLs + Output FM3 (6 @ 3 × 3) FM1 (6 @ 25 × 28) .