An Effective Multi-Resolution Hierarchical Granular Representation Based Classiﬁer Using General Fuzzy Min-Max Neural Network

1 An Effective Multi-Resolution Hierarchical Granular Representation based Classifier using General Fuzzy Min-Max Neural Network Thanh Tung Khuat , Fang Chen , and Bogdan Gabrys , Senior Member, IEEE Abstract—Motivated by the practical demands for simplifica- numeric data [4]. Information granules have also contributed tion of data towards being consistent with human thinking and to quantifying the limited numeric precision in data [5]. problem solving as well as tolerance of uncertainty, information Utilizing information granules is one of the problem-solving granules are becoming important entities in data processing at different levels of data abstraction. This paper proposes a methods based on decomposing a big problem into sub-tasks method to construct classifiers from multi-resolution hierarchical which can be solved individually. In the world of big data, granular representations (MRHGRC) using hyperbox fuzzy sets. one regularly departs from specific data entities and discover The proposed approach forms a series of granular inferences general rules from data via encapsulation and abstraction. The hierarchically through many levels of abstraction. An attractive use of information granules is meaningful when tackling the characteristic of our classifier is that it can maintain a high accuracy in comparison to other fuzzy min-max models at a low five Vs of big data [6], i.e., volume, variety, velocity, veracity, degree of granularity based on reusing the knowledge learned and value. Granulation process gathering similar data together from lower levels of abstraction. In addition, our approach contributes to reducing the data size, and so the volume can reduce the data size significantly as well as handle the issue is addressed. The information from many heterogeneous uncertainty and incompleteness associated with data in real- sources can be granulated into various granular constructs, and world applications. The construction process of the classifier consists of two phases. The first phase is to formulate the model then several measures and rules for uniform representation at the greatest level of granularity, while the later stage aims are proposed to fuse base information granules as shown to reduce the complexity of the constructed model and deduce in [7]. Hence, the data variety is addressed. Several studies it from data at higher abstraction levels. Experimental analyses constructed the evolving information granules to adapt to the conducted comprehensively on both synthetic and real datasets changes in the streams of data as in [8]. The variations of indicated the efficiency of our method in terms of training time and predictive performance in comparison to other types of information granules in a high-speed data stream assist in fuzzy min-max neural networks and common machine learning tackling the velocity problem of big data. The process of algorithms. forming information granules is often associated with the Index Terms—Information granules, granular computing, hy- removal of outliers and dealing with incomplete data [6]; perbox, general fuzzy min-max neural network, classification, thus the veracity of data is guaranteed. Finally, the multi- hierarchical granular representation. resolution hierarchical architecture of various granular levels can disregard some irrelevant features but highlight facets of interest [9]. In this way, the granular representation may help I. INTRODUCTION with cognitive demands and capabilities of different users. IERARCHICAL problem solving, where the problems A multi-dimensional hyperbox fuzzy set is a fundamental are analyzed in a variety of granularity degrees, is a arXiv:1905.12170v3 [cs.LG] 3 Dec 2019 H conceptual vehicle to represent information granules. Each typical characteristic of the human brain [1]. Inspired by fuzzy min-max hyperbox is determined by the minimum this ability, granular computing was introduced. One of the and maximum points and a fuzzy membership function. A critical features of granular computing is to model the data as classifier can be built from the hyperbox fuzzy sets along high-level abstract structures and to tackle problems based on with an appropriate training algorithm. We can extract a these representations similar to structured human thinking [2]. rule set directly from hyperbox fuzzy sets or by using it in Information granules (IGs) [3] are underlying constructs of combination with other methods such as decision trees [10] the granular computing. They are abstract entities describing to account for the predictive results. However, a limitation of important properties of numeric data and formulating knowl- hyperbox-based classifiers is that their accuracy at the low edge pieces from data at a higher abstraction level. They play level of granularity (corresponding to large-sized hyperboxes) a critical role in the concise description and abstraction of decreases. In contrast, classifiers at the high granularity level are more accurate, but the building process of classifiers at T.T. Khuat (email: [email protected]) and B. Gabrys (email: [email protected]) are with Advanced Analytics Institute, this level is time-consuming, and it is difficult to extract the Faculty of Engineering and Information Technology, University of Technology rule set interpretable for predictive outcomes because of the Sydney, Ultimo, NSW 2007, Australia. high complexity of resulting models. Hence, it is desired to F. Chen (email: [email protected]) is with Data Science Centre, Faculty of Engineering and Information Technology, University of Technology construct a simple classifier with high accuracy. In addition, Sydney, Ultimo, NSW 2007, Australia. we expect to observe the change in the predictive results at 2 different data abstraction levels. This paper proposes a method between efficiency and simplicity of the classifiers. A model of constructing a high-precision classifier at the high data with high resolution corresponds to the use of a small value of abstraction level based on the knowledge learned from lower maximum hyperbox size, and vice versa. A choice of suitable abstraction levels. On the basis of classification errors on the resolution level results in better predictive accuracy of the validation set, we can predict the change in the accuracy of generated model. Our main contributions in this paper can the constructed classifier on unseen data, and we can select be summarized as follows: an abstraction level satisfying both acceptable accuracy and • We propose a new data classification model based on simple architecture on the resulting classifier. Furthermore, the multi-resolution of granular data representations in our method is likely to expand for large-sized datasets due combination with the online learning ability of the general to the capability of parallel execution during the constructing fuzzy min-max neural network. process of core hyperboxes at the highest level of granularity. • The proposed method is capable of reusing the learned In our method, the algorithm starts with a relatively small knowledge from the highest granularity level to construct value of maximum hyperbox size (θ) to produce base hyperbox new classifiers at higher abstraction levels with the low fuzzy sets, and then this threshold is increased in succeeding trade-off between the simplification and accuracy. levels of abstraction whose inputs are the hyperbox fuzzy • The efficiency and running time of the general fuzzy min- sets formed in the previous step. By using many hierarchical max classifier are significantly enhanced in the proposed resolutions of granularity, the information captured in earlier algorithm. steps is transferred to the classifier at the next level. Therefore, • Our classifier can perform on large-sized datasets because the classification accuracy is still maintained at an acceptable of the parallel execution ability. value when the resolution of training data is low. • Comprehensive experiments are conducted on synthetic Data generated from complex real-world applications fre- and real datasets to prove the effectiveness of the pro- quently change over time, so the machine learning models posed method compared to other approaches and base- used to predict behaviors of such systems need the efficient lines. online learning capability. Many studies considered the online The rest of this paper is organized as follows. Section II learning capability when building machine learning models presents existing studies related to information granules as such as [11], [12], [13], [14], [15], [16], and [17]. Fuzzy min- well as briefly describes the online learning version of the max neural networks proposed by Simpson [11] and many general fuzzy min-max neural network. Section III shows our of its improved variants only work on the input data in the proposed method to construct a classifier based on data gran- form of points. In practice, due to the uncertainty and some ulation. Experimental configuration and results are presented abnormal behaviors in the systems, the input data include not in Section IV. Section V concludes the main findings and only crisp points but also intervals. To address this problem, discusses some open directions for future works. Gabrys and Bargiela [12] introduced a general fuzzy min-max (GFMM) neural network, which can handle both fuzzy and crisp input samples. By using hyperbox fuzzy sets for the input II. PRELIMINARIES layer, this model can accept the input patterns in the granular A. Related Work form and process data at a high-level

An Effective Multi-Resolution Hierarchical Granular Representation Based Classiﬁer Using General Fuzzy Min-Max Neural Network

Multi-View Cluster Analysis with Incomplete Data to Understand Treatment Effects

Examining Granular Computing from a Modeling Perspective Ying Xie Kennesaw State University, [email protected]

Exploratory Multivariate Analysis for Empirical Information Affected by Uncertainty and Modeled in a Fuzzy Manner: a Review

Rough Sets, Fuzzy Sets, Data Mining and Granular Computing

History and Development of Granular Computing – Witold Pedrycz

Granular Computing in Intelligent Transportation: an Exploratory Study

Rough Sets, Fuzzy Sets, Data Mining and Granular Computing 11Th International Conference, Rsfdgrc 2007, Toronto, Canada, May 14-16, 2007

Clustering with Granular Information Processing

Event Abstraction in Process Mining: Literature Review and Taxonomy

Model-Based Feature Selection Based on Radial Basis Functions and Information Measures

Granular Soft Computing: a Paradigm in Information Processing

Granular Approach of Knowledge Discovery in Databases