SELF ADAPTIVE FEATURE SELECTION in DATAMINING USING FRUITFLY ALGORITHM Shashikala B 1 , Saravanakumar.R2 1Research Scholar

International Journal of Pure and Applied Mathematics Volume 118 No. 14 2018, 135-140 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu Special Issue ijpam.eu SELF ADAPTIVE FEATURE SELECTION IN DATAMINING USING FRUITFLY ALGORITHM Shashikala B 1 , Saravanakumar.R 2 1Research Scholar, Department of CSE, B.T.L I T, Bangalore, Karnataka 560099, India 2Associate Professor, Department of Computer Science & Engineering, Dayananda Sagar Academy of Technology & Management., Bangalore,560078, India [email protected] Abstract: In the current trend of growing data complexity optimization algorithms with their simple steps and and volume and the advent of big data, feature selection efficient search plays a vital role when dealing with high-dimensional methods have become the most widely used algorithms in dataset in machine learning problems.Feature selection optimization problems which require performance. has shown its effectiveness in many applications by Particle swarm optimization (PSO) [1,2], ant colony building simpler and more comprehensive model, optimization (ACO) [3], artificial bee colony algorithm improving learning performance, and preparing clean, (ABC) [4], Simulated Annealing (SA) algorithm [5], understandable data. In this work, an adaptive fruit fly Bacterial Colony Chemotaxis (BCC) [6] and Fruit Fly algorithm is proposed with the objective to perform the Optimization algorithm (FOA) [7,8] are some of them. feature selection for better learning performance. This Fruit fly algorithm introduced by Wen Tsao Pan in 2011 approach divides the population of fruit flies into two originated from foraging behavior of fruit flies. groups with one group searching optimum solution in wide space and another group of flies searching solution In this article, a much simpler, adaptive and more robust in optimum solution space , also uses Chaotic Vector optimization is proposed; the Fruit Fly Optimization Map for the convergence of algorithm.This method is Algorithm with chaotic map.The chaotic dynamics of shown to be a more robust and effective optimization fruit flies is combined into the position updating rules to method than other well-known methods. improve the diversity of solutions and to avoid being trapped in the local optima. This approach, that combines Index Terms—Fruit Fly Algorithm, Chaotic Map, the strengths of Swarm Optimization and chaotic Feature Selection, Optimization dynamics, is then applied into the feature selection. I. INTRODUCTION1 II. RELATED WORK Machine learning is a field of computer science that A. Feature Selection gives computers the ability to learn without being Feature Selection Algorithm (FSA) is a computational explicitly programmed.In machine learning , feature solution that is motivated by a certain definition of selection is the process of selecting a subset of relevance. The purpose of a FSA is to identify relevant relevant features which efficiently describe the input features according to a definition of relevance. FSA is data. Feature selection techniques are often used in actually a model selection problem – NP-hard problem domains where there are many features and (Cannot be solved in polynomial time). comparatively few samples. They are used for simplification of models to make them easier to interpret The selection of optimal features adds an extra layer of by researchers/users, for shorter training times and to complexity in the modelling as instead of just finding avoid the curse of dimensionality.The central principle optimal parameters for full set of features, first optimal when using a feature selection technique is that the data feature subset is to be found and the model parameters are contains many features that are to be optimised [9].Evolutionary algorithms can be used either redundant or irrelevant, and can thus be removed to intelligently search the sloution space. without incurring much loss of information. The swarm 135 International Journal of Pure and Applied Mathematics Special Issue FSA can be broadly divided into filter and wrapper B. Fruit Fly Algorithm approaches. Fruit fly optimization algorithm is the latest evolutionary computation technique which was pointed out by Wen H. In the Filter approach the attribute selection method is Iscan, M. Gunduz 138 Tsao Pan in 2011. The Fruit Fly independent of the data mining algorithm to be applied to Optimization Algorithm (FOA) is a new intelligent the selected attributes and assess the relevance of features method on the food finding behavior of the fruit fly. The by looking only at the intrinsic properties of the data. In fruit fly itself is superior to other species in sensing and most cases a feature relevance score is calculated, and perception, especially in osphresis and vision. The lowscoring features are removed. The subset of features osphresis organs of fruit flies can find all kinds of scents left after feature removal is presented as input to the floating in the air; it can even smell food source from 40 machine learning algorithms. Advantages of filter km away [7, 8]. Then, after it gets close to the food techniques are that they easily scale to highdimensional location, it can also use its sensitive vision to find food datasets are computationally simple and fast, and as the and the company’s flocking location, and fly towards that filter approach is independent of the mining algorithm so direction too. The behaviors of the fruit flies could be feature selection needs to be performed only once, and demonstrated in Figure 1. then different classifiers can be evaluated. Wrapper methods embed the model hypothesis search within the feature subset search. In the wrapper approach the attribute selection method uses the result of the data mining algorithm to determine how good a given attribute subset is. In this setup, a search procedure in the space of possible feature subsets is defined, and various subsets of features are generated and evaluated. The major characteristic of the wrapper approach is that the quality of an attribute subset is directly measured by the performance of the data mining algorithm applied to that attribute subset. The wrapper approach tends to be much Figure1.Illustration of the group iterative food searching slower than the filter approach, as the data mining of fruit fly algorithm is applied to each attribute subset considered by the search. In addition, if several different data mining Fruit fly’s food finding characteristics is divided into algorithms are to be applied to the data, the wrapper several necessary steps as shown in Figure 1. approach becomes even more computationally expensive The steps of the algorithm could be given as follows [10]. Advantages of wrapper approaches include the ________________________________________ interaction between feature subset search and model Algorithm1: FruitFly Algorithm selection, and the ability to take into account feature _______________________________________ dependencies. A common drawback of these techniques is that they have a higher risk of overfitting than filter 1) Initialize the fruit fly swarm location with random techniques and are very computationally intensive. values. [X_axis, Y_axis] Another category of feature selection technique was also 2) Search with random direction introduced, termed Embedded technique in which Xi = X_axis + RandomValue search for an optimal subset of features is built into the Yi = Y_axis + RandomValue classifier construction, and can be seen as a search in the combined space of feature subsets and hypotheses. Just 3) Since food’s position [Optimal solution] is unknown, like wrapper approaches, embedded approaches are thus the distance (Dist) to the origin is estimated first, and the specific to a given learning algorithm. judged value of smell concentration (S), which is the inverse of distance, is then calculated. Disti = √(Xi 2 + Yi 2 ) Si = 1/Disti 136 International Journal of Pure and Applied Mathematics Special Issue 4) Substitute smell concentration judgment value (S) into According the parameters the population of flies will be smell concentration judge function (or called fitness generated and works on algorithm to retrieve the best fly function) so as to find the smell concentration (Smelli) of representing the feature subset. the individual location of the fruit fly. Smelli = Objective Function (Si) Best fruit Dataset Self Adaptive 5) Identifying the position of the best smell concentration Feature Selection fly (maximum value). [bestSmellbestIndex] = max(Smell) Algorithm Using representing Fruit Fly Fruitfly algorithm the selected features 6) Keep the best smell concentration value and x, y Parametes coordinate, the fruit fly swarm will use vision to fly towards that location. Figure2.Block diagram of proposed work Smellbest = bestSmell E. Algorithm X_axis = X(bestIndex) Y_axis = Y(bestIndex) The algorithm for the proposed work is given below and 7) Enter iterative optimization to repeat the the fruit fly is composed on n values corrsponding to the implementation of steps 2-5, then judge if the smell number of attributes of the dataset depicted in Figure3. concentration is superior to the previous iterative smell concentration, if so, implement step 6 X1 X2 ………………………….. Xn Figure3. Representation of a Fruitfly C. Proposed work In this research proposal, the swarm optimization X1 represents first feature, X2 represents second feature algorithm, Fruit Fly Algorithm is used along with chaotic and so. technique for performing feature selection.

SELF ADAPTIVE FEATURE SELECTION in DATAMINING USING FRUITFLY ALGORITHM Shashikala B 1 , Saravanakumar.R2 1Research Scholar

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support