Association Rules Mining from the Educational Data of ESOG Web-Based Application Stefanos Ougiaroglou, Giorgos Paschalis
Total Page:16
File Type:pdf, Size:1020Kb
Association Rules Mining from the Educational Data of ESOG Web-Based Application Stefanos Ougiaroglou, Giorgos Paschalis To cite this version: Stefanos Ougiaroglou, Giorgos Paschalis. Association Rules Mining from the Educational Data of ESOG Web-Based Application. 8th International Conference on Artificial Intelligence Applications and Innovations (AIAI), Sep 2012, Halkidiki, Greece. pp.105-114, 10.1007/978-3-642-33412-2_11. hal-01523060 HAL Id: hal-01523060 https://hal.archives-ouvertes.fr/hal-01523060 Submitted on 16 May 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution| 4.0 International License Association Rules Mining from the educational data of ESOG web-based application Stefanos Ougiaroglou1 and Giorgos Paschalis2 1 Dept. Of Applied Informatics, University of Macedonia, Thessaloniki Greece 2 Human-Computer Interaction Group, University of Patras, Patra, Greece [email protected], [email protected] Abstract. Many researchers have focused on the mining of educational data stored in databases of educational software and Learning Management Systems. The goal is the knowledge discovery that can help educators to support their students by managing effectively educational units, redesigning student’s activities and finally improving the learning outcome. A basic data mining technique concerns the discovery of hidden associations that exist in data stored in educational software Databases. In this paper, we present the KDD process which includes the application of the Apriori algorithm for the association rules mining from the educational data of ESOG Web-based application. Keywords: Association Rules, Apriori Algorithm, ESOG, Educational Data. 1 Introduction Data Mining (DM) or Knowledge Discovery in Databases (KDD) is the automatic extraction of implicit and interesting patterns from large data collections [2,3,8]. The results of the whole process are taken into consideration by experts in important future decisions in various fields. Educational Data Mining (EDM) is an emerging interdisciplinary research area [13]. EDM deals with the development of methods to explore data originating in an educational context. EDM uses computational approaches to analyze educational data in order to study educational questions. The results can be used not only to learn the model for the learning process [9] or student modeling [10] but also to evaluate and to improve e-learning systems [11] by discovering useful learning information from learning portfolios [12]. Educational Data are data coming from two types of educational systems: Traditional classroom and Distance (Web-based) Education. Today, there are a lot of terms used to refer to web-based education such as e-learning, e-training, online instruction, web-based learning, web-based training, web-based instruction, etc. And there are different types of web-based systems: synchronous and asynchronous, collaborative and non-collaborative, closed corpus and open corpus, etc. According to [13], educators and academics are responsible in charge of designing, planning, building and maintaining the educational systems. Students use and interact with them. Starting from all the available information about courses, students, usage and interaction, different data mining techniques can be applied in order to discover useful knowledge that helps to improve the e-learning process. The discovered knowledge can be used not only by providers (educators) but also by own users (students). KDD process, as presented in [14, 2, 3], is the process of using DM methods to extract knowledge according to the specification of measures and thresholds, using a database along with the appropriate preprocessing, and data transformation of the database. There are considered five KDD stages executed in the following order: • Selection: This stage consists of creating a target data set, or focusing on a subset of variables or data samples, on which discovery is to be performed; • Pre-processing: It consists of the target data cleaning and pre processing in order to obtain consistent data; • Transformation: It includes the transformation of the data using dimensionality reduction or other transformation methods; • Data Mining: It includes procedures which search for patterns of interest in a particular representational form, depending on the DM objective; • Interpretation/Evaluation: It includes the interpretation and evaluation of the mined patterns. The most difficult and time-consuming of the five mentioned stages are the stages of the data pre-processing and transformation. It is worth mentioning that many researchers consider these two stages as one under the label “preprocessing”. One of the most useful and well studied mining methods is the Association Rule Mining (ARM) [5, 2, 3]. During the last decades, many scientists and practitioners of various fields apply data mining algorithms to explore and discover relations between attributes of large databases to create corresponding rules. Such rules associate one or more attributes of a dataset with another attribute between sets of items in large databases, producing an if–then statement concerning attribute values. The application of such techniques, usually, is not a simple process, but requires the successive application of the five steps of the KDD process. ARM has been successfully applied in educational content. An interesting review of relevant applications on data of learning management system can be found in [19]. The work in [20] presents the application of ARM in the data of an e-Examination system. Other interesting relevant works can be found in [17, 18]. This work is also focused in the application of ARM in educational data. The motivation is the discovery of hidden associations that probably exist in the educational data of the database of the ESOG web-based application [1]. These data were collected by the use of the particular application from Greek secondary education students of various parts of Greece. The contribution of this paper is the application of the KDD process on the educational data of ESOG and the discovery of the corresponding association rules. For the mining process, the well-known Apriori algorithm [4] is applied through the machine learning software WEKA [6]. The rest of this work is organized as follows: The next section outlines the ESOG application as well as the data that it manages. Then, the paper presents briefly issues that concern the association rules mining and the Apriori algorithm. Section 4 considers in details the KDD process for the export of the association rules from the ESOG data using the Apriori algorithm. The paper concludes by summarizing the whole effort and defining some certain directions for future work. 2 The web-based Application ESOG 2.1 ESOG System ESOG1 [1] is a web-based application that supports students in their school occupational guidance orientation. Particularly, students in Greece, in their last year of studies in Secondary Education (SE) have to choose the university department where they are going to continue their studies. So, according to their degrees in the annual Pan-Hellenic exams, they have to complete a form stating their preferences in University Departments (UDs). However, the knowledge domain of the UD that they choose may be out of their interest. The factors that may lead students to a wrong choice include (i) Lack of accurate information and proper guidance, (ii) Influence from relatives, (iii) selecting of a UD based exclusively on placement that it provides. Consequently, many students may face various learning difficulties during their studies. These students maybe would have a successful evolution in another science. Thus, there is an obvious need to help students to take a right decision. ESOG is a software tool that may be useful in such cases. In particular, it is an intelligent decision support system which allows the students to indicate their degree of interest for courses have been taught in Secondary Education (SE). Then, it executes the multi-criteria analysis method ELECTRE I [7, 16] and produce a ranked based on students interests UD list. In general, multi-criteria analysis methods [15], such as ELECTRE I, are assessments and evaluations methods based in the principle that complicated problems of decision-making, are not possible to be solved mono dimensionally. All criteria that influence the decision-making are necessary to be taken into consideration for the problem solution. The use of such methods is necessary when various parameters of a problem must be taken into account to make a decision. The family of ELECTRE methods [7, 16] includes multi-criteria analysis methods that act into two stages. The first stage develops one or several outranking relations, which aims at comparing in a comprehensive way each pair of actions. In the second stage, an exploitation procedure elaborates the recommendations obtained in the first stage making consistent exploration and analysis in support of decision makers. ESOG