Educare: a Decision Support System for Anticipating Behaviors Among Educational Actors
Total Page:16
File Type:pdf, Size:1020Kb
educare: a decision support system for anticipating behaviors among educational actors Cláudia Antunes Instituto Superior Técnico, Av Rovisco Pais, 1049-001 Lisboa, Portugal [email protected] Abstract. Decision support systems play a core role in data management, due to its ability to provide the tools needed to help in the decision making process. Education is just another application field, where those systems may contribute to assist in its improvement. The educare project aims to design a decision support system, specifically created for enabling the discovery of hidden information about students and teachers performance. Keywords: Decision support systems, Educational data mining, Multi- dimensional model for education 1 Introduction The educational process produces amounts of data, which can and should be used to understand its actors’ behaviors and to identify failure and success causes. Despite data is collected for years, even in digital storages, education poses a set of particular challenges in respect to data analysis and information discovery. Educational Data Mining (EDM) [1] is an emerging discipline with the goal of applying DM (DM) techniques to data that come from educational settings, like computer-based tutoring systems or the traditional teaching process. In either cases, data hide students’ usual behaviors, process definitions and coordination, teaching strategies and so on. In all these cases, data are framed in a particular and well- defined context, which can and should be used to better understand the data. The reason for creating a dedicated discipline results from the identification of a set of peculiarities that characterize educational data, in particular: the temporality of records, the impact of previous events (data) on the results that occur in the future, and the impact of the context on behaviors. These three factors together make that the application of basic DM tools (like decision trees or association rules) does not give an adequate answer to the questions issued by education professionals. See for example [2], where classification is performed under new considerations, and using estimations instead of fully recorded data. The educare project provides a tool that fallow the best practices among decision support, creating a data warehouse and developing the adequate mining operations to reach those goals. In this paper, we describe the educare data warehouse, designing its general architecture and stating its main dimensions and facts. Each data mart is described in detail, giving particular attention to their corresponding granularities. The rest of this paper is structured as follows: next we describe the data warehouse architecture; in chapter 3, we define the data architecture, dedicating a special attention to the main dimensions and present the different star schemas in the data warehouse. The paper concludes with some conclusions and guidelines for the tools for operating over the data warehouse. 2 The Educational Process in educare The educare project aims to provide educational community with a prototype of a decision support system that covers the major aspects of educational data analysis, from students’ behaviors to teacher strategies, walking through programs and subjects organization. The basic idea is to consider two main entities in the educational context: actors (students and teachers) and curricular units; and to deal with different levels of abstraction for actors (individual, working group and set of actors), and curricular units (subjects, groups of subjects and programs). Using this conceptual framework, will then be possible to understand the entire educational process, and prevent and correct problematic situations, whenever is possible. The distinction between educare and other systems resides on the fact that each educational entity can be addressed by similar approaches, since all of them share the educational context. Indeed, the strategy to follow is to create a system that can be guided by contextual information (background knowledge), in order to anticipate failure situations either from actors’ or curricular units point of view. 2.1 Business processes The educational process can be seen as a cycle (as shown in Figure 1), which involves students’ participation, teaching process and quality assurance. Student participation Quality Teaching Assurance process Figure 1 Educational process In this manner, the educare project considers three main business processes in education: student enrollment and evaluation (SEE), teaching activity (TA) and quality assurance (QA). The first process contemplates five atomic activities: application, admittance, subject enrollment, lesson attendance and item evaluation. Subject Lesson Item Application Admittance enrollment attendance evaluation Figure 2 Students' enrollment and evaluation process The second process only envisages teaching activities; despite it can also cover both research and management educators’ tasks, in the future. The last process congregates the quality assurance process, both covering subjects’ organization and teaching performance. 3 System architecture As any decision support system, educare must deal with historical and consolidated data, which should be stored in a way that enables its analysis, and naturally should follow an architecture similar to the typical one in this kind of systems. Figure 3 presents the generic architecture for the educare system (see [3]), which like any decision support system, is centered on a data warehouse full of historical and consolidated data. Data Exploration External Data Data & Data Warehouse Information Visualization Operational Data Educational Data Mining Metadata Knowledge Base Figure 3 educare architecture The data warehouse should be fed by operational data, and updated repeatedly on each curricular term, in order to be up to date. This storage should follow the multi- dimensional model to ease data analysis and to allow for exploring data at different granularities. In this manner, the development of mining processes will be easier, since data is available at different levels of detail, as needed. The interaction with the data warehouse will be distributed, and done through three modules: the data exploration, the data mining and the data and info visualization tools. Against typical decision support systems architecture, educare will maintain an additional repository, here on designated by knowledge base (KB). This repository will contain an ontology dedicated to the educational context and process, and should have as instances the information discovered by mining tools, such as the sequential patterns describing frequent behaviors, or just knowledge about curricular plans. Figure 4 presents de detailed architecture for the system, contemplating its backroom. Back Room Front Room Metadata Operational data Presentation Servers Data Exploration Tools n Data Marts with o i t Aggregated Data c a r t x Data & Information E DW Visualization Bus Data Staging Loading Area Data Marts with Atomic Data Educational Data Mining Transformation Knowledge Base Figure 4 System detailed architecture with back room and front room In order to fill the data warehouse, the system has to provide an additional tool for extracting, transforming and loading the data from the operational databases to the data warehouse, here on designated educare ETL tool. This tool comprises services for extracting the data from operational data sources, for transforming it to the most adequate formats and for loading this transformed data into the data warehouse. These services should consider the first batch (already existing data) and incremental posterior updates. The entire transformation path between operational data sources and the data warehouse will be documented in the metadata catalog, enabling posterior data warehouse maintenance and extension operations. 3.1 Data architecture in educare In order to represent all those processes and atomic activities, the data will be organized around seven main dimensions: Student, Teacher, Subject, Program, Term, Curricular day and QAItem. Four additional dimensions will be also consider: Lesson type, Curricular topic, Subject QA survey, Teaching QA survey and Working group. Student The student dimension is the responsible for collecting identification, personal and application data. This is the central key for understanding all issues related with students and its results. Teacher Similarly, teacher dimension represents educators, storing their identification, personal and professional data. It may be used for analyzing teaching behaviors. Subject Subject dimension agglomerates the data about each course available at some school, from its goals to its description, passing through the responsibility hierarchy (department and group of disciplines). Term The term dimension is the main entity with respect to temporal dimension, corresponding to the minor granularity of interest in the main business processes. It just has description data and the academic year. Curricular day The curricular day dimension is the most detailed time unit, and describes each day of a specific term. This dimension should contain attributes that distinguish the characteristics of different days on the term, such as its ordinal on the term, the day of the week, etc. Evaluation Item An evaluation item is just an item that students have to do in order to be evaluated in some subject, examples of this items are exams, homework or laboratory reports. Program Program