Applicability of Educational Data Mining in Afghanistan: Opportunities and Challenges
Total Page:16
File Type:pdf, Size:1020Kb
Applicability of Educational Data Mining in Afghanistan: Opportunities and Challenges Abdul Rahman, Sherzad Lecturer at Computer Science Faculty of Herat University, Afghanistan Ph.D. Student at Technische Universität Berlin, Germany [email protected] || [email protected] ABSTRACT predict proper field of study for high school The author's own experience as a student and graduates, or, identify first year university later as a lecturer in Afghanistan has shown that students who are at high risk of attrition. the methods used in the educational system are Keywords not only flawed, but also do not provide the Educational data mining; data mining; major minimum guidance to students to select proper study prediction; placement class; computer uses course of study before they enter the national in education; value of information; Kankor, university entrance (Kankor) exam. Thus, it often University Entrance Exam, Afghanistan results in high attrition rates and poor education systems. performance in higher education. Based on the studies done in other countries, and 1. INTRODUCTION by the author of this paper through online Recent advances in big data, data mining, and questionnaires distributed to university students educational data mining (EDM) enable in Afghanistan – it was found that proper educational institutions to transform their data procedures and specialized studies in high into valuable information to find the hidden schools can help students in selecting their major patterns that yields clear results and specific field of study more systematically. targets to improve educational settings. Additionally, it has come to be known that there The efforts of educational institutions in are large amounts of data available for mining Afghanistan are to generate (only) facts and purposes, but the methods that the Ministry of figures (e.g., total number of students and Education and Ministry of Higher Education use teachers based on gender, geographic location, to store and produce their data, only enable them schools and universities, and some other to achieve simple facts and figures. Furthermore, conditions). However, it turned out that these from the results it can be concluded that there are simple facts and figures do not help educational potential opportunities for educational data institutions to improve the educational settings. mining application in the domain of For example, to predict proper major fields of Afghanistan's education systems. Finally, this study for high school graduates participating in study will provide the readers with approaches for the Kankor exam, to identify first year university using Educational Data Mining to improve the students who are at high attrition risk, or to educational business processes. For instance, Page 1 of 10 recommend right courses for the students to Afghanistan's education environment using case enroll. studies as a support for the argumentation. To find out about participants' familiarity and 1.1 Big Data factors choosing proper major studies in the Kankor exam, attrition rates and poor quality of Upon entering the 21st century, there has been a education, and whether offering specialized massive increase in data generation with the studies at high school is a good decision or not; prevalence of mobile technologies, social media online questionnaires (open-ended and close- networks, sensor devices for environmental data ended questions) were prepared and distributed gathering, and online shopping sites, to name a through social networking platforms to few. These vast amounts of generated data are too Afghanistan public and private university big and complex to be handled and processed students and alumni. Results of the effectively by traditional approaches [34, 35] questionnaires indicated that lack of familiarity Big data does not just mean very large amounts of with the Kankor exam and its processes, data, though that could be one of the factors and inappropriate selection of major fields of study as characteristics of it. Big data is identified by the well as lack of offering specialized studies at following three main cornerstones known as 3 school are the major reasons that most of the V's: Volume, Variety and Velocity [34, 35]. students are at risk of dropping out or poor Recently other V's were added to define the term performance in their higher education studies. big data very precisely such as Value, Veracity, The main goal and objective of this paper is to and Visualization [32, 16]. study the opportunities and challenges of EDM In a nutshell, big data refers to rapidly growing applicability in Afghanistan education context to structured and unstructured data with sizes help educational institutions to better prepare beyond the ability of traditional database tools students for their studies in schools and (RDBMSs) to store, manage, and analyze. universities. This paper also tries to offer answers Therefore, it leads to new big data solution to the following questions: technologies. For example, NoSQL, Apache 1. What are the potential challenges to Hadoop, Apache Spark and Apache Flink implement EDM in the Afghanistan's framework with ecosystems on top of each to context? effectively store and process the big data either in batch or streaming processes with distributed 2. How to implement EDM in Afghanistan computing in a reliable and scalable fashion [5, 8, education system effectively? 12, 31]. 3. What is the status with regards to It is worth mentioning that there are other non- availability of data in Afghanistan distributed and non-scalable software and tools education systems? available for data mining purposes, e.g., Weka, This paper, initially presents a description on big SPSS Clementine, packages for R and Python data, data mining, and educational data mining. programming languages. Furthermore, it should Next, it briefly explains the education systems be considered that big data solutions certainly do and the existence and availability of large not replace the traditional approaches. Likewise, amounts of data in Afghanistan education data mining is not only for big data. systems. Finally, it presents studies in the area of EDM and discusses their applicability in Page 2 of 10 1.1.1 The Three V's of Big Data The EDM research community has undergone The Volume means very large amounts of data. remarkable growth. Since 2008 every year For example, in 2012, approximately 2.5 Exabyte international conference on EDM is being held to of data were generated per day [17]. bring together researchers from computer science, education, psychology, and statistics to analyze Velocity means how rapidly data is generated. large datasets to answer educational research Per the InternetLiveStats every second more than questions. In 2009 the Journal of Educational 10,000 Tweets are sent, 100,000 YouTube videos Data Mining (JEDM) was established for sharing are viewed, 49,000 Google searches are and broadcasting research results. performed, 2,300,000 emails are sent, and 27,700 Gigabytes of Internet traffic are generated [11, The main goals of EDM is to predict students' 20]. future learning behavior, to study the effects of education support that can be realized through Variety refers to the 80% total existing tutoring and learning systems, to find out new unstructured data e.g., videos, audios, books, models or improve the existing ones, and finally news articles, blogs, and click streams data [29]. to advance scientific knowledge about learning [4]. 1.2 Data Mining The Application of EDM is very broad, ranging Data mining is the cross-road of Statistics, from analysis and visualization of data, providing Machine Learning and Artificial Intelligence feedback for supporting instructors, algorithms, Database and Data Warehouses [10, recommendations for students, predicting student 25]. performance, detecting undesirable student Data mining enables organizations to explore the behaviors, grouping students, constructing past by means of univariate and bivariate courseware, planning and scheduling, and others exploration techniques and predict the future by [23]. means of classification and clustering techniques and methods [26]. 2. STATUS OF AFGHANISTAN EDUCATION SYSTEMS Per the KDnuggets 2014 poll [13], data mining and analytics are applied in a wide range of General education in Afghanistan comprises K- industries, e.g., Banking, Healthcare, Fraud 12 (primary, secondary and high school), Islamic Detection, Science, Advertising, Education, and studies, Teacher Training, Technical and many other sectors. Vocational schools and institutes which is administered by the Ministry of Education 1.3 Educational Data Mining (MoE). The Ministry of Higher Education (MoHE) supervises the universities which Researchers apply and test data mining provide Bachelor, Master, and Ph.D. degree techniques to explore and mine the data which programs. are generated from educational settings (e.g., data from schools, universities, traces that students The establishment of the new democracy in leave when they interact with learning Afghanistan in 2002 brought hope to people to be management systems, and intelligent tutoring more optimistic towards the future. The education systems) to better understand students and the systems have been going through a nationwide settings which they learn in and call it EDM [4, 6, rebuilding process, and despite obstacles, 24]. numerous public and private educational Page 3 of 10 institutions were established across the country Data needs context