Investigating Knowledge Structures in Computer Science Education

Technische Universität München Fakultät für Informatik Fachgebiet Didaktik der Informatik Investigating Knowledge Structures in Computer Science Education Andreas Michael Mühling Vollständiger Abdruck der von der Fakultät für Informatik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzende: Univ.-Prof. Gudrun J. Klinker, Ph.D. Prüfer: 1. Univ.-Prof. Dr. Peter Hubwieser 2. Prof. Christopher Hundhausen, Ph.D., Washington State University, USA 3. Univ.-Prof. Dr. Johannes Magenheim, Universität Paderborn Die Dissertation wurde am 19. Februar 2014 bei der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 5. Mai 2014 angenom- men. Abstract To be knowledgeable in a subject matter is a prerequisite of being competent. Experts typically possess densely connected structural knowledge of the concepts in their field of expertise. Therefore, investigating the knowledge structures of learners remains a central aspect of educational research in computer science as well as in other subjects, despite the ongoing trend towards outcome and competence oriented assessment. This thesis presents concept landscapes - a novel way of investigating the state and development of knowledge structures using concept maps. Instead of focusing on the assessment and evaluation of single maps, the data of many persons is aggregated and data mining approaches are used in analysis. New insights into the “shared” knowledge of groups of learners are possible in this way. The educational theories underlying the approach, the definition of concept landscapes, and the accompanying analysis methods are presented. Since both data mining techniques and concept map collection are well suited for computer-based approaches, three software projects have been realized in the course of this thesis. They allow computer-based drawing and assessing concept maps, the subsequent analysis of concept maps and concept landscapes, and the extraction of salient concepts and propositions from texts. The methods and tools have been applied in three research projects that investigate structural knowledge in computer science education. Among others, the structural knowledge of CS students entering university has been analyzed. It has been found, that attending the compulsory school subject “Informatics” in Bavaria has visible effects on the knowledge structures of beginning students. The results of these studies can be taken in order to better understand the possibilities, limits, and effects of teaching methods, curricula, or learning materials in computer science education and other subjects. i ii Zusammenfassung Wissen ist eine Voraussetzung um in einem Fach kompetent zu sein. Experten be- sitzen üblicherweise eine dicht vernetzte Wissensstruktur hinsichtlich der Konzepte ihres Spezialgebiets. Daher ist die Untersuchung von Wissensstrukturen von Ler- nenden ein zentraler Aspekt von Lehr-/Lernforschung in Informatik und anderen Fächern, trotz eines erkennbaren Trends hin zur Outcome- und Kompetenzorien- tierung. Diese Arbeit präsentiert concept landscapes - ein neuartiger Weg um den Zus- tand und die Entwicklung von Wissensstrukturen mit Hilfe von Begriffsnetzen zu untersuchen. Anstelle den Fokus bei der Bewertung und Auswertung auf einzelne Netz zu richten, werden die Daten von vielen Personen aggregiert und zur Analy- se Methoden des Data Mining angewendet. Dadurch sind neue Einblicke in das “gemeinsame” Wissen einer Gruppe von Lernenden möglich. Die Arbeit präsen- tiert die zugrundeliegenden Theorien der Lehr-/Lernforschung, die Definition von concept landscapes und die möglichen Analysemethoden. Da sich Techniken des Data Mining und die Erhebung von Begriffsnetzen inhärent für rechnerbasierte Verfahren eignen, sind drei Softwareprojekte im Laufe dieser Ar- beit realisiert worden. Diese erlauben das rechnerbasierte Zeichnen und Bewerten von Begriffsnetzen, die darauf folgende Analyse von Begriffsnetzen und concept landscapes sowieso das Extrahieren von zentralen Konzepten und Propositionen aus Texten. Die Methoden und Werkzeuge wurden in drei Forschungsprojekten angewendet, die Wissensstrukturen im Bereich der Informatikausbildung untersuchen. Unter anderem wurde das strukturelle Wissen von Informatik-Studienanfängern analysiert. Dabei hat sich gezeigt, dass der Besuch des Pflichtfachs Informatik in Bayern sichtbaren Einfluss auf die Wissensstrukturen von Studienanfängern hat. Die Ergebnisse der Studien helfen, die Möglichkeiten, Grenzen und den Einfluss von Lehrmethoden, Lehrplänen und Lehrmaterialien in der Ausbildung in Informatik und anderen Fächern besser zu verstehen. iii iv Contents I Introduction 1 1 Problem Setting 3 2 Detailed Overview 11 II Theoretical Background and Related Work 17 3 Knowledge and Learning 19 3.1 Knowledge . 19 3.1.1 Psychological Foundations . 20 3.1.2 Epistemological Foundations . 24 3.2 Learning . 25 3.2.1 Psychological Foundations . 26 3.2.2 Constructivism . 27 3.2.3 Conceptual Change . 30 3.2.4 Meaningful Learning . 33 3.2.5 Models of Learning . 36 3.3 Assessment . 37 3.3.1 Learning Objectives . 39 3.3.1.1 Taxonomies . 39 3.3.2 Learning Outcomes . 41 3.3.2.1 SOLO Taxonomy . 41 3.3.2.2 Competencies . 42 v 4 Concept Maps 45 4.1 Elements . 46 4.1.1 Map Structure . 50 4.2 Applications . 52 4.2.1 Learning and Teaching . 54 4.2.2 Assessment . 56 4.2.2.1 Scoring System . 60 5 Analysis Methods 67 5.1 Graph Theory . 67 5.2 Pathfinder Networks . 68 5.2.1 Construction . 69 5.2.2 Investigating Structural Knowledge . 71 5.3 Cluster Analysis . 75 5.3.1 Partitioning Methods . 75 5.3.2 Model Based Methods . 77 5.4 Text Mining . 78 5.4.1 Existing Software Solutions . 80 III From Concept Maps to Concept Landscapes 83 6 Possibilities and Limitations of Concept Maps 85 6.1 Cognitive View . 85 6.1.1 Observing the Externalization . 92 6.2 Epistemological View . 97 6.2.1 Concept Maps as Graphs . 98 6.2.1.1 Representations in Computers . 101 6.3 Educational View . 103 vi 6.3.1 Assessing Learning . 104 6.3.2 Scoring Concept Maps . 108 7 Concept Landscapes 111 7.1 Definition . 112 7.1.1 Formal Definition . 117 7.1.1.1 Vertical . 117 7.1.1.2 Horizontal . 118 7.2 Analysis Methods . 119 7.2.1 Cluster Analysis . 119 7.2.1.1 Similarity Based Clustering . 122 7.2.1.2 Latent Class Clustering . 123 7.2.1.3 Example . 125 7.2.2 Pathfinder . 131 7.2.2.1 Example . 135 7.2.3 Graph Measures . 138 7.2.3.1 Simple Graph Measures . 138 7.2.3.2 Advanced Graph Measures . 139 7.2.3.3 Community Detection . 141 7.2.3.4 Frequent Subgraph Analysis . 142 7.2.4 Visualization . 143 7.2.4.1 Vertical Landscapes . 144 7.2.4.2 Horizontal Landscapes . 145 8 Software Support for Concept Landscapes 149 8.1 CoMapEd . 150 8.1.1 Requirements Analysis . 150 8.1.2 Design and Implementation . 152 vii 8.2 CoMaTo . 155 8.2.1 Requirements Analysis . 155 8.2.2 Design and Implementation . 156 8.3 ConEx . 156 8.3.1 Requirements Analysis . 157 8.3.2 Design and Implementation . 157 IV Case Studies 161 9 Overview 163 10 CS1: Computer Science Education for Non-Majors 167 10.1 Description of the Setting . 167 10.2 Data Collection & Research Questions . 169 10.3 Analysis and Results . 173 10.3.1 RQ1: Development of Knowledge . 174 10.3.2 RQ2: Common Knowledge and Misconceptions . 177 10.3.3 RQ3: The Process of Learning . 182 10.4 Discussion . 182 11 CS2: Knowledge Structures of Beginning CS Students 187 11.1 Description of the Setting . 187 11.1.1 Description of the Subject “Informatics” . 188 11.2 Data Collection & Research Questions . 190 11.3 Analysis and Results . 193 11.3.1 RQ1: Prior Knowledge of Beginning CS Students . 193 11.3.2 RQ2: Effect of CS Education in Secondary Schools . 198 11.4 Discussion . 199 viii 12 CS3: Conceptual Knowledge and Abilities 205 12.1 Description of the Setting . 205 12.2 Data Collection & Research Questions . 206 12.3 Analysis and Results . 209 12.3.1 RQ1: Development of Structural Knowledge . 210 12.3.2 RQ2: Connections Between Knowledge and Abilities . 215 12.4 Discussion . 220 V Conclusion 225 13 Summary 227 14 Discussion 231 14.1 Case Studies . 235 15 Further research 239 15.1 Identifying Threshold Concepts of CS . 240 15.2 Combining Knowledge Space Theory and Concept Maps . 241 16 Acknowledgements 243 Appendix A Example Maps for Pathfinder Analysis 245 Appendix B Additional Diagrams for CS1 247 Appendix C Additional Diagrams for CS3 249 List of Figures 259 List of Tables 264 References 265 ix x Part I Introduction 1 Problem Setting This thesis presents data driven methods and the results of empirical research dealing with learning in the context of computer science (CS). As such, it is in- terdisciplinary in nature, depending both on computer science and on research about education and learning. The focus lies on investigating and visualizing the state and development of structural knowledge. Research concerned specifically with computer science education is a relatively recent field of interest (cf. Fincher & Petre 2004, p. 1), especially when compared to the corresponding research in other subjects like mathematics, physics, history, or languages, for example. These all have a much longer tradition in the curricula of schools and in universities. Consequently the knowledge about and experience in teaching these subjects is far more elaborate and widespread than for computer science, where the subject itself has emerged merely half a century ago. This is especially true for learning how to program: “Huge numbers of papers have been published in computing education conferences and journals in the past 40 years, so we would expect much to have been learnt about teaching and learning in computing. However, after decades of research, we still have only a vague understanding of why it is so difficult for many students to learn programming, the basis of the discipline, and consequently of how it should be taught.” (Malmi, Sheard, Simon, Bednarik, Helminen, Korhonen, Myller, Sorva & Taherkhani 2010, p.

Investigating Knowledge Structures in Computer Science Education

Informatics: Integrating the Essentials Into Education and Practice Keywords

Informatics for Secondary Education

TIGER Summit

A Model of Inheritance for Declarative Visual Programming Languages

New Technology in Nursing Education and Practice

Building Blocks of Nursing Informatics

Methodology Stereotypes

Data-Intensive Research in Education: Current Work and Next Steps REPORT on TWO NATIONAL SCIENCE FOUNDATION-SPONSORED COMPUTING RESEARCH EDUCATION WORKSHOPS

Augmented Reality for the Study of Human Heart Anatomy

Polymorphic Type Inference and Assignment

The Secondary School Students' Opinions on Distance Education

A Framework to Overcome Challenges and Explore Opportunities Through Industry 4.0