Modeling Problem Solving Times in Tutoring Systems
Total Page:16
File Type:pdf, Size:1020Kb
MASARYKOVA UNIVERZITA F}w¡¢£¤¥¦§¨ AKULTA INFORMATIKY !"#$%&'()+,-./012345<yA| Modeling Problem Solving Times in Tutoring Systems PHDTHESIS Petr Jarušek Brno, 2013 Declaration Hereby I declare, that this paper is my original authorial work, which I have worked out by my own. All sources, references and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Petr Jarušek Advisor: prof. RNDr. Ivana Cerná,ˇ CSc. ii Acknowledgement I would like to thank to my advisor Ivana Cernᡠfor her guidance and sup- port during my PhD studies. It was my pleasure to work with her. I am deeply grateful to my consultant Radek Pelánek for many things. Firstly, since he has been my advisor since my master thesis I am really grateful for all the knowledge and critical and analytical worldview he has shared with me. He has broadened my horizons, shaped many of my opin- ions and changed my mind-sets in certain areas. But not only that. I am grateful for all the PhD years when we struggled with our rather exper- imental research – many times failing, sometimes succeeding, but always experimenting. I can remember that when we started even I myself was not convinced that the topic we were dealing with would lead to successful conclusion. But it happened and now I can see interesting results that we have achieved. Among many other things I deeply admire his working ef- fort and ability to finish work long time before any deadlines even appear on the horizon. I also admire his sense for humanity and his modesty and I am curiously looking forward what he is going to deal with in the future. But it has not been only the academy that has supported me and shaped my life during my studies. I am very grateful to my family, especially to my mother, for all the care and support she has given me through all of my life. This may sound like a phrase, but as I am growing older I can see more clearly how much she has sacrificed for her children and I am very thankful for that. I am looking forward to once support my kids in the same way she has been supporting us. I am also deeply thankful to other important members of my family – to my father and my sister and my broader family. iii Abstract We study problem solving in context of intelligent tutoring systems, partic- ularly with the focus on timing information as opposed to just correctness of answers. This leads to different types of educational problems and re- quires new student models. We describe a simple model which assumes a linear relationship be- tween latent problem solving skill and a logarithm of time to solve a prob- lem. We show that this model is related to models from two different ar- eas: the item response theory and collaborative filtering. We also propose model extensions for learning and dealing with multidimensional skills. Using both synthesized data and real data from a widely used “Problem Solving Tutor” we evaluate the model, analyze its parameter values and es- timation techniques, and discuss the insight into problem difficulty which the model brings. As a direct application of the model we developed a “Problem Solving Tutor” (tutor.fi.muni.cz) – a web-based educational tool for learning through problem solving. The tool makes predictions of problem solving times and thus is able to recommend to each student a problem of suitable difficulty. The tool contains 30 problem types and more than 2 000 prob- lems, mainly programming problems, math problems and logic puzzles. All problems are interactive and the system gives students immediate feed- back on their performance. This system is already widely used – it has more than 460 000 problems solved and 10 000 users. The system also supports “virtual classes” and is already used in more than 50 high schools in Czech republic. Finally, we study six transport puzzles – Minotaurus, Number Maze, Replacement Puzzle, Rush Hour, Sokoban and Tilt Maze. Using Tutor we collect large scale data on human problem solving of these puzzles. The results show that there are large differences among difficulty of individual problem instances and that these differences are not explained by previous research. In order to explain differences, we propose a computational model of human problem solving behavior based on state space navigation and provide evaluation and discussion. We also derive concept of state space bottleneck and problem decomposition for Sokoban puzzle. We evaluate both methods and compare them to other metrics. iv Contents 1 Introduction ............................... 5 1.1 Contribution of the Thesis .................... 7 1.1.1 Model of Problem Solving Times . 7 1.1.2 Problem Solving Tutor . 8 1.1.3 Model of Human Problem Solving of Transport Puzzles 9 1.2 Outline of the Thesis ....................... 10 2 Background ............................... 12 2.1 Item Response Theory ...................... 12 2.1.1 Basics . 12 2.1.2 Features . 14 2.1.3 Computerized Adaptive Testing . 14 2.2 Modeling Response Times .................... 16 2.2.1 Approaches . 16 2.2.2 Lognormal Model . 17 2.2.3 Application of Response Times in Adaptive Tests . 18 2.2.4 Maximum Information Criterion for Response Times 19 2.3 Intelligent Tutoring Systems ................... 19 2.3.1 Outer Loop and Inner Loop . 20 2.3.2 Model Tracing . 20 2.3.3 Knowledge Tracing . 21 2.4 Educational Data Mining and Recommender Systems . 22 2.4.1 Educational Data Mining . 22 2.4.2 Recommender Systems . 23 2.4.3 Collaborative Filtering . 24 2.5 Human Problem Solving and Puzzles . 25 2.5.1 Difficulty and Puzzles . 26 2.6 Our Approach: Focus on Timing Information . 26 2.6.1 Correctness Versus Timing Approach . 27 2.6.2 Tutoring Based on Timing Information . 28 2.6.3 Examples of Problems . 29 3 Model of Problem Solving Times . 30 3.1 Motivation ............................. 30 3.1.1 Preliminaries . 30 3.2 Basic model ............................ 31 3.2.1 Group Invariance . 33 3.2.2 Relations to Item Response Theory and Collaborative Filtering . 33 1 3.3 Model with Variability of Students’ Performance . 35 3.4 Basic Model with Learning .................... 36 3.4.1 Model with Multidimensional Skill . 37 3.5 Introduction to Maximum Likelihood and Estimation Methods 37 3.5.1 Maximum Likelihood for Univariate Gaussian Linear Regression . 38 3.5.2 Analytical Estimation . 39 3.5.3 Gradient Descent Estimation . 40 3.6 Parameter Estimation Using Maximum Likelihood . 41 3.7 Parameter Estimation Using Iterative Joint Estimation . 44 3.7.1 Approach . 44 3.7.2 Estimating Skill . 44 3.7.3 Estimating Problem Parameters . 45 3.7.4 Joint Estimation . 46 3.7.5 Estimating Skill for Model with Learning . 47 4 Evaluation of the Model ........................ 49 4.1 Evaluation Using Synthesized Data . 49 4.1.1 Synthesized Data for Basic Model and Model with Students’ Variability . 49 4.1.2 Synthesized Data for Basic Model with Learning . 50 4.1.3 Evaluation of Parameter Estimation Techniques . 53 4.2 Evaluation Using Real Data ................... 54 4.2.1 Parameter Values for Real Data . 55 4.2.2 Evaluation of Predictions . 56 4.2.3 Reliability of Parameter Values . 58 4.2.4 Insight Gained from Parameter Values . 60 4.2.5 Detection of Multidimensional Skill . 60 4.3 Open Issues ............................ 63 4.3.1 Problem Completion . 63 4.3.2 Detection of Cheating . 64 4.3.3 Application for Adaptive Testing . 65 5 Problem Solving Tutor ......................... 67 5.1 Main Approach .......................... 67 5.2 Main Components ........................ 68 5.2.1 Typical Usage . 68 5.2.2 Problem Simulators . 69 5.2.3 Data Collection . 69 5.2.4 Predictions . 70 5.2.5 Recommendations . 71 5.2.6 Class Mode . 72 2 5.2.7 Motivational Features . 72 5.3 Problems In the Tutor ....................... 73 5.3.1 Robot Programming Problems . 73 5.3.2 Programming Problems . 75 5.3.3 Computer Science Problems . 75 5.3.4 Math Problems . 77 5.3.5 Logic Puzzles . 77 5.4 Implementation .......................... 77 5.4.1 Technologies . 78 5.4.2 Main Entities . 78 5.4.3 Entity Relationship Model . 78 5.4.4 Logging Interface for Simulators . 79 5.4.5 Problem Locker . 79 5.4.6 Gradual Start . 80 5.5 Statistics of Usage ......................... 80 6 Difficulty of Transport Puzzles .................... 82 6.1 Motivation ............................. 82 6.2 Studied Problems ......................... 83 6.2.1 Sokoban . 83 6.2.2 Minotaurus Puzzle . 84 6.2.3 Number Maze . 84 6.2.4 Tilt Maze . 85 6.2.5 Rush Hour . 86 6.2.6 Replacement Puzzle . 86 6.3 Data Collection and Analysis . 87 6.3.1 Data Collection . 88 6.3.2 Data Analysis . 88 6.3.3 Problem Difficulty . 89 6.3.4 Analysis of Individual Moves in Sokoban Puzzle . 90 6.4 Model of Human Behaviour ................... 91 6.4.1 Basic Principle . 93 6.4.2 Model Formalization . 94 6.4.3 Model with Dead States . 95 6.4.4 Other Extensions . 96 6.5 Evaluation ............................. 96 6.5.1 Difficulty Rating Metrics . 96 6.5.2 Value of the Parameter B . 97 6.5.3 Differences among Problems . 98 6.5.4 Relation to the Model of Problem Solving Times . 99 6.6 State Space Bottleneck . 101 3 6.6.1 Analysis of Bottleneck . 102 6.6.2 Network Flows . 102 6.6.3 Bottleneck Coefficient . 103 6.6.4 Possible Applications . 104 6.7 Problem Decomposition . 105 6.7.1 Approach . 105 7 Conclusion ................................ 109 7.1 Future Work ............................ 111 A First Appendix ............................