Students' Language in Computer-Assisted Tutoring of Mathematical Proofs
Total Page:16
File Type:pdf, Size:1020Kb
Saarbrücken Dissertations band 40_Layout 1 28.08.2015 13:34 Seite 1 Saarbrücken Dissertations | Volume 40 | in Language Science and Technology s f o o r p Students' language l a c i t in computer-assisted tutoring a m e h Truth and proof are central to mathematics. Proving (or disproving) t of mathematical proofs seemingly simple statements often turns out to be one of the hardest a m mathematical tasks. Yet, doing proofs is rarely taught in the class - f o room. Studies on cognitive difficulties in learning to do proofs have g n shown that pupils and students not only often do not understand or i r cannot apply basic formal reasoning techniques and do not know how o t Magdalena A. Wolska to use formal mathematical language, but, at a far more fundamental u t level, they also do not understand what it means to prove a statement d e or even do not see the purpose of proof at all. Since insight into the t s importance of proof and doing proofs as such cannot be learnt other i s than by practice, learning support through individualised tutoring is s a - in demand. r e This volume presents a part of an interdisciplinary project, set at the t u intersection of pedagogical science, artificial intelligence, and (com - p putational) linguistics, which investigated issues involved in provisio - m o ning computer-based tutoring of mathematical proofs through c n dialogue in natural language. The ultimate goal in this context, ad - i dressing the above-mentioned need for learning support, is to build e g intelligent automated tutoring systems for mathematical proofs. The a u research presented here has been focused on the language that stu - g n dents use while interacting with such a system: its linguistic proper - a l ties and computational modelling. Contribution is made at three ' s levels: first, an analysis of language phenomena found in students' t n input to a (simulated) proof tutoring system is conducted and the va - e d riety of students' verbalisations is quantitatively assessed, second, a u t general computational processing strategy for informal mathematical S language and methods of modelling prominent language phenomena are proposed, and third, the prospects for natural language as an input a k modality for proof tutoring systems is evaluated based on collected s l corpora. o W . A . ISBN 978-3-86223-111-9 M Deutsches Forschungszentrum universaar für künstliche Intelligenz Universitätsverlag des Saarlandes German Research Center for Saarland University Press Artificial Intelligence Presses Universitaires de la Sarre Magdalena A. Wolska Students' language in computer-assisted tutoring of mathematical proofs Deutsches Forschungszentrum universaar für künstliche Intelligenz Universitätsverlag des Saarlandes German Research Center for Saarland Univer sity Press Artificial Intelligence Presses Universitai re s d e la Sarre Saarbrücken Dissertations in Language Science and Technology (Formerly: Saarbrücken Dissertations in Computational Linguistics and Language Technology) http://www.dfki.de/lt/diss.php Volume 40 Saarland University Department of Computational Linguistics and Phonetics German Research Center for Artificial Intelligence (Deutsches Forschungszentrum für künstliche Intelligenz, DFKI) Language Technology Lab D 291 © 2015 universaar Universitätsverlag des Saarlandes Saarland University Press Presses Universitaires de la Sarre Postfach 151150; 66041 Saarbrücken ISBN 978-3-86223-111-9 gedruckte Ausgabe ISBN 978-3-86223-112-6 Online-Ausgabe ISSN 2194-0398 gedruckte Ausgabe ISSN 2198-5863 Online-Ausgabe URN urn:nbn:de:bsz:291-universaar-1117 zugl. Dissertation zur Erlangung des Grades eines Doktors der Philosophie der Philosophischen Fakultäten der Universität des Saarlandes Berichterstatter: Prof. Dr. Manfred Pinkal, Prof. Dr. Michael Kohlhase Kommissionsmitglieder: Prof. Dr. Dr. h.c. mult. Martin Kay, Dr. Stefan Thater Dekan: Prof. Dr. Dr. h.c. Roland Marti Tag der letzten Prüfungsleistung: 17.05.2013 Projektbetreuung universaar: Susanne Alt, Matthias Müller Satz: Magdalena A. Wolska Umschlaggestaltung: Julian Wichert Gedruckt auf säurefreiem Papier von Monsenstein & Vannerdat Bibliographische Information der Deutschen Nationalbibliothek: Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detailierte bibliografische Daten sind im Internet über <http://dnb-d-nb.de> abrufbar. Abstract Truth and proof are central to mathematics. Proving (or disproving) seemingly simple statements often turns out to be one of the hardest mathematical tasks. Yet, doing proofs is rarely taught in the classroom. Studies on cognitive difficulties in learning to do proofs have shown that pupils and students not only often do not understand or cannot apply basic formal reasoning techniques and do not know how to use formal mathematical language, but, at a far more fundamental level, they also do not understand what it means to prove a statement or even do not see the purpose of proof at all. Since insight into the importance of proof and doing proofs as such cannot be learnt other than by practice, learning support through individualised tutoring is in demand. This thesis has been part of an interdisciplinary project, set at the intersection of pedagogical science, artificial intelligence, and (computational) linguistics, which investigated issues involved in provisioning automated tutoring of mathematical proofs through dialogue in natural language (see Chapter1). The ultimate goal in this context, addressing the above-mentioned need for learning support, is to build intelligent tutoring systems for mathematical proofs. The focus of this thesis is on the language that students use while interacting with such a system: its linguistic properties and computational modelling. Contribution is made at three levels: first, an analysis of language phenomena found in students' input to a (simulated) proof tutoring system is conducted and the variety of students' verbalisations is quantitatively assessed, second, a general computational processing strategy for informal mathematical language and methods of modelling prominent language phenomena are proposed, and third, prospects for natural language as an input modality for proof tutoring systems is evaluated based on collected corpora. Proof tutoring corpora (Chapter2) In order to learn about the properties of students' language in naturalistic interactions with a tutoring system for proofs, two data collection experiments have been conducted. Both experiments were carried out in the so-called 4 Students' Language in Computer-Assisted Tutoring of Proofs Wizard-of-Oz (WOz) paradigm, that is, subjects interacted with a system simulated by a human. The interaction with the simulated system was typewritten. The language of the experiments was German; no constraints on the students' language production were imposed. Naïve set theory and binary relations were selected as the mathematical domains. In the set theory experiment, students were tutored using one of three tutoring strategies differing in the granularity of pedagogical feedback. In the binary relations experiment students were assigned to one of two experimental conditions: one group was shown study material formulated using mainly natural language (verbose), while the other group received mainly formalised content. The hypothesis was that the students' language would reflect the study material presentation format. The key lesson learnt from the experiments is that mathematics is a difficult domain for the Wizard-of-Oz setup. While WOz is an established research methodology in interactive systems, mathematics as a domain is challenging to the wizards due to the time-pressure on response generation related to maintaining a believable system setup. Certain interface features, in particular, the copy–paste mechanism and the ease with which it enables text reuse – in our case, stringing mathematical expressions together – produced substantial cognitive load on the wizards. In future experiments, support for the wizard, for instance, consisting of automated detection of mathematical expression errors, should be considered. The collected corpus comprises 59 dialogues with 1259 student turns and constitutes the source data for all the analyses. Students' language in computer-based proof tutoring Qualitative analysis (Chapter3) The language of informal proofs in textbook discourse has been previously modelled based on mainly ad hoc analyses, rather than systematic corpus studies. The language of informal proofs has been described as precise, exhibiting no ambiguity and little linguistic variation, and consisting of stereotypical, formulaic phrasings in which natural language is used for the most part to express logical connectives. Contrary to these observations, our analysis of proof tutoring corpora shows that the language of students' proofs is rich in linguistic phenomena at all levels: lexical, syntactic, semantic, and discourse-pragmatic. The following utterances illustrate proof statements from our data: x ∈ B =⇒ x∈ / A B enthaelt kein x ∈ A B contains no x ∈ A 5 A hat keine Elemente mit B gemeinsam. A has no elements in common with B. A enthaelt keinesfalls Elemente, die auch in B sind. A contains no elements that are also in B A ∩ B ist ∈ von C ∪ (A ∩ B) A ∩ B is ∈ of C ∪ (A ∩ B) Nach der Definition von ◦ folgt dann (a, b) ist in S−1 ◦ R−1 By definition of ◦ it follows then that (a, b) is in S−1 ◦ R−1 wenn A vereinigt C ein Durchschnitt von B vereinigt C ist, dann müssen alle A und B in C sein