Active Learning of Strict Partial Orders: a Case Study on Concept Prerequisite Relations

Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations Chen Liang1 Jianbo Ye1 Han Zhao2 Bart Pursel1 C. Lee Giles1 1The Pennsylvania State University 2Carnegie Mellon University {cul226, jxy198, bkp10, clg20}@psu.edu [email protected] ABSTRACT −1 + 2 · 1[(u; v) 2 G] 2 {−1; 1g, and a feature extractor d d Strict partial order is a mathematical structure commonly F : V × V 7! R , find h : R 7! {−1; 1g from a hypothesis seen in relational data. One obstacle to extracting such type class H that predicts whether or not (u; v) 2 G for each pair of relations at scale is the lack of large scale labels for build- (u; v) 2 V × V and u 6= v (using h(F(u; v))) by querying W ing effective data-driven solutions. We develop an active a finite number of (u; v) pairs from V × V . learning framework for mining such relations subject to a strict order. Our approach incorporates relational reason- Our main focus is to develop reasonable query strategies in ing not only in finding new unlabeled pairs whose labels can active learning of a strict order exploiting both the knowl- be deduced from an existing label set, but also in devising edge from (non-consistent) classifiers trained on a limited new query strategies that consider the relational structure number of labeled examples and the deductive structures of labels. Our experiments on concept prerequisite relations among pairwise relations. Our work also has a particular show our proposed framework can substantially improve the focus on partial orders. If the strict order is total, a large classification performance with the same query budget com- school called \learning to rank" has studied this topic [10], pared to other baseline approaches. some of which are under the active learning setting [4]. Learn- ing to rank relies on binary classifiers or probabilistic models 1. INTRODUCTION that are consistent with the rule of a total order. Such ap- Pool-based active learning is a learning framework where the proaches are however limited in a sense to principally mod- learning algorithm is allowed to access a set of unlabeled ex- eling a partial order: a classifier consistent with a total order amples and ask for the labels of any of these examples [3]. will always have a non-zero lower bound of error rate, if the Its goal is to learn a good classifier with significantly fewer ground-truth is a partial order but not a total order. labels by actively directing the queries to the most \valuable" examples. In a typical setup of active learning, the la- In our active learning problem, incorporating the deductive bel dependency among labeled or unlabeled examples is not relations of a strict order in soliciting examples to be la- considered. But data and knowledge in the real world are beled is non-trivial and important. The challenges moti- often embodied with prior relational structures. Taking into vating us to pursue this direction can be explained in three consideration those structures in building machine learning folds: First, any example whose label can be deterministi- solutions can be necessary and crucial. The goal of this pa- cally reasoned from a labeled set by using the properties of per is to investigate the query strategies in active learning of strict orders does not need further manual labeling or sta- a strict partial order, namely, when the ground-truth labels tistical prediction. Second, probabilistic inference of labels of examples constitute an irreflexive and transitive relation. based on the independence hypothesis, as is done in the con- In this paper, we develop efficient and effective algorithms ventional classifier training, is not proper any more because extending popular query strategies used in active learning the deductive relations make the labels of examples depen- to work with such relational data. We study the following dent on each other. Third, in order to quantify how valuable problem in the active learning context: an example is for querying, one has to combine uncertainty and logic to build proper representations. Sound and effi- Problem. Given a finite set V , a strict order on V is a cient heuristics with empirical success are to be explored. type of irreflexive and transitive (pairwise) relation. Such a strict order is represented by a subset G ⊆ V × V . Given an One related active learning work that deals with a simi- unknown strict order G, an oracle W that returns W (u; v) = lar setting to ours is [13], whereas equivalence relations are considered instead. Particularly, they made several crude approximations in order to expedite the expected error cal- culation to a computational tractable level. We approach the design of query strategies from a different perspective Chen Liang, Jianbo Ye, Han Zhao, Bart Pursel and C. Lee Giles while keeping efficiency as one of our central concerns. "Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations" In: Proceedings of The 12th To empirically study the proposed active learning algorithm, International Conference on Educational Data Mining (EDM we apply it to concept prerequisite learning problem [15, 8], 2019), Collin F. Lynch, Agathe Merceron, Michel Desmarais, & where the goal is to predict whether a concept A is a pre- Roger Nkambou (eds.) 2019, pp. 348 - 353 Proceedings of The 12th International Conference on Educational Data Mining (EDM 2019) 348 requisite of a concept B given the pair (A; B). Although that a key difference between the traditional transitive clo- there have been some research efforts towards learning pre- sure and our definition of closure (Definition 3&4) is that requisites [16, 15, 8, 17], the mathematical nature of the the former only focuses on G but the latter requires calcula- prerequisite relation as strict partial orders has not been tion for both G and Gc. In the context of machine learning, investigated. In addition, one obstacle for effective learning- relations in G and Gc correspond to positive examples and based solutions to this problem is the lack of large scale negative examples, respectively. Since both of these exam- prerequisite labels. Liang et al. [9] applied standard active ples are crucial for training classifiers, existing algorithms learning to this problem without utilizing relation proper- for calculating transitive closure such as the Warshall algo- ties of prerequisites. Active learning methods tailored for rithm are not applicable. Thus we propose the following strict partial orders provide a good opportunity to tackle theorem for monotonically computing the closure. Please the current challenges of concept prerequisite learning. refer to supplemental material for the proofs. 1. Let G be a strict order of V and W a com- Our main contributions are summarized as follows: Fist, we Theorem H plete G-oracle on H ⊆ V × V . For any pair (a; b) 2 V × V , propose a new efficient reasoning module for monotonically define the notation C by calculating the deductive closure under the assumption of (a;b) a strict order. This computational module can be useful (i) If (a; b) 2 H, C(a;b) := H. for general AI solutions that need fast reasoning in regard c c 0 to strict orders. Second, we apply our reasoning module (ii) If (a; b) 2 G \ H , C(a;b) := H [ N(a;b) where to extend two popular active learning approaches to handle 0 G\H G\H N := f(d; c)jc 2 A [ fbg; d 2 D [ fagg; relational data and empirically achieve substantial improve- (a;b) b a 0 c ments. This is the first attempt to design active learning and particularly N(a;b) ⊆ G . query strategies tailored for strict partial orders. Third, un- c der the proposed framework, we solve the problem of con- (iii) If (a; b) 2 G\H , C(a;b) := H [N(a;b) [R(a;b) [S(a;b) [ cept prerequisite learning and our approach appears to be T(a;b) [ O(a;b), where successful on data from four educational domains, whereas N := f(c; d) j c 2 AG\H [ fag; d 2 DG\H [ fbgg; previous work have not exploited the relational structure of (a;b) a b prerequisites as strict partial orders in a principled way. R(a;b) := f(d; c) j (c; d) 2 N(a;b)g; G\H G\H S(a;b) := f(d; e) j c 2 Aa [ fag; d 2 Db [ fbg; 2. REASONING OF A STRICT ORDER (c; e) 2 Gc \ Hg; G\H G\H 2.1 Preliminary T(a;b) := f(e; c) j c 2 Aa [ fag; d 2 Db [ fbg; Definition 1 (Strict Order). Given a finite set V , (e; d) 2 Gc \ Hg; a subset G of V × V is called a strict order if and only if it [ 00 satisfies the two conditions: (i) if (a; b) 2 G and (b; c) 2 G, O(a;b) := N(c;d); (c;d)2S [T then (a; c) 2 G; (ii) if (a; b) 2 G, then (b; a) 62 G. (a;b) (a;b) G\(H[N ) N 00 := f(f; e) j e 2 A (a;b) [ fdg; Definition 2 (G-Oracle). For two subsets G; H ⊆ (c;d) d G\(H[N(a;b)) V × V , a function denoted as WH (·; ·): H 7! {−1; 1g is f 2 Dc [ fcgg: called a G-oracle on H iff for any (u; v) 2 H, WH (u; v) = −1 + 2 · 1[(u; v) 2 G]. In particular, N(a;b) ⊆ G and R(a;b) [ S(a;b) [ T(a;b) [ O ⊆ Gc.

Load more