Learning from Description Logics

Learning from Description Logics

Learning from Description Logics Part 2 of the Tutorial on Semantic Data Mining Agnieszka Lawrynowicz, Jedrzej Potoniec Poznan University of Technology Semantic Data Mining Tutorial (ECML/PKDD’11) 1 Athens, 9 September 2011 Outline 1 Description logics in a nutshell 2 Learning in description logic - definition 3 DL learning methods and techniques: Concept learning Refinement operators Pattern mining Similarity-based approaches 4 Tools 5 Applications 6 Presentation of a tool: RMonto Semantic Data Mining Tutorial (ECML/PKDD’11) 2 Athens, 9 September 2011 Learning in DLs Definition Learning in description logics: a machine learning approach that adopts Inductive Logic Programming as the methodology and description logic as the language of data and hypotheses. Description logics theoretically underpin the state-of-art Web ontology representation language, OWL, so description logic learning approaches are well suited for semantic data mining. Semantic Data Mining Tutorial (ECML/PKDD’11) 3 Athens, 9 September 2011 subset of first order logic (decidability, efficiency, expressivity) root: semantic networks, frames Description logic Definition Description Logics, DLs = family of first order logic-based formalisms suitable for representing knowledge, especially terminologies, ontologies. Semantic Data Mining Tutorial (ECML/PKDD’11) 4 Athens, 9 September 2011 Description logic Definition Description Logics, DLs = family of first order logic-based formalisms suitable for representing knowledge, especially terminologies, ontologies. subset of first order logic (decidability, efficiency, expressivity) root: semantic networks, frames Semantic Data Mining Tutorial (ECML/PKDD’11) 4 Athens, 9 September 2011 Basic building blocks DL concepts roles constructors individuals Examples Atomic concepts: Artist, Movie Role: creates Constructors: u, 9 Concept definition: Director ≡ Artist u 9creates.Movie Axiom (”each director is an artist”): Director v Artist Asertion: creates(sofiaCoppola, lostInTranslation) Semantic Data Mining Tutorial (ECML/PKDD’11) 5 Athens, 9 September 2011 DL knowledge base K = (T Box; ABox) T Box = { CreteHolidaysOffer ≡ Offer u9 in.Crete u8 in.Crete SantoriniHolidaysOffer ≡ Offer u9 in.Santorini u8 in.Santorini TromsøyaHolidaysOffer ≡ Offer u9 in.Tromsøya u8 in.Tromsøya Crete v 9 partOf.Greece Santorini v 9 partOf.Greece Tromsøya v 9 partOf.Norway }. ABox = { Offer(o1). in(Crete). SantoriniHolidaysOffer(o2). Offer(o3). in(Santorini). hasPrice(o3, 300) }. Semantic Data Mining Tutorial (ECML/PKDD’11) 6 Athens, 9 September 2011 DL reasoning services satisfiability inconsistency subsumption instance checking Semantic Data Mining Tutorial (ECML/PKDD’11) 7 Athens, 9 September 2011 Concept learning Given new target concept name C knowledge base K as background knowledge a set E+ of positive examples, and a set E− of negative examples the goal is to learn a concept definition C ≡ D such that K [ fC ≡ Dg j= E+ and K [ fC ≡ Dg j= E− Semantic Data Mining Tutorial (ECML/PKDD’11) 8 Athens, 9 September 2011 Negative examples and Open World Assumption But what are negative examples in the context of the Open World Assumption? Semantic Data Mining Tutorial (ECML/PKDD’11) 9 Athens, 9 September 2011 Open world (description logic DL, Semantic Web) incomplete knowledge of instances negation of some fact has to be explicitely asserted (monotonic negation) Semantics: ”closed world” vs ”open world” Closed world (Logic programming LP , databases) complete knowledge of instances lack of information is by default negative information (negation-as-failure) Semantic Data Mining Tutorial (ECML/PKDD’11) 10 Athens, 9 September 2011 Semantics: ”closed world” vs ”open world” Closed world (Logic programming LP , databases) complete knowledge of instances lack of information is by default negative information (negation-as-failure) Open world (description logic DL, Semantic Web) incomplete knowledge of instances negation of some fact has to be explicitely asserted (monotonic negation) Semantic Data Mining Tutorial (ECML/PKDD’11) 10 Athens, 9 September 2011 Are all of the movies of Sofia Coppola Oscar movies? YES - closed world ”Closed world” vs ”open world” example Let data base contain the following data: OscarMovie(lostInTranslation) Director(sofiaCoppola) creates(sofiaCoppola, lostInTranslation) Semantic Data Mining Tutorial (ECML/PKDD’11) 11 Athens, 9 September 2011 YES - closed world ”Closed world” vs ”open world” example Let data base contain the following data: OscarMovie(lostInTranslation) Director(sofiaCoppola) creates(sofiaCoppola, lostInTranslation) Are all of the movies of Sofia Coppola Oscar movies? Semantic Data Mining Tutorial (ECML/PKDD’11) 11 Athens, 9 September 2011 ”Closed world” vs ”open world” example Let data base contain the following data: OscarMovie(lostInTranslation) Director(sofiaCoppola) creates(sofiaCoppola, lostInTranslation) Are all of the movies of Sofia Coppola Oscar movies? YES - closed world Semantic Data Mining Tutorial (ECML/PKDD’11) 11 Athens, 9 September 2011 Different conclusions! ”Closed world” vs ”open world” example Let data base contain the following data: OscarMovie(lostInTranslation) Director(sofiaCoppola) creates(sofiaCoppola, lostInTranslation) Are all of the movies of Sofia Coppola Oscar movies? YES - closed world DON’T KNOW - open world Semantic Data Mining Tutorial (ECML/PKDD’11) 12 Athens, 9 September 2011 ”Closed world” vs ”open world” example Let data base contain the following data: OscarMovie(lostInTranslation) Director(sofiaCoppola) creates(sofiaCoppola, lostInTranslation) Are all of the movies of Sofia Coppola Oscar movies? YES - closed world DON’T KNOW - open world Different conclusions! Semantic Data Mining Tutorial (ECML/PKDD’11) 12 Athens, 9 September 2011 OWA and machine learning OWA is problematic for machine learning since an individual is rarely deduced to belong to a complement of a concept unless explicitely asserted so. Semantic Data Mining Tutorial (ECML/PKDD’11) 13 Athens, 9 September 2011 Dealing with OWA in learning Solution1: alternative problem setting Solution2: K operator Solution3: new performance measures Semantic Data Mining Tutorial (ECML/PKDD’11) 14 Athens, 9 September 2011 Dealing with OWA in learning: alternative problem setting ”Closing” the knowledge base to allow performing instance checks under the Closed World Assumption (CWA). By default: Positive examples of the form C(a), and negative examples of the form :C(a), where a is an individual and holding: K [ fC ≡ Dg j= E+ and K [ fC ≡ Dg j= E− Alternatively: Examples of the form C(a) and holding: K [ fC ≡ Dg j= E+ and K [ fC ≡ Dg 6j= E− Semantic Data Mining Tutorial (ECML/PKDD’11) 15 Athens, 9 September 2011 Dealing with OWA in learning: K operator epistemic K–operator allows for querying for known properties of known individuals w.r.t. the given knowlege base K the K operator alters constructs like 8 in a way that they operate on a Closed World Assumption. Consider two queries: Q1: K j= {(8creates.OscarMovie) (sofiaCoppola)} Q2: K j= {(8Kcreates.OscarMovie) (sofiaCoppola)} Badea and Nienhuys-Cheng (ILP 2000) considered the K operator from a theoretical point of view. not easy to implement in reasoning systems, non-standard Semantic Data Mining Tutorial (ECML/PKDD’11) 16 Athens, 9 September 2011 Dealing with OWA in learning: new performance measures d’Amato et al (ESWC 2008) – overcoming unknown answers from the reasoner (as a reference system) – correspondence between the classification by the reasoner for the instances w.r.t. the test concept C and the definition induced by a learning system match rate: number of individuals with exactly the same classification by both the inductive and the deductive classifier w.r.t the overall number of individuals; omission error rate: number of individuals not classified by inductive method, relevant to the query w.r.t. the reasoner; commission error rate: number of individuals found relevant to C, while they (logically) belong to its negation or vice-versa; induction rate: number of individuals found relevant to C or to its negation, while either case not logically derivable from K; Semantic Data Mining Tutorial (ECML/PKDD’11) 17 Athens, 9 September 2011 Concept learning - algorithms supervised: YINYANG (Iannone et al, Applied Intelligence 2007) DL-Learner (Lehmann & Hitzler, ILP 2007) DL-FOIL (Fanizzi et al, ILP 2008) TERMITIS (Fanizzi et al, ECML/PKDD 2010) unsupervised: KLUSTER (Kietz & Morik, MLJ 1994) Semantic Data Mining Tutorial (ECML/PKDD’11) 18 Athens, 9 September 2011 DL-learning as search learning in DLs can be seen as search in space of concepts it is possible to impose ordering on this search space using subsumption as natural quasi-order, and generality measure between concepts if D v C then C covers all instances that are covered by D refinement operators may be applied to traverse the space by computing a set of specializations (resp. generalizations) of a concept Semantic Data Mining Tutorial (ECML/PKDD’11) 19 Athens, 9 September 2011 Properties of refinement operators Consider downward refinement operator ρ, and by C ρ D denote a refinement chain from a concept C to D complete: each point in lattice is reachable (for D v C there exists E such that E ≡ D and a refinement chain C ρ ::: ρ E weakly complete: for any concept C with C v >, concept E with E ≡ C can be reached from > finite: finite for any concept redundant: there exist two different refinement chains from C to D proper: C ρ D implies C 6≡ D ideal = complete + proper + finite Semantic Data Mining Tutorial (ECML/PKDD’11) 20 Athens, 9 September 2011

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    58 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us