Using a Lexical for the Ontology Building Nadia Bebeshina-Clairet, Sylvie Despres, Mathieu Lafourcade

To cite this version:

Nadia Bebeshina-Clairet, Sylvie Despres, Mathieu Lafourcade. Using a Lexical Semantic Network for the Ontology Building. RANLP: Recent Advances in Natural Processing, Sep 2019, Varna, Bulgaria. 12th International Conference Recent Advances in Natural Language Processing, 2019. ￿hal-02269128￿

HAL Id: hal-02269128 https://hal.archives-ouvertes.fr/hal-02269128 Submitted on 22 Aug 2019

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Using a Lexical Semantic Network for the Ontology Building

Nadia Bebeshina-Clairet Sylvie Despres Mathieu Lafourcade LIRMM / LIMICS LIMICS LIRMM 74 rue Marcel Cachin 860 rue de St Priest 93017 Bobigny 34095 Montpellier

Abstract cepts) shared by a community of agents. A con- cept corresponds to a set of individuals sharing sim- In the present article, we focus on mining lexical se- ilar characteristics and may or may not be lexical- mantic knowledge resources for the ontology build- ized. Thus, the ontology labels cannot be polyse- ing and enhancement. The ontology building is of- mous. The strength of ontologies is in their for- ten a descending and manual process where the high mal consistency. The weaknesses are linked to their level concepts are defined, then detailed by human coverage, size (as stated in (Raad and Cruz, 2015), experts. We explore a way of using a multilingual ”large ontologies usually cause serious scalability or monolingual lexical semantic network to make problems”), and human effort needed for their build- evolve or localize an existing ontology with a lim- ing. The potential of the LSNs is linked to the large ited human effort. amount of explicit semantic information they con- tain. However, a filtering process is needed to dis- 1 Introduction criminate irrelevant information (polysemy, noise). Today’s termino-ontological resources are increas- 2 State of the art ingly rich in terms of data they rely upon. The sci- entific community works intensively on data acqui- The opportunity of ontology construction empow- sition for the ontology building. For instance, the 1 ered by the use of Natural Language Processing NeOn project has been set up to provide a method- (NLP) techniques and tools has been explored for ology for the ontology engineering by integrating more than 20 years. Among the achievements, one preexisting knowledge resources into an ontology can distinguish the tools which take into account the building process. The NeOn methodology contains difference between the lexical term and the ontol- a consistent framework for modular ontology build- ogy concept (differentiated tools) and those that do ing as well as for setting up ontology networks. Here not make such distinction. Differentiated tools and we focus on exploiting Lexical Semantic Networks methods suggest extracting the terminological units (LSNs) to enrich an ontology or accompany the from texts and organizing them as a network using ontology building process. We assume that LSNs a set of hierarchical and equivalence relation types. represent knowledge as it is expressed through the Such network guides the ontology expert through human language whereas ontologies provide a for- the conceptualisation and ontology building pro- mal description (specification) of a conceptualiza- cess. Such process relies on an intermediary struc- tion (concepts and relationships between those con- ture, a termino-ontology (for instance, the Termi- 1http://neon-project.org/nw/About_NeOn.html nae suite, (Szulman, 2012)). Undifferentiated tools use some statistical information to suggest the can- The RezoJDM (Lafourcade, 2007) is a lexical se- didate concepts. They exploit such methods as for- mantic network (LSN) for French built using crowd- mal concept analysis (Mondary, 2011) or knowledge sourcing methods and, in particular, games with a based methods (for instance, TextToOnto2). ”Lex- purpose (GWAPS) such as JeuxDeMots3 and addi- ical ontologies” (Abu Helou et al., 2014) are suc- tional games . This commons sense network has cessfully used for ontology building. Numerous ap- been built since 2007. It is a directed, typed and proaches targeted at high level ontology or informa- weighted graph. At the time of our writing, Rezo- tion retrieval ontology based on general knowledge JDM contains 2.7 millions of terms that are modeled (such as (Marciniak, 2013)) rely on PWN. Others as nodes of the graph and 240 millions of relations use PWN and domain specific semantic for (arcs). forming the concepts (Turcato et al., 2000). Many The MLSN (Bebeshina-Clairet, 2019) is a multilin- other techniques use distributional gual LSN (it covers French, English, Spanish, and to learn lightweight ontologies, for exam- Russian) with an interlingual pivot built for the cui- ple, (Wong, 2009). sine and nutrition domain. This network is inspired In the framework of corpora based approaches to by the RezoJDM in terms of its model. Structurally the ontology building such as described in (Kietz speaking, the MLSN is a directed, typed, and val- et al., 2000), the idea of notable (salient, relevant) uated graph. It contains k sub-graphs correspond- element or relevant piece of knowledge (RPK) has ing to each of the k it covers and a spe- been introduced. It corresponds either to the fre- cific sub-graph which fulfills the role of the interlin- quent terms appearing in a corpus and to the tacit gual pivot. Similar to the RezoJDM, we call terms knowledge contained in texts. Such tacit knowl- the nodes of the MLSN and relations - its typed, edge corresponds to the semantic relationships (sub- weighed, and directed arcs. The MLSN nodes may somption relationship and other specialized relation- correspond to one of the following types : lexi- ships). Their presence in texts may take the form of cal items (garlic), interlingual items (pertaining ”indices”. In contrast, the explicit elements may re- to the interlingual pivot and also called covering veal the presence of concepts. The main drawback terms), relational items (i.e. relationship reifica- such definition of RPKs is that it relies on the fea- tions such as salad[r has part]garlic), and category tures defined or recorded for a particular language. items parts of speech or other morpho-syntactic fea- In addition, statistical criteria are often preferred and tures (i.e. Noun:AccusativeCase). it is difficult to qualify such RPKs from the semantic As it has been difficult to set up the pivot using point of view and in a language independent man- a multilingual embedding (joining multiple spaces, ner. In this paper, we will detail the experiments we one per language) as well as to avoid pairwise align- conducted to provide a new definition of the RPKs ment based on combinatorial criteria, the pivot has based on the structured lexical semantic information been started as a natural one using the English edi- and describe the way so defined RPKs can be used tion of DBNary (Serasset´ , 2014). It incrementally to help the top down ontology building process. evolves to become interlingual. The pivot evolution relies on sense-based alignments between the lan- 3 Ressources guages of the MLSN and aims at taking into account the difference of sense ”granularity” in different lan- In the present section, we will describe the resources guages. For example, as stewin English can be trans- we used in our experiments. lated as into French as pot-au-feu and ragoutˆ . It

2https://sourceforge.net/p/texttoonto/wiki/ 3http://www.jeuxdemots.org reflects the conceptualization discrepancy as ragoutˆ details and exemplifies the annotation scheme. refers to saute´ the ingredients and then add water whereas pot-au-feu refers to boiling them together in a larger amount of water than used for the ragoutˆ . In the MLSN pivot we have the interlingual term cor- responding to stew (which covers the English term) with two hyponyms corresponding to 0rench terms. The alignments are progressively obtained through external resources or by inference. Thus it can be Figure 1: MLSN: relationship annotation scheme. considered as a union of senses lexicalized or identified in the languages covered by the MLSN. l l . Even though we assume the pivot as being inter- The labels s and t correspond to the language (sub- lingual, it is still close to a natural one. A relation graph) labels. At the time of our writing, the MLSN contains 821 781 nodes and 2 231 197 arcs. It covers r ∈ R is a sextuplet r =< s, t, type, v, ls, lt > where s and t correspond respectively to the source 4 languages : English, French, Russian, and Span- and the target term of the relation. The relation ish. type is a typical relation type. It may model differ- MIAM (Despres` , 2016)5 is a modular termino- ent features such as taxonomic and part-whole re- ontology for the digital cooking. It provides knowl- lations (r isa, r hypo, r has part, r matter, r holo), edge necessary for the elaboration of general nutri- possible predicate-argument relations (typical object tional suggestions. The knowledge model of this r object, location r location, instrument r instr of an ontology gathers expert knowledge on food, food action), ”modifier” relations (typical characteristic transformation, cooking actions, relevant dishes that r carac, typical manner r manner) and more4. The reflect french culinary tradition, recipes necessary relationship valuation v corresponds to the charac- to cook such dishes. MIAM contains about 7 teristics of the relation which are its weight, con- 000 nodes and 30 000 semantic (non subsump- fidence score, and annotation. The relation weight tion/ontological is-a) relations. may be negative in order to model noise and keep the information about erroneous relations easy to access 4 Method: immersion - projection programmatically so they could not affect the infer- ence processes. The confidence score is a score at- 4.1 Summing up the Method tributed to a particular data origin (external resource, inference process). In practice, this feature is an ar- Our method is built upon the idea of projecting a ray as different origins may provide the same rela- model (the MIAM model) onto a multilingual or tion. The confidence information is provided as an monolingual LSN (respectively MLSN and Rezo- argument to the function that maps from some ex- JDM) in order to extract an intermediary resource ternal knowledge resource to the MLSN. In case of that can be used by ontology or domain experts in relation calculated by an inference process, it cor- the scope of information retrieval or validation of the responds to the precision evaluated on a sample of automatically suggested pieces of knowledge. an- candidate relations returned by this process. To Our method differs from others by the definition of notate a relation we add a complementary informa- the RPK and by the use of a non ontological seman- tion that allows qualifying this relation. The figure1 tically structured resource for ontology building. We

4We also introduced more specific relation types such as r entailment, 5http://www-limics.smbh.univ-paris13.fr/ r cause, r telic role, r incompatible, r before, r after etc. ontoMIAM/ define RPK as follows : ”a relevant piece of knowl- ploit standard vocabularies (such as RDFS, SKOS edge is either a term or a relation or a semantic and other machine readable formats). The input of structure which is known as qualified and qualify- the immersion algorithm is the reference ontology ing”. Qualified refers to the possibility to describe (MIAM) and the set of mapping rules whereas its the RPK in a discrete way (i.e. by enumerating the output is the action of inferring terms and relations typed relations). If the RPK is a term, it needs to in the target LSN (MLSN, RezoJDM). have a high in-degree (which reveals its conceptual In their general form, the mapping rules state: ”If role as it is used to define other terms of the net- x and y are respectively domain and range of an work). If the RPK is a relation, it needs to be con- Object Property p of the ontology to be immersed textualized (through the annotation mechanism rep- and y is a subclass of C, then x has a relation R resented on the figure1 or through the constraints with y and y has a relation is-a with C in the re- put on source and/or target terms of the relation). If ceiving (target) LSN”. Such rules have been de- the presumed RPM is a graph structure (path, sub- fined for the multilingual experiment for two rea- graph), it needs to possess a certain number of oc- sons. First, for each of the 93 MIAM proper- currences in the network. Qualifying refers to the ties, we determined relevant MLSN semantic re- possibility to use the candidate RPK for endogenous lation types (or set of types). Thus, the Ob- inference process. If the RPK is a term, it needs ject Property aPourProduitInitial (hasIni- to have hypernyms, hyponyms, among its tialProduct) corresponds to the substance and part- neighbours. It has to be aligned with other terms whole meronymy (MLSN relations typed r has part pertaining to the other languages of the LSN (if such and r matter). Second, we mapped the ontology LSN is multiligual). If the RPK is a relation, it must labels to the MLSN terms by coincidence (3 930 not be unique (other real or potential6 relations must terms; i.e. poulet basquaise formally denotes a exist in the network). If the candidate RPK is a struc- MIAM concept, a lexical item with the same la- ture, its terms and relations must be qualifying. bel already exists in the MLSN) or by composition Here we detail the experiments that have been con- (4 135 terms, i.e.: unite´ mesure capacite´ doesn’t ducted on the basis of the MLSN in order to pro- correspond to any existing MLSN term because it pose ”pseudo-class” and ”pseudo-property” candi- doesn’t correspond to any commonly used colloca- date RPKs to enhance the MIAM ontology and those tion in French; this label is split and integrated into concerning the enrichment of an ontology draft us- MLSN with the semantic relations that link its parts. ing the monolingual LSN, RezoJDM (Lafourcade, As part of the monolingual experiment, 115 descrip- 2007). These experiments rely on lexical knowl- tors have been automatically expressed in French edge. Therefore, the resulting RPKs have no pre- on the basis of their Uniform Resource Identifier tension to the ontological validity. The decision per- (URI) strings. All the terms except one were al- tains to the human expert. ready present in the RezoJDM network. We ex- ploited the relations typed r carac (typical charac- 4.2 Immersion teristic) for this experiment. This relation has been The projection of a given ontology model onto a annotated using the URIs of the ontology properties LSN starts by the immersion of such model. The aPourDescripteurBruit (hasSoundDescrip- immersion mechanism uses a set of manually de- tor is immersed as follows: crouteˆ r carac :: bruit fined mapping rules. It is possible to generate them croustillant (”crouteˆ has typical characteristic linked 7 automatically for the ontological resources that ex- to the noise croustillant” ). The premises of the

6Relations that can be calculated using inference. 7crust has typical characteristic linked to the sound crusty mapping rules rely on the contextualization of the (r has part i.e. a specific label (organic)), trans- LSN relations. Such contextualization is possible formation (r carac i.e. boiled mixture, cubed veg- when using sets of hypernyms and neighborhood se- etable), composition (r matter i.e. produit a` base de mantic relations of the source and target terms of a poisson ”fish based product”), category based dis- relation. Meta-information attached to the LSN re- tinctions (r hypo i.e. volaille type dinde ”turkey type lations (annotations, weight) may also be used. For poultry”). This analysis allowed selecting a subset instance: petrir´ r object pateˆ ∧ petrir´ r isa tech- of relation types to consider during the experiment. nique de base ∧ pateˆ r isa preparation´ 8). The RPK(ci) inference includes two steps: valida- As part of the immersion process the ontology labels tion of the hierarchical chain (figure2) and RPK(ci) become LSN terms that can be polysemous. candidate suggestion.

4.3 Projection

4.3.1 Inference in the LSN Context In the MLSN context we set up several algorithms to discover relevant pieces of knowledge (RPK) of Figure 2: Hierarchy chain validation. Ti are the terms of the the types ”class/individual” (ci) and ”ontology prop- hierarchy chain. We check by triangulation their semantic erty”(op). To discover the RPK(ci) we compare relatedness and use a subset of relations types (such relations the neighborhood terms inside an hierarchical chain are noted R) for that. which goes up to a high level MIAM concept im- We calculated and validated hierarchical chains cor- mersed into the LSN. For the RPK(op) we look for responding to 1 322 top MIAM concepts pertaining (real or possible) MLSN relations similar to the im- to the Aliment module. First we obtained 132 213 mersed MIAM properties. The inference scheme we chains. After filtering them by weight, the set has use is the abduction scheme. When we have two been reduced to 53 749 chains (40% of the initial similar terms (such as cohyponyms) the relations de- set). Still, a certain number of redundancies may ex- tained by one of them can be proposed for the other. ist inside this statistically pre-filtered set since a long For a term T, the abduction implies selecting a set of chain may include several shorter ones. The logical similar terms (according to some criteria) in order to filtering by triangulation left us with a set containing propose the relations detained by those similar terms 9 600 hierarchical chains (18% of the number of sta- to T . tistically filtered chains and 7% of the initial number of candidates). 4.3.2 Discovering the RPK(ci) Hierarchical chain examples after filtering: In MIAM, the general axioms concern the disjunc- (1) ”baguette complete` →pain complet → tion between the MIAM classes which is the basis of pain→ingredient´ de recette de cuisine→aliment” the ontology consistence. To translate them in terms (2) ”angelique´ →confiserie→bonbon”. of MLSN, we considered the labels of the classes RPK(ci) examples: listed in the axioms in order to identify the crite- truffe chocolat subClassOf chocolat ria that could have determined the disjunction. We pomme a` cidre subClassOf pomme manually analyzed a subset of the MIAM axioms sucre de pomme subClassOf confiserie and came up with the following criteria: affiliation The analysis and validation of the hierarchical 8knead r object dough ∧ knead r isa basic technique ∧ dough r isa chains corresponds to the important memory load. mixture The complexity of the algorithm depends on the im- portance of the MIAM concept being processed as lyMonth”); characteristic based (aPourEtat, well as on the length of the hierarchical chains that ”hasState”); OPs with a specific sub-graph9 are considered. The use of a subset of the seman- (aPourAlimentAmi, ”hasFriendlyFoodItem”). tic relations types available in the MLSN reduces The MIAM ontology we try to enrich counts 21 565 the number of combinations to process. Thus, given instances of Object Properties. Once they are im- r isa that the highest in-degree typed in the French mersed into the MLSN, one could consider that we al- sub-graph of the MLSN is 5 264 (for the term have the same number of inference rule instances iment l = 9 ) and that the maximum length , the that can be used for the cross-lingual RPK(op) dis- O(dl complexity in the worst case would be isa or covery. A naive approach would be setting up a O(5 2649) = 3, 103436942 × 1033 . cross-lingual inference mechanism. However, such The table1 introduces the results obtained for the approach would be error-prone due to the poten- discovery of RPK(ci) related to the top level concept tial alignment and polysemy issues. In addition, Aliment in the French sub-graph of the MLSN. The as MIAM has been built according to the top-down methodology by a community of domain experts, it #candidates #valid %valid #new %new contains a variable number of instances per property. 11 520 11 289 98% 4 741 42% The naive approach would reiterate this imbalance.

Table 1: The RPK(ci) discovery. To refine the RPK(op) discovery, we experimented a rule based approach. First, the validity of the rule automatic evaluation of the proposed RPK(ci) would for the source language is calculated. Second, struc- mostly rely on similarity measures. However, the tures similar to those specified by the rule are being projection step implicitly relies on relatedness and discovered in the MLSN (in other languages). similarity between the LSN terms. Thus, in our fu- The rule has the following form: ture work, the RPK(ci) evaluation will need human property=aPourEtatPhysique expert decisions. (property name) src=?s (domain, set of terms) 4.3.3 The RPK(op) Discovery reltype=r_carac (relation type) tgt=?o(range, set of terms) The RPK(op) discovery seems to be particularly source_isa=aliment(src hypernyms) useful in the context of multilingual ontology target_isa=etat physique building or localization of an existing ontology. (tgt hypernyms) In our experience, each module of the MIAM annotation=int:physical state ontology has its own hierarchy of properties. While (meta-information) immersing them into the MLSN, these properties src_feat=OUT/r_pos/int:Noun have been expressed in terms of semantic relations (in and out relations that characterize terms in the contextualized using annotations. The choice of source set) the MLSN semantic relation type made for these tgt_feat=OUT/r_pos/int:Adj (in and out properties allows us to distinguish the following relations that characterize terms in the target set) cases for the MIAM Object Properties (OPs): If a given rule allowed detecting enough structures composition based (aPourProduitInitial, in the source language (at least, 2 structures), it is ”hasInitialProduct”); related to processes ( considered as a valid one and can generate the qual- aPourMethodeDeConservation, ”has- ifying object. Thanks to this object, candidate struc- ConservationMethod”); temporal and spatial relation based (aPourMoisPrimeur, ”hasEar- 9A subset of MLSN terms connected through semantic relations. tures are detected in the other language sub-graphs of our future work. of the MLSN. The mechanism of RPK(op) discov- ery reveals the following elements that allowed dis- 4.3.4 Towards the Automatic Suggestion of the covering new pieces of knowledge : possibly anno- RPKs(op) tated semantic relation (case of the properties such To extend the RPK discovery experiment, we tried as aPresenceLactose, aPresenceGluten); to automatically suggest pseudo ontology proper- specific pattern (defined by rules); complex struc- ties to be submitted to the domain and ontology hu- ture for properties related to processes. The results man experts. We considered the ontology Senso- obtained using possibly annotated relations(in par- MIAM10 for this experiment. This ontology is a ticular, Data Properties)) are presented in the table MIAM module but we considered it as a ”draft” on- 2. the potential improvement is estimated as an in- tology as the sensory aspect modeling is a flourish- crease compared to the initial number of property ing research and development area and the Senso- instances (impr.%). MIAM could be improved. We used the monolin- #DP #MIAM #RPK(op) filt. +% gual LSN (RezoJDM). The SensoMIAM contains aTeneurLipide 0 4 741 3 271 - sensory descriptors such as DescripteurTact aPresenceLactose 2 593 530 408 +16% (”TactileDescriptor”) = {astringent, filandreux, aPresenceGluten 289 820 762 +263% ..., nerveux}; DescripteurSubstance (”Sub- stanceDescriptor”)= {aer´ e,´ dense, . . . , epais´ } Table 2: The RPK(op) discovery on the basis of simple se- To calculate the RPK descriptors RPK(desc), we mantic relations. explored the semantics of the source terms of the Example of the output of the rule-based algorithm: relations typed r carac. If the set of outgoing ru:jarkoje aPourProduitDiscriminant podlivka relations of such terms connects them to a food (”stew hasDiscriminatingProduct sauce”) and item and if they have a set of typical character- en:stew aPourProduitInitial en:vegetable (produce) istics shared with other terms with an hypernym (”stew hasInitialProduct vegetable”). Our rule- ≈ ”food”, the target term of their outgoing rela- based RPK(op) experiment (given the actual state of tions typed r carac that is not present in the Senso- the MLSN) yielded the results listed in the table3. MIAM can be suggested as a potential RPK(desc). The relation typed r carac is annotated. The pro- Fully automated structure-based evaluation such as cess is represented on the figure3. The experi- described in (Fernandez´ et al., 2009) may be cho- ence allowed to suggest the RPK(desc) such as: sen to compare to other resources available on the DescripteurArome={sucre-sal´ e,´ mielle,´ . . . , Web such as (Dooley et al., 2018). To address the vinaigre´} or DescripteurTact = {ecailleux,´ ontology accuracy, completeness, conciseness, ef- spongieux,. . . , floconneux}. ficiency, consistency, and other features (Raad and We automatically suggested and semi-automatically Cruz, 2015), a combination of methods is needed. In validated 342 RPKs(desc). We explored the pos- particular, gold standard ontology, specific tasks and sibility of suggesting relevant RPKs to human ex- corpora may be used for evaluation. A task-based perts. We defined 3 pseudo-properties for test- evaluation such as semantic analysis (Bebeshina- ing: aPourComposantFlaveur (”hasFlavourCompo- Clairet, 2019), dietary conflict detection from dish nent”), aPourComposantToucher (”hasTouchCom- titles (Clairet, 2017) have been used for the MLSN. ponent”), and aPourComposantAspect (”hasAspect- To evaluate the output of the immersion- projection Component”). To populate them, we explored the method, we need to organize our triples into a fully structured ontology. This will be one of the priorities 10www-limics.smbh.univ-paris13.fr/sensoMIAM/ m prop #in en fr es ru #out %aug A aPourProdInit. 2031 292 1208 203 2 245 3 039 +149% A aPourEtatPhys. 543 30 29 10 53 85 +16% A aPourForme 39 77 78 5 37 132 +338% A aPourLabel 114 15 11 3 1 29 26% A aPourMethodC. 115 94 101 13 156 309 +269% A aPourMois 116 117 221 23 28 116 +288% A aPourRegion 289 98 71 2 57 216 +75% A aPourProdCon. 98 256 302 143 103 570 +582% A aPourProdInitialA. 41 94 147 12 567 259 +633% P aPourTypeDeCuis. 23 155 124 80 285 686 +2 986% P aPourDomCul. 82 112 92 120 1313 1276 +1 557% P aPourDecoupe 82 82 78 56 77 272 +332% S aPourSaveur 752 51 78 47 98 232 +31% S aPourDescripteurBr. 119 67 80 10 6 159 +134% S aPourCouleur 233 192 451 59 423 911 +391% S aPourAspectSurf. 176 40 35 12 52 101 +58% S aPourSensationT. 54 84 77 21 12 155 +287% - Total 5388 2384 3960 937 4953 9531 +177%

Table 3: Rule-based approach. m -name of module (Aliment (A), Preparation (P), Sensory (S)), prop - property, #in - MIAM triples, en, fr, es, ru - MLSN sub-graphs. #out - overall number of suggested RPK(op) after filtering, %aug - potential improvement.

properties we explored. They have been automat- ically validated by constraining the range of the pseudo-property (it must be related to the terms ”flavour”, ”touch”, and ”aspect”) and by checking the sufficient intersection size between the relation sets typed r carac of the ”whole” and its ”part”.

Conclusion and perspectives

Figure 3: (1) relation annotation. (2) RPK(op) suggestion. We described a method that attaches an intermediary resource containing relevant pieces of knowledge RezoJDM relations typed r has part and r matter harvested from semantically structured resources in and considered the characteristics that can be shared different languages to the main building process. by a ”whole” and its ”parts”. We tried to gener- The method suits for other domains of knowledge alize to the ”whole” some of the characteristics of and the amount of work necessary for the immersion its parts. Automatically suggested toy triples: veau process is proportional to the size and to the type of Orloff aPourComposantFlaveur lard from the ontological resource to be enhanced. It is also 11 {gras, viande} ”veal Prince Orloff has a component possible to use such resources as as the that influences its flavour lard because they share basis for the intermediary resource. Among the dif- fat, meat”; gratin aPourComposantAspect fro- ficulties linked to our method appear the differences mage from {gratine,´ gras, brulˆ e´} ”gratin has a com- between formal representation paradigms as well as ponent that influences its aspect cheese as they share the availability of well structured and semantically characteristics grilled, fat, burned”. We automat- rich resources. ically suggested 1 709 RPKs(op) for the pseudo- 11http://globalwordnet.org/ References Mathieu Lafourcade. 2011. Lexique et analyse semantique´ de textes - structures, acquisitions, calculs, Mamoun Abu Helou, Mustafa Jarrar, Matteo Palmonari, et jeux de mots. ( and semantic analysis of texts and Ch Fellbaum. 2014. Towards building lexical ontol- - structures, acquisition, computation and games with ogy via cross-language matching. pages 346–354. ). https://tel.archives-ouvertes.fr/tel-00649851. Keith Allan. 2001. Natural Language Semantics. Black- well. Ora Lassila and Deborah McGuinness. 2001. The role of frame-based representation on the semantic web. Techni- Nadia Bebeshina-Clairet. 2019. Construction d’une cal report, Knowledge Systems Laboratory Report KSL- ressource termino-ontologique multilingue pour les do- 01-02, Stanford University, Stanford (USA). maines de la cuisine et de la nutrition. Theses, Universite´ Paris 13. Jacek Marciniak. 2013. Building based ontolo- Christian Biemann. 2005. Ontology learning from text: gies with expert knowledge. In LTC. A survey of methods. LDV Forum 20:75–93. Thibault Mondary. 2011. Construction d’ontologies Nadia Clairet. 2017. Dish classification using knowledge a` partir de textes. L’apport de l’analyse de con- based dietary conflict detection. In RANLP 2017. cepts formels.. Theses, Universite´ Paris-Nord - Paris Sylvie Despres.` 2016. Construction d’une ontolo- XIII. Equipe RCLN. https://tel.archives-ouvertes.fr/tel- gie modulaire. application au domaine de la cuisine 00596825. numerique´ . Revue d’Intelligence Artificielle 30(5):509– Joe Raad and Christophe Cruz. 2015. A Survey on On- 532. https://doi.org/10.3166/ria.30.509-532. tology Evaluation Methods. In Proceedings of the Inter- Zhendong Dong, Qiang Dong, and Changling Hao. national Conference on Knowledge Engineering and On- 2010. Hownet and its computation of mean- tology Development, part of the 7th International Joint ing. In Proceedings of the 23rd International Conference on Knowledge Discovery, Knowledge Engi- Conference on Computational Linguistics: Demon- neering and Knowledge Management . Lisbonne, Portu- strations. Association for Computational Linguistics, gal. https://doi.org/10.5220/0005591001790186. Stroudsburg, PA, USA, COLING ’10, pages 53–56. Lionel Ramadier. 2016. Indexation and learning of terms http://dl.acm.org/citation.cfm?id=1944284.1944298. and relations from reports of radiology. Theses, Uni- Damion Dooley, Emma Griffiths, Gurinder S. Gosal, versite´ de Montpellier. https://hal-lirmm.ccsd.cnrs.fr/tel- Pier Luigi Buttigieg, Robert Hoehndorf, Matthew 01479769. C. Lange, Lynn M. Schriml, Fiona S. L. Brinkman, and Christophe Roche. 2007. Le terme et le concept : fonde- William W. L. Hsiao. 2018. Foodon: a harmonized ments d’une ontoterminologie. In TOTh 2007 : Ter- food ontology to increase global food traceability, qual- minologie et Ontologie : Theories´ et Applications. An- ity control and data integration. npj Science of Food 2. necy, France, pages 1–22. 22 pages. https://hal.archives- https://doi.org/10.1038/s41538-018-0032-6. ouvertes.fr/hal-00202645. Miriam Fernandez,´ Chwhynny Overbeeke, Marta Sabou, Gilles Serasset.´ 2014. DBnary: Wiktionary as a Lemon- and Enrico Motta. 2009. What makes a good ontology? a Based Multilingual in RDF. Semantic case-study in fine-grained knowledge reuse. In Asuncion´ Web – Interoperability, Usability, Applicability pages –. Gomez-P´ erez,´ Yong Yu, and Ying Ding, editors, The Se- To appear. https://hal.archives-ouvertes.fr/hal-00953638. mantic Web. Springer Berlin Heidelberg, Berlin, Heidel- berg, pages 61–75. Robyn Speer and Catherine Havasi. 2012. Represent- E. Gaillard, J. Lieber, and E. Nauer. 2015. Improv- ing general relational knowledge in ConceptNet 5. In ing ingredient substitution using formal concept analysis Proceedings of the Eighth International Conference on and adaptation of ingredient quantities with mixed linear Language Resources and Evaluation (LREC-2012). Eu- optimization. In Computer Cooking Contest Workshop. ropean Languages Resources Association (ELRA), Istan- Frankfort, Germany. https://hal.inria.fr/hal-01240383. bul, Turkey, pages 3679–3686. J.U. Kietz, A. Maedche, and R. Volz. 2000. A method for Sylvie Szulman. 2012. Logiciel Terminae - Ver- semi-automatic ontology acquisition from a corporate in- sion 2012. TERMINAE est une plateforme d’aide tranet. EKAW-2000 Workshop Ontologies and Text, Juan- a` la construction de ressources termino-ontologiques Les-Pins, France, October 2000 . a` partir de ressources textuelles. https://hal.archives- Mathieu Lafourcade. 2007. Making people play for Lex- ouvertes.fr/hal-00719453. ical Acquisition with the JeuxDeMots prototype. In Andon Tchechmedjiev. 2016. Semantic Interoperability SNLP’07: 7th International Symposium on Natural Lan- of Multilingual Lexical Resources in Lexical Linked Data. guage Processing. Pattaya, Chonburi, Thailand, page 7. Theses, Universite´ Grenoble Alpes. https://tel.archives- https://hal-lirmm.ccsd.cnrs.fr/lirmm-00200883. ouvertes.fr/tel-01681358. Davide Turcato, Fred Popowich, Janine Toole, Dan Fass, Devlan Nicholson, and Gordon Tisher. 2000. Adapting a to specific domains. In ACL-2000 Workshop on Recent Advances in Natural Language Pro- cessing and Information Retrieval. Association for Com- putational Linguistics, Hong Kong, China, pages 1–11. https://doi.org/10.3115/1117755.1117757. Piek Vossen. 2012. Ontologies. The Ox- ford Handbook of Computational Linguistics https://doi.org/10.1093/oxfordhb/9780199276349.013.0025. Wilson Wong. 2009. Learning lightweight ontologies from text across different domains using the web as back- ground knowledge. Ph.D. thesis. Manel Zarrouk. 2015. Endogeneous Consolidation of Lexical Semantic Networks. Theses, Universite´ de Mont- pellier. https://hal-lirmm.ccsd.cnrs.fr/tel-01300285.