2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology Learning a Large Scale of Ontology from Japanese Wikipedia Susumu Tamagawa∗, Shinya Sakurai∗, Takuya Tejima∗, Takeshi Morita∗, Noriaki Izumiy and Takahira Yamaguchi∗ ∗Keio University 3-14-1 Hiyoshi, Kohoku-ku, Yokohama-shi, Kanagawa 223-8522, Japan Email: fs tamagawa,
[email protected] yNational Institute of Advanced Industrial Science and Technology 2-3-26 Aomi, Koto-ku, Tokyo 135-0064, Japan Abstract— Here is discussed how to learn a large scale II. RELATED WORK of ontology from Japanese Wikipedia. The learned on- tology includes the following properties: rdfs:subClassOf Auer et al.’s DBpedia [4] constructed a large information (IS-A relationships), rdf:type (class-instance relationships), database to extract RDF from semi-structured information owl:Object/DatatypeProperty (Infobox triples), rdfs:domain resources of Wikipedia. They used information resource (property domains), and skos:altLabel (synonyms). Experimen- tal case studies show us that the learned Japanese Wikipedia such as infobox, external link, categories the article belongs Ontology goes better than already existing general linguistic to. However, properties and classes are built manually and ontologies, such as EDR and Japanese WordNet, from the there are only 170 classes in the DBpedia. points of building costs and structure information richness. Fabian et al. [5] proposed YAGO which enhanced Word- Net by using the Conceptual Category. Conceptual Category is a category of Wikipedia English whose head part has the form of plural noun like “ American singers of German I. INTRODUCTION origin ”. They defined a Conceptual Category as a class, and defined the articles which belong to the Conceptual It is useful to build up large-scale ontologies for informa- Category as instances of the class.