TaxoPhrase: Exploring Knowledge Base via Joint Learning of Taxonomy and Topical Phrases Weijing Huang Wei Chen∗ Key Laboratory of High Confidence Software Technologies Key Laboratory of High Confidence Software Technologies (Ministry of Education), EECS, Peking University (Ministry of Education), EECS, Peking University
[email protected] [email protected] Tengjiao Wang Shibo Tao Key Laboratory of High Confidence Software Technologies SPCCTA, School of Electronics and Computer (Ministry of Education), EECS, Peking University Engineering, Peking University
[email protected] [email protected] ABSTRACT 106 Politics_of_Morelos Knowledge bases restore many facts about the world. But due to the 105 big size of knowledge bases, it is not easy to take a quick overview 104 Mathematics onto their restored knowledge. In favor of the taxonomy structure 3 10 Musicians_by_band and the phrases in the content of entities, this paper proposes an Freq 2 Albums_by_artist exploratory tool TaxoPhrase on the knowledge base. TaxoPhrase 10 (1) is a novel Markov Random Field based topic model to learn the 101 taxonomy structure and topical phrases jointly; (2) extracts the 0 10 0 1 2 3 4 5 topics over subcategories, entities, and phrases, and represents the 10 10 10 10 10 10 extracted topics as the overview information for a given category Outdegree in the knowledge base. The experiments on the example categories Mathematics, Chemistry, and Argentina in the English Wikipedia Figure 1: Distribution of categories’ out degrees to subcate- demonstrate that our proposed TaxoPhrase provides an effective gories, according to the study on the English Wikipedia. tool to explore the knowledge base. KEYWORDS the taxonomy, it seems straightforward to utilize the taxonomy to Knowledge Base, Exploratory Tool, Topical Phrases, Taxonomy answer Q1 and Q2.