AM03 Procv3.Pdf

Total Page:16

File Type:pdf, Size:1020Kb

AM03 Procv3.Pdf Contents A Fuzzy Set Approach to Query Syntax Analysis in Information Retrieval Systems ……. 1 Dariusz Josef Kogut Acquisition of Hypernyms and Hyponyms from the WWW …………………………….… 7 Ratanachai Sombatsrisomboon, Yutaka Matsuo, Mitsuru Ishizuka Micro View and Macro View Approaches to Discovered Rule Filtering ………………….. 13 Yasuhiko Kitamura, Akira Iida, Keunsik Park, Shoji Tatsumi Relevance Feedback Document Retrieval using Support Vector Machines ……………… 22 Takashi Onoda, Hiroshi Murata, Seiji Yamada Using Sectioning Information for Text Retrieval: a Case Study with the MEDLINE Abstracts …...………………………………… 32 Masashi Shimbo, Takahiro Yamasaki, Yuji Matsumoto Rule-Based Chase Algorithm for Partially Incomplete Information Systems ………….… 42 Agnieszka Dardzinska-Glebocka, Zbigniew W Ras Data Mining Oriented CRM Systems Based on MUSASHI: C-MUSASHI …………….… 52 Katsutoshi Yada, Yukinobu Hamuro, Naoki Katoh, Takashi Washio, Issey Fusamoto, Daisuke Fujishima, Takaya Ikeda Integrated Mining for Cancer Incidence Factors from Healthcare Data ……………….… 62 Xiaolong Zhang, Tetsuo Narita Multi-Aspect Mining for Hepatitis Data Analysis …………………………………………... 74 Muneaki Ohshima, Tomohiro Okuno, Ning Zhong, Hideto Yokoi Investigation of Rule Interestingness in Medical Data Mining ……………………………. 85 Miho Ohsaki, Yoshinori Sato, Shinya Kitaguchi, Hideto Yokoi, Takahira Yamaguchi Experimental Evaluation of Time-series Decision Tree …………………………………… 98 Yuu Yamada, Einoshin Suzuki, Hideto Yokoi, Katsuhiko Takabayashi Extracting Diagnostic Knowledge from Hepatitis Dataset by Decision Tree Graph-Based Induction ……………………………………….… 106 Warodom Geamsakul, Tetsuya Yoshida, Kouzou Ohara, Hiroshi Motoda, Takashi Washio Discovery of Temporal Relationships using Graph Structures …………………………… 118 Ryutaro Ichise, Masayuki Numao A Scenario Development on Hepatitis B and C ……………………………………………… 130 Yukio Ohsawa, Naoaki Okazaki, Naohiro Matsumura, Akio Saiura, Hajime Fujie Empirical Comparison of Clustering Methods for Long Time-Series Databases ………… 141 Shoji Hirano, Shusaku Tsumoto Classification of Pharmacological Activity of Drugs Using Support Vector Machine …….. 152 Yoshimasa Takahashi, Katsumi Nishikoori, Satoshi Fujishima Mining Chemical Compound Structure Data Using Inductive Logic Programming …… 159 Cholwich Nattee, Sukree Sinthupinyo, Masayuki Numao, Takashi Okada Development of a 3D Motif Dictionary System for Protein Structure Mining …………..… 169 Hiroaki Kato, Hiroyuki Miyata, Naohiro Uchimura, Yoshimasa Takahashi, Hidetsugu Abe Spiral Mining using Attributes from 3D Molecular Structures …………………………… 175 Takashi Okada, Masumi Yamakawa, Hirotaka Niitsuma Architecture of Spatial Data Warehouse for Traffic Management ………………………… 183 Hiroyuki Kawano, Eiji Hato Title Index …………………………………………………………...……………………...…. iii Author Index ……………..…………………………………………………………………… v A Fuzzy Set Approach to Query Syntax Analysis in Information Retrieval Systems Dariusz J. Kogut Institute of Computer Science, Warsaw University of Technology, Nowowiejska 15/19, 00-665 Warsaw, Poland [email protected] Abstract. This article investigates whether a fuzzy set approach to the natural language syntactic analysis can support information retrieval sys- tems. It concentrates on a web search since the internet becomes a vast resource of information. In addition, this article presents a module of syntax analysis of TORCH project where the fuzzy set disambiguation has been implemented and tested. 1 Introduction Some recent developments on information technology have concured to acceler- ate a research in the field of artificial intelligence [5]. The appeal of fantasizing about intelligent computers that understand human communication is practi- cally unavoidable. However, the natural language research seems to be one of the hardest problems of artificial intelligence due to the complexity, irregularity and diversity of human languages [8]. This article investigates whether a fuzzy set approach to natural language processing can support the search engines in web universe. An overview of syn- tactic analysis based on fuzzy set disambiguation will be presented and may provide some insight for further inquiry. 2 Syntax Analysis in TORCH - a Fuzzy Set Approach TORCH is an information retrieval system with a natural language interface. It has been designed and implemented by author in order to facilitate searching of physical data in CERN (European Organization for Nuclear Research, Geneva). TORCH relies on a textual database composed of web documents published in the internet and intranet networks. It became an add-on module in Infoseek Search Engine environment [6]. 2.1 TORCH Architecture The architecture of TORCH has been shown in figure 1. To begin with, a natural language query is captured by the TORCH user interface module and transmit- ted to the syntactic analysis section. After the analysis is completed, the query Fig. 1. TORCH Architecture is being conveyed to the semantic analysis module [6], then interpreted and translated into a formal Infoseek Search Engine query. Let us focus on the syntax analysis module of TORCH which is shown in figure 2. The syntax analysis unit has been based on a stepping-up parsing al- gorithm and equipped with a fuzzy logic engine that supports syntactic disam- biguation. The syntactic dissection goes through several autonomous phases [6]. Once the natural language query is transformed into a surface structure [4], the preprocessor unit does the introductory work and query filtering. Fig. 2. Syntax Analysis Module Next, the morphological transformation unit turns each word from a non- basic form into its canonical one [7]. WordNet lexicon and CERN Public Pages Dictionary provide the basic linguistic data and therefore play an important role in both syntactic and semantic analysis. WordNet Dictionary has been built on Princeton University [7], while CERN Public Pages Dictionary extends TORCH knowledge on advanced particle physics. At the end of dissection process, a fuzzy logic engine formulates the part of speech membership functions [6] which char- acterize each word of the sentence. This innovatory approach allows to handle the most of syntax ambiguity cases which happen in English (tests proved that approx. 90% of word atoms are properly identified). 2.2 Fuzzy Set Approach The fuzzy logic provides a rich and meaningful addition to standard logic and creates the opportunity for expressing those conditions which are inherently imprecisely defined [11]. The architecture of fuzzy logic engine has been shown in figure 3. Fig. 3. Fuzzy Logic Engine The engine solves several cases where syntactic ambiguation may occur. Therefore, it constructs and updates four descrete membership functions des- ignated as ΨN , ΨV , ΨAdj and ΨAdv. At first stage, the linguistic data retrieved from WordNet lexicon are used to form the draft membership functions, as it is described below: Let us assume that : LK - refers to the amount of categories that may be assigned to the word; Nx - refers to the amount of meanings within the selected category; Note that x ∈M,whereM = {N(oun),V(erb),Adj,Adv}; NZ - refers to the amount of meanings within the all categories; w(n) - refers to the n-vector (n-word) of the sentence; ξK - is a heuristic factor that describes a category weight (e.g., ξK =0.5) The membership functions have been defined as follows: NZ [w(n)]= Nx[w(n)] x∈M 1 − ξK ξK ΨN [w(n)]= · NN [w(n)]+ NZ [w(n)] LK Similarly, 1 − ξK ξK ΨV [w(n)]= · NV [w(n)]+ NZ [w(n)] LK 1 − ξK ξK ΨAdj[w(n)]= · NAdj[w(n)]+ NZ [w(n)] LK 1 − ξK ξK ΨAdv[w(n)]= · NAdv[w(n)]+ NZ [w(n)] LK In order to illustrate the algorithm, let us take an example query: Why do the people drive on the right side of the road? Figures 4 and 5 describe the draft ΨN , ΨV , ΨADJ and ΨADV functions. 1 1 VERB NOUN Ψ Ψ 0.75 0.75 0.5 0.5 0.25 0.25 n n 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 Fig. 4. ΨN and ΨV membership functions 1 1 ADJ ADV Ψ Ψ 0.75 0.75 0.5 0.5 0.25 0.25 n n 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 Fig. 5. ΨADJ and ΨADV membership functions Certainly, each word must be considered as a specific part of speech and may belong to only one of the lingual categories in the context of a given sentence. Thus, the proper category must be assigned in a process of defuzzyfication, which may be described by the following steps and formulas (KN , KV , KAdj and KAdv represent the part of speech categories): Step 1.: w(n) ∈KN ⇔ ΨN [w(n)] ≥ max(ΨAdj[w(n)],ΨV [w(n)],ΨAdv[w(n)]) Step 2.: w(n) ∈KAdj ⇔ ΨAdj[w(n)] ≥ max(ΨV [w(n)],ΨAdv[w(n)]) ∧ (ΨAdj[w(n)] >ΨN [w(n)]) Step 3.: w(n) ∈KV ⇔ ΨV [w(n)] > max(ΨN [w(n)],ΨAdj[w(n)]) ∧ (ΨV [w(n)] ≥ ΨAdv[w(n)]) Step 4.: w(n) ∈KAdv ⇔ ΨAdv[w(n)] > max(ΨN [w(n)],ΨAdj[w(n)],ΨV [w(n)]) Unfortunately, the syntactic disambiguation based on lexicon data exclusively cannot handle all the cases. Therefore, a second stage of processing - based on any grammar principles - seems to be necessary [2] [3]. For that reason, TORCH employs its own English grammar engine [6] (shown in figure 6) with a set of simple grammar rules that can verify and approve the membership functions. Fig. 6. English Grammar Engine It utilizes a procedural parsing, so the rules are stored in a sort of static library [10] and may be exploited on demand. When the engine exploits English grammar rules upon the query and verifies the fuzzy logic membership functions, the deep structure of the question is constructed [1] [9] and the semantic analysis initialized [6]. The query syntax analysis of TORCH seems to be simple and efficient. A set of tests based on 20000-word samples of e-text books has been done, and the accuracy results (Acc-A when the English grammar engine was disable, and Acc-B with the grammar processing switched on) are shown in table 1. Natural language clearly offers advantages in convenience and flexibility, but also involves challenges in query interpretation.
Recommended publications
  • IBS Annual Summer School Preface Editorial Board
    IBS Annual Summer School Foundation IBS supports the establishment, growth and preservation of scientific oriented interdisciplinary networks. National and international scientists and experts are brought together for research and technology transfer, where the foundation IBS provides the suitable environment for conferences, workshops and seminars. Especially young researchers are encouraged to join all IBS events. The series IBS Scientific Workshop Proceedings (IBS-SWP) publishes peer reviewed papers contributed to IBS - Workshops. IBS-SWP are open access publications, i.e. all volumes are online available at IBS website www.ibs-laubuch.de/ibs-swp and free of charge. IBS Scientific Workshop Proceedings aim at the ensuring of permanent visibility and access to the research results presented during the events staged by Foundation IBS. Preface Annual summer school was organized successful in this year from 29th June to 1th July in Laubusch, Germany. In this summer school attended bachelor, master and doctors students, researchers from Mongolia, Germany, Russia, India and Turkey. Topics of SS2015 were: e-learning, embedded systems and international cooperation between universities of partner countries. One of the highlighted points of this year was attendance of bachelor students from Mongolia and video session with Russian delegation. Master students from India and Turkey presented own ongoing research and got live feedback from researchers and professors. Next highlight of summer school was formal agreement with extension of Summer School in next year by Harbin Technical University, China. By idea of Russian university summer school will be change places from year to year by partner university countries. One of the main goals of this international event is to give opportunity to students and researchers of partner universities to work together during school time actively and freely in frame if science and international cooperation.
    [Show full text]
  • Anvisningar Och Mall
    Master's thesis Two years International Master’s Degree in Computer Engineering Cross-Platform Solution for Mobile Application Development MA level, 30credits Aditya Polisetti Cross-Platform Solution for Mobile Application Development Abstract Aditya Polisetti 2014-11-10 Abstract With the possibility to increase the smart phone market, the mobile application development is experiencing a rapid improvement in terms of both revenue and innovation. Considering the demand in the applica- tion development market, there is an urge to develop applications faster and smoother with the growing number of different platforms and technologies. Many companies and brands are making their marketing strategies more efficient by using mobile applications. The constant growth in mobile devices and operating systems raise several problems and challenges in mobile application development. With this growth, a developer finds it difficult to build applications keeping the native technologies in mind with respect to the device platform. Application development for a mobile device mainly deals with native development platforms such as C, Objective-C, Java, C#, J2ME, C++ etc., and is incompatible with the cross-platform support. HTML5 is a solution that incorporates cross-platform capability in all these devices. The purpose of this thesis is to research the different approaches of mobile application development and to find a favourable environment for a cross mobile platform development. The smart phone application development can built in three environments and the applications can differ such as native applications, cross-platform applications and hy- brid applications. The research work includes the study of several cross-platform frame- works, responsive web designing and location-based web services.
    [Show full text]
  • System Configuration Guide for Cisco Unified Communications Manager, Release 11.5(1)SU7 and SU8
    System Configuration Guide for Cisco Unified Communications Manager, Release 11.5(1)SU7 and SU8 First Published: 2020-08-25 Last Modified: 2021-04-26 Americas Headquarters Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA http://www.cisco.com Tel: 408 526-4000 800 553-NETS (6387) Fax: 408 527-0883 THE SPECIFICATIONS AND INFORMATION REGARDING THE PRODUCTS IN THIS MANUAL ARE SUBJECT TO CHANGE WITHOUT NOTICE. ALL STATEMENTS, INFORMATION, AND RECOMMENDATIONS IN THIS MANUAL ARE BELIEVED TO BE ACCURATE BUT ARE PRESENTED WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. USERS MUST TAKE FULL RESPONSIBILITY FOR THEIR APPLICATION OF ANY PRODUCTS. THE SOFTWARE LICENSE AND LIMITED WARRANTY FOR THE ACCOMPANYING PRODUCT ARE SET FORTH IN THE INFORMATION PACKET THAT SHIPPED WITH THE PRODUCT AND ARE INCORPORATED HEREIN BY THIS REFERENCE. IF YOU ARE UNABLE TO LOCATE THE SOFTWARE LICENSE OR LIMITED WARRANTY, CONTACT YOUR CISCO REPRESENTATIVE FOR A COPY. The Cisco implementation of TCP header compression is an adaptation of a program developed by the University of California, Berkeley (UCB) as part of UCB's public domain version of the UNIX operating system. All rights reserved. Copyright © 1981, Regents of the University of California. NOTWITHSTANDING ANY OTHER WARRANTY HEREIN, ALL DOCUMENT FILES AND SOFTWARE OF THESE SUPPLIERS ARE PROVIDED “AS IS" WITH ALL FAULTS. CISCO AND THE ABOVE-NAMED SUPPLIERS DISCLAIM ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING, WITHOUT LIMITATION, THOSE OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THIS MANUAL, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
    [Show full text]
  • Acquisition of Hypernyms and Hyponyms from the WWW
    Acquisition of Hypernyms and Hyponyms from the WWW Ratanachai Sombatsrisomboon *1 Yutaka Matsuo *2 Mitsuru Ishizuka *1 *1 Department of Information and Communication Engineering *2 National Institute of Advanced Industrial Science School of Information Science and Technology and Technology (AIST) University of Tokyo [email protected] Abstract. Recently research in automatic ontology construction has become a hot topic, because of the vision that ontology will be the core component to realize the semantic web. This paper presents a method to automatically construct ontology by mining the web. We introduce an algorithm to automatically acquire hypernyms and hyponyms for any given lexical term using search engine and natural language processing techniques. First, query phrase is constructed using the frame “X is a/an Y”. Then corpora sentences is obtained from the result from search engine. Natural language processing techniques is then employed to discover hypernym/hyponym lexical terms. The methodologies proposed here in this paper can be used to automatically augment natural language ontology, such as WordNet, domain specific knowledge. 1. Introduction Recently research in ontology has been given a lot of attention, because of the vision that ontology will be the core component to realize the next generation of the web, Semantic Web [1]. For example, natural language ontology (NLO), like WordNet [2], will be used to as background knowledge for a machine; domain ontology will be needed as specific knowledge about a domain when particular domain comes into discussion, and so on. However, the task of ontology engineering has been very troublesome and time -consuming as it needs domain expert to manually define the domain’s conceptualization, left alone maintaining and updating ontologies.
    [Show full text]
  • Viscos Mobile: Learning Computer Science on the Road Teemu H
    ViSCoS Mobile: Learning Computer Science on the Road Teemu H. Laine, Niko Myller and Jarkko Suhonen Department of Computer Science, University of Joensuu P.O. Box 111 FI-80101 Joensuu, FINLAND +358 13 251 7929 [email protected] ABSTRACT The aim in the ViSCoS Mobile is to allow more flexible ways to We have been running for several years an online study program study the first year computer science studies. Mobile technologies called ViSCoS where one can study the first year Computer have also potential to introduce totally new ways to support Science studies through the Internet. Currently, we are working learning, such as deepening the community awareness among the on transferring the program into mobile devices. The project for students. The purpose of this paper is not to evaluate or test expanding the current PC-based solution towards mobile devices already existing solutions, but to elaborate and discuss the has been given a name ViSCoS Mobile. We discuss the possibilities and challenges of the ViSCoS Mobile concept. We possibilities and problems related to the design of the ViSCoS also analyze the first potential solutions for implementing the Mobile and propose preliminary solutions to them. Three ViSCoS Mobile. development “threads” for the ViSCoS Mobile are identified: adaptation of the course materials and digital learning tools used 2. VISCOS ONLINE STUDY PROGRAM in the ViSCoS studies, a possibility to program with a mobile In ViSCoS (Virtual Studies of Computer Science) online learning device while learning Java programming and enhancement of program, students study the first year university-level computer student support via mobile learning solutions, such as science courses via the web.
    [Show full text]
  • Qooxdoo Documentation Release 5.0.2
    qooxdoo Documentation Release 5.0.2 qooxdoo developers January 12, 2017 Copyright ©2011–2017 1&1 Internet AG CONTENTS 1 Introduction 1 1.1 About...................................................1 1.1.1 Framework...........................................1 1.1.2 GUI Toolkit...........................................1 1.1.3 Communication.........................................1 1.1.4 More Information (online)...................................2 1.2 Feature Overview............................................2 1.2.1 Runtimes............................................2 1.2.2 Object-orientation........................................2 1.2.3 Programming..........................................2 1.2.4 Internationalization.......................................3 1.2.5 API reference..........................................3 1.2.6 Testing.............................................3 1.2.7 Deployment...........................................3 1.2.8 Migration............................................4 1.3 Architectural Overview.........................................4 1.4 Getting Started..............................................4 1.4.1 qx.Website...........................................4 1.4.2 qx.Desktop...........................................6 1.4.3 qx.Mobile............................................7 1.4.4 qx.Server............................................8 1.4.5 Others.............................................. 10 2 Core 11 2.1 Object Orientation............................................ 11 2.1.1 Introduction to Object
    [Show full text]