Creating and Using Semantics for Information Retrieval and Filtering
Creating and Using Semantics for Information Retrieval and Filtering State of the Art and Future Research Several experiments have been carried out in the last 15 years investigating the use of various resources and techniques (e.g., thesauri, synonyms, word sense disambiguation, etc.) to help refine or enhance queries. However, the conclusions drawn on the basis of these experiments vary widely. Results of some studies have led to the conclusion that semantic information serves no purpose and even degrades results, while others have concluded that the use of semantic information drawn from external resources significantly increases the performance of retrieval software. At this point, several question arise: • Why do these conclusions vary so widely? • Is the divergence a result of differences in methodology? • Is the divergence a result of a difference in resources? What are the most suitable resources? • Do results using manually constructed resources differ in significant ways from results using automatically extracted information? • From corpus building to terminology structuring, to which methodological requirements resources acquisition has to comply with in order to be relevant to a given application? • What is the contribution of specialized resources? • Are present frameworks for evaluation (e.g., TREC) appropriate • for evaluation of results?. These questions are fundamental not only to research in document retrieval, but also for information searching, question answering, filtering, etc. Their importance is even more acute for multilingual applications, where, for instance, the question of whether to disambiguate before translating is fundamental. Moreover, the increasing diversity of monolingual as well as multilingual documents on the Web invite to focus attention on lexical variability in connection with textual genre and with questioning the resources reusability stance.
[Show full text]