DirectoryRank: Ordering Pages in Web Directories Vlassis Krikos Sofia Stamou Pavlos Kokosis Computer Engineering Dept. Computer Engineering Dept. Computer Engineering Dept. Patras University, Greece Patras University, Greece Patras University, Greece
[email protected] [email protected] [email protected] Alexandros Ntoulas Dimitris Christodoulakis Computer Science Department UCLA, USA Computer Engineering Department Patras University, Greece
[email protected] [email protected] ABSTRACT lists the pages within a category alphabetically, while the Google Web Directories are repositories of Web pages organized in a hier- Directory [1] orders the pages within a category according to their archy of topics and sub-topics. In this paper, we present Direc- PageRank [11] value on the Web. While these rankings can work toryRank, a ranking framework that orders the pages within a given well in some cases, they do not directly capture the closeness of the topic according to how informative they are about the topic. Our pages to the topic that they belong to. method works in three steps: first, it processes Web pages within a In this paper, we present DirectoryRank, a new ranking framework topic in order to extract structures that are called lexical chains, that we have developed in order to alleviate the problem of ranking which are then used for measuring how informative a page is for a the pages within a topic based on how “informative” these pages particular topic. Then, it measures the relative semantic similarity of are to the topic. DirectoryRank is based on the intuition that the the pages within a topic. Finally, the two metrics are combined for quality (or informativeness) of a Web page with respect to a par- ranking all the pages within a topic before presenting them to the ticular topic is determined by the amount of information that the users.