<<

General Chairman Deputy General Chairman Veljko Milutinovic, Frédéric Patricelli, School of Electrical Engineering, Telecom Italia Learning Services University of Belgrade (Head of International Education) Conference organization staff:

Conference Managers:

Miodrag Stefanovic Cesira Verticchio

Conference Staff

Renato Ciampa Veronica Ferrucci Maria Rosaria Fiori Maria Grazia Guidone Natasa Kukulj Bratislav Milic Zaharije Radivojevic Milan Savic

These pages are optimized for Explorer 4+ or Netscape Navigator v4+ and resolution of 1024x768 pixels in high color.

Designed by SSGRR

SSGRR-2002s - Papers

1. .NET All New?

Jürgen Sellentin, Jochen Rütschlin

2. A center for Knowledge Factory Network Services (KoFNet) as a support to e-business

Giuseppe Visaggio, Piernicola Fiore

3. A concept-oriented math teaching and diagnosis system

Wei-Chang Shann, Peng-Chang Chen

4. A contradiction-free proof procedure with visualization for extended logic programs

Susumu Yamasaki, Mariko Sasakura

5. A Framework For Developing Emerging Information Technologies Strategic Plan

Amran Rasli

6. A Generic Approach to the Design of Linear Output Feedback Controllers

Yazdan Bavafa-Toosi, Ali Khaki-Sedigh

7. A Knowledge Management Framework for Integrated design

Niek du Preez, Bernard Katz

8. A Method Component Programming Tool with Object Databases

Masayoshi Aritsugi, Hidehisa Takamizawa, Yusuke Yoshida and Yoshinari Kanamori

9. A Model for Business Process Supporting Web Applications

Niko Kleiner, Joachim Herbst

10. A Natural Language Processor for Querying Cindi

Niculae Stratica, Leila Kosseim, Bipin C. Desai

11. A New Approach to the Construction of Parallel File Systems for Clusters

Felix Garcia, Alejandro Calderón, Jesús Carretero, Javier Fernández, Jose M. Perez

12. A New model of On-Line Learning

file:///F¦/papers.html (1/15)2004/03/22 13:16:33 SSGRR-2002s - Papers

Marjan Gusev, Ljupco N. Antovski, Vangel V. Ajanovski

13. A New Paradigm for Network Management: Business Driven Device Management

John Strassner

14. A Prototype of a Retail Internet Banking for Thai Customers

Rawin Raviwongse, Pornpriya Koedrabruen

15. A Reuse-Oriented Approach for the Construction of Hypermedia Applications

Naoufel Kraiem

16. A Scientific Paradigm On Image Processing痴 Lecture

Sar Sardy

17. A Theory of Programming for e-Science and Software Engineering

Juris Reinfelds

18. A video based laboratory on the Internet, and the experiences obtained with high-school teachers

Fernando Gamboa Rodríguez, J.L. Pérez Silva, F. Lara Rosano, A. Miranda Vitela, F. Cabiedes Contreras

19. Web Engineering: Methods and Tools for Education

George E. Cormack, G. Griffiths, B. D. Hebbron, M. A. Lockyer, B. J. Oates

20. Adding Security to Quality of Service Architectures

Stefan Lindskog, Erland Jonsson

21. Advanced Mobile Multipoint Rela-Time Military Conferencing System (AMMCS)

R. Sureswaran, A. Osman, M. S. Mushardin, M. Yusof, B. Husain

22. Advanced Optical Infrastructure for the Emerging Optical Internet Services

Marian Marciniak, Marian Kowalewski, Miroslaw Klinkowski

23. Agent-based Intelligent Clinical Information System

Il Kon Kim, Ji Hyun Yun, Sang Wook Lee, Hang Chan Kim

24. An approach for implementing Object Persistence in C++ using Broker

Kulathu Sarma

25. Digital Learning: Infrastructure and Web Culture

Alexei L. Semenov

26. An Efficient and Adaptive Method for Reservation of Multiple Multicast Trees

file:///F¦/papers.html (2/15)2004/03/22 13:16:33 SSGRR-2002s - Papers

Shiniji Inoue, Makoto Amamiya, Kasuga-shi, Yoshiaki Kakuda

27. Distributed Information Systems BuildingTechniques

Petr Smol勛, Tom癩 Hru嗅a

28. An eye-gaze input device for people with severe motor disabilities

Laura Farinetti, Fulvio Corno

29. Asia-Pacific and Australian Grid developments, the coming information Grid and the iSpace

Bernard A. Pailthorpe, Nicole S. Bordes

30. Askemos - a distributed settlement

J•g F. Wittenberger

31. Automatic Determination of Cluster Size Using Machine Learning Algorithm

Jihoon Yang, Sung-Hae Jun, Kyung-Whan Oh

32. B2E Business to Employee and Business to Everything

Prashant Killipara

33. Business Methods: Patent Practice at the European Patent Office

J•g Machek

34. CAL-Visual, an E-Education Tool for the Management of Digital Resources

Dino Bouchlaghem

35. Casual databases

Ken Roger Riggs

36. CLINICAL ANTHROPOLOGY - A NEW EDUCATIONAL METHOD FOR ETHICS AND HUMANITY USING INTERNET

Shinichi Shoji

37. Communication Behavior and Collaboration in Virtual Seminars - Experiences

Birgit Feldmann, G. Schlageter

38. DEC (Decision Making in Business Environment) Computer Age of business (Admiration or Admission?)

Nanayaa Owusu Prempeh

39. Symbolic Computation in Research

Qi Zheng

40. Computer 邦ediated Communication (CMC): A Shift Towards E- Education Systems In Malaysia

Rozhan M Idrus

file:///F¦/papers.html (3/15)2004/03/22 13:16:33 SSGRR-2002s - Papers

41. Computer Simulation Of Combustion in Particulate 2-Phase Flow

Aleksandar Saljnikov, Simeon Oka, Elmira Karbozova, Miroslav Sijercic

42. COMPUTING OUR WAY TO THE ULTIMATE TOY

Kristinn R. Thorisson

43. Cooperative Learning: Multi-user Support for Advanced Distance Learning Services

Andrea B•

44. Coordinating Representations for Collaborative Systems

Richard Alterman, Alex Feinman, Seth Landsman, Josh Introne

45. Coordinatized Graphs: Interplay Between Graphical Properties and Adjacency Systems

Andrew Woldar

46. Cryptographic Schemes in Secure e-Course eXchange (eCX) for e-Course Workflow

Lucas C. K. Hui, Joe C.K. Yau

47. Data Mining from a Web Browser

David J. Haglin, Richard J. Roiger

48. Design and implementation of an adaptive learning management system

Giorgio Casadei, Matteo Magnani

49. E-learning, Metacognition and Visual design

David Kirsh

50. Development of Computer-Based Activities for Peer-Led Team Learning in University-Level General Chemistry

John Goodwin

51. Differentiated Multilayer Resilience in IP over Optical Networks

Achim Autenrieth

52. Distributed Medical Intelligence via Broadband Communication Networks

Constantinos Makropoulos

53. Multimedia based Learning and Working: a Cooperation of University with Industry

Peter Deussen, Hartmut Ehrich, Tim Young Weisssch臈el, Christian Zorn

54. Document Ontology: A Statistical Approach

Sadanand Srivastava, James Gil de Lamadrid, Chakravarthi S. Velvadapu

file:///F¦/papers.html (4/15)2004/03/22 13:16:33 SSGRR-2002s - Papers

55. Does Attentional Load Affect Discourse Management in On-Line Communication?

Claude G. Cech, Sherri L. Condon

56. E-Book, an e-learning tool for Engineering Undergraduates

Eduardo Gomez-Ramirez

57. E-Business Management and Workflow Technologies

Zeljko Djuricic, Natasa Ilic, Zeljko Djuricic, Veljko Milutinovic

58. Economic Decision-making in a Technological Age

James R. Forcier

59. Complexity and the Emergent Web

Sorin Solomon, Eran Shir

60. E-Diagnosis Using GeneChip Technologies

Zhao Lue-Ping, S. Gilbert, C. Defty

61. e-DOCPROS: An e-Business Document Processing System

Zhenfu Cheng, Xuhong Li

62. Effects of Changing the Pedagogical Concept of a Part-time Bachelor of Science in Accounting from Traditional Lectures into an IT-supported Asynchronous and Flexible Teaching & Learning Concept

Lars Kiertzner, Maya Dole, Tage Rasmussen

63. e-Infrastructure in a complex environment

Julian Smith

64. E-learning at ENSAIT: a case study

Pierre Douillet, S. Pessé, A. M. Jolly

65. E-Learning Content Creation with MPEG-4

Michael Stepping

66. E-Learning of Spanish with Interactive Video and Blackboard Technologies for Elementary School Children

Julia Coll

67. EMERGENCY! Medicine and Modern Education Technology

Dag K.J. E. von Lubitz, Benjamin Carrasco, Francesco Gabbrielli, Frederic Patricelli, Tymoty Pletcher, Caleb Poirier, Simon Richir

68. e-Medicine Utilization: Socio-cultural issues

file:///F¦/papers.html (5/15)2004/03/22 13:16:33 SSGRR-2002s - Papers

Robert Doktor, David Bangert

69. Emerging market mechanisms in Business-to-Business E Commerce A framework

B. Mahadevan

70. Enhanced Security Watermarking and Authentication based on Watermark Semantics

Dimitrios Koukopoulos, Y. C. Stamatiou

71. Environment for Teaching Support in the Medical Area

Rosa Maria Vicari, Cecilia Dias Flores, Louise Seixas, André Silvestre

72. Epidemic Communication Mechanisms in Distributed Computing

Oznur Ozkasap

73. Evaluating Java Applets for Teaching on the Internet

Michael R. Healy, Dale E. Berger, Victoria L. Romero, Amanda Saw

74. Evaluating network intrusion detection algorithm performance as attack complexity increases

Dirk Ourston, Bryan Hopkins, Sara Matzner, William Stump

75. Evaluating the Quality of Service for a Satellite Based

Helmut Hlavacs, Guido Aschenbrenner, Ewald Hotop, Aadarsh Baijal, Ashish Garg

76. Evaluation and perspectives of innovative Tunisian e-learning experimentation

Mohamed Jemni, Henda Chorfi

77. Evaluation of Minimal Deterministic Routing in Irregular Networks

Tor Skeie, Ingebjørg Theiss, Olav Lysne

78. Evolution and Convergence in Telecommunications

Gennady G. Yanovsky

79. Extending SOAP for handling lighweight transactional information

Mario Jeckle

80. Extending the Personal Response System (PRS) to Further Enhance Student Learning

Joan Wines, Julius Bianchi

81. Federated Profile Information Architecture

Guoping Jia

82. Fusion of Multiple Images with Robust Random Field Models

file:///F¦/papers.html (6/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

Kie B. Eom

83. e-Business MUSIC: New Ways to Perform Introspection Within the Corporation

Enrique Espinosa

84. Generating color palettes for compressed video sequences

Yuk-Hee Chan, Wan-Fung Cheung

85. Genetic Algorithms for Internet Search: Examining the Sensitivity of Internet Search by Varying the Relevant Components of Genetic Algorithm

Vesna 各嗽m, Dragana Cvetkovic

86. GroupIntelligence: Automated Support for Capitalising On Group Knowledge

Jules de Waart, Michiel van Genuchten

87. From Innovators to Laggards: Computer Scientists and E- learning

Roger Boyle, Martyn Clark

88. How to find similar web sites by using only link information

Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, Osamu Akashi, Toshiharu Sugawara

89. Hardware RAID - 5 versus Non-RAID solution under UNIX Operating System

Borislav Djordjevic, Stanislav Miskovic, Nemanja Jovanovic, Veljko Milutinovic

90. Identity Management: a Key e-Business Enabler

Marco Casassa Mont, Pete Bramhall, Mickey Gittler, Joe Pato, Owen Rees

91. Impacts of the Global Information Society on the Banking Industry

Ondrej Slapak

92. Implementation of a remote-assistant application via Web over IP networks: CIMA Project

Francisco Sandoval, Francisco Javier González Cañete, Francisco Miguel García Palomo, Eduardo Casilari Pérez

93. Implementation of Feedback TM an Application for Quality Assurance, Learning and e-Communication of Diagnosis of Medical Images

M. Bergquist, H. Gater, O. Flodmark, J. Hedin, S. Hedin, M. Hellström, B. Jacobson, B. Johansson, N. Lundberg, K. Måre, J. Wallberg, P. Wenngren

94. The social infrastructure of E-Education

Peter Lyman

file:///F¦/papers.html (7/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

95. Information Publishing on FRIENDS

Alfons H. Salden, Ronald J. van Eijk, Mortaza S. Bargh, Johan de Heer

96. Infrastructure For E-Business, E-Education, E-Science, and E- Medicine; Challenges For Developing economies. The Nigerian Experience

Babatunde O.R. Ogundele

97. Observations and prescriptions for Web Standards

David Bodoff, Mordechai Ben-Mehachem

98. Infrastructure in Education time to learn lessons from elsewhere?

Tony Shaw

99. Infrastructure, requirements and applications for eScience: a European perspective

Ron Perrott

100. Infrastructures for Mobile Services in e-Medicine

Heinz Thielmann

101. SEMPEL: A Software Engineering Milieu for PEer-Learning

V.Lakshmi Narasimhan

102. Integrating Emerging E-Technologies into Traditional Classroom Settings

Jay M. Lightfoot

103. Integrating the Teaching of Psychology with Web-Based Distant Learning: Practicum and Internship

V. Wayne Leaver

104. Interconnecting Networks and the Performance of Multithreaded Mutiprocessors

Wlodek Zuberek

105. INTERNET PRIVACY CONCERNS AND TRADE-OFF FACTORS EMPIRICAL STUDY AND BUSINESS IMPLICATIONS

Tamara Dinev, Paul Hart

106. Internet: A Powerful Tool in Disseminating Medical Knowledge in Urban and Rural India

Deena Suresh, Dr. CB Sridhar

107. Introduction of Information Infrastructure for Medical Academic Activities in Japan - UMIN and MINCS-UH

Takahiro Kiuchi

108. Learning and Networking Concepts and Components of the Global Seminar

Dean Sutphin

file:///F¦/papers.html (8/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

109. Learning Objects -Pedagogy Based Structuring of Course Materials

Paul Juell, Elizabeth Smith, Lisa Daniels, Vijayakumar Shanmugasundaram

110. Living Book: an Interactive and Personalized Book

Margret Gross-Hardt, Peter Baumgartner, Anna B. Simon

111. Local telematics services for higher education

Joze Rugelj

112. Maestro: A Middleware for Distributed applications based in components software

Jorge Risco Becerra

113. Magenta Multi-Agent Engines For Decision-Making Support

Peter Skobelev, V. Andrejev, S. Batishchev, K. Ivkushkin, I. Minakov, G. Rzevski, A. Safronov

114. Mapping Object Oriented Models into Relational Models: a formal approach

Pedro Ramos, Luís Rio

115. Marketing and Engineering Criteria for the Implementation of a top level Tnternet Infrastructure

Enrique S. Draier

116. Marrying Sanskrit to Java - an e-tutor for Sanskrit

Sudhir Kaicker, Jayant Shekhar

117. m-commerce: why it does not fly (yet?)

Peter Langendoerfer

118. Measurement Technique for Object Oriented Systems

Sallie Henry, Cary Long

119. Measuring the Effectiveness of Internal Electronic Communication Channels in Achieving Business Goals

Angela Sinickas

120. Medical eLearning, eTraining and Interactive Telemedicine via Satellite in the operating room of the future

G. Graschew, T. A. Roelofs, S. Rakowsky, P. M. Schlag

121. Meta-Learning Functionality in eLearning Systems

Ulrik Schroeder

122. Mobile Commerce: Some Extensions of Core Concepts and Key Issues

Christer Carlsson, Pirkko Walden

file:///F¦/papers.html (9/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

123. Models for E-Learning environment evaluation: a proposal

Francesco Colace

124. Multidrop Generic Framing Procedure (GFP-MD)

Kari Sepp舅en

125. Multi-grid Parallel Algorithm with Virtual Boundary Forecast for Solving 2D Transient Equation

Guo Qingping, Yakup Paker, Dennis Parkinson, Wei Jialin, Zhang Sheng

126. Multilingual Multimedia Electronic Dictionary for Children

Valentin E. Brimkov, Reneta P. Barneva, Peter L. Stanchev

127. Mutual indexing of video and bulletin board for lecture video

Hirohide Haga

128. Navigation Support System for Live e-CRM

Hideto Ikeda, Nikolaos Vogiatzis, Aki Shibuya

129. New media - traditional universities: Success factors and obstacles for e-learning technologies

Georg J. Anker, Yuka Sasaki

130. Object Oriented Communication Design Tool usable for Everyone

Hajime Nonogaki

131. On the Viability on E-learning

Sunil Choenni

132. Optimal Link Allocation and Charging Model

Jyrki Joutsensalo, Timo Hamalainen

133. Oral Metaphor Construct. New direction in cognitive linguistics

Asa Stepak

134. Our Progress Into E-Business Education: How we have Incorporated Higher-Order Thinking Skills into our Web-based class

Rexford H. Draman, Robin Eanes

135. Overview of the role of ATM/AAL2 Aggregator in UMTS Access Network

Aleksandar D. Petrovic

136. Parallel Solutions of Coupled Problems

Felicja Okulicka Dluzewska

file:///F¦/papers.html (10/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

137. Practical Traffic Grooming Formulation for SONET/WDM Rings

Paul Ghobril

138. Privacy Issues Arising from a Smart_ID Application in eHealth

John Fulcher

139. The Brave New World of the Cyber Speech and Hearing Clinic: Treatment Possibilities

William R. Culbertson, Dennis C. Tanner

140. GIS and DGPS via Web: the GIS on line of the Everest National Park

Giorgio Vassena, Roberto Cantoni, Carlo Lanzi, Giuseppe Stefini

141. Holarchies on The Internet: Enabling Global Collaboration

Mihaela Ulieru

142. Purdue Center for Technology Roadmapping: A Resource for Research and Education in Technology Roadmapping

Edward J. Coyle

143. QUALITY MANAGEMENT SYSTEM BASED ON THE NATIONAL TRAUMA REGISTRY

Drago Brilej, Radko Komadina

144. Rationales for Consumer Adoption or Rejection of E-Commerce: Exploring the Impact of Product Characteristics

Bill Anckar

145. Some remarks on time modelling in interactive computing systems

Merik Meriste, Leo Motus

146. Schema Validation Applied to Native XML Databases

Gongzhu Hu, Qinglan Li

147. Search and Discover on the Web

Bipin C. Desai

148. Server Load Balancing in the Next Generation Internet

Jamalul-lail Abdul Manan, Habibah Hashim

149. Service Oriented Community Systems for Mobile Commerce

Kinji Mori

150. SIP and the Internet

Gianni Scandroglio

file:///F¦/papers.html (11/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

151. Socrates Meets The Web: Incorporating the Internet Into U.S. Law Classes

Anna Williams Shavers

152. Software Issues for Applying Conversation Theory For Effective Collaboration Via the Internet

William Klemm

153. Solving scaling problems with the modern GUI

Peter M. Bagnall

154. Some like it soft

Ole Lauridsen

155. A Monitoring System for Manufacturing Machines Based on SNMP

GSangyong Lee, Joongsoon Jang, Gihyun Jung, Kyunghee Choia

156. Enhancing IDS performance through dropping hacking-free packets

Jongwook Moon, Jongsu Kim, Gihyun Jung, Kangbin Yim, Kyunghee Choi, Haiyoung Yoo

157. Analysis on Utilization and Delay of Memory in a Lossless Packet Processing System

Jongsu Kim, Jongwook Moon, Gihyun Jung, Kangbin Yim, Kyunghee Choi, Joongsoon Jang

158. Protecting Mail Server using the CBT algorithm

Hyun-Suk Lee, Soo-Juong Lee, Hui-Sug Jung, Gihyun Jung, Kyunghee Choi

159. Storage Technologies for an Efficient e-Infrastructure

Satish Rege

160. Structured Metadata Analysis

Steve Probets

161. Superscalar in City-1: An Educational Guide to the next step beyond Pipelining

Ryuichi Takahashi, Noriyoshi Yoshida

162. Supervision of Electrical Utility Works Based on Internet

Felipe Alaniz, Pablo R. de Buen

163. Teaching Novices Programming Skills Efficiently: What, When and How?

Yuh-Huei Shyu

164. Teaching, Technology and Teamwork

Elaine Carbone, Shaun Stemmler, Jon Beal

file:///F¦/papers.html (12/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

165. Software solutions for Science e-Education: A case study from the VISIT Project

Yichun Xie

166. Technologies for Student-Generated Work in a Peer-Led, Peer- Review Instructional Environment

Brian P. Coppola, Ian C. Stewart

167. TEN WAYS TO IMPACT THE WEB WITHOUT A WEB MEISTER

Ken McNaughton

168. The Architecture of Knowledge: Representation and Theorization of Violence on the Internet

Lily Alexander

169. The emergence a of web-mediated genres: the home page

Anne Ellerup Nielsen

170. The Emerging Autosophy Internet

Klaus Holtz, Eric Holtz

171. The Future of Education

Lalita Rajasingham

172. The impact of internet technologies on the financial markets

Ross A. Lumley

173. The Mathematical Structure model of a Word-unit-based Program

Hamid Fujita, Osamu Arai

174. The Role of XML in E-Business

Betty Harvey

175. Think before you click customerschallenges in the e-commerce

Zita Zoltay Paprika

176. Topological Design of Multiple VPNs over MPLS Network

Anotai Srikitja, David Tipper

177. Toward logical-probabilistic modeling of complex systems

Taisuke Sato, Yoichi Motomura

178. Towards to e-transport

Miroslav Svítek, Mirko Nov疚

179. Emergence and Evolution of Microturbine Generators [MTGs] to Provide Infrastructure for E-Related Applications

file:///F¦/papers.html (13/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

Stephanie L. Hamilton

180. Using Building Blocks to Implement a Business-to-Supplier Portal

Shannon Fowler

181. Using CORBA Interceptors to Implement a Security Wrapper

Luigi Romano, D. Cotroneo, A. Mazzeo, S. Russo

182. Using Internet and Database Technology to Enable Collaboration between Researchers and Teachers Developing Educational Featuring Endangered Species Research and Conservation

Mary A. Overby, Mark MacAllister, Jeffrey Hoffman, Chris Bulla

183. Using the Quick Look Methodology to Plan and Implement Complex Information Technology Transformations

Richard C. Staats

184. Verifying and Leveraging Software Frameworks

Trent Larson

185. Virtual Communities for Service Delivery: Transferring the Notion of Pro-Social Behavior from 撤laceto 鉄pace"

Ko de Ruyter, Caroline Wiertz, Sandra Streukens

186. Visualizing Molecules Helps Students 'See' Chemistry in a New Light

Harry Ungar, Albion Baucom

187. Wavelet-based Blind Watermark Embedding Technique

Sanghyun Joo, Yongseok Seo, Youngho Suh

188. Web-based Tools for Supporting Health Education

William B. Hansen

189. What can you do with a frozen leg of lamb? - Connecting products and information services in a web-based environment

Benkt Wangler, Ingi Jonasson, Eva Söderström

190. What is Virtual about the Web?

Murat Karamuftuoglu

191. Wireless Control of the Virtual Kiosk

Charles A. Milligan, Steven H. McCown

192. World-wide interaction with 3D-data

Gerd Kaupp, Svetlana Stepanenko, Andreas Herrmann

file:///F¦/papers.html (14/15)2004/03/22 13:16:34 SSGRR-2002s - Papers

193. XML Technologies, Value-based Marketing, Franchising and the New Paradigm in Business

Dino Karabeg

194. Electronic Public Transmission Act of 2002 to cope with the Convergence and as the Minimum Regulations on the Internet

Koichiro Hayashi

195. Domain-Specific Language Agents

Merik Meriste, Jüri Helekivi, Tõnis Kelder, Leo Motus

196. Development of Distributed Package of Finite Element Method

F. Okulicka-Dluzewska, J.M. Dluzewski

197. Feasibility Study and Strategic Business Analysis of Fuel Ethanol Production in Indiana (May 2002)

Dusan V. Milutinovic

198. Virtual Marketplace on the Internet (May 2002)

Zaharije R. Radivojevic, Živoslav Adamovic, Veljko M. Milutinovic

199. Dissemination Of World Health Organisation Reproductive Health Library (WHO-RHL) Information To Doctors In India RECON HEALTHCARE, BANGALORE MODEL

CB Sridhar, Deena Suresh

200. Technological High School Education through Internet

L. P駻ez Silva, F. Cabiedes Contreras, F. Gamboa Rodr刕uez, F. Lara Rosano, A. Viniegra Hern疣dez

201. Agent-based brokerage of personalised B2B mobile services

Alfons H. Salden, Ronald J. van Eijk, Mortaza S. Bargh, Johan de Heer

202. Denial of Service Attacks: methods, tools, defenses

Fred Darnell, Bratislav Milic, Milan Savic, Veljko Milutinovic

203. Scalability and Knowledge Reusability in Ontology Modeling

Mustafa Jarrar, Robert Meersman

204. (Web) Self Service

Tanja MILOŠEVIC

file:///F¦/papers.html (15/15)2004/03/22 13:16:34 How to find similar web sites by using only link information

Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, Osamu Akashi, and Toshiharu Sugawara NTT Network Innovation Labs. 3-9-11, Midori-cho, Musashino-shi, Tokyo, 180-8585 JAPAN [email protected]

Abstract—We are studying techniques that allow even similar sites, and have already proposed a basic algorithm ordinary end users to make efficient use of the Internet. for detecting similar web sites by focusing on the link We previously proposed an algorithm for determining the information embedded in web pages [2]. As a result of degree of similarity between web sites by using link verifying the basic effectiveness of that algorithm, we information to find web sites that are mirrors of each then found that there are some sites that are thoroughly other and ones that are not mirrors but have similar adequate for use as substitutes yet have a degree of content and can be used as substitutes for each other. As a similarity of no more than 50%. But to use this detection result of verifying the basic effectiveness of that method in practice, it is necessary to employ a mechanism algorithm, we found that when trying to find similar web for automatically judging whether sites for which a low sites to site-A, in addition to ones found to have almost degree of similarity has been detected are mirrors or 100% similarity to site-A, there were also ones that were similar sites that can actually be used instead of mirrors. thoroughly adequate for use as substitutes for site-A, even So, in this paper, we propose an automatic determination though they had a low degree of similarity of 50% or less. algorithm, in which web pages are divided into hub-type Therefore, for practical use of that algorithm, it is and content-type and the judgment is done based on the essential to be able to automatically judge whether web results of judgment algorithms specifically tailored to sites that can be inferred to have some kind of similarity each type of web page. Initial trials of this approach have are actually mirror sites or similar sites that can be used as yielded favorable detection results. We also examine the substitutes. To solve this problem, in this paper, we operation of this detection methodology and propose an propose and evaluate the basic effectiveness of an algorithm for effectively finding candidates for similar automatic judgment methodology, and we focus on its web sites by using the user’s access history to the Internet. operation and propose a methodology for effectively Section 2 reviews our previously proposed algorithm for finding candidates for a similar site by using a user’s detecting similar web sites based on link information and Internet access history. describes our new automatic similar web site detection

method. Section 3 discusses a similar-web-site candidate Index Terms—Internet, Mirror site, Access history, Link finding method. information

I. INTRODUCTION II. USING LINK INFORMATION TO FIND SIMILAR WEB SITES Due to the rapid expansion of the Internet, it has become possible for ordinary end users to obtain many kinds of A. HOW TO FIND A SIMILAR WEB SITE? information easily. However, it is still difficult for them to use the network effectively. For example, although mirror We define a mirror site as follows: servers and servers have been provided in order to improve scalability and response times, it is difficult for users to identify the optimal server. If the link structure of site-A is very similar to that of site-B, then sites-A and -B are mirrors of each other. To solve this problem we have already proposed a “URL Resolver” framework, which allows users to select the optimal server from multiple servers that provide various This definition is based on the observation that mirror kinds of services via data storage facilities such as caches sites or sites that hold information that is so similar that or mirror servers [1]. To enable users to select one of the they closely resemble mirror sites should more or less servers, it is first necessary to gather information such as match in terms of the number and types of links a list of servers that might be useful to the user. Initially, embedded within them, even if there are slight differences we focused on information related to mirror sites or 1 such as inserted advertising banners (This definition is based on [3]). The detection method we proposed in reference [2] is as follows. Assume starting site-A and mirror candidate site- B. The degree of similarity between these two sites is as follows. The total number of embedded inward links that can be gathered when tracing the links of web pages to a depth of N levels from the top web page of site-A is N referred to as (Ain) , and the total number of embedded N outward links is referred to as url(Aout) . Here, an inward link is one in which the host part of the link’s destination URL is the same as that of the current host, and an outward link is one link in which the host part of the link destination is different. The corresponding properties of N N site-B are similarly expressed as url(Bin) and url(Bout) . N Then, the total number of inward links in url(Bin) that are N N also included in url(Ain) is expressed as url(Ain ∩ Bin) , while the corresponding property of the outward links is N Fig. 1: Detection results expressed as url(Aout ∩ Bout) . At this point, when ∩ N determining the value of url(Ain Bin) , the comparisons reason why we decided not to perform a text level are made after replacing the host parts of site-A and -B comparison. with the same arbitrary text strings. So, the degree of similarity between site-A and -B when links are followed α to a depth of N levels is denoted by the symbol , which Next, we discuss the way to find candidates for mirror is given by sites. Using a web robot to recursively access suitable N N sites indiscriminately and compare them with site-A to url(Ain ∩ Bin) + url(Aout ∩ Bout) α = N N × 100 (%). url(Ain) + url(Aout) Since web sites that employ mirror servers do so with Since this procedure only compares the link structures, it the aim of dispersing the load, they will probably want does not perform a text-level comparison of every to provide users accessing the site with information character in every word on the web pages. The reasons for about these mirrors. In other words, it is reasonable to adopting this approach are as follows: assume that the site will make some mention of where its mirrors can be found. Accordingly, it is highly likely that site-B can be found by gathering and (1) As mentioned above, we think it is possible to judge analyzing the content accessible within a certain the similarity of hypertext documents such as web number of link levels from the top page of site-A. It is pages by comparing only their link structures. The also highly likely that the web site will contain links to practicality of focusing on the link structure in web sites of a similar nature, so there should be a high pages in also highlighted in other studies [4] and [5]. likelihood of being able to find similar sites by (2) In this procedure, although a lot of processing time is checking the link structure. taken up by gathering web pages, the amount of text to be compared also increases substantially when links are followed to a depth of several levels and find a candidate for site-B would be far too inefficient, so much more time is required for a comparing text than instead we adopted the following strategy: for comparing just the links. Besides, the gathering work can be speeded up by increasing the network For the actual trials, we extracted 1000 entries from bandwidth, so we decided not to compare the the access log stored in a proxy server used by our processing times required for a text-level comparison. organization of about 200 people, and applied our similar- site detection program to each one (See α in Figure. 1). (3) We are planning to use this similar-web-site finding Similar site candidates were detected for 65% of these. It method even in environments with limited processing is interesting to note that we found that many sites can be resources, such as users’ notebook PCs. Therefore, used as substitutes, even though their degrees of similarity considering the storage of information obtained when were less than 50% (see Table 1). So, if a way can be calculating the degree of similarity, performing a found to automatically judge whether or not they are text-level comparison would require all the text actually capable of being used as substitutes, then it information to be stored, which would be a waste of should be possible to present a greater number of sites to resources. The link information takes up considerably the users in addition to the sites having a high degree of less space than the text information, which is another similarity for which judgment is unnecessary. 2 as the way of embedding links, even if a site is capable of being used as a substitute. To deal with this, in addition to

the links, we also use the label strings of links as Degree of Strength of relationship between two important elements expressing the attributes of the links, similarity sites and add to the calculation of the degree of similarity as follows: we use these labels corresponding to the text 0%–10% Probably unrelated enclosed within the links; e.g., the text string “XXXXXX” in the link “XXXXXX”, 10%–60% May include some sites of a similar and the text string “YYYYY” in the link “YYYYY ”. 60%–90% Either a mirror site or a site that is We decided to rate these links by scoring them according highly similar to the length of the text strings “XXXXXX” and 90%–100% Almost certainly a mirror site “YYYYY” embedded in their labels when a match is found between a pair of labels. Note that the calculation is Table 1: Results of classifying detected sites performed using only the text string “XXXXXX” for links where the text strings “XXXXXX” and “YYYYY” B. SITES CAPABLE OF BEING USED AS match. SUBSTITUTES In content-type sites like a news site, the headline of the Thus, we propose the following detection method. First, article is usually used as a link label, and the string length we divide web pages into the following two broad of the headline is usually longer than that of a link label categories according to the style of user access. whose reference address is another Web site. • Web pages that are accessed as a starting point for net An example of a link in which the alt option is set is as surfing are called hub-type sites. follows: • < img Web pages that are accessed in order to view the src = "http://a772.g.ak···/2.gif" width = "84" content on the page itself are called content-type sites. height="42" alt="The Apple Store." Border ="0"> Then, by considering the conditions of sites that can be considered as substitutes for hub- and content-type sites, In the algorithm in [2], the degree of similarity was respectively, we propose the following degree-of- calculated using only the text string of the URL part similarity calculation methods. “http://www.apple.com/store”, but here in addition to this, the link label “The Apple Store.” is also included in the calculation. The labels are scored according to the 1) METHODOLOGY FOR JUDGING A HUB-TYPE SITE following rules: For example, consider hub-type sites-A and –B. If site-A (1) When the URL parts and label parts both match, the has many embedded external links that are the same as link is awarded a score corresponding to the number those in site-B, then it is highly likely that the user will be of characters in the label. able to use both sites equally. (2) When the URL parts match but the label parts are That is, site-A and the possible substitute candidate site-B different, the link is awarded a score of 70% of the are deemed to have a greater degree of similarity with number of characters in the starting link label. respect to their outward links if they satisfy the condition (3) When the label parts match but the URL parts are α < β , (1) different, the link is awarded a score of 50% of the number of characters in the starting link label. where α is as defined in Section 2.1 and (4) When both parts are different, the link is awarded no N score. url(Aout ∩ Bout) β = N x 100(%) url(Aout) These scoring settings are made based on experience, and further study is required to investigate their validity. Also, is the degree of similarity related to outward links only, for rule (3), since we focus on the similarity of the link from which it can be inferred that site-B is highly likely to structure, it could conceivably be wrong to consider cases be suitable for use as a substitute hub-type site. where the URL parts are different. However, in the case of content-type sites, since we concentrate on the label parts, 2) METHODOLOGY FOR JUDGING A CONTENT- we decided to include cases where the URL parts are TYPE SITE different in the detection by reducing the score awarded. In the above example of “The Apple Store.”, if a link is On the other hand, in a content-type site we think that detected whose URL parts and label parts are identical there may be some differences in the page structure such when matching the link with the mirror candidate site, 3 then this link is awarded 16 points (the number of the most suitable URL is selected on the basis of characters in “The Apple Store.”). In calculating the information from the mirror information managing agent. number of characters in a label, all single-byte and On the other hand, there may be a small amount of time double-byte characters (including English and Japanese available up until the user clicks an anchor within that characters, spaces, and so on) are each counted as one URL, and if such is the case, it might be possible to select character. the most suitable URL according to new transfer rate information. If this can be accomplished, access will be The value of url (A )N for the starting site is given by label in forcibly changed to the most suitable URL when the user the sum of the number of characters in the labels added to clicks the anchor. each link in url(A )N, and is awarded the maximum in possible score when matching is performed with an N C. INITIAL TRIALS identical mirror site. The value of urllabel (Bin) for a mirror candidate site-B is defined in the same way. Furthermore, As mentioned in Section 2.1, Fig. 1 shows the results of N the value of urllabel (Ain ∩ Bin) is given by the sum of the calculating β and γ for 1000 URLs. About 18% of the scores awarded for matching combinations of the URLs were classified as hub-type sites with a degree of abovementioned URL parts and label parts in each similarity of 30% or more, and about 6% of them were respective label. If the degree of similarity between site-A classified as content-type sites. Both these include URLs and -B in terms of inward links is given by that were detected as hub-type and content-type pairs respectively. We then manually checked 50 sites for

which either β or γ was 30% or more from among the ∩ N urllabel (Ain Bin) sites classified as hub- or content-type sites, and found γ = N x 100(%) , then if urllabel (Ain) that all of them were indeed suitable for use as substitutes for these hub- and content-type sites. In future we plan to α <γ , (2) perform verifications with a greater number of access logs and to investigate and examine the reliability of the it is judged to be likely that the mirror candidate can be degree of similarity calculations. used as a substitute for a content-type site.

When the number of links used as the denominator of α or β or γ, when judging Equations (1) and (2), is small (in III. USING USER’S ACCESS HISTORY the current version, less than 10), the degree of similarity With the procedure in Section 2.1, we were able to detect is recalculated by following links to a greater depth. similar sites by using an access log as a starting point. Unfortunately, this procedure is unable to detect sites that However, it will be more effective if it is possible to have a mirror relationship but have differences between detect similar sites that are significant for each individual both the link structure and the labels added to the links. end user. Of course, facilities such as proxy servers However, it is doubtful whether many sites of this sort contain access logs that reflect the character of the actually exist. community that uses them, and can themselves be thought At present, the following simple techniques are used to of as candidates for similar web sites that may match the select a URL from the list of URLs that are possibly users’ preferences. However, in the current version, we mirrors: (1) select the site with the highest likelihood and can only find mirror or similar sites that are limited to the (2) select the candidate with the highest transfer rate at the range of sites traced from the starting host. That is, we do time of retrieval in the case of multiple candidates by not evaluate the degree of similarity between different phase (1). In the future, we also plan to make use of the hosts in the access log. This is because it would lead to a transfer rate at the time of user access and feedback data combinatorial explosion and we judge it to be inefficient. from users, etc. Here we note that in terms of user access, However, it is clear that it is highly effective to detect the process of indicating the most suitable URL requires similar sites including hosts that cannot be reached from real-time properties unlike the mirror search process. In the detection origin host, not just the hosts recorded in the relation to the above, we are investigating using a access log. Therefore, we propose a method for retrieving technique that can flexibly select optimal strategies for mirror or similar sites by using the users’ access history to selecting a URL at any time through an algorithm that filter similar site candidates. executes multiple strategies in parallel [9]. For example, when a user is the first to access a certain URL, there is no time available that for measuring the transfer rate, and

4 www.watch.impress.co.jp/p

www.nikkeibp.co.j

www.zdnet.co.j

Fig. 2: Examples of the same conent in different web sites.

www.watch.impress.co.jp, and detailed specifications were only mentioned at www.zdnet.co.jp. Here, when a user browses through some content zzz at a certain site-A, if sites-B, -C, and -D which are highly likely to contain similar articles related to content zzz— i.e., sites that have a high degree of similarity to site-A— have been detected, then the user will be more likely to obtain a greater amount of information if these sites are recommended to him/her. Next, we discuss how to efficiently extract sites-A, -B, -C, and -D. 1. First, acquiring the following user’s access history: every time the user clicks on a link, we extract site-R in which this link is embedded, site-T which is the destination site of the link, and label-L which is the www.watch.impress.co.jp/ text string of the link’s label. Moreover, label-L is subjected to morphological analysis (using widely used general-purpose morphological analysis 1 software, like [7]) to extract several noun parts N1⋅⋅⋅n , Figure 2 shows excerpts from web pages related to the and {R, T, N ⋅⋅⋅ } triplets are recorded. In the example same content (new digital camera products) in four web 1 n shown in Fig. 2, the following lists are recorded: sites. On finding an article about a new product on one web site, many people (certainly the authors do) often {biztech.nikkeibp.co.jp, www.kodak.co.jp, habitually browse through other related web sites and “Kodak”} look for articles related to the same content. This is {www.watch.impress.co.jp/pc, www.kodak.co.jp, because the content of the articles changes slightly from “Kodak”} {www.watch.impress.co.jp/av, www.kodak.co.jp, one site to the next. In the example shown in Fig. 2, an article related to resolution and the number of pictures 1 that can be taken was only mentioned at By the morphological analysis, a noun is classified into www.nikkeibp.co.jp, while an article relating to the a proper noun or a general noun or an unknown word, and manufacturer’s business strategy was only mentioned at we use a proper noun and an unknown words to express a character of a link. 5 All Both N and T are same Only N is same Only T is same When both N and T are same, Case-I Case-II Case-III N indeed expressed the character of T watch vs. zdnet 24482 22 23966 494 20 asahi vs. yomiuri 2288 15 2198 76 14 watch vs. asahi 4389 8 4281 100 1 zdnet vs. asahi 15879 1 14789 89 0 watch vs. yomiuri 1869 1 1845 23 0 zdnet vs. yomiuri 6350 3 6289 58 0 ( watch: www.watch.impress.co.jp, zdnet: www.zenet.co.jp, asahi: www.asahi.com, yomiuri: www.yomiuri.co.jp ) Fig. 3: Number of extracted {R, T, N}s “Kodak”} To verify the basic efficiency of this methodology, we {www.zdnet.co.jp/news, www.kodak.co.jp, “Kodak”} investigated how many {R, T, N}s, having the same noun part N and same URL of the destination site T could be 2. When there are pages carrying the same content in found from two similar sites. Figure 3 shows the results of different sites, it is highly likely that the destinations this investigation: First, we extracted the link information of the outward links embedded within these pages from www.watch.impress.co.jp and www.zdnet.co.jp, will be the same, so there is a higher possibility that which are hub-type sites concerning new products in the computer or office automation fields, and from access histories such as {Ra, T, N} and {Rb, T, N} where only site-R is different can be extracted to www.asahi.com and www.yomiuri.co.jp, which are web evaluate similarity. In Fig. 2, a link to the sites of newspapers. Specifically, we extracted noun parts manufacturer’s site “Kodak” is embedded in all the and their destination sites of the outward links to a depth sites. of 3 levels (N=3). And second, we searched for the 3. Then, by extracting from the access history the sites- following types of {R, T, N}s from this link information:

R1...n for which the site-T and N terms are the same and only the R terms are different, we can obtain a Case-I: {R, T, N} in which noun part N and destination list of sites where the same content appears, and the site T are both the same. degree of similarity between these sites is calculated. Of course, a different site list could be accumulated Case-II: {R, T, N} in which only noun part N is the same. from all the sites where only the noun parts N are the same, but in practice the content will have a lower Case-III: {R, T, N} in which only destination site T is the likelihood of being related. same.

This procedure is very general-purpose because it can Finally, we checked that among the searched {R, T, N}s learn {R, T, N} triplets as soon as the user first accesses in Case-I whether each {R, T, N} did indeed express the sites, even if they have not yet been registered in the character of the site T or not. access log. As for a way to recommend detected sites to the users, The results show that even though many {R, T, N}s were several methodologies are considered as follows: When searched for Case-II and Case-III from every combination the user has accessed any one of these sites, he/she is of the sites, in Case-I, {R, T, N}s were mainly extracted recommended to browse other sites with priority given to from only combinations of the similar sites (“watch vs. those having a higher degree of similarity. Moreover, zdnet” and “asahi vs. yomiuri”). And although, several when the user has accessed site-T which has already been {R, T, N}s were also extracted from the combination of “watch vs. asahi” having no relationship between them, recorded in {R1...n, T, N}, he/she is recommended to browse other sites with priority given to pages in the we could find only one {R, T, N} which indeed expressed the character of both sites (see the following partial lists individual Rm of R1...n whose content has been updated recently. Another effective measure is to pre-fetch the of extracted {T, N}s). contents of similar sites related to sites accessed by the user and to display a compiled list of these sites. watch vs. zdnet

If there is a similarity relationship between site-A and {http://www.minolta.co.jp/, MINOLTA} (The company name) site-B, but site-B is the competitor of site-A, it may be {http://www.newtech.co.jp/, NEWTECH} (The company name) {http://www.melcoinc.co.jp/, MELCO} (The company name) difficult to find a similar web site-B from site-A by using {http://www.tsutaya.co.jp/, TSUTAYA} (The company name) the strategy proposed in section 2.1. But, by using user’s {http://www.sony.co.jp/sd/, SONY} (The company name) access history, it may be possible to find a similarity relationship between site-A and site-B.

6 ACKNOWLEDGEMENTS watch vs. asahi We thank our executive manager, Dr. Keiichi Koyanagi {http://www.microsoft.com/japan/misc/cpyright.htm, Microsoft} (The of NTT Network Innovation Labs, and the researchers of company name) {http://www.microsoft.com/japan/misc/cpyright.htm, Corporation.} the Computer Networking Principles Research Group. {http://www.microsoft.com/japan/misc/cpyright.htm, ALL} {http://www.microsoft.com/japan/misc/cpyright.htm, rights} {http://www.microsoft.com/japan/misc/cpyright.htm, reserved} REFERENCES

Therefore, from this initial investigation, we can infer that [1] Toshio Hirotsu, Satoshi Kurihara, Toshihiro if two sites have links having the same noun parts and Takada, and Toshiharu Sugawara: ARESAIN - same destination sites, these two sites can be thought as Alternative Resource Access Information being strongly candidates for having a similarity Navigator, Thirteenth IASTED International relationship. Of course, this procedure is still at the stage Conference on Parallel and Distributed of initial trials, and we are planning to verify its Computing and Systems (PDCS 2001), 2001. effectiveness by conducting full-scale verification trials. [2] Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, and Toshiharu Sugawara: Mirror Site Navigator using Link Information, Proceedings IV. CONCLUDING REMARKS of World Multiconference on Systemics, Cybernetics and Informatics (SCI2000), pp. In this study, only link information was used to detect 283–290, 2000. similarity on the grounds that in hypertext environments [3] Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, such as the WWW, links express the most information Monika Rauch Henzinger: A Comparison of regarding the characteristics of content. On the other hand, Techniques to Find Mirrored Hosts on the a considerable amount of research is being done in the WWW, Journal of the American Society for field of natural language processing for procedures that Information Science (JASIS), Vol. 51, No. 12, determine the degree of similarity by analyzing the text Nov. 2000, pp. 1114–1122. content. Reference [8] describes one example of a study [4] Narayanan Shivakumar and Hector Garcia- in which this procedure is applied to the WWW. However, Molina: Finding near-replicas of documents on it has been concluded that this sort of conventional text- the web, International Workshop on the World based procedure does not function effectively in hypertext Wide Web and Databases (WebDB ’98), 1998. environments such as the WWW [5],[6]. [5] O. Zamir and O. Etzioni: Grouuper -A Dynamic Studies of ways to detect mirror sites by focusing on the Clustering Interface to Web Search Results-, The link structure include references [3] and [4]. However, the Eighth International WWW Conference, 1999. aim of those methods is to detect only complete mirror [6] L. Page, S. Brin, R. Motwani, and T. Winograd: sites, and in these procedures, other information—such as The PageRank Citation Ranking: Bringing Order the link connection relationships and the information from to the Web, Work in progress. a DNS, etc—is used besides the calculated degree of http://google.stanford.edu/~backrub/pageranksub α similarity corresponding to in this study. Our procedure .ps. is different in that it regards some sites as being capable of being used as substitutes even though they have a low [7] http://chasen.aist-nara.ac.jp/ α value, and aims to detect these sites as well. To do this, [8] S. Chakrabarti, B. Dom, R. P., S. Rajagopalan, D. we broadly divide web sites into hub-type and content- Gibsoon, and J. Kleinberg: Automatic Resource type sites, and the degree of similarity is calculated using Compilation by Analyzing Hyperlink Structure methods tailored to each type. By comparing the degrees and Associated Text, The Seventh International of similarity thereby obtained, it is possible to WWW Conference, pp. 65–74, 1998. automatically judge whether or not web pages can be used [9] S Kurihara, S, Aoyagi, S, Onai, R, and Sugawara, as substitutes for each other. T: Adaptive Selection of Reactive/Deliberate Planning for the Dynamic Environment, Robotics In this paper, we focused on the operation of this similar and Autonomous Systems, vol. 24, No. 3--4, pp. web site detection method and proposed an effective 183--195, 1998. procedure for finding candidates for similar web site that match the user’s preferences. This involves storing the connection relationships and label parts of links in sites accessed by the user, and extracting similar site candidates by starting from sites where the noun parts of the labels are the same.

7