General Chairman Deputy General Chairman Veljko Milutinovic, Frédéric Patricelli, School of Electrical Engineering, Telecom Italia Learning Services University of Belgrade (Head of International Education) Conference organization staff:
Conference Managers:
Miodrag Stefanovic Cesira Verticchio
Conference Staff
Renato Ciampa Veronica Ferrucci Maria Rosaria Fiori Maria Grazia Guidone Natasa Kukulj Bratislav Milic Zaharije Radivojevic Milan Savic
These pages are optimized for Internet Explorer 4+ or Netscape Navigator v4+ and resolution of 1024x768 pixels in high color.
Designed by SSGRR
SSGRR-2002s - Papers
1. .NET All New?
Jürgen Sellentin, Jochen Rütschlin
2. A center for Knowledge Factory Network Services (KoFNet) as a support to e-business
Giuseppe Visaggio, Piernicola Fiore
3. A concept-oriented math teaching and diagnosis system
Wei-Chang Shann, Peng-Chang Chen
4. A contradiction-free proof procedure with visualization for extended logic programs
Susumu Yamasaki, Mariko Sasakura
5. A Framework For Developing Emerging Information Technologies Strategic Plan
Amran Rasli
6. A Generic Approach to the Design of Linear Output Feedback Controllers
Yazdan Bavafa-Toosi, Ali Khaki-Sedigh
7. A Knowledge Management Framework for Integrated design
Niek du Preez, Bernard Katz
8. A Method Component Programming Tool with Object Databases
Masayoshi Aritsugi, Hidehisa Takamizawa, Yusuke Yoshida and Yoshinari Kanamori
9. A Model for Business Process Supporting Web Applications
Niko Kleiner, Joachim Herbst
10. A Natural Language Processor for Querying Cindi
Niculae Stratica, Leila Kosseim, Bipin C. Desai
11. A New Approach to the Construction of Parallel File Systems for Clusters
Felix Garcia, Alejandro Calderón, Jesús Carretero, Javier Fernández, Jose M. Perez
12. A New model of On-Line Learning
file:///F¦/papers.html (1/15)2004/03/22 13:16:33 SSGRR-2002s - Papers
Marjan Gusev, Ljupco N. Antovski, Vangel V. Ajanovski
13. A New Paradigm for Network Management: Business Driven Device Management
John Strassner
14. A Prototype of a Retail Internet Banking for Thai Customers
Rawin Raviwongse, Pornpriya Koedrabruen
15. A Reuse-Oriented Approach for the Construction of Hypermedia Applications
Naoufel Kraiem
16. A Scientific Paradigm On Image Processing痴 Lecture
Sar Sardy
17. A Theory of Programming for e-Science and Software Engineering
Juris Reinfelds
18. A video based laboratory on the Internet, and the experiences obtained with high-school teachers
Fernando Gamboa Rodríguez, J.L. Pérez Silva, F. Lara Rosano, A. Miranda Vitela, F. Cabiedes Contreras
19. Web Engineering: Methods and Tools for Education
George E. Cormack, G. Griffiths, B. D. Hebbron, M. A. Lockyer, B. J. Oates
20. Adding Security to Quality of Service Architectures
Stefan Lindskog, Erland Jonsson
21. Advanced Mobile Multipoint Rela-Time Military Conferencing System (AMMCS)
R. Sureswaran, A. Osman, M. S. Mushardin, M. Yusof, B. Husain
22. Advanced Optical Infrastructure for the Emerging Optical Internet Services
Marian Marciniak, Marian Kowalewski, Miroslaw Klinkowski
23. Agent-based Intelligent Clinical Information System
Il Kon Kim, Ji Hyun Yun, Sang Wook Lee, Hang Chan Kim
24. An approach for implementing Object Persistence in C++ using Broker
Kulathu Sarma
25. Digital Learning: Infrastructure and Web Culture
Alexei L. Semenov
26. An Efficient and Adaptive Method for Reservation of Multiple Multicast Trees
file:///F¦/papers.html (2/15)2004/03/22 13:16:33 SSGRR-2002s - Papers
Shiniji Inoue, Makoto Amamiya, Kasuga-shi, Yoshiaki Kakuda
27. Distributed Information Systems BuildingTechniques
Petr Smol勛, Tom癩 Hru嗅a
28. An eye-gaze input device for people with severe motor disabilities
Laura Farinetti, Fulvio Corno
29. Asia-Pacific and Australian Grid developments, the coming information Grid and the iSpace
Bernard A. Pailthorpe, Nicole S. Bordes
30. Askemos - a distributed settlement
J•g F. Wittenberger
31. Automatic Determination of Cluster Size Using Machine Learning Algorithm
Jihoon Yang, Sung-Hae Jun, Kyung-Whan Oh
32. B2E Business to Employee and Business to Everything
Prashant Killipara
33. Business Methods: Patent Practice at the European Patent Office
J•g Machek
34. CAL-Visual, an E-Education Tool for the Management of Digital Resources
Dino Bouchlaghem
35. Casual databases
Ken Roger Riggs
36. CLINICAL ANTHROPOLOGY - A NEW EDUCATIONAL METHOD FOR ETHICS AND HUMANITY USING INTERNET
Shinichi Shoji
37. Communication Behavior and Collaboration in Virtual Seminars - Experiences
Birgit Feldmann, G. Schlageter
38. DEC (Decision Making in Business Environment) Computer Age of business (Admiration or Admission?)
Nanayaa Owusu Prempeh
39. Symbolic Computation in Research
Qi Zheng
40. Computer 邦ediated Communication (CMC): A Shift Towards E- Education Systems In Malaysia
Rozhan M Idrus
file:///F¦/papers.html (3/15)2004/03/22 13:16:33 SSGRR-2002s - Papers
41. Computer Simulation Of Combustion in Particulate 2-Phase Flow
Aleksandar Saljnikov, Simeon Oka, Elmira Karbozova, Miroslav Sijercic
42. COMPUTING OUR WAY TO THE ULTIMATE TOY
Kristinn R. Thorisson
43. Cooperative Learning: Multi-user Support for Advanced Distance Learning Services
Andrea B•
44. Coordinating Representations for Collaborative Systems
Richard Alterman, Alex Feinman, Seth Landsman, Josh Introne
45. Coordinatized Graphs: Interplay Between Graphical Properties and Adjacency Systems
Andrew Woldar
46. Cryptographic Schemes in Secure e-Course eXchange (eCX) for e-Course Workflow
Lucas C. K. Hui, Joe C.K. Yau
47. Data Mining from a Web Browser
David J. Haglin, Richard J. Roiger
48. Design and implementation of an adaptive learning management system
Giorgio Casadei, Matteo Magnani
49. E-learning, Metacognition and Visual design
David Kirsh
50. Development of Computer-Based Activities for Peer-Led Team Learning in University-Level General Chemistry
John Goodwin
51. Differentiated Multilayer Resilience in IP over Optical Networks
Achim Autenrieth
52. Distributed Medical Intelligence via Broadband Communication Networks
Constantinos Makropoulos
53. Multimedia based Learning and Working: a Cooperation of University with Industry
Peter Deussen, Hartmut Ehrich, Tim Young Weisssch臈el, Christian Zorn
54. Document Ontology: A Statistical Approach
Sadanand Srivastava, James Gil de Lamadrid, Chakravarthi S. Velvadapu
file:///F¦/papers.html (4/15)2004/03/22 13:16:33 SSGRR-2002s - Papers
55. Does Attentional Load Affect Discourse Management in On-Line Communication?
Claude G. Cech, Sherri L. Condon
56. E-Book, an e-learning tool for Engineering Undergraduates
Eduardo Gomez-Ramirez
57. E-Business Management and Workflow Technologies
Zeljko Djuricic, Natasa Ilic, Zeljko Djuricic, Veljko Milutinovic
58. Economic Decision-making in a Technological Age
James R. Forcier
59. Complexity and the Emergent Web
Sorin Solomon, Eran Shir
60. E-Diagnosis Using GeneChip Technologies
Zhao Lue-Ping, S. Gilbert, C. Defty
61. e-DOCPROS: An e-Business Document Processing System
Zhenfu Cheng, Xuhong Li
62. Effects of Changing the Pedagogical Concept of a Part-time Bachelor of Science in Accounting from Traditional Lectures into an IT-supported Asynchronous and Flexible Teaching & Learning Concept
Lars Kiertzner, Maya Dole, Tage Rasmussen
63. e-Infrastructure in a complex environment
Julian Smith
64. E-learning at ENSAIT: a case study
Pierre Douillet, S. Pessé, A. M. Jolly
65. E-Learning Content Creation with MPEG-4
Michael Stepping
66. E-Learning of Spanish with Interactive Video and Blackboard Technologies for Elementary School Children
Julia Coll
67. EMERGENCY! Medicine and Modern Education Technology
Dag K.J. E. von Lubitz, Benjamin Carrasco, Francesco Gabbrielli, Frederic Patricelli, Tymoty Pletcher, Caleb Poirier, Simon Richir
68. e-Medicine Utilization: Socio-cultural issues
file:///F¦/papers.html (5/15)2004/03/22 13:16:33 SSGRR-2002s - Papers
Robert Doktor, David Bangert
69. Emerging market mechanisms in Business-to-Business E Commerce A framework
B. Mahadevan
70. Enhanced Security Watermarking and Authentication based on Watermark Semantics
Dimitrios Koukopoulos, Y. C. Stamatiou
71. Environment for Teaching Support in the Medical Area
Rosa Maria Vicari, Cecilia Dias Flores, Louise Seixas, André Silvestre
72. Epidemic Communication Mechanisms in Distributed Computing
Oznur Ozkasap
73. Evaluating Java Applets for Teaching on the Internet
Michael R. Healy, Dale E. Berger, Victoria L. Romero, Amanda Saw
74. Evaluating network intrusion detection algorithm performance as attack complexity increases
Dirk Ourston, Bryan Hopkins, Sara Matzner, William Stump
75. Evaluating the Quality of Service for a Satellite Based Content Delivery Network
Helmut Hlavacs, Guido Aschenbrenner, Ewald Hotop, Aadarsh Baijal, Ashish Garg
76. Evaluation and perspectives of innovative Tunisian e-learning experimentation
Mohamed Jemni, Henda Chorfi
77. Evaluation of Minimal Deterministic Routing in Irregular Networks
Tor Skeie, Ingebjørg Theiss, Olav Lysne
78. Evolution and Convergence in Telecommunications
Gennady G. Yanovsky
79. Extending SOAP for handling lighweight transactional information
Mario Jeckle
80. Extending the Personal Response System (PRS) to Further Enhance Student Learning
Joan Wines, Julius Bianchi
81. Federated Profile Information Architecture
Guoping Jia
82. Fusion of Multiple Images with Robust Random Field Models
file:///F¦/papers.html (6/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
Kie B. Eom
83. e-Business MUSIC: New Ways to Perform Introspection Within the Corporation
Enrique Espinosa
84. Generating color palettes for compressed video sequences
Yuk-Hee Chan, Wan-Fung Cheung
85. Genetic Algorithms for Internet Search: Examining the Sensitivity of Internet Search by Varying the Relevant Components of Genetic Algorithm
Vesna 各嗽m, Dragana Cvetkovic
86. GroupIntelligence: Automated Support for Capitalising On Group Knowledge
Jules de Waart, Michiel van Genuchten
87. From Innovators to Laggards: Computer Scientists and E- learning
Roger Boyle, Martyn Clark
88. How to find similar web sites by using only link information
Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, Osamu Akashi, Toshiharu Sugawara
89. Hardware RAID - 5 versus Non-RAID solution under UNIX Operating System
Borislav Djordjevic, Stanislav Miskovic, Nemanja Jovanovic, Veljko Milutinovic
90. Identity Management: a Key e-Business Enabler
Marco Casassa Mont, Pete Bramhall, Mickey Gittler, Joe Pato, Owen Rees
91. Impacts of the Global Information Society on the Banking Industry
Ondrej Slapak
92. Implementation of a remote-assistant application via Web over IP networks: CIMA Project
Francisco Sandoval, Francisco Javier González Cañete, Francisco Miguel García Palomo, Eduardo Casilari Pérez
93. Implementation of Feedback TM an Application for Quality Assurance, Learning and e-Communication of Diagnosis of Medical Images
M. Bergquist, H. Gater, O. Flodmark, J. Hedin, S. Hedin, M. Hellström, B. Jacobson, B. Johansson, N. Lundberg, K. Måre, J. Wallberg, P. Wenngren
94. The social infrastructure of E-Education
Peter Lyman
file:///F¦/papers.html (7/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
95. Information Publishing on FRIENDS
Alfons H. Salden, Ronald J. van Eijk, Mortaza S. Bargh, Johan de Heer
96. Infrastructure For E-Business, E-Education, E-Science, and E- Medicine; Challenges For Developing economies. The Nigerian Experience
Babatunde O.R. Ogundele
97. Observations and prescriptions for Web Standards
David Bodoff, Mordechai Ben-Mehachem
98. Infrastructure in Education time to learn lessons from elsewhere?
Tony Shaw
99. Infrastructure, requirements and applications for eScience: a European perspective
Ron Perrott
100. Infrastructures for Mobile Services in e-Medicine
Heinz Thielmann
101. SEMPEL: A Software Engineering Milieu for PEer-Learning
V.Lakshmi Narasimhan
102. Integrating Emerging E-Technologies into Traditional Classroom Settings
Jay M. Lightfoot
103. Integrating the Teaching of Psychology with Web-Based Distant Learning: Practicum and Internship
V. Wayne Leaver
104. Interconnecting Networks and the Performance of Multithreaded Mutiprocessors
Wlodek Zuberek
105. INTERNET PRIVACY CONCERNS AND TRADE-OFF FACTORS EMPIRICAL STUDY AND BUSINESS IMPLICATIONS
Tamara Dinev, Paul Hart
106. Internet: A Powerful Tool in Disseminating Medical Knowledge in Urban and Rural India
Deena Suresh, Dr. CB Sridhar
107. Introduction of Information Infrastructure for Medical Academic Activities in Japan - UMIN and MINCS-UH
Takahiro Kiuchi
108. Learning and Networking Concepts and Components of the Global Seminar
Dean Sutphin
file:///F¦/papers.html (8/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
109. Learning Objects -Pedagogy Based Structuring of Course Materials
Paul Juell, Elizabeth Smith, Lisa Daniels, Vijayakumar Shanmugasundaram
110. Living Book: an Interactive and Personalized Book
Margret Gross-Hardt, Peter Baumgartner, Anna B. Simon
111. Local telematics services for higher education
Joze Rugelj
112. Maestro: A Middleware for Distributed applications based in components software
Jorge Risco Becerra
113. Magenta Multi-Agent Engines For Decision-Making Support
Peter Skobelev, V. Andrejev, S. Batishchev, K. Ivkushkin, I. Minakov, G. Rzevski, A. Safronov
114. Mapping Object Oriented Models into Relational Models: a formal approach
Pedro Ramos, Luís Rio
115. Marketing and Engineering Criteria for the Implementation of a top level Tnternet Infrastructure
Enrique S. Draier
116. Marrying Sanskrit to Java - an e-tutor for Sanskrit
Sudhir Kaicker, Jayant Shekhar
117. m-commerce: why it does not fly (yet?)
Peter Langendoerfer
118. Measurement Technique for Object Oriented Systems
Sallie Henry, Cary Long
119. Measuring the Effectiveness of Internal Electronic Communication Channels in Achieving Business Goals
Angela Sinickas
120. Medical eLearning, eTraining and Interactive Telemedicine via Satellite in the operating room of the future
G. Graschew, T. A. Roelofs, S. Rakowsky, P. M. Schlag
121. Meta-Learning Functionality in eLearning Systems
Ulrik Schroeder
122. Mobile Commerce: Some Extensions of Core Concepts and Key Issues
Christer Carlsson, Pirkko Walden
file:///F¦/papers.html (9/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
123. Models for E-Learning environment evaluation: a proposal
Francesco Colace
124. Multidrop Generic Framing Procedure (GFP-MD)
Kari Sepp舅en
125. Multi-grid Parallel Algorithm with Virtual Boundary Forecast for Solving 2D Transient Equation
Guo Qingping, Yakup Paker, Dennis Parkinson, Wei Jialin, Zhang Sheng
126. Multilingual Multimedia Electronic Dictionary for Children
Valentin E. Brimkov, Reneta P. Barneva, Peter L. Stanchev
127. Mutual indexing of video and bulletin board for lecture video
Hirohide Haga
128. Navigation Support System for Live e-CRM
Hideto Ikeda, Nikolaos Vogiatzis, Aki Shibuya
129. New media - traditional universities: Success factors and obstacles for e-learning technologies
Georg J. Anker, Yuka Sasaki
130. Object Oriented Communication Design Tool usable for Everyone
Hajime Nonogaki
131. On the Viability on E-learning
Sunil Choenni
132. Optimal Link Allocation and Charging Model
Jyrki Joutsensalo, Timo Hamalainen
133. Oral Metaphor Construct. New direction in cognitive linguistics
Asa Stepak
134. Our Progress Into E-Business Education: How we have Incorporated Higher-Order Thinking Skills into our Web-based class
Rexford H. Draman, Robin Eanes
135. Overview of the role of ATM/AAL2 Aggregator in UMTS Access Network
Aleksandar D. Petrovic
136. Parallel Solutions of Coupled Problems
Felicja Okulicka Dluzewska
file:///F¦/papers.html (10/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
137. Practical Traffic Grooming Formulation for SONET/WDM Rings
Paul Ghobril
138. Privacy Issues Arising from a Smart_ID Application in eHealth
John Fulcher
139. The Brave New World of the Cyber Speech and Hearing Clinic: Treatment Possibilities
William R. Culbertson, Dennis C. Tanner
140. GIS and DGPS via Web: the GIS on line of the Everest National Park
Giorgio Vassena, Roberto Cantoni, Carlo Lanzi, Giuseppe Stefini
141. Holarchies on The Internet: Enabling Global Collaboration
Mihaela Ulieru
142. Purdue Center for Technology Roadmapping: A Resource for Research and Education in Technology Roadmapping
Edward J. Coyle
143. QUALITY MANAGEMENT SYSTEM BASED ON THE NATIONAL TRAUMA REGISTRY
Drago Brilej, Radko Komadina
144. Rationales for Consumer Adoption or Rejection of E-Commerce: Exploring the Impact of Product Characteristics
Bill Anckar
145. Some remarks on time modelling in interactive computing systems
Merik Meriste, Leo Motus
146. Schema Validation Applied to Native XML Databases
Gongzhu Hu, Qinglan Li
147. Search and Discover on the Web
Bipin C. Desai
148. Server Load Balancing in the Next Generation Internet
Jamalul-lail Abdul Manan, Habibah Hashim
149. Service Oriented Community Systems for Mobile Commerce
Kinji Mori
150. SIP and the Internet
Gianni Scandroglio
file:///F¦/papers.html (11/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
151. Socrates Meets The Web: Incorporating the Internet Into U.S. Law Classes
Anna Williams Shavers
152. Software Issues for Applying Conversation Theory For Effective Collaboration Via the Internet
William Klemm
153. Solving scaling problems with the modern GUI
Peter M. Bagnall
154. Some like it soft
Ole Lauridsen
155. A Monitoring System for Manufacturing Machines Based on SNMP
GSangyong Lee, Joongsoon Jang, Gihyun Jung, Kyunghee Choia
156. Enhancing IDS performance through dropping hacking-free packets
Jongwook Moon, Jongsu Kim, Gihyun Jung, Kangbin Yim, Kyunghee Choi, Haiyoung Yoo
157. Analysis on Utilization and Delay of Memory in a Lossless Packet Processing System
Jongsu Kim, Jongwook Moon, Gihyun Jung, Kangbin Yim, Kyunghee Choi, Joongsoon Jang
158. Protecting Mail Server using the CBT algorithm
Hyun-Suk Lee, Soo-Juong Lee, Hui-Sug Jung, Gihyun Jung, Kyunghee Choi
159. Storage Technologies for an Efficient e-Infrastructure
Satish Rege
160. Structured Metadata Analysis
Steve Probets
161. Superscalar in City-1: An Educational Guide to the next step beyond Pipelining
Ryuichi Takahashi, Noriyoshi Yoshida
162. Supervision of Electrical Utility Works Based on Internet
Felipe Alaniz, Pablo R. de Buen
163. Teaching Novices Programming Skills Efficiently: What, When and How?
Yuh-Huei Shyu
164. Teaching, Technology and Teamwork
Elaine Carbone, Shaun Stemmler, Jon Beal
file:///F¦/papers.html (12/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
165. Software solutions for Science e-Education: A case study from the VISIT Project
Yichun Xie
166. Technologies for Student-Generated Work in a Peer-Led, Peer- Review Instructional Environment
Brian P. Coppola, Ian C. Stewart
167. TEN WAYS TO IMPACT THE WEB WITHOUT A WEB MEISTER
Ken McNaughton
168. The Architecture of Knowledge: Representation and Theorization of Violence on the Internet
Lily Alexander
169. The emergence a of web-mediated genres: the home page
Anne Ellerup Nielsen
170. The Emerging Autosophy Internet
Klaus Holtz, Eric Holtz
171. The Future of Education
Lalita Rajasingham
172. The impact of internet technologies on the financial markets
Ross A. Lumley
173. The Mathematical Structure model of a Word-unit-based Program
Hamid Fujita, Osamu Arai
174. The Role of XML in E-Business
Betty Harvey
175. Think before you click customerschallenges in the e-commerce
Zita Zoltay Paprika
176. Topological Design of Multiple VPNs over MPLS Network
Anotai Srikitja, David Tipper
177. Toward logical-probabilistic modeling of complex systems
Taisuke Sato, Yoichi Motomura
178. Towards to e-transport
Miroslav Svítek, Mirko Nov疚
179. Emergence and Evolution of Microturbine Generators [MTGs] to Provide Infrastructure for E-Related Applications
file:///F¦/papers.html (13/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
Stephanie L. Hamilton
180. Using Building Blocks to Implement a Business-to-Supplier Portal
Shannon Fowler
181. Using CORBA Interceptors to Implement a Security Wrapper
Luigi Romano, D. Cotroneo, A. Mazzeo, S. Russo
182. Using Internet and Database Technology to Enable Collaboration between Researchers and Teachers Developing Educational Websites Featuring Endangered Species Research and Conservation
Mary A. Overby, Mark MacAllister, Jeffrey Hoffman, Chris Bulla
183. Using the Quick Look Methodology to Plan and Implement Complex Information Technology Transformations
Richard C. Staats
184. Verifying and Leveraging Software Frameworks
Trent Larson
185. Virtual Communities for Service Delivery: Transferring the Notion of Pro-Social Behavior from 撤laceto 鉄pace"
Ko de Ruyter, Caroline Wiertz, Sandra Streukens
186. Visualizing Molecules Helps Students 'See' Chemistry in a New Light
Harry Ungar, Albion Baucom
187. Wavelet-based Blind Watermark Embedding Technique
Sanghyun Joo, Yongseok Seo, Youngho Suh
188. Web-based Tools for Supporting Health Education
William B. Hansen
189. What can you do with a frozen leg of lamb? - Connecting products and information services in a web-based environment
Benkt Wangler, Ingi Jonasson, Eva Söderström
190. What is Virtual about the Web?
Murat Karamuftuoglu
191. Wireless Control of the Virtual Kiosk
Charles A. Milligan, Steven H. McCown
192. World-wide interaction with 3D-data
Gerd Kaupp, Svetlana Stepanenko, Andreas Herrmann
file:///F¦/papers.html (14/15)2004/03/22 13:16:34 SSGRR-2002s - Papers
193. XML Technologies, Value-based Marketing, Franchising and the New Paradigm in Business
Dino Karabeg
194. Electronic Public Transmission Act of 2002 to cope with the Convergence and as the Minimum Regulations on the Internet
Koichiro Hayashi
195. Domain-Specific Language Agents
Merik Meriste, Jüri Helekivi, Tõnis Kelder, Leo Motus
196. Development of Distributed Package of Finite Element Method
F. Okulicka-Dluzewska, J.M. Dluzewski
197. Feasibility Study and Strategic Business Analysis of Fuel Ethanol Production in Indiana (May 2002)
Dusan V. Milutinovic
198. Virtual Marketplace on the Internet (May 2002)
Zaharije R. Radivojevic, Živoslav Adamovic, Veljko M. Milutinovic
199. Dissemination Of World Health Organisation Reproductive Health Library (WHO-RHL) Information To Doctors In India RECON HEALTHCARE, BANGALORE MODEL
CB Sridhar, Deena Suresh
200. Technological High School Education through Internet
L. P駻ez Silva, F. Cabiedes Contreras, F. Gamboa Rodr刕uez, F. Lara Rosano, A. Viniegra Hern疣dez
201. Agent-based brokerage of personalised B2B mobile services
Alfons H. Salden, Ronald J. van Eijk, Mortaza S. Bargh, Johan de Heer
202. Denial of Service Attacks: methods, tools, defenses
Fred Darnell, Bratislav Milic, Milan Savic, Veljko Milutinovic
203. Scalability and Knowledge Reusability in Ontology Modeling
Mustafa Jarrar, Robert Meersman
204. (Web) Self Service
Tanja MILOŠEVIC
file:///F¦/papers.html (15/15)2004/03/22 13:16:34 How to find similar web sites by using only link information
Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, Osamu Akashi, and Toshiharu Sugawara NTT Network Innovation Labs. 3-9-11, Midori-cho, Musashino-shi, Tokyo, 180-8585 JAPAN [email protected]
Abstract—We are studying techniques that allow even similar sites, and have already proposed a basic algorithm ordinary end users to make efficient use of the Internet. for detecting similar web sites by focusing on the link We previously proposed an algorithm for determining the information embedded in web pages [2]. As a result of degree of similarity between web sites by using link verifying the basic effectiveness of that algorithm, we information to find web sites that are mirrors of each then found that there are some sites that are thoroughly other and ones that are not mirrors but have similar adequate for use as substitutes yet have a degree of content and can be used as substitutes for each other. As a similarity of no more than 50%. But to use this detection result of verifying the basic effectiveness of that method in practice, it is necessary to employ a mechanism algorithm, we found that when trying to find similar web for automatically judging whether sites for which a low sites to site-A, in addition to ones found to have almost degree of similarity has been detected are mirrors or 100% similarity to site-A, there were also ones that were similar sites that can actually be used instead of mirrors. thoroughly adequate for use as substitutes for site-A, even So, in this paper, we propose an automatic determination though they had a low degree of similarity of 50% or less. algorithm, in which web pages are divided into hub-type Therefore, for practical use of that algorithm, it is and content-type and the judgment is done based on the essential to be able to automatically judge whether web results of judgment algorithms specifically tailored to sites that can be inferred to have some kind of similarity each type of web page. Initial trials of this approach have are actually mirror sites or similar sites that can be used as yielded favorable detection results. We also examine the substitutes. To solve this problem, in this paper, we operation of this detection methodology and propose an propose and evaluate the basic effectiveness of an algorithm for effectively finding candidates for similar automatic judgment methodology, and we focus on its web sites by using the user’s access history to the Internet. operation and propose a methodology for effectively Section 2 reviews our previously proposed algorithm for finding candidates for a similar site by using a user’s detecting similar web sites based on link information and Internet access history. describes our new automatic similar web site detection
method. Section 3 discusses a similar-web-site candidate Index Terms—Internet, Mirror site, Access history, Link finding method. information
I. INTRODUCTION II. USING LINK INFORMATION TO FIND SIMILAR WEB SITES Due to the rapid expansion of the Internet, it has become possible for ordinary end users to obtain many kinds of A. HOW TO FIND A SIMILAR WEB SITE? information easily. However, it is still difficult for them to use the network effectively. For example, although mirror We define a mirror site as follows: servers and cache servers have been provided in order to improve scalability and response times, it is difficult for users to identify the optimal server. If the link structure of site-A is very similar to that of site-B, then sites-A and -B are mirrors of each other. To solve this problem we have already proposed a “URL Resolver” framework, which allows users to select the optimal server from multiple servers that provide various This definition is based on the observation that mirror kinds of services via data storage facilities such as caches sites or sites that hold information that is so similar that or mirror servers [1]. To enable users to select one of the they closely resemble mirror sites should more or less servers, it is first necessary to gather information such as match in terms of the number and types of links a list of servers that might be useful to the user. Initially, embedded within them, even if there are slight differences we focused on information related to mirror sites or 1 such as inserted advertising banners (This definition is based on [3]). The detection method we proposed in reference [2] is as follows. Assume starting site-A and mirror candidate site- B. The degree of similarity between these two sites is as follows. The total number of embedded inward links that can be gathered when tracing the links of web pages to a depth of N levels from the top web page of site-A is N referred to as url (Ain) , and the total number of embedded N outward links is referred to as url(Aout) . Here, an inward link is one in which the host part of the link’s destination URL is the same as that of the current host, and an outward link is one link in which the host part of the link destination is different. The corresponding properties of N N site-B are similarly expressed as url(Bin) and url(Bout) . N Then, the total number of inward links in url(Bin) that are N N also included in url(Ain) is expressed as url(Ain ∩ Bin) , while the corresponding property of the outward links is N Fig. 1: Detection results expressed as url(Aout ∩ Bout) . At this point, when ∩ N determining the value of url(Ain Bin) , the comparisons reason why we decided not to perform a text level are made after replacing the host parts of site-A and -B comparison. with the same arbitrary text strings. So, the degree of similarity between site-A and -B when links are followed α to a depth of N levels is denoted by the symbol , which Next, we discuss the way to find candidates for mirror is given by sites. Using a web robot to recursively access suitable N N sites indiscriminately and compare them with site-A to url(Ain ∩ Bin) + url(Aout ∩ Bout) α = N N × 100 (%). url(Ain) + url(Aout) Since web sites that employ mirror servers do so with Since this procedure only compares the link structures, it the aim of dispersing the load, they will probably want does not perform a text-level comparison of every to provide users accessing the site with information character in every word on the web pages. The reasons for about these mirrors. In other words, it is reasonable to adopting this approach are as follows: assume that the site will make some mention of where its mirrors can be found. Accordingly, it is highly likely that site-B can be found by gathering and (1) As mentioned above, we think it is possible to judge analyzing the content accessible within a certain the similarity of hypertext documents such as web number of link levels from the top page of site-A. It is pages by comparing only their link structures. The also highly likely that the web site will contain links to practicality of focusing on the link structure in web sites of a similar nature, so there should be a high pages in also highlighted in other studies [4] and [5]. likelihood of being able to find similar sites by (2) In this procedure, although a lot of processing time is checking the link structure. taken up by gathering web pages, the amount of text to be compared also increases substantially when links are followed to a depth of several levels and find a candidate for site-B would be far too inefficient, so much more time is required for a comparing text than instead we adopted the following strategy: for comparing just the links. Besides, the gathering work can be speeded up by increasing the network For the actual trials, we extracted 1000 URLs entries from bandwidth, so we decided not to compare the the access log stored in a proxy server used by our processing times required for a text-level comparison. organization of about 200 people, and applied our similar- site detection program to each one (See α in Figure. 1). (3) We are planning to use this similar-web-site finding Similar site candidates were detected for 65% of these. It method even in environments with limited processing is interesting to note that we found that many sites can be resources, such as users’ notebook PCs. Therefore, used as substitutes, even though their degrees of similarity considering the storage of information obtained when were less than 50% (see Table 1). So, if a way can be calculating the degree of similarity, performing a found to automatically judge whether or not they are text-level comparison would require all the text actually capable of being used as substitutes, then it information to be stored, which would be a waste of should be possible to present a greater number of sites to resources. The link information takes up considerably the users in addition to the sites having a high degree of less space than the text information, which is another similarity for which judgment is unnecessary. 2 as the way of embedding links, even if a site is capable of being used as a substitute. To deal with this, in addition to
the links, we also use the label strings of links as Degree of Strength of relationship between two important elements expressing the attributes of the links, similarity sites and add to the calculation of the degree of similarity as follows: we use these labels corresponding to the text 0%–10% Probably unrelated enclosed within the links; e.g., the text string “XXXXXX” in the link “XXXXXX”, 10%–60% May include some sites of a similar and the text string “YYYYY” in the link “ ”. 60%–90% Either a mirror site or a site that is We decided to rate these links by scoring them according highly similar to the length of the text strings “XXXXXX” and 90%–100% Almost certainly a mirror site “YYYYY” embedded in their labels when a match is found between a pair of labels. Note that the calculation is Table 1: Results of classifying detected sites performed using only the text string “XXXXXX” for links where the text strings “XXXXXX” and “YYYYY” B. SITES CAPABLE OF BEING USED AS match. SUBSTITUTES In content-type sites like a news site, the headline of the Thus, we propose the following detection method. First, article is usually used as a link label, and the string length we divide web pages into the following two broad of the headline is usually longer than that of a link label categories according to the style of user access. whose reference address is another Web site. • Web pages that are accessed as a starting point for net An example of a link in which the alt option is set is as surfing are called hub-type sites. follows: • < img Web pages that are accessed in order to view the src = "http://a772.g.ak···/2.gif" width = "84" content on the page itself are called content-type sites. height="42" alt="The Apple Store." Border ="0"> Then, by considering the conditions of sites that can be considered as substitutes for hub- and content-type sites, In the algorithm in [2], the degree of similarity was respectively, we propose the following degree-of- calculated using only the text string of the URL part similarity calculation methods. “http://www.apple.com/store”, but here in addition to this, the link label “The Apple Store.” is also included in the calculation. The labels are scored according to the 1) METHODOLOGY FOR JUDGING A HUB-TYPE SITE following rules: For example, consider hub-type sites-A and –B. If site-A (1) When the URL parts and label parts both match, the has many embedded external links that are the same as link is awarded a score corresponding to the number those in site-B, then it is highly likely that the user will be of characters in the label. able to use both sites equally. (2) When the URL parts match but the label parts are That is, site-A and the possible substitute candidate site-B different, the link is awarded a score of 70% of the are deemed to have a greater degree of similarity with number of characters in the starting link label. respect to their outward links if they satisfy the condition (3) When the label parts match but the URL parts are α < β , (1) different, the link is awarded a score of 50% of the number of characters in the starting link label. where α is as defined in Section 2.1 and (4) When both parts are different, the link is awarded no N score. url(Aout ∩ Bout) β = N x 100(%) url(Aout) These scoring settings are made based on experience, and further study is required to investigate their validity. Also, is the degree of similarity related to outward links only, for rule (3), since we focus on the similarity of the link from which it can be inferred that site-B is highly likely to structure, it could conceivably be wrong to consider cases be suitable for use as a substitute hub-type site. where the URL parts are different. However, in the case of content-type sites, since we concentrate on the label parts, 2) METHODOLOGY FOR JUDGING A CONTENT- we decided to include cases where the URL parts are TYPE SITE different in the detection by reducing the score awarded. In the above example of “The Apple Store.”, if a link is On the other hand, in a content-type site we think that detected whose URL parts and label parts are identical there may be some differences in the page structure such when matching the link with the mirror candidate site, 3 then this link is awarded 16 points (the number of the most suitable URL is selected on the basis of characters in “The Apple Store.”). In calculating the information from the mirror information managing agent. number of characters in a label, all single-byte and On the other hand, there may be a small amount of time double-byte characters (including English and Japanese available up until the user clicks an anchor within that characters, spaces, and so on) are each counted as one URL, and if such is the case, it might be possible to select character. the most suitable URL according to new transfer rate information. If this can be accomplished, access will be The value of url (A )N for the starting site is given by label in forcibly changed to the most suitable URL when the user the sum of the number of characters in the labels added to clicks the anchor. each link in url(A )N, and is awarded the maximum in possible score when matching is performed with an N C. INITIAL TRIALS identical mirror site. The value of urllabel (Bin) for a mirror candidate site-B is defined in the same way. Furthermore, As mentioned in Section 2.1, Fig. 1 shows the results of N the value of urllabel (Ain ∩ Bin) is given by the sum of the calculating β and γ for 1000 URLs. About 18% of the scores awarded for matching combinations of the URLs were classified as hub-type sites with a degree of abovementioned URL parts and label parts in each similarity of 30% or more, and about 6% of them were respective label. If the degree of similarity between site-A classified as content-type sites. Both these include URLs and -B in terms of inward links is given by that were detected as hub-type and content-type pairs respectively. We then manually checked 50 sites for
which either β or γ was 30% or more from among the ∩ N urllabel (Ain Bin) sites classified as hub- or content-type sites, and found γ = N x 100(%) , then if urllabel (Ain) that all of them were indeed suitable for use as substitutes for these hub- and content-type sites. In future we plan to α <γ , (2) perform verifications with a greater number of access logs and to investigate and examine the reliability of the it is judged to be likely that the mirror candidate can be degree of similarity calculations. used as a substitute for a content-type site.
When the number of links used as the denominator of α or β or γ, when judging Equations (1) and (2), is small (in III. USING USER’S ACCESS HISTORY the current version, less than 10), the degree of similarity With the procedure in Section 2.1, we were able to detect is recalculated by following links to a greater depth. similar sites by using an access log as a starting point. Unfortunately, this procedure is unable to detect sites that However, it will be more effective if it is possible to have a mirror relationship but have differences between detect similar sites that are significant for each individual both the link structure and the labels added to the links. end user. Of course, facilities such as proxy servers However, it is doubtful whether many sites of this sort contain access logs that reflect the character of the actually exist. community that uses them, and can themselves be thought At present, the following simple techniques are used to of as candidates for similar web sites that may match the select a URL from the list of URLs that are possibly users’ preferences. However, in the current version, we mirrors: (1) select the site with the highest likelihood and can only find mirror or similar sites that are limited to the (2) select the candidate with the highest transfer rate at the range of sites traced from the starting host. That is, we do time of retrieval in the case of multiple candidates by not evaluate the degree of similarity between different phase (1). In the future, we also plan to make use of the hosts in the access log. This is because it would lead to a transfer rate at the time of user access and feedback data combinatorial explosion and we judge it to be inefficient. from users, etc. Here we note that in terms of user access, However, it is clear that it is highly effective to detect the process of indicating the most suitable URL requires similar sites including hosts that cannot be reached from real-time properties unlike the mirror search process. In the detection origin host, not just the hosts recorded in the relation to the above, we are investigating using a access log. Therefore, we propose a method for retrieving technique that can flexibly select optimal strategies for mirror or similar sites by using the users’ access history to selecting a URL at any time through an algorithm that filter similar site candidates. executes multiple strategies in parallel [9]. For example, when a user is the first to access a certain URL, there is no time available that for measuring the transfer rate, and
4 www.watch.impress.co.jp/p
www.nikkeibp.co.j
www.zdnet.co.j
Fig. 2: Examples of the same conent in different web sites.
www.watch.impress.co.jp, and detailed specifications were only mentioned at www.zdnet.co.jp. Here, when a user browses through some content zzz at a certain site-A, if sites-B, -C, and -D which are highly likely to contain similar articles related to content zzz— i.e., sites that have a high degree of similarity to site-A— have been detected, then the user will be more likely to obtain a greater amount of information if these sites are recommended to him/her. Next, we discuss how to efficiently extract sites-A, -B, -C, and -D. 1. First, acquiring the following user’s access history: every time the user clicks on a link, we extract site-R in which this link is embedded, site-T which is the destination site of the link, and label-L which is the www.watch.impress.co.jp/ text string of the link’s label. Moreover, label-L is subjected to morphological analysis (using widely used general-purpose morphological analysis 1 software, like [7]) to extract several noun parts N1⋅⋅⋅n , Figure 2 shows excerpts from web pages related to the and {R, T, N ⋅⋅⋅ } triplets are recorded. In the example same content (new digital camera products) in four web 1 n shown in Fig. 2, the following lists are recorded: sites. On finding an article about a new product on one web site, many people (certainly the authors do) often {biztech.nikkeibp.co.jp, www.kodak.co.jp, habitually browse through other related web sites and “Kodak”} look for articles related to the same content. This is {www.watch.impress.co.jp/pc, www.kodak.co.jp, because the content of the articles changes slightly from “Kodak”} {www.watch.impress.co.jp/av, www.kodak.co.jp, one site to the next. In the example shown in Fig. 2, an article related to resolution and the number of pictures 1 that can be taken was only mentioned at By the morphological analysis, a noun is classified into www.nikkeibp.co.jp, while an article relating to the a proper noun or a general noun or an unknown word, and manufacturer’s business strategy was only mentioned at we use a proper noun and an unknown words to express a character of a link. 5 All Both N and T are same Only N is same Only T is same When both N and T are same, Case-I Case-II Case-III N indeed expressed the character of T watch vs. zdnet 24482 22 23966 494 20 asahi vs. yomiuri 2288 15 2198 76 14 watch vs. asahi 4389 8 4281 100 1 zdnet vs. asahi 15879 1 14789 89 0 watch vs. yomiuri 1869 1 1845 23 0 zdnet vs. yomiuri 6350 3 6289 58 0 ( watch: www.watch.impress.co.jp, zdnet: www.zenet.co.jp, asahi: www.asahi.com, yomiuri: www.yomiuri.co.jp ) Fig. 3: Number of extracted {R, T, N}s “Kodak”} To verify the basic efficiency of this methodology, we {www.zdnet.co.jp/news, www.kodak.co.jp, “Kodak”} investigated how many {R, T, N}s, having the same noun part N and same URL of the destination site T could be 2. When there are pages carrying the same content in found from two similar sites. Figure 3 shows the results of different sites, it is highly likely that the destinations this investigation: First, we extracted the link information of the outward links embedded within these pages from www.watch.impress.co.jp and www.zdnet.co.jp, will be the same, so there is a higher possibility that which are hub-type sites concerning new products in the computer or office automation fields, and from access histories such as {Ra, T, N} and {Rb, T, N} where only site-R is different can be extracted to www.asahi.com and www.yomiuri.co.jp, which are web evaluate similarity. In Fig. 2, a link to the sites of newspapers. Specifically, we extracted noun parts manufacturer’s site “Kodak” is embedded in all the and their destination sites of the outward links to a depth sites. of 3 levels (N=3). And second, we searched for the 3. Then, by extracting from the access history the sites- following types of {R, T, N}s from this link information:
R1...n for which the site-T and N terms are the same and only the R terms are different, we can obtain a Case-I: {R, T, N} in which noun part N and destination list of sites where the same content appears, and the site T are both the same. degree of similarity between these sites is calculated. Of course, a different site list could be accumulated Case-II: {R, T, N} in which only noun part N is the same. from all the sites where only the noun parts N are the same, but in practice the content will have a lower Case-III: {R, T, N} in which only destination site T is the likelihood of being related. same.
This procedure is very general-purpose because it can Finally, we checked that among the searched {R, T, N}s learn {R, T, N} triplets as soon as the user first accesses in Case-I whether each {R, T, N} did indeed express the sites, even if they have not yet been registered in the character of the site T or not. access log. As for a way to recommend detected sites to the users, The results show that even though many {R, T, N}s were several methodologies are considered as follows: When searched for Case-II and Case-III from every combination the user has accessed any one of these sites, he/she is of the sites, in Case-I, {R, T, N}s were mainly extracted recommended to browse other sites with priority given to from only combinations of the similar sites (“watch vs. those having a higher degree of similarity. Moreover, zdnet” and “asahi vs. yomiuri”). And although, several when the user has accessed site-T which has already been {R, T, N}s were also extracted from the combination of “watch vs. asahi” having no relationship between them, recorded in {R1...n, T, N}, he/she is recommended to browse other sites with priority given to pages in the we could find only one {R, T, N} which indeed expressed the character of both sites (see the following partial lists individual Rm of R1...n whose content has been updated recently. Another effective measure is to pre-fetch the of extracted {T, N}s). contents of similar sites related to sites accessed by the user and to display a compiled list of these sites. watch vs. zdnet
If there is a similarity relationship between site-A and {http://www.minolta.co.jp/, MINOLTA} (The company name) site-B, but site-B is the competitor of site-A, it may be {http://www.newtech.co.jp/, NEWTECH} (The company name) {http://www.melcoinc.co.jp/, MELCO} (The company name) difficult to find a similar web site-B from site-A by using {http://www.tsutaya.co.jp/, TSUTAYA} (The company name) the strategy proposed in section 2.1. But, by using user’s {http://www.sony.co.jp/sd/, SONY} (The company name) access history, it may be possible to find a similarity relationship between site-A and site-B.
6 ACKNOWLEDGEMENTS watch vs. asahi We thank our executive manager, Dr. Keiichi Koyanagi {http://www.microsoft.com/japan/misc/cpyright.htm, Microsoft} (The of NTT Network Innovation Labs, and the researchers of company name) {http://www.microsoft.com/japan/misc/cpyright.htm, Corporation.} the Computer Networking Principles Research Group. {http://www.microsoft.com/japan/misc/cpyright.htm, ALL} {http://www.microsoft.com/japan/misc/cpyright.htm, rights} {http://www.microsoft.com/japan/misc/cpyright.htm, reserved} REFERENCES
Therefore, from this initial investigation, we can infer that [1] Toshio Hirotsu, Satoshi Kurihara, Toshihiro if two sites have links having the same noun parts and Takada, and Toshiharu Sugawara: ARESAIN - same destination sites, these two sites can be thought as Alternative Resource Access Information being strongly candidates for having a similarity Navigator, Thirteenth IASTED International relationship. Of course, this procedure is still at the stage Conference on Parallel and Distributed of initial trials, and we are planning to verify its Computing and Systems (PDCS 2001), 2001. effectiveness by conducting full-scale verification trials. [2] Satoshi Kurihara, Toshio Hirotsu, Toshihiro Takada, and Toshiharu Sugawara: Mirror Site Navigator using Link Information, Proceedings IV. CONCLUDING REMARKS of World Multiconference on Systemics, Cybernetics and Informatics (SCI2000), pp. In this study, only link information was used to detect 283–290, 2000. similarity on the grounds that in hypertext environments [3] Krishna Bharat, Andrei Z. Broder, Jeffrey Dean, such as the WWW, links express the most information Monika Rauch Henzinger: A Comparison of regarding the characteristics of content. On the other hand, Techniques to Find Mirrored Hosts on the a considerable amount of research is being done in the WWW, Journal of the American Society for field of natural language processing for procedures that Information Science (JASIS), Vol. 51, No. 12, determine the degree of similarity by analyzing the text Nov. 2000, pp. 1114–1122. content. Reference [8] describes one example of a study [4] Narayanan Shivakumar and Hector Garcia- in which this procedure is applied to the WWW. However, Molina: Finding near-replicas of documents on it has been concluded that this sort of conventional text- the web, International Workshop on the World based procedure does not function effectively in hypertext Wide Web and Databases (WebDB ’98), 1998. environments such as the WWW [5],[6]. [5] O. Zamir and O. Etzioni: Grouuper -A Dynamic Studies of ways to detect mirror sites by focusing on the Clustering Interface to Web Search Results-, The link structure include references [3] and [4]. However, the Eighth International WWW Conference, 1999. aim of those methods is to detect only complete mirror [6] L. Page, S. Brin, R. Motwani, and T. Winograd: sites, and in these procedures, other information—such as The PageRank Citation Ranking: Bringing Order the link connection relationships and the information from to the Web, Work in progress. a DNS, etc—is used besides the calculated degree of http://google.stanford.edu/~backrub/pageranksub α similarity corresponding to in this study. Our procedure .ps. is different in that it regards some sites as being capable of being used as substitutes even though they have a low [7] http://chasen.aist-nara.ac.jp/ α value, and aims to detect these sites as well. To do this, [8] S. Chakrabarti, B. Dom, R. P., S. Rajagopalan, D. we broadly divide web sites into hub-type and content- Gibsoon, and J. Kleinberg: Automatic Resource type sites, and the degree of similarity is calculated using Compilation by Analyzing Hyperlink Structure methods tailored to each type. By comparing the degrees and Associated Text, The Seventh International of similarity thereby obtained, it is possible to WWW Conference, pp. 65–74, 1998. automatically judge whether or not web pages can be used [9] S Kurihara, S, Aoyagi, S, Onai, R, and Sugawara, as substitutes for each other. T: Adaptive Selection of Reactive/Deliberate Planning for the Dynamic Environment, Robotics In this paper, we focused on the operation of this similar and Autonomous Systems, vol. 24, No. 3--4, pp. web site detection method and proposed an effective 183--195, 1998. procedure for finding candidates for similar web site that match the user’s preferences. This involves storing the connection relationships and label parts of links in sites accessed by the user, and extracting similar site candidates by starting from sites where the noun parts of the labels are the same.
7