<<

Web Navigation Modeling with ad lib Statecharts

¡

¢ ¢

Karl R.P.H. Leung Ricky W.M. Tang Lucas C.K. Hui ¡ Department of Computing and Mathematics, Hong Kong Institute of Vocational Education (Tsing Yi), Hong Kong.

email: kleung@.org ¢ Department of Computer Science and Information Systems, The University of Hong Kong,

Hong Kong. ¤ emails: £ wmtang2, hui @csis.hku.hk

Abstract

Web navigation is the act of moving from web page to web page by users. In recent years, there is a trend towards increasing number of pages contained in web sites, and increasing complexity of web pages due to ad lib contents, such as those that result from search engines. The complexity and difficulties include modeling the ad lib web navigation and the vast variety of browser navigation facilities the user can employ. Most of the modeling methods require all components defined before use and hence do not support ad lib features. We overcome these difficulties by using ad lib Statechart, an extension of Harel’s Statechart, to model navigation of web sites which may contain ad lib contents and the navigation caused by browsers. Multiple window and multi-thread web navigation are supported by ad lib Statechart as well. Ad lib Statechart preserves the semantics of Statechart. Then Statechart semantics can be used to analyze the navigation of complex and dynamic web sites.

1 Introduction

The exponential growth of the World Wide Web (WWW) and electronic commerce in recent years have given rise to a need to model and analyze the browsing semantics of web sites. Con- temporary web sites have increased number of pages and complexity due to ad lib content such as those that result from search engines. Traditional data modeling methods, such as data flow

¥ This project is partially supported by a research grant from the Research Grant Council of the Hong Kong SAR Governement.

1 diagrams, entity-relationship diagrams, and object-oriented hierarchies cannot provide the nec- essary features to model the web navigation [2]. The lack of clear, precise and formal modeling method makes documenting web navigations and validating the design of linkages between web pages difficult. Maintenance and update of the web site would be difficult also. Hence, a clear, precise and formal modeling method is needed to solve these problems. It can help to identify and eliminate undesirable situations and improve the quality of web contents development. Modeling web navigation may not be as easy as it seems. The WWW can be viewed as the collection of web servers that communicates with using the HTTP protocol in request-reply pairs, and stream out the requested content written in the Hypertext Markup Language (HTML). HTML creates some navigation features that are not available by hypertext alone. Examples include dynamic content, embedded executable programs and dynamic link- ages. The WWW can be used for information processing in addition to the traditional information retrieval role performed by hypertext systems. Web pages and links replace information frag- ments and relationships, respectively, in the hypertext systems. Users navigate from a web page to the other by clicking on a hyperlink. Besides this basic navigation method, various clien- t side scripts and programs, such as Javascript [24], VBScript [5], JAVA [6] and ActiveX [1], provide other ways to control navigation between different web pages without clicking a link. For the purpose of information processing, dynamic HTML and ActiveX components are also widely employed to change the content and hyperlinks of the currently displaying web page. Concurrent display of web pages is possible by the use of frames and windows. Synchronization between the concurrently displayed web pages can be achieved using functions of the client side scripts and control programs. Previous works on modeling web navigation [23, 21] treat the web as a kind of hyperdocu- ment and can only support modeling of simple and static web pages. Complex, dynamic contem- porary web pages, which are the most in need of modeling, cannot be modeled. Furthermore, the web navigation initiated by the browsers cannot be modeled either. A web navigation modeling method need to have the following features: Abstraction Abstraction is almost the only feasible technique for human beings to handle large volume of information. Just like reading atlas, web navigators and designers need different levels of details, such as overview, of their navigations at different time. Support ad lib Features In web navigations, users are free to go to any web pages. Very often, these Web pages are not known beforehand. Most of the modeling methods, like DFD and Petri-net, possess the principle of defined before use. They do not support ad lib features, i.e. they cannot add components to the model during run time. These modeling methods are unable to model web navigation properly. Consistent Mapping Web navigation features should be modeled consistently. There must be no ambiguity in interpreting the model. Different readers should interpret the model in the same way. Comprehensive All kinds of web navigation should be modeled. Omission of any kind of navigation would lead to incompleteness and weakens its usefulness.

2 Analytical Ability The modeling method should support formal and systematic analysis of the web navigation model. Fitness The modeling method should fit the web navigation well. There should not be many features which are not used by the web navigation modeling method. User Friendliness The web navigation model should be easy to use and easy to be understood, even by non-technical readers. Statechart [10] is a formal model that contains rich features to model web navigation. We studied web navigation modeling with Statechart and some preliminary results have been dis- cussed in [17]. In this paper, we employ ad lib Statechart [16], an extension of Statechart with dynamic features, to solve the more complex problems of modeling web navigation. In the rest of this paper, some related works are discussed in Section 2. A brief overview of Statechart and ad lib Statechart are discussed in Section 3. Some basic definitions of web pages are defined in Section 4. Modeling of web navigation by hyperlinks and browsers are discussed in Section 5 and Section 6, respectively. Then we discuss and conclude our work in Section 7.

2 Related Works

There are many research works on modeling hypertext [22, 25]. Most of these works claimed that their results can be applied in web navigation also. Unfortunately, these techniques can only be applied to those old style static web pages which have no ad lib contents. All these hypertext modeling techniques are not applicable to web navigations with dynamic features. Web navigation are modelled by web page content together with their intra-linkages in [3, 19]. These models give a general shape of the information space when browsing the web pages in an “overview” mode. They are not comprehensive since many web navigation features are not included in these studies. EFSM [14] is an extended finite state machine to model temporal synchronization of temporal interactive web content. The model lacks the abstraction facility. It would be blown up to unmanageable size for complex web sites. HMData [12] is an object oriented modeling method for storage and retrieval of hypermedia. This modular approach discourages linkages between individual elements of different modules, and is not suitable for modeling extensive spaghetti-like web linkages. In another object oriented approach, Conallen [4] discussed using the common behavior package in Unified Modeling Language (UML) [20] to model the business logic in web appli- cations. Many components of UML are not used in the modeling. This makes UML an overkill method for this purpose. Furthermore, some attributes affecting presentation and browsing semantics of web navigation are intentionally left out. Hence this modeling method can be seen as a compliment to our model to supply information about business logic. On the same track with us, HMBS [23] (Hypertext Model Based on Statecharts) uses S- tatechart [7] as a modeling tool. HMBS aims to support behavioral modeling in the design of hypermedia. But there are no well-defined mappings between hypertext and Statechart. Users

3 are allowed to define them on their own. This would cause difficulties in obtaining consistent interpretation of the model. Ad lib navigation features are not supported by HMBS either.

3 Statechart and ad lib Statechart

3.1 Statechart Statechart [10] was first proposed by Harel in 1987. It is a visual formalism used to extend state diagrams for modeling of complex systems that involve large number of concurrent states, synchronizations and action triggers. Detailed semantics of statechart can be found in Harel [10, 9]. The primitive notion activity is what all the other components are based on. Activity refers to something which is happening or being done in the system under study. State represents a collection of activities which may or may not be running concurrently. Transition captures the change from one state to another and is presented as an event-condition- action rule. A transition occurs only if the specified event takes place resulting in the condition being true and hence activating the specified action. The components of the rule are further explained next. Event can be perceived as the preparation activities necessary to determine whether a transition is to take place. It is a compulsory component in a ECA rule. Condition determines whether a transition is to take place and is optional in a ECA rule. Action corresponds to the activities to be carried out when a transition is established and is optional in a ECA rule. Statechart closely resembles the Mealy machines [13] studied in automata theory. For this reason the statechart definition given below is based on that of Mealy machines. A statechart is

defined as tuple ¦

δ λ

§ § § § § §

Q E C A q0 ¨

where Q a finite set of states E a finite set of events

C a finite set of conditions ¦

A a finite set of actions

¦

© δ ©

the transition function Q E C Q ¨

¦

© λ ©

the action function Q E C A ¨ ¨ q0 the initial state q0 Q Taking the example given in Figure 1, A and B are states drawn from Q, α is an event drawn from E, γ is a condition drawn from C, and κ is an action drawn from A. The transition is determined by the transition function δ. In this case, the arguments A, E, and C return state B.

4 α[γ]/κ A B

Figure 1: State-Transition

The action is modelled by the action function λ taking A, E, and C as arguments. The γ part is sometimes omitted if there is no condition attached to the transition and its absence should be

interpreted as having the condition true. The  κ part can be similarly omitted to represent no action. Broadcasting is the mode of communication between states. It is assumed that all the states in a statechart know the happening of an event. Whether transition take place is depending on whether the state need to response to the event and whether the state is active at that instance. The fact that statecharts are extensions of state transition diagrams with hierarchy and con- currency only makes state transition diagrams more readable and manageable. Indeed, there is no change in expressive power [10]. A summary of these extensions can be found in Appendix A. There are problems of undetermined state when events/activities are defined as zero time, as described in [7, 9, 15]. We will take the micro-step approach [15] to solve these problems. For clarity, state names will be shown in bold and event labels will be shown in sans serif in this paper.

3.2 ad lib Statechart Ad lib Statechart [16] is an extension of Harel’s Statechart in which states and transitions are allowed to be added during run time. It also introduce the concept of Statechart Vector. This Statechart Vector supports forking of statecharts during run time and multi-thread processing of the same statechart. Ad lib Statechart has a principle which stipulates that it should be possible to interprete the resulting statechart the same way as with Harel’s Statechart.

3.2.1 Adding States and Transitions States and transitions are allowed to be added to a statechart during run time. When adding states and transitions, the Laissez-faire principle has to be observed. States and transitions are only to be added to a statechart just before they are going to be activated and triggered, respectively. This principle is used to avoid keeping unnecessary states and transitions. There are also conditions for the states and transitions to be satisfied before they can be added to the statechart. These conditions are to ensure that after adding these states and transitions, the resulting statechart is still correct.

3.2.2 Statechart Vector In order to support multiple statecharts processing, a Statechart Vector is introduced to statechart. This Statechart Vector is an extensible array of statechart which is used to keep track of the active

5 statecharts. Similar to the states in the orthogonal AND-states, the statecharts of the Statechart Vector is running concurrently. Communication between these statecharts is through broadcast- ing. Unlike the AND-state, statecharts in the Statechart Vector can terminate. Termination of a statechart would not cause termination the rest of the statecharts in the Statechart Vector. When a state forks a statechart, the action openWin(sc) will be executed in the transition. An instance of the statechart stated in the variable sc will be instantiated and added to the State- chart Vector. Then the two instances of statechart are running concurrently. With the Statechart Vector and the action openWin, multiple statecharts and multi-thread statecharts processing are supported.

4 Some Basic Definitions

4.1 Web Pages and Hyperlinks Web pages and hyperlinks are the intrinsics of web navigations. We classify web pages into static and dynamic. A static web page is a web page that retains the same HTML for all the client requests of the same URL. It must also contain no reactive or executable components. A dynamic web page is defined as a web page that returns different HTML for client requests of the same URL (server side dynamics), or contains reactive or executable components (client side dynamics). Navigation in the WWW is done by activating hyperlinks. Hyperlinks are directional links from a source to a target web page, which can be the same page. Linking to different sections of the same web page is possible. When a hyperlink is activated, the current web page will be replaced by the target web page. The sequence of hyperlink activation is the navigation path. Our study only includes those components in web pages which would cause navigations.

4.2 Modeling Scope Web pages and hyperlinks are the intrinsics of web navigation. A web navigation model should only capture all the web pages and hyperlinks that are interested and necessary. Then both the web pages and hyperlinks can be divided into two sets: the body and the boundary. The set of body web pages contains those web pages which are subjects of study. The set of body hyperlinks contains those hyperlinks which link up web pages in the set of body web pages. The set of boundary web pages contains web pages which are not in the subjects of study. The set of boundary hyperlinks contains those hyperlinks linking up web pages from the set of body web pages to the set of boundary web pages and those web pages within the set of boundary web pages. In Figure 2, X and Y are web pages which are subjects of study linked by t1. Y contains a hyperlink t2 which links to web page Z. Then X and Y are web pages in the set of body web pages. Z is an element in the set of boundary web pages. The set of body hyperlinks contains the hyperlink t1 and the set of boundary hyperlinks contains the hyperlink t2.

6 X t1 Y t2 Z

Web Page Body set Web Page Boundary set

Figure 2: Modeling Scope

5 Modeling Web Navigation by Hyperlinks

A web page is modeled as a state in statechart. The event of clicking or processing which leads to navigation to another web page is modeled as transition in statechart. Since a hyperlink can lead to a specific section in the target web page, we define an event jp(target, section) in statechart to model hyperlink. target and sect are primitive variables in the statechart which hold the target web page and the destination section in the web page, respectively, of the hyperlink. section can hold a special value default. This means that the hyperlink is leading to the default section of the target web page. We define jp(target) as the shorthand for jp(target, default). In the example in Figure 2, the web pages X,Y and Z are modeled as states. The hyperlinks t1 and t2 are modeled as the events jp(Y) and jp(Z), respectively.

5.1 Intra-page Navigation Intra-page hyperlink can be defined as a link that allows jumping to different defined sections of the same page. For instance, a web page productA.html 1 has hyperlinks to sections descrip- tion, size, price and availability in the same page. Users can jump to these sections by activating their respective hyperlink. On the other hand, viewing these sections by scrolling up and down with browsers do not involve hyperlinks and hence is out of the scope of our study. The web page productA.html is modeled as a composite state as in Figure 3. Sub-states description, size, price and availability represent different sections of the web page. A sub- state can contains hyperlinks linking to other sub-states. These hyperlinks can be modeled by jp(target, section). For instance, if event jp(productA.html, price) in Figure 3 is triggered, the sub-state price would be activated. It should be noticed that the target in jp(productA.html, price) is productA.html itself. The section is the destination sub-state in the event. A select connective can be used to abstract this kind of composite state. Let jps(productA) be the disjunction event of jp(productA.html, description), jp(productA.html, size), jp(productA.html, price) and jp(productA.html, availability). The select connective would select the correspond- ing sub-state accordingly. 1The web pages used as examples in the discussion of this paper can be found at

http://www.csis.hku.hk/  swflow/www/.

7 jps(productA)

productA.html S

description price

jp(productA.html, price) size availability

Figure 3: Composite State

productA.html productA.html jp(next.html) jp(menu.html)

(a) (b)

menu.html

jp(menu.html) jp(menu.html)

jp(productB.html) jp(productA.html)

productA.html productB.html

(c)

menu.html menu.html

jp(menu.html) jps(products) jp(menu.html)

jp(productB.html) jp(productA.html) S

productA.html productB.html productA.html productB.html

(d) (e)

Figure 4: Inter-page Hyperlinks

5.2 Inter-page Navigation Inter-page navigation is navigation between web-pages. This is a more trivial kind of mapping from hyperlinks to state transitions. Every hyperlink from one page to another is represented by a transition arrow. The hyperlink activation is the event that triggers the transition. Intra-page navigation inside the web pages are abstracted by composite states. If all the sub- states of a composite state, e.g. the product.html, have hyperlinks to a web page, only one hyperlink leading out from product.html is sufficient (Figure 4a). Stubbed transition arrows are used in case hyperlinks are only available from some of the sub-states (Figure 4b). Some facilities of statechart help to simplify the diagrams. For example, in Figure 4(c), there are hyperlinks between the states menu.html, productA.html and productB.html. Web pages in which contents belong to the same category of the application domain, e.g. product in our example, can have their states grouped into a composite state. Then some arrows can be

8 eliminated between these states. In Figure 4d, only one arrow is leading out from the product state. The select connective of statechart can help to further reduce the number of arrows in the dia- grams. In Figure 4e, the transition jps(products) is the disjunction transitions of jp(productA.html) and jp(productB.html). The select connective is used in the product state to select the state for activation according to the jps(products). It should be noticed that the three models are equivalent, i.e. Figure 4c, Figure 4d and Fig- ure 4e are equivalent.

5.2.1 Frames Frames inside a browser window provides concurrent viewing of multiple web pages. These web pages can be changed through inter-page or intra-page navigations. These frames can be synchronized with each other. A web page with multiple-frames is modeled by an orthogonal-AND composite state. Each frame is a sub-state of the AND-state. The web pages navigated within each frame are the refinements of the frame. Consequently, each of the sub-states of the AND-state can have other levels of sub-states. In our example, next.html has two frames displaying concurrently. A menu frame is used for displaying pages of information of product A and product B. A photo frame is used for displaying the photo gallery of the currently displaying product. next.html is modeled as the orthogonal AND of menu and photo sub-states in Figure 5. Intra-page and inter-page navigation can happen independently in each frame. A default state in a frame indicates its default web page. The sub-states can be a simple web page or another framed page. Synchronization between frames is done by the broadcasting mechanism of statechart. For example, when menu.html is active, triggering the event jp(productB.html) would navigate to productB.html in menu. This event is broadcast to all states. In reaction to this event, photoB in photo is activated. It is because both general.html and photoA react to the event jp(productB.html). When it is active, it will trigger the transition which leads to photoB. Trig- gering the event jp(B2.html) when B1.html is active would only activate B2.html in photo. It is because, although this event is also broadcast to all states, there is no states listening to this event.

5.2.2 Multiple Windows Concurrent viewing of web pages by multiple windows is modeled by ad lib Statechart. For example, let the current active web page be index.html of mainWin. When the hyperlink jp(next.html) is triggered, a new window of adsWin is popped up. This is modeled in Fig- ure 6. The transition between index.html and next.html is jp(next.html) / openWin(adsWin, ads.html ). When this transition is triggered, a new instance of the statechart adsWin is created. and this new instance of adsWin is added to the Statechart Vector. Furthermore, the state in

9 next.html

menu photo

general.html jp(general.html )

jp( productA.html ) jp( productB.html) menu.html jp( menu.html ) jp( productB.html) jp( productB.html)

jp( productA.html ) photoA photoB

A1.html B1.html productA.html jp(A2.html) jp( B2.html )

productB.html A2.html B2.html

jp( productA.html )

Figure 5: Modeling framed page

adsWin will be initialized to ads.html. On the other hand, the active state of mainWin becomes next.html.

jp( next.html ) / index.html openWin(adsWin , ads.html ) next.html ads.html

statechart mainWin statechart adsWin

Figure 6: Modelling Multiple Windows

5.2.3 Dynamic Content Dynamic web pages have their contents and hyperlinks changeable while viewing. They give rise to difficulties in web navigation modeling because not all web pages are known in advance. The size of the set of possible web pages can be very large even if they are known. The set of possible navigation paths of these dynamic pages can also be dynamic. The followings discuss the modeling method of these dynamic relationships.

5.2.3.1 Server Side The server can create new web pages for the users to navigate. The amount of these web pages may be very large. Using search engine to find information is an example. The search engine may return to the user with a large number of web pages in response to a keyword. User can navigate to these web pages through the hyperlinks.

10 agree.html

jp(main.html)[v="Agree"] main.html menu display

mouseClick / A.html highlightA display.jp(A)

mouseClick / display.jp(A) display.jp(B)

mouseOverA display.jp(B) highlightB B.html mouseOverB

Figure 7: Client Side Dynamic Pages

These ad lib web pages are modeled by ad lib Statechart. Since the amount of web pages can be very large in the resulting page and the user may not navigate to any of these pages, these web pages will not be added to the navigation model until they are being accessed. That is, when the hyperlink of a web page is clicked, the web page and the hyperlink is added to the set of body web pages and the set of body hyperlinks, respectively. This can avoid the problem of activating an undefined state in the statechart and keeping unnecessary states and transitions in the statechart model. Another kind of observable effect of server side programming is the return of some web pages which are not known beforehand. These web pages may be generated on the server side during run time. These can be handled easily by ad lib statecharts. These ad lib web pages are added to the set of body web pages right on its arrival at the client side. The transitions leading to these states are added to the set of body hyperlinks also. Then all the web pages and transitions are defined in the corresponding sets.

5.2.3.2 Client Side Client side dynamic web pages can alter their content and hyperlinks while viewing, and is commonly implemented by client side scripts and programs. Although these scripts and pro- grams can be modeled by statechart as an executable object [8], we do not take them into account unless there are implications on navigation. Our web navigation model will only cover modeling of navigation events generated from executable objects, such as a hyperlink activation event. In reaction to some events, such as time-out or mouse clicks, client side scripts and programs can cause navigations between web pages without explicit hyperlinks. These events are modeled as events in the transitions of statechart. Scripts and programs can also post constraints on hyperlinks. These constraints are naturally modeled as constraints of the transitions. It should be noticed that some scripts and programs can post constraints on hyperlinks of pages other than the hosting page. Let us illustrate with an example. A web site starts with the page agree.html. Users cannot

11 Figure 8: Web Browser Navigation Facilities proceed to the page main.html unless they agree to some conditions by typing the word ”Agree” in the agree.html page. In the main.html page, there are two frames. The menu frame contains two buttons A and B. When the cursor is placed on any one of these buttons, that buttons is highlighted. When this buttons is clicked, the corresponding information page is displayed in the display frame. This example is modeled in Figure 7. The navigation from agree.html to main.html is modeled by the constrained transition jp(main.html, pos) / [v = ”Agree”] . The mouseOverA and mouseOverB are the events of se- lecting buttons A and B. The event mouseClick would cause the action of creating the event display.jp(A) or display.jp(B), depending on the active state. Depending on the activation of the events display.jp(A) or display.jp(B), the corresponding states A.html or B.html are activated in display. These actions in menu shows how the JAVA program control the navigations in another frame.

6 Modelling Web Browser Navigation by Browsers

Web browsers provide additional navigation functions that is beyond the control of the web pages (Figure 8). We have analyzed functions provided by two popular web browsers Netscape/Communicator (NS) and Internet Explorer (IE) 2. Due to different terminologies used in different browsers, we will put the browser abbreviations after their facility names in our discussion. The modeling of web browser navigation facilities is applicable to every web page, in addi- tion to their modeling method discussed in the above section. The scope of modeling also applies here, so irrelevant states and transitions are still not modeled. 2They are chosen for their popularity [18]

12 6.1 Go To URL

The user uses this facility to go to any web page by directly entering its URL. This also include going to some arbitrary predefined web pages, including User Home Page (NS, IE), listed web pages Bookmark(NS) / Favorites(IE) and History Tool (NS, IE). We can model this Go To URL navigation with ad lib Statechart easily. If the destination web page is not in the set of body web pages and set of boundary web pages, before the web page is being visited, this new web page is added to the set of boundary web pages. And the hyperlink, in this case is the URL, is added to the set of boundary hyperlinks. Navigation in this new set of web pages is then modeled as usual. The purpose of adding the new web pages and hyperlinks to the two sets of boundary is to show the possibility that navigation can go to web pages and hyperlinks that are not in the scope of study. These information are important for web designers who should understand the consequences of their design if users navigate to some unexpected web pages, or enter web pages they design from unexpected origins.

6.2 Navigation By History

Navigation history (NS, IE) exists for every browser window. The visited web pages are kept in this navigation history. With reference to the navigation history, users can navigate the web pages by going back or forward sequentially in the same order they visited previously. Users can also go to any of the web pages in the history. When navigating in the navigation history, if the user chooses to go to any web page by any method except back and forward, all web pages visited later than the currently displaying page will be deleted from the navigation history, and the new web page which is chosen for view by the user will be added to the navigation history as the latest entry. We use variables and actions of events to model the navigation history.

The navigation by history is viewed as a list of hyperlink activation events. Let H be a

 § ω §§ ω ω ω

navigation history of web pages 1 n denoted by 1 n  . Let c be the cursor which is a variable holding the current index of the currently active web page ωi in H. All items in the list are referred to by the identifiers and positions in the list.

We first define the¦ following auxiliary functions to support our study.

 §  § § § § §

¨  concatenation cat 1 2  2 3 1 2 2 3 This function concatenates two lists into

a single list.¦

¦¦

 § § § § §  element elem a c b c ¨ a b c This function returns the elements of a list into a set.

¦ ω j ω j

index index i ¨¨ j. This function returns the index component of the item i in a list.

 § §

length len a b c ¨  3 This function returns the length, i.e. the number of elements in a ¦

list. ¦

§ § § ¨! maximum & minimum max 3 4 ¨  4 min 3 4 3 These two functions return the maximum and minimum of two values, respectively.

13

¦

§ § §

item item 2 a b c ¨ b. The function returns the item of the index which is requested. An action HistOp is defined to manipulate the navigation history. Let H be a navigation ω j ω history. i is a web page with index j in the list H. new is the destination web page of a transition. c is the cursor which holds the index of the current active page. The action HistOp is defined to handle the navigation history and current index. The navigations due to the Go To function of browser, selection of a web page in the navigation history and initiated inside a web pages can be modeled by the event jp(target). The HistOp has different handling with these three

cases.

¦ ¦ ¦

¦

+

*

§ §

ω 1'243 ω ω

,.-0/

¨  ¨

¨ : manip H c ;c : index ¦

¦ new new

57698:

"$# %'&)(*

; §

 ¨

 : c : max c 1 1

<>=@?BA ?DC

6

E § ¨¨

: c :  min c 1 len H

¦ ¦ ¦ ¦ where

ω ω j ω ω j ω

§ §  §0 

¨GF ¨ H  ¨ manip H new c ¨ cat i : i elem H index i c new

In the case jp(ωnew), HistOp first manipulates the navigation history and then updates the

current index. In case of back and forward, only the current index is updated. ¦

This HistOp action is included in the actions of all transitions. Then the transition leads to § the web page point to by the current index, i.e. the active page is the item c H ¨ . Figure 9a gives an idea on this modeling. In order not to make the diagrams clumsy, the HistOp is omitted in our diagrams whenever it is not the subject of discussion.

6.3 Navigation Constraints Security Constraints and Content Constraints provide constraints on what web page the browser will load according to the security and content requirement of its user.

6.3.1 Security Constraints (NS, IE) Users can select types of scripts and programs that are not allowed to execute arbitrarily. The hosting web page of these banned scripts and programs will still be displayed, but without any of the banned components. This type of constraint prevents the web page from containing some contents which are not desired. Security Constraints can be modeled by providing alternate states for different banded con- tents and with the condition connective. Each alternate state will only have the sub-states allowed by the security constraint it aims at. The alternate state to transit to is selected by the conditional connective. For example, in Figure 9(b), when the event jp(A) leads to state A. The condition connective selects A.html or A'.html, depending on whether JAVA is restricted to be accessed. A.html and A'.html are alternative versions. Their only difference is A'.html do not possess the JAVA which are restricted contents. We do not use the constraints in transitions to model security constraints because our approach would make the diagrams more elegant.

14 jp( A) jp( B) / HistOp A c back/ HistOp no restriction A B [ JAVA restricted ] [ ]

A.html forward/ HistOp A'.html

(a) Navigation History JAVA

(b) Security Constraint

jp( A) jp( B) [ violence = false ] A B A (c) Content Constraint (d) Reload / Refresh

newWindow/ openWin( new , home ) home A

statechart old statechart new

(e) New Window

Figure 9: Modeling Web Browser Facilities

6.3.2 Content Constraints (IE) Web pages can be rated by their content, such as sex, nudity, violence and language. Browser can prevent web pages of inappropriate content from being viewed by a set of user defined rules. Web pages that violate these rules will not be displayed at all. The browser will stay at the page before activating the hyperlink. This type of constraint can be naturally modeled as constraints on navigation events. Vari- ables need to be defined for every web page state for its content, so comparison is possible in the constraint. For instance, in Figure 9c, the constraint violence = false is applied in the transition. B can be accessed if only if the condition violence = false is satisfied.

6.4 Reload / Refresh Reload (NS) / Refresh (IE) of a web page cause the web page to be visited again. This is modeled by a self target transition event. For example, in Figure 9d, the transition jp(A) is a self target transition from A leading to A. It should be noticed that if the web page is a server side dynamic page which returns different web pages in different time period, the web page returned by Reload / Refresh may be different from the original one. We have discussed the modeling of server side dynamic in Section 5.2.3 and hence we do not repeat here.

15 6.5 New Window Users can manually create concurrent viewing of web pages by opening new windows (NS, IE). They can choose to open a new window and then use it to go to a web page. Users can also select any hyperlink from the history of the current window and open the target web page in a new window. As described in Section 5.2.2, multiple windows are modeled by ad lib Statecharts. The event newWindow is used to denote the creation of a new window by browser. The creation of new window will not cause any change in the old window. Usually, the new window starts with the home web page defined by the users themselves. This is illustrated in Figure 9e. The event newWindow is a self target transition. The openWin(home) is in the action part of this transition. When a new window is created by the browser, i.e. the event newWindow is triggered, an instance of the statechart new is created and added to the Statechart Vector. Then both statecharts new and old are running concurrently. If users select a hyperlink from the history of the current window, the web page is used in the openWin instead of home.

7 Discussion

We have identified the requirements for a web navigation modeling tool in Section 1. Our web navigation modeling by ad lib statechart has satisfied these requirements. Abstraction Facilities for abstraction is inherited from statechart. For example in studying inter- page navigation (Section 5.2), details of intra-page navigation can be hidden, such that each web page appears as a single state. When it is desired to study the intra-page navigation, the details can be shown again, as described in intra-page navigation (Section 5.1). Composite states can be abstracted at any level to provide a clear picture with arbitrary amount of details. Abstraction can also be done with different levels of detail in different parts of the model at the same time. Support ad lib Features We use ad lib Statechart to model the ad lib features of web navigation. This is not provided in most of the modeling methods. Consistent Mapping Each web navigation method discussed is mapped consistently to state- chart constructs. Consequently, different readers will have the same interpretation on the same model. Comprehensive Web navigation methods, including those general to the web and browser spe- cific ones, are identified and modeled. Instances of navigation by hyperlinks, frames, win- dows, server dynamic content, client side scripts and programs and various web browser features are captured. Furthermore, multiple windows and multi-thread are also modeled by means of ad lib Statechart. Thus our model can provide a complete picture of all kinds of navigation of the web. Analytical Ability Web navigation we modeled is represented by statechart visual formalism. Further analysis can be carried out on the captured information by statechart semantics and

16 tools [11]. Examples include finding out unreachable web pages, or undesirable alternate navigation paths between web pages. Fitness Most of the features of ad lib Statechart are used in the modeling of web navigation. On the other hand, all the features of web navigation can be modelled by ad lib Statechart. Hence The ad lib Statechart fits the modeling of web navigation well. User Friendliness The symbols of ad lib Statechart are very simple. Round-angle rectangles are used to represent web pages and arrows are used to represent hyperlinks. Furthermore, the operation concepts are simple and easy to be understood, even by non-computing users. By fulfilling the above requirements, our web navigation model by statechart makes it possi- ble to precisely document a web site. It is now possible to validate the design of a web site by analysis on the model. Maintenance and update to the web site can be tested for their correctness and errors can be corrected before implementation.

References

[1] Microsoft Visual Basic 5.0 : ActiveX Controls Reference (Visual Basic 5.0 Documentation Library). Microsoft Press, 1997. [2] M. Bieber and T. Isakowitz. Designing hypermedia applications. Communication of the ACM, 38(8):26–29, August 1995. [3] C. Chen. Structuring and visualising the www by generalised similarity analysis. In Proceedings of the 8th ACM Conference on Hypertext (Hypertext’97), pages 177–186, Southamption, UK, 1997. [4] Jim Conallen. Modeling web application architectures with UML. Communications of the ACM, 42, No.10:63–70, 1999. [5] Adrian Kingsley-Hughes et al. Vbscript Programmer’s Reference. Wrox Press Inc, 1999. [6] Lisa Friendly, editor. The Java Language Specification, Second Edition. ADDISON- WESLEY, 2000. [7] D. Harel. Statecharts: a visual formalism for computer system. Science of Computer Programming, 8, No.3:231–274, 1987. [8] D. Harel and E. Gery. Executable object modeling with statecharts. In Proceeding of the 18th Int. Conf. Soft. Eng., pages 246–257. IEEE Press, March 1996. [9] D. Harel, A. Pnueli, J.P. Schmidt, and R. Sherman. On the formal semantics of statecharts. In Proceedings of the 2nd IEEE Symp. on Logic in Computer Science, pages 54–64. IEEE Press, 1987. [10] David Harel. Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8:231–274, 1987. [11] David Harel and Michal Politi. Modeling Reactive Systems with Statecharts: the statemate approach. McGraw-Hill, New York, 1998. ISBN 0-07-026205-5.

17 [12] Denis Helic, Hermann Maurer, and Nick Scherbakov. Introducing hypermedia composites to WWW. Journal of Network and Computer Applications, 22, No.1:19–32, 1999. [13] John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addision-Wesley, 1979. [14] Chung-Ming Huang and Ming-Yuhe Jang. Interactive temporal behavious and modelling for multimedia presentations in the WWW environment. The Computer Journal, 42, No.2:112–128, 1999. [15] C. Huizing. Semantics of Reactive Systems: Comparison and Full Abstraction. PhD thesis, Eindhoven University of Technology, March 1991. Chap. 1. [16] Karl R.P.H. Leung. ad lib Statechart. 2000. manuscript. [17] Karl R.P.H. Leung, Ricky W.M. Tang, Lucas C.K. Hui and S.M. Yiu. Modeling Web Nav- igation by Statechart. In Proc. of the 24th Computer Application and Systems Conference (COMPSAC 2000), Taipei, Taiwan, 2000. IEEE Computer Society Press. [18] C. Kehoe, J. Pitkow, K. Sutton, G. Aggarwal, and J. Rogers. Results of gvu’s tenth world wide web user survey. Technical report, Graphics Visualization and Usability Center, College of Computing, Georgia Institute of Technology, Atlanta, GA, USA, May 1999. http://www.cc.gatech.edu/gvu/user surveys/survey-1998-10/tenthreport.html. [19] S. Mukherjea and Y. Hara. Focus+context views of world-wide web nodes. In Proceedings of the 8th ACM Conference on Hypertext (Hypertext’97), pages 187–196, Southamption, UK, 1997. [20] OMG. OMG Unified Modeling Language Specification, 1.3 edition, June 1999. [21] F. B. Paulo, P. C. Masiero, and M. C. F. Oliveira. Hypercharts: Extended statecharts to support hypermedia specification. IEEE Transactions on Software Engineering, 25(1):33– 49, January/February 1999. [22] P. Stotts and R. Furuta. Petri-net-based hypertext: Document structure with browsing se- mantics. ACM Trans. on Information Systems, 7, No.1:3–29, 1989. [23] M. Turine, M. Oliveira, and P. Masiero. A navigation-oriented hypertext model based on statecharts. In Proceedings of the 8th ACM Conference on Hypertext (Hypertext’97), pages 102–111, 1997. [24] Janice Winsor and Brian Freeman. Jumping Javascript (Sunsoft Press Java Series). Prentice Hall Computer Books, 1997. [25] Yi Zheng and Man-Chi Pong. Using statecharts to model hypertext. In Proceedings of the ACM conference on Hypertext, pages 242–250, 1992.

18 Appendix

A Some Statechart Features

State can be grouped together by XOR for efficient use of transition arrows. Figure 10 shows the savings from the expanded (a) to the more concise (b). The system can be exclusively in state A OR B.

A

A1 A1

event

A2 event B A2 event B

event A3 A3

(a) (b)

Figure 10: XOR grouping of states

Concurrent states, like independent modules in a system, can be modeled by orthogonal AND of sub-states that exist concurrently (Figure 11). In the figure, being sub-states of the composed state A, A1, A2 and A3 maintains their own state.

A

A1 A2 A3

Figure 11: AND grouping of states

Abstraction of statechart can be done by hiding details of sub-states. Figure 12a is abstracted to Figure 12b by hiding the details of the XOR sub-states. The stubbed transition arrow of e2 represents that the transition is available from some of the sub-states of A only. The select connective can be used to reduce the number of transition arrows. Events e in Figure 13b is defined as the disjunction of events e1, e2 in Figure 13a. e replaces e1, e2 by pointing to the select connective, meaning that A will transit to B1 or B2 depending on e1 or e2 happened. Events can affect states globally. This global effect of all occurred event is described as broadcasting of events to all parts of the system. All activities/events are considered instant

19 A

A1

A2 e1 B e1 B

A

e2 e2 A3 C C

(a) (b)

Figure 12: Abstraction of states

A

A e

e1 e2 S

B1 B2 B1 B2

(a) (b)

A

A e

e[c2] e[c1] C

B1 B2 B1 B2

(c) (d)

Figure 13: Select connective

(zero time), so synchronization can be supported by the triggering of some limiting events on all concurrent partitions.

20