Heuristics for the Collection and Use of Web Data and the Design of Web Interactivity

Total Page:16

File Type:pdf, Size:1020Kb

Heuristics for the Collection and Use of Web Data and the Design of Web Interactivity

Guidelines: Web Data Collection for Understanding and Interacting with Your Users

Judith Ramey, University of Washington

SUMMARY

The global growth of the World Wide Web challenges technical communicators to reconsider the methods we use to create designs that meet the goals and needs of our users. This article focuses on taking advantage of the Web’s potential for interactivity between designers and users. It offers strategies for getting data from users of Web sites and using it for two main purposes: (1) analyzing audience and patterns of use to support continuous redesign, and (2) building a relationship or sense of community on a Web site.

WEB DATA COLLECTION: AN INTRODUCTION

Doing audience analysis has from the earliest days of technical communication been regarded as essential to be sure that an information design will meet the needs of its intended users or readers. Historically, audience analysis has been completed early in the design process, so as to guide basic design choices, and has typically yielded a document enumerating audience characteristics and the design strategies selected to respond to them. This document is then kept on hand to be consulted during the rest of the design process.

But in designing for the World Wide Web, the pace has become so accelerated, the audiences so diverse, and audience needs so mutable that we need to reconsider how we do audience analysis. And in fact, the medium itself offers us new answers. When designing for the World Wide Web, technical communicators can be in direct contact with their users/readers on an ongoing basis. Thus they have the opportunity to monitor their audiences continuously and adjust their designs based on what they learn. Further, they have the opportunity to go beyond traditional audience analysis to actually engage their users/readers in direct communication of various kinds, thus creating new kinds of immediate, interactive relationships with them.

These guidelines aim to guide technical communicators in the process of understanding the power of direct, ongoing contact with users/readers and the possibilities for design that can result from it. This area is new enough that much of the background information appears online rather than in print. Also, most of it focuses on marketing or technical issues rather than rhetorical issues like audience analysis. In the following discussion the references to supporting literature often take the form of a universal resource locator (url), although where printed sources exist they are cited, and more no doubt will have appeared by the time of publication.

The design of these guidelines

1 These guidelines focus on strategies for getting data from users of existing informational Web sites and using it for two main purposes: (1) analyzing audience and patterns of use to support continuous redesign, and (2) building a relationship or sense of community, either between you and your users or among the users themselves.

Although both of these purposes require you as designer to consider real user behavior, the designer task is quite different in the two cases. The first case is the more elementary and the closer to the goals and practices of traditional audience analysis. It calls for the analysis of basic data available on any Web site, and of questionnaire responses, email, and other direct user input, to draw conclusions about audience needs and patterns of use.

Meeting the second purpose goes beyond simple data analysis. It requires creative rhetorical exploitation of the direct link to your users/readers and transforms data exchange with users/readers into something much richer (Amkreutz 2000) But as you work with even simple data from users, they become increasingly vivid and present to you, and you can begin to see how much more you might do with the link you have to them. The interactive nature of the Web in fact gives you the power to radically re- imagine your relationship with your user or reader. Thus these guidelines first offer (in Part One) a primer on Web-based audience analysis to support continuous redesign, and then (in Part Two) go on to treat the more sophisticated relationships with users that you can create.

Technical requirements for using these guidelines

Most of the guidelines in this set require that you work with the system administrator responsible for your Web server (to extract the necessary data) and possibly with a programmer (to implement data capture techniques). In most cases the guidelines require you to use software that collects, manipulates, and/or graphically displays specific kinds of data from the server log file (the record of activity on a site). Especially at the beginning, while you are getting your tools and processes in place, these data analyses can be very time-intensive. Thus your organization must endorse this user-centered approach to Web site design and make a significant commitment of resources to implement it.

PART ONE; ANALYZING AUDIENCE AND PATTERNS OF USE TO SUPPORT CONTINUOUS REDESIGN

You can use two main sources of data to analyze your audiences and their patterns of use: data from server logs and data collected directly from your site visitors, for instance answers to a questionnaire that you have posted on your site. You can use what you learn to continually improve the fit between your site, its goals and purposes, and its users.

Important points to remember

Analysis of Web data does Web site design (like the design of other forms of

2 not substitute for doing communication) begins with a careful analysis of the intended initial audience analysis. or expected audience(s), purposes, and uses. After the initial release, however, Web statistics and Web survey data can provide the designer with a dynamically emerging picture of site visitors and their patterns of use.

Web data from logs must Server log data reports on machines and transactions be used cautiously. (individual requests for files), not people and sessions (see below for more details).

Web data from user User informants (respondents to online questionnaires, informants must also be visitors sending email, and other voluntary providers of data) used cautiously. are self-selected and possibly not representative of your broader audience(s).

Guidelines for analyzing audience and patterns of use by means of server log data

The first three guidelines focus on the use of server log data for audience analysis and analysis of patterns of use. These guidelines primarily support the detection and diagnosis of problems. For those who are new to the idea of server log data, the next section provides an introductory overview, followed by a brief list of products for analyzing Web statistics.

Server log data: an introductory overview In using the guidelines having to do with using server log data, it is important to remember the problems and limitations of the data. First, the Web is “sessionless;” each transaction (file request) is reported separately. When a user types in a url for your Web site, or clicks on a link, each request for the file or files associated with the link is treated as a single event not associated with any other requests issued for files on the site. That is, requests for files are tracked, not users.

Second, the log recognizes visits by specific machine addresses, not specific people. The number of visitors reported is affected (increased or decreased) by the use of dynamic Internet Protocol (IP) addresses, proxy servers, and cacheing. In the case of dynamic IP addresses, a single user at a single machine might actually be using more than one IP address. For instance, in a lab with numerous workstations, the lab manager might figure that not everybody will want to be on the Web at the same time, and thus might set up a small pool of IP addresses to be assigned as needed . A user’s machine might thus be assigned a different IP address for each Web transaction. Alternatively, in the case of proxy servers, all internet traffic in an organization might be channeled through a server used as a “stand-in” IP address (often the case with “firewalls” and other company security arrangements). In this case, all the hits to your Web site from all the people in that organization would show the same IP address. Also, when your user requests a file

3 from your server, it is transmitted to your user’s computer and typically is stored in a cache, a temporary storage file. If the user returns to that file (page), his or her computer might retrieve it from the local cache rather than from your site, in which case your server log file would not record the transaction.

Third, the numbers of “hits” generally report the number of files requested, which may or may not correspond to pages (a “page” on your site might for instance contain several graphic images stored in separate files, each of which is logged separately).

Thus, reaching conclusions about your users and what they are doing requires you to make complicated logical links and inferences that can take you far from the actual data at hand. The greater the distance between the actual data and the conclusions that you draw from them, the less reliable and certain your conclusions are and the more caution you need to exercise in making design decisions based on them.

Keeping in mind these challenges to interpreting server log data, let’s look at more detail at the data reported on a server log. Log files follow one of two formats: common log file (CLF) format and extended log file (ELF) format.

The common log file (CLF) format basically records the date and time of the transaction, the IP address of the remote host, the file that was requested, the size of the file in bytes, and the status of the request (e.g. “404, file not found”). Table 1 shows a small sample of server log data from a University of Washington informational Web site about arthritis (Macklin, Turns, and Shelton 1999). Each row in the table corresponds to a hit. The first two columns indicate the date and time of the hit; the third column contains the IP address of the requester. The last column indicates the resource that was requested. (In this analysis they did not track the number of bytes transferred.)

Table 1: Sample common log file format

1/1/96 0:09:30 dial18.chemek.cc.or.us. :bonejoint:kkakkkkk2_1.html 1/1/96 0:09:32 dial18.chemek.cc.or.us. :bonejoint:gif:Clip.GIF 1/1/96 0:10:03 dial18.chemek.cc.or.us. : bonejoint:mov:ScopeACLTear.mov 1/1/96 0:10:47 dial18.chemek.cc.or.us. :bonejoint:mov:ACLgraft/mov 1/1/96 0:10:56 dial18.chemek.cc.or.us. :bonejoint:Arthritis.idx.html 1/1/96 0:13:01 pm5-00.magicnet.net :bonejoint:nzzzzzzz1_2.html 1/1/96 0:15:00 pm5-00.magicnet.net. :bonejoint:xzzzzyzz1_1.html

The size of the actual log file is suggested by the very small amount of time that elapsed between hits on this site; imagine the possible size of a file covering for instance a full day. Note that these seven entries report transactions with only two different IP addresses; given the small amounts of time involved, you might decide to interpret these as two visitors. Also note that the requests are for different kinds of files: formatted pages (“html”), graphics (“GIF”), and animations (“mov”).

By analyzing these log entries, you can examine the pattern and consistency of use over time (monthly or daily statistics, day of week statistics, or even hourly statistics), the

4 origins of hits to your site, the resources most often consulted (say, the top five files by number of hits or the top five most frequently requested “404” files), number of apparent repeat visitors, etc.

The extended log file (ELF) format includes other data points about visits and visitors, such as the visitor’s browser and platform and the Web site that the visitor is coming from, called the “referring page” (plus the search term the visitor used, if it was entered from an engine or directory). The ELF format can include “cookie” information as well, if available. A “cookie” is a small file that records a visiting computer’s activity on a site. When you visit a site (and you have not set a preference in your browser to refuse “cookies”), the site server can transmit a cookie file that records the files that you have requested. The cookie is placed in a folder on your computer; if you visit the site again, the server requests the cookie file that it sent to your machine before and updates it with data about your current visit. In this way the server can build up a historical record of your computer’s actions on the site. (There are privacy concerns with the use of cookies that will be discussed in more detail later.)

Neither the CLF nor ELF format records the search terms that visitors use within your site using your internal search engine. To get that data, you need to have your system administrator set up your server so that the search engine reports all search terms to a file that you can then analyze.

Several case studies of groups that have used server log data for analysis of audience and patterns of use have appeared in print and on the Web (to mention only a few, see Drott 1998; Nielsen 1999; Sullivan 1998; Yu et al. 1999; and Kantner 2000).

Software products for analyzing Web statistics There are a number of products on the market that help you manipulate and visualize Web server log data. These products differ in the features that they offer; if your organization needs to acquire a product to implement analysis of Web server log data, work with your technical staff to understand your requirements so as to choose a tool that is right for your situation. These tools change so rapidly that it is not possible to summarize their features accurately. Here is a brief list of some tools (Macklin 00); no endorsement of any product is implied:

 Free Ware Stats Analysis, http://awsd.com/scripts/weblog/index.html  Bazaar Analyzer, http://www.bazaarsuite.com/  FastStats, http://www.mach5.com/fast/  FunnelWeb, http://www.activeconcepts.com/  Gwstat, http://www.ccs.cs.umass.edu/stats/gwstat/html/  Summary, http://www.summary.net/  Webalyzer, http://www.webalizer.org/  WebTrends, http://www.webtrends.com/products/log/def  wwwstat, http://www.ics.uci.edu/pub/websoft/wwwstat/

For more information about currently available log analysis tools, see http://www.uu.se/Software/Analyzers/Access-analyzers.html.

5 There are a number of resources, both print and online, that you can consult to learn more about server log data and how it can be used (to mention only a few, see Buchanan and Lukaszewski 1997; Stout 1997; Aviram 1998; Burke 1997; Goldberg 1999; Linder 1999; Marketwave.com 1999; and Stehle 1999).

Armed with this understanding of the nature and limitations of Web statistics, we can now turn to the three guidelines for analyzing audience and patterns of use based on them.

1. USING SERVER LOG DATA TO MONITOR YOUR AUDIENCE DEMOGRAPHICS

Use server log data to monitor your audience demographics, keeping in mind that drawing conclusions about your audience demographics requires interpretation of the data.

1.1 Analyze the IP addresses, translated into domain names or countries of origin, for computers sending requests to your Web site server.

Determine what percentage of visits come from each of the various domains (indicated by the extensions at the end of the names:.com for a business, .edu for an educational institution, .gov, for a government agency, etc.).

Determine what percentage of visits come from each country (.nl for The Netherlands, .jp for Japan, etc.), and thus get a view of the international composition of your audience.

Compare to the initial assumptions that you made in your audience analysis. If you discover a difference, consider how much and in what way your actual audience differs from the audience you expected to get. If there is a difference, do you still want or need to reach the audience that you originally targeted? If so, consider what you can do to raise their awareness of your site or increase your site’s attractiveness to them. Or is your current actual audience acceptable and productive for you? If so, identify any design changes to your site that are required by their characteristics.

1.2 Analyze the browsers or platforms being used by computers sending requests to your Web site server.

Determine the technical composition and level of sophistication of your audience. Compare to your initial assumptions.

If this is not the audience that you need to reach, consider how to make your site more visible and attractive to the audience you want. Or, if your current actual audience is acceptable even if unexpected, identify any design changes to your site that are required by the actual browsers and platforms that they are using.

6 1.3 Analyze the number of unique IP addresses that visited your site and the number of visits each made.

By (cautiously) assuming that each IP address is a single user, you can determine where your audience falls on a continuum from heavy users to one-time visitors. Compare to your initial assumptions.

Identify any design changes to your site that are called for by the pattern of visits of your current actual audience. For instance, if you get mostly one-time visitors, do you clearly announce the audience, purpose, and use of your site on each page?

2. USING SERVER LOG DATA TO GET A GROSS VIEW OF PATTERNS OF USE ON YOUR SITE

Use server log data to monitor the patterns of use on your site. (Drawing conclusions about patterns of use requires extensive interpretation because the actions of a visitor that together make up that visitor's session on your site are each reported as a separate transaction. Thus you can reason that two or more requests within a very short time by the same IP address constitute a sequence of requests by a single user, but in fact you can't be sure.)

2.1 Analyze the patterns in the dates and times of transactions.

Determine how even or uneven your level of use is, and what your periods of heaviest use are.

Identify any design changes to your site that are called for by the pattern of visits of your current actual audience.

2.2 Analyze the number of hits (files requested) and the number of page views (which can be derived from your site structure by most of the software tools for server log analysis).

Use this data to determine the amount of traffic on your site and the level of demand for the various topics and types of content that you offer.

Compare these patterns to your initial assumptions. Do you see differences that call for changes to the site’s design or content?

2.3 Analyze the referring pages from which visitors come to your site.

7 Identify what sites your visitors are coming from. Are there patterns that you did not expect to see? Are there design changes that you can think of that would better serve visitors with these apparent interests or affiliations?

2.4 Analyze the amount of time spent on each page.

Using averages over long blocks of time to minimize the effects of disrupted user attention (for instance, users leaving your page open while answering the phone), determine which of your pages users appear to spend the most time on.

Use care in responding to this statistic. Although time spent on a page might indicate interest, it might also indicate confusion or difficulty in understanding your content. This statistic can be combined with other statistics (and results of user questionnaires and other direct user queries) to clarify the user experience.

2.5 Analyze your most and least frequently visited pages.

Identify the pages that appear, based on the number of hits, to be relevant or interesting to the greatest number of visitors, and those that appear to be of least interest.

Are there patterns that you did not expect to see? Are your visitors overlooking content that your initial analysis suggested would be important to them or that you particularly want them to see? Can you think of design changes that could redirect their attention?

2.6 Analyze the search terms used to hit your pages

Identify the vocabulary that your site matches in searches. Compare to your initial assumptions and to the whole set of terms used in your design.

Are terms that are important descriptors of some of your content not showing up in the search terms that bring visitors to your site? You can work with your Web master and system administrator to add keywords to your site that will be picked up by search engines.

Are there other problems with the search terms that lead people to your site? By looking at the search terms and referring pages, you may be able to identify ambiguous terms or other terms that are leading you to get unproductive hits.

2.7 . Analyze the search terms used to search within your Web site.

Identify the vocabulary used by your visitors to look for the content they are trying to locate on your site. (Remember, to analyze the search terms used on your site, you need to work with your Web master or other technical staff to set up the search engine so that the terms are also reported to a file.) Compare the terms actually being used to your original assumptions (labels and titles that you use).

8 Are users apparently seeking content that you offer, but simply using different names for it? Are users seeking content that you don’t offer but could? Are users thinking about your content in ways that differ from your terminology or organization?

Consider whether you can modify your terminology so that it fits better with the way your actual users are thinking, so as to reduce the number of unproductive searches and improve access to your content.

2.8 Analyze the most frequent paths through your site.

Using larger blocks of data to minimize problems with interpretation, identify the pathways that users most often follow through your site.

Remember, doing so requires software that traces a given IP address through a string of link choices, clusters the results, and graphically displays the results as pathways with frequencies. One such product, Link trakker, provides information on what search engines and terms visitors use, paths they take through the site, and other sites that are linking to your site (http://www.radiation.com/cgi-bin/trakker/secure/demoreport.cgi). Keep in mind that this kind of tracking depends on making a number of inferences rather than on certain knowledge. Be careful not to put more faith in the results than is warranted.

3. COMBINING TYPES OF SERVER LOG DATA

By juxtaposing your insights from different parts of your server log data, you can draw additional conclusions about your users and their needs. These additional conclusions can point you to specific ways to improve the usefulness of your Web site.

3.1 Combine types of server log data to draw conclusions about users and user needs to support strategic decisions about revision and redesign.

You can get more power out of your analyses if you combine or juxtapose the findings related to different types of server log data. For example, compare the search terms used on your site to the terminology you use; combine that data with data about the number of hits on pages containing related content.

3.2 Track the effects of your changes by monitoring new server log data in the same categories.

After you have redesigned your pages based on what the server log data told you, monitor new server log data in the same categories to see the effects of your changes.

Examples:

As suggested above, you might observe that a significant block of users is using the same set of terms to search within your Web site, and you might learn by inspecting your site’s

9 terminology that your users appear to have a vocabulary for your content that differs from yours. After changing your vocabulary, you may learn by monitoring the search terms used by visitors coming in to your site that you are now getting more hits on your new terms.

You might compare users’ search terms to your most and least popular pages and pathways. You might then change your terms and monitor patterns of use to see if you have raised the visibility of your “hidden” content.

You might study the data about your most and least popular pages compared to the ordering and grouping of topics in your menu hierarchy. You might then reorganize your menu structure and monitor the server log data to see if the pattern of use changes.

If you are located in the U.S.A. and you find that your site is heavily visited from Asian countries, and (perhaps as a result) the site has its heaviest period of use late at night, you might reschedule routine maintenance that otherwise could reduce access.

Guidelines for analyzing audience and patterns of use by means of data from user informants

Beyond using server log data, the second major way to analyze audience and patterns of use on your Web site is to collect data directly from the visitors to your site–“user informants.” The next three guidelines focus on collecting and using this kind of data.

Notes

Data from user informants is often used for marketing purposes, for instance to build demographic profiles, conduct e-commerce, and document levels of traffic in order to sell advertising. But data from user informants can also be collected and analyzed to provide direction for improvements to a site's design.

Remember that user informants (respondents to online questionnaires, visitors sending email, and other voluntary providers of data) are self-selected and possibly not representative of your broader audience(s). Also remember that people may not have a clear understanding of the question you are asking, may not be highly articulate, and may not necessarily report factually or fully.

4. USING DATA PROVIDED BY USER INFORMANTS TO IMPROVE YOUR PICTURE OF AUDIENCE DEMOGRAPHICS AND PATTERNS OF USE

Visitors to your site can cooperate more directly in producing data to help you understand your audiences and their needs. They can give you many kinds of data, varying greatly in the effort required on their part. Each approach also imposes different technical requirements on your site design and requires a different level of trust and willingness to participate from your users.

10 A Word about Privacy

The information in cookies and other user data is often used to create profiles of typical users and patterns of use. Profiling generally aggregates information about users, so that no individual is named or tracked. But (by using cookies as well as passwords, for instance) some profilers do link individual users and their names and addresses to their specific purchases, so as to sell targeted ads. That is, you can compare the behavior of a new visitor to patterns of behavior that you have seen over large numbers of visitors. Where you see a match, you can tell the new visitor about the behavior of people ”like” him or her (for instance, you can say that people who liked this book or movie–the one the visitor is looking at–also liked a second one, which you can then offer to sell them).

A second form of profiling, called collaborative filtering, goes farther. Rather than simply inferring visitors’ preferences by observing what they do, collaborative filtering asks the visitor to overtly rate or rank choices. For instance, on the site http://www.moviecritic.com, the visitor fills out an attitude questionnaire about a handful of movies, and on the basis of that indication of his or her taste, the site recommends other movies.

The power of profiling, collaborative filtering, and other user data analysis is just beginning to be understood (e.g., Gladwell 1999), but in any case it raises serious issues about user’s privacy. It is important to have a clear policy about privacy rights and adhere to it strictly.

4.1 If you collect user data, put a privacy policy statement on your Web site that clearly informs your users about the data you are collecting and the use you are making of it.

TRUSTe is an independent, non-profit initiative whose mission is to build users’ trust and confidence in the Internet by promoting the principles of disclosure and informed consent. In the privacy policy statement that they recommend, the visitor can expect to be notified of what information is gathered/tracked, how the information is used, and whom the information is shared with. The TRUSTe site (http://www.truste.org/) provides extensive information about privacy statements. TRUSTe has come to be a sort of watchdog for the internet; AOL and other major internet service providers post a privacy policy on their Web site and submit to auditing by TRUSTe.

4.2 Analyze data from cookies (files stored on the visitor’s computer that accumulate a history of the computer’s activity on your site).

This is the least complicated cooperation that you can ask of your visitor; the only user efforts involved in the use of cookies are leaving the computer set to accept them and not throwing them away. By updating the cookie each time that computer requests a file from your site, your server can add to a record of the computer's activity on your site: sequence of page visited, participation in interactivity, transactions, etc.

11 4.3 Ask your users to use a user i.d. and password on your site.

By asking (or requiring) your users to create and use a user i.d and password, you can track the activities of individuals as opposed to the computers that they are using. Thus you can raise your confidence that the patterns of behavior that you are seeing on your Web site do reflect the real behavior of users.

By aggregating these patterns of use, you can identify the main types of activity on your site. For instance, if you have a cooking site, do most users go first to the recipe pages? Do they search or browse? Do they check the seasonal availability of fresh ingredients as they look at recipes? Do they consult the bulletin board for comments from other user/cooks? By analyzing such patterns, you can draw conclusions about your users’ goals and interests (Dervin 1989).

4.4 Give users the opportunity to refuse dialog with you.

You can also use what you learn from user i.d.s and passwords to present individual users with messages tailored to their apparent characteristics, tastes, and interests. Many users will find such messages helpful and interesting; others will not. If you decide to display targeted messages, give users the option to reject any one message and to turn off the entire messaging activity.

4.5 Ask your users to fill out online questionnaires.

You can invite users to respond to online questionnaires about any issues on which you want feedback. The shorter the questionnaire the more often users complete it. The same rules of design used for paper questionnaires govern online ones.

4.6 On every page, offer your users the opportunity to send you email.

You can provide an email link to allow users to send the Web master questions, comments, and other feedback.

If you do, you must be prepared to respond to the emails in a reasonable period of time. It is possible to first send an automatic reply that tells the user when and how to expect an answer. One somewhat less labor-intensive way to respond to emails is to aggregate the answers and post them on a bulletin board or Frequently Asked Questions page on your Web site.

Email from your users can be very powerful at getting you outside the confines of your own thinking. Often unstructured input from users can help you discover creative or radical solutions that would never have occurred to you otherwise. The tradeoff is that this kind of input is time-consuming to respond to and challenging to control (for instance, you don’t want to get locked into an extended email discussion about a problem beyond your control).

4.7 Conduct “remote usability tests” over the Web.

12 A record of an individual’s session on your Web site is essentially a remote usability test, but without the thinking out loud that can tell you what the user wanted to do, what assumptions he or she was making as they worked, and other insights into their thinking. You can get that important dimension of user behavior using technology as simple as the telephone to have them talk you through a session as you follow along on your own computer.

5. COMBINING TYPES OF USER INFORMANT DATA

Again, by juxtaposing your insights from different kinds of user-supplied data, you can draw additional conclusions about your users and their needs. These additional conclusions can point you to specific ways to improve the usefulness of your Web site.

5.1 Combine types of data provided by user informants to draw additional conclusions about users and user needs.

Examples

You can compare the questions and requests for information contained in email from your users to the pathways they follow (provided by tracking the actions of users identified by user i.d.'s) and their self-descriptons (provided by their answers to pop-up questionnaires).

You can combine data from cookies, user i.d.'s and passwords, and popup questionnaires about the users' level of familiarity with your your content to build a profile of the behavior typical of beginners (e.g., search terms used, if any; pages visited).

6. COMBINING DATA FROM WEB SOURCES WITH OTHER DATA AVAILABLE IN YOUR ORGANIZATION

Continue to explore ways to combine the data available to you to support design improvements. Don't overlook other data available in your organization derived from sources other than the Web site itself.

6.1 Combine data from server logs and user informants with other data available in your organization to support strategic decisions about revision and redesign. Track the effect of your changes by monitoring new data in all categories.

Examples

Compare problem reports from your cusomer support organization to your record of search terms used on your site to identify new content that should be added to your site.

13 Examine the search terms that triggered hits to your site. Construct a pop-up questionnaire to ask visitors about their level of interest in related topics, appropriate to your site's scope and purpose, which a search engine would not find on your site. If the feedback warrants it, add the new content.

PART TWO: BUILDING A RELATIONSHIP OR SENSE OF COMMUNITY

We started by saying that these guidelines would focus on strategies for getting data from users and using it for two main purposes: (1) analyzing audience and patterns of use to support continuous redesign, and (2) building a relationship or sense of community, either between you and your users or among the users themselves. We now turn to focus on this second main purpose, building community, often described as one of the most powerful effects of Web communication.

Meeting this second purpose goes beyond analysis of audience and patterns of use to examine how to manage the link between you and your users so as to build a relationship or sense of community. We have in fact already talked about using this link for data collection in which the user gives data to the designer. These methods actually already imply a certain kind of relationship between the designer and user.

In the case that we first considered, in which the user is passive or even unaware that data is being collected (use of server logs), we can say that the user experiences the Web site as an inanimate product. The designer sees himself or herself as an analyst examining data from a subject (the user). In the case of the user who actively takes part in the data exchange (for instance, by sending email), the roles and relationship of designer and user are very different. The user may still perceive the Web site as an inanimate product, but now he or she has an awareness of the designer as a person, the producer of the site, and of him- or herself as at least at some level a contributor to producing the site. Thus there is a much richer human relationship involved in this exchange.

Type of Interaction Examples; relationship established user to designer Examples: Data from both server logs and user informants.

Relationship established: 1. If user is passive or unaware (server log data, cookies): Web site perceived as inanimate product, designer is "analyst," user is "subject"

2. If user is active (questionnaire respondent, emailer): Web site is perceived as inanimate product, (human) designer is

14 acknowledged as its producer, user is "contributor" to the site

But providing the designer with data is only one form of user communication (user-to- designer) that your site can offer. You can also offer numerous other forms: designer-to- user, designer-to-group, user-to-user, user-to-group. And these different types of communication with users have different rules and create different relationships.

When the designer responds directly to communications from users, more complex relationships become possible. Let’s consider the simplest case where the designer presents the user with plain reportage, for instance a list of his or her recent actions. This response, even though still quite impersonal, at least involves the closure of the feedback loop–the user took actions, the designer analyzed the actions, and then the designer reflected the record of the actions back to the user. The user may continue to perceive the Web site as just a product,, but the “breadcrumb trail” information offered by the designer may lead the user to think of the designer as a more active partner in a communication (“producer/communicator”). In the next more complex situation, in which the designer responds directly and personally to the user (for instance, in a personal email), the beginnings of a peer relationship begin to emerge: the designer and user are consulting, are collaborating, on the design of the site.

Let’s take this process one step further and look at the case where the designer communicates back to the entire group of users. Here, the feedback loop is closed not by providing feedback to just a single user but by showing the whole group the aggregated results of the communication or data collection/analysis. For instance, the designer might present a chart of the distribution of user responses to an online questionnaire. The designer is still in the role of producer/collaborator, but now the user becomes aware of himself or herself as a member of a community, and the Web site begins to feel more like a setting for interactions than like an inanimate product.

Type of Interaction Examples; relationship established designer to user Examples: Display of former user actions (for instance, topics consulted earlier); replies from designer to user email.

Relationship established: 1. Impersonal response (e.g. topic list): Web site is a product, designer is "producer/communicator," user is "participant"

2. Personal response (e.g. email reply): Web site is a product, designer is "producer/collaborator," user is "collaborator"

15 designer to group Examples: Designer reflects data back to users, displaying charts or other reports of the results of online questionnaires, posing new questions based on feedback from/dialog with users, suggesting actions or options based on analysis of behavior of other similar users ("if you liked this movie, you'll probably like these others;" selecting ads to show based on user profile)

Relationship established: Designer is "producer/collaborator;" user is "collaborator," possibly with some feeling of "community member," Web site is a product and a setting

Once the user has become aware of the other users in the community, it is a short step to go on to offer him or her the option of communicating directly user-to-user. You may choose to offer a bulletin board where users can leave postings, or a chatroom where users can interact with each other more casually. Now the relationship between designer and user has changed radically; the designer now has become the enabler of communication among users, the Web site is the setting, and the users are themselves creators of content and even community.

Type of Interaction Examples; relationship established user to user Examples: User replies directly to another user's posting on a bulletin board; user replies to another user's query or ad.

Relationship established: Web site is a setting, designer is "enabler," focus on designer displaced by focus on user as "co-creator" of use and content user to group, group to user Examples: User selects an audience or topic (for instance, on a bulletin board) and addresses all other users associated with it (e.g. by posting a message). Users post messages for or against positions taken or attitudes expressed by one or more other users. Users confirm group identity.

Relationship established: Web site is a setting, designer is "enabler," focus on designer

16 displaced by focus on user as "creator" of use, content, and community

Note that these radically different imaginings of the purpose and use of a Web site derive directly from the designer’s choices about communication. By choosing communication roles and relationships for themselves and for the users, Web site designers define the scope of the human dimension of the Web.

Guidelines for building a relationship or sense of community

The final three guidelines focus on your selection and management of the roles and relationships set up by your choice of modes of interaction and communication.

7. OFFERING FORMS OF INTERACTING WITH USERS THAT ARE APPROPRIATE FOR AND CONSISTENT WITH YOUR SITE'S INTENDED AUDIENCES, PURPOSES AND USES

We have said that different choices about how you communicate with your users create different roles and relationships. Not all roles and relationships are appropriate for every audience, purpose, or use of every Web site.

If a site claims that it is a forum for a particular interest group, does it offer the members of that group a mechanism for posting content? If the site represents itself as an advocate of a group (members of a social group, for instance), it should offer the members of that group an egalitarian setting to encourage wide participation.

If a site, on the other hand, claims that it presents authoritative information, does it nevertheless allow users to post unverified information? Generally, a site that wants to maintain an authoritative persona would welcome questions but not enable unscreened postings of information. It might however enhance community feeling by posting human-interest stories or case studies (often done, for instance, on health or education sites).

A site can have areas that differ as to what forms of interaction are appropriate. For instance, a health site might have a “news” area that is quite authoritative and a “chat” area in which visitors can exchange feelings and stories. If a site has areas that differ in this way, the site design should clearly indicate moves from one area to another.

8. CONSIDERING WHETHER YOUR INTERACTION DESIGN IS APPLIED CONSISTENTLY, AND WHETHER YOU MAINTAIN THE RESULTING DESIGNER/USER ROLES CONSISTENTLY ACROSS OTHER DIMENSIONS OF YOUR DESIGN (FOR INSTANCE, TONE)

17 Roles and relationships are fragile and can be undermined or undone by abrupt departures from the overall pattern or user expectation.

8.1 Make the communicative tasks and opportunities of the reader as clear and explicit as far as possible.

Do you use one or more forms of interacting with your users? Is each type designed in the same way everywhere that it is used? If not, can you justify the difference? If you use more than one type, do the types work together without creating conflicting roles and relationships for users? Are the relationships implied by your choice of forms of interaction with users maintained consistently across your site?

8.2 Inspect your site to confirm that all of your design choices are working together.

The forms of interactivity with users that you employ create roles and relationships between and among designers and users, but other dimensions of your design also contribute to the creation of roles and relationships (see the Coney and Steehouder, Guidelines: Reader Roles). Inspect your site to confirm that all your design choices are working together.

Does your choice of tone (for instance, authority speaking in a formal tone versus peer speaking in a familar tone) work with the roles and relationships created by the form of interactivity you are using? Do the other dimensions of your Web site maintain relationships between designer and user (and among users themselves) that are consistent with those created the forms of interactivity that you provide to the user? Or do you change the roles allowed users with respect to designers and other users? If you do change the roles allowed to users, are the changes appropriate and justifiable, and on what grounds?

9. DECIDING HOW EXPLICIT YOU WILL MAKE THE DESIGNER/USER ROLES AND RELATIONSHIPS CREATED BY YOUR CHOICE OF INTERACTIVITY WITH THE USER, AND IDENTIFYING THE DESIGN CHOICES AND MOTIFS THAT YOU WILL USE TO REINFORCE THEM

Choose the extent to which you want to reveal and explicitly reinforce the roles and relationships created by your choice of forms of interactivity.

If you do not want to emphasize the relationship of designer to user or user to user on your site, consider whether you have chosen forms of interactivity with your users that create expectations that you do not intend to or cannot meet. Consider choosing forms of data collection and interactivity more consistent with the relationship with your user that you want to maintain.

If you do want to emphasize this aspect of your site, consider using labels, design layouts, or other design motifs to draw attention to the roles and relationships. If you use terms or

18 motifs that draw attention to or showcase user participation (user profiles, summaries of responses, chatrooms, etc.), use them consistently across the site.

Quicklist: Web Data Collection for Understanding and Interacting with Your Users

This guideline discusses Web data collection for understanding and interacting with your users in two main parts: (1) analyzing audience and patterns of use to support continuous redesign and (2) building relationship and community on your Web site.

Four Considerations to Keep in Mind About Web Data Collection

Analysis of Web data does Web site design (like the design of other forms of not substitute for doing communication) begins with a careful analysis of the intended initial audience analysis. or expected audience(s), purposes, and uses. After the initial release, however, Web statistics and Web survey data can provide the designer with a dynamically emerging picture of site visitors and their patterns of use.

Web data from logs must Server log data reports on machines and transactions be used cautiously. (individual requests for files), not people and sessions (see below for more details).

Web data from user User informants (respondents to online questionnaires, informants must also be visitors sending email, and other voluntary providers of data) used cautiously. are self-selected and possibly not representative of your broader audience(s).

Collecting and interpreting Most of the guidelines in this set require that you work with Web data requires the system administrator responsible for your Web server (to technical support. extract the necessary data) and possibly with a programmer (to implement data capture techniques). In most cases the guidelines require you to use software that collects, manipulates, and/or graphically displays specific kinds of data from the server log file (the record of activity on a site).

19 PART ONE: ANALYZING AUDIENCE AND PATTERNS OF USE TO SUPPORT CONTINUOUS REDESIGN

Part One focuses on guidelines for analyzing audience and patterns of use to support continuous redesign. It covers the two approaches to collecting data for this purpose: collecting data from Web server logs and collecting data directly from user informants.

Guidelines for Analyzing Audience and Patterns of Use by means of Server Log Data

1. USING SERVER LOG DATA TO MONITOR YOUR AUDIENCE DEMOGRAPHICS

Use server log data to monitor your audience demographics, keeping in mind that drawing conculsions about your audience demographics requires interpretation of the data.

1.1 Analyze the IP addresses, translated into domain names or countries of origin, for computers sending requests to your Web site server.

1.2 Analyze the browsers or platforms being used by computers sending requests to your Web site server.

1.3 Analyze the number of unique IP addresses that visited your site and the number of visits each made.

2. USING SERVER LOG DATA TO GET A GROSS VIEW OF PATTERNS OF USE ON YOUR SITE

Use server log data to get a gross view of patterns of use on your site.

2.1 Analyze the patterns in the dates and times of transactions.

2.2 Analyze the number of hits (files requested) and the number of page views (which can be derived from your site structure by most of the software tools for server log analysis).

2.3 Analyze the referring pages from which visitors come to you site.

2.4 Analyze the amount of time spent on each page.

2.5 Analyze your most and least frequently visited pages.

2.6 Analyze the search terms used to hit your pages

20 2.7 . Analyze the search terms used to search within your Web site

2.8 Analyze the most frequent paths through your site.

3. COMBINING TYPES OF SERVER LOG DATA

Combine types of server log data to draw conclusions about users and user needs to support strategic decisions about revision and redesign. Track the effects of your changes by monitoring new server log data.

Guidelines for Analyzing Audience and Patterns of Use by means of Data from User Informants

4. USING DATA PROVIDED BY USER INFORMANTS

Use data provided by user informants to improve the accuracy and detail of your picture of audience demographics and patterns of use.

Note: It is important to have a clear policy about privacy rights and adhere to it strictly.

4.1 If you collect user data, put a privacy policy statement on your Web site that clearly informs your users about the data you are collecting and the use you are making of it.

4.2 Analyze data from cookies (files stored on the visitor’s computer that accumulate a history of the computer’s activity on your site).

4.3 Ask your users to use a user i.d. and password on your site.

4.4 Give users the opportunity to refuse dialog with you.

4.5 Ask your users to fill out online questionnaires.

4.6 On every page, offer your users the opportunity to send you email.

4.7 Conduct “remote usability tests” over the Web.

5. COMBINING TYPES OF USER INFORMANT DATA

5.1 Combine types of data provded by user informants to draw additonal conclusions about users and user needs.

6. COMBINING DATA FROM WEB SOURCES WITH OTHER DATA AVAILABLE IN YOUR ORGANIZATION

21 6.1 Combine data from server logs and user informants with other data available in your organization to support strategic decisions about revision and redesign. Track the effect of your changes by monitoring new data in all categories.

PART TWO: BUILDING A RELATIONSHIP OR SENSE OF COMMUNITY

Part Two focuses on guidelines for building a relationship or sense of community on your Web site.

Guidelines for Building a Relationship or Sense of Community

7. OFFERING FORMS OF INTERACTING WITH USERS THAT ARE APPROPRIATE FOR AND CONSISTENT WITH YOUR SITE'S INTENDED AUDIENCES, PURPOSES, AND USES

Offer forms of interacting with users that are appropriate for and consistent with your site's intended audiences, purposes, and uses.

8. CONSIDERING WHETHER YOUR INTERACTION DESIGN IS APPLIED CONSISTENTLY, AND WHETHER YOU MAINTAIN THE RESULTING DESIGNER/USER ROLES CONSISTENTLY ACROSS OTHERDIMENSIONS OF YOUR DESIGN (FOR INSTANCE, TONE)

Consider whether your interaction design is applied consistently, and whether you maintain the resulting designer/user roles consistently across other dimensions of your design (for instance, tone).

8.1 Make the communicative tasks and opportunities of the reader as clear and explicit as far as possible.

8.2 Inspect your site to confirm that all of your design choices are working together.

9. DECIDING HOW EXPLICIT YOU WILL MAKE THE DESIGNER/USER ROLES AND RELATIONSHIPS

Decide how explicit you will make the designer/user roles and relationships created by your choice of interactivity with the user, and, if you want to emphasize them, identify the design choices and motifs that you will use to reinforce them.

22 REFERENCES

Amkreutz Boyd, Suzanne (2000). Practitioners' review of Web guidelines. Master's thesis, Department of Communication, University of Washington, Seattle, Washington. (I would also like to thank Suzanne for the extensive support, over more than a year’s duration, that she provided me and my colleagues during the development of these guidelines.)

Aviram, Mariva H. (2/3/98). “Analyze Your Web site Traffic,” Builder,com, http://www.builder.com/Servers/Traffic/

Buchanan, R.W. & Lukaszewski, C. (1997). Measuring the Impact of Your Web Site. Proven Yardsticks for Evaluating. New York: John Wiley.

Burke, Raymond R. (1997). “The Future of Market Research on the Web: Who is visiting your site?” Continuous Learning Project: Problems with Traditional Measurement Techniques, Indiana University, http://universe.indiana,edu/clp/or/future.htm

Dervin, Brenda (1989). “Users as research inventions: how research categories perpetuate inequalities.” Journal of Communication, 39, 3, pp. 216-232.

Drott, M. Carl (1998). “Using Web Server Logs to Improve Site Design,” SIGDOC ’98 Conference Proceedings, pp. 43-50.

Esler, Mike, Katherine Puckering, Ryan Knutsen, Josh Cohen, Dorothy Lin, and Tristan Robinson, “Privacy on the Web: Pro and Con,” seminar report for TC505, Computer- Assisted Communication, Autumn 1999. I would like to thank these students for identifying the sources cited concerning privacy.

Gladwell, Malcolm (1999). “Annals of Marketing: The Science of the Sleeper (How the Information Age could blow away the blockbuster).” New Yorker: October 4, 1999.

Goldberg, Jeff (1999). “Why Web usage statistics are (worse than) meaningless,” Cranfield Computer Centre, Cranfield University, http://www.cranfield.ac.uk/docs/stats/

Kantner, Laurie (2000). "Assessing Web Site Usability from Server Logs," Common Ground, newsletter of the Usability Professionals' Association, vol. 10, no. 1 (March 200), pp. 1, 5-11. Also published as Tec-Ed, Inc. (1999), “Assessing Web site Usability from Server Log Files,” white paper prepared by Tec-Ed, Inc., PO Box 1905, Ann arbor, MI 48106, December 1999.

Linder, Doug (1999). “Interpreting WWW statistics,” National Archives and Records Administration Web site, http://gopher.nara.gov:70/Oh/what/stats/webanal.html

23 Macklin, Scott, Jennifer Turns, and Brett Shelton (2000). personal communication. I am grateful to Scott Macklin, director of the University of Washington PETTT (Program for Educational T ransformation through Technology), and Jennifer Turns and Brett Shelton of the same project, for allowing me to use a sample of their server log file.

Macklin, Scott (2000). Personal communication.

Marketwave.com (1999). “Web Mining: Going Beyond Web Traffic Analysis,” White Paper–Web Statistics and Traffic Analysis Software, Tuesday, June 1, 1999. http://www.marketwave.com/press/whitepaper.htm

Nielsen, Jakob (1999). “Collecting Feedback From Users of an Archive (Reader Challenge),” Useit.com Alertbox, January 10, 1999. http://www.useit.com/alertbox/990110.html

Stehle, Tim (1999). “Getting Real About Usage Statistics,” http://www.wprc.com/wpl/stats.html

Stout, R. (1997). Web Site Stats. Tracking Hits and Analyzing Traffic. Berkeley: Osborne/McGraw-Hill.

Sullivan, Terry (1998). “Reading Reader Reaction: A Proposal for Inferential Analysis of Web Server Log Files,” U.S. West Web Conference: http://www.uswest.com/web- conference/proceedings/rrr.html

Yu, Jack J., Prasad V. Prabhu, and Wayne C. Neale (1999). “A User-Centered Approach to Designing a New Top-Level Structure for a Large and Diverse Corporate Web Site,” Our Global Community Conference Proceedings, http0://www.research.att.com/conf/hfweb/proceedings/yu/index/html

BIOSKETCH

Judith Ramey, PhD, is professor and chair of technical communication at the University of Washington. She edited a special issue of IEEE Transactions on Professional Communication on usability testing in 1989. With Ginny Redish, she conducted a research study on the value added by technical communicators to a product or process, the results of which were published in a special section of the Technical Communication in 1995. With.Dennis Wixon, she co-edited a collection of essays entitled Field Methods Casebook for Software Design, published by John Wiley and Sons in 1996. She is a Fellow of STC. [email protected], (206) 543-2588

24

Recommended publications