A Retargetable Usability Testbed for Website Authentication Technologies

RUST: A Retargetable Usability Testbed for Website Authentication Technologies Maritza L. Johnson Chaitanya Atreya∗ Adam Avivy Columbia University Columbia University Columbia University Mariana Raykova Steven M. Bellovin Gail Kaiser Columbia University Columbia University Columbia University Abstract posed of a usability study design and a test harness for the test environment. First we discuss prior work in the Website authentication technologies attempt to make the area and then we describe the design process for the us- identity of a website clear to the user, by supplying in- ability study. Next, we describe the test harness. Finally, formation about the identity of the website. In prac- we present results from two usability studies conducted tice however, usability issues can prevent users from cor- at Columbia to illustrate how we validated RUST. rectly identifying the websites they are interacting with. To help identify usability issues we present RUST, a Retargetable USability Testbed for website authentica- 2 Background tion technologies. RUST is a testbed that consists of a test harness, which provides the ability to easily con- Prior to website authentication tools gaining popularity, figure the environment for running usability study ses- Whalen and Inkpen conducted a study to evaluate web sions, and a usability study design that evaluates usability browser security indicators [15]. They collected data based on spoofability, learnability, and acceptability. We on which indicators users considered when evaluating a present data collected by RUST and discuss preliminary webpage’s security by asking participants to perform a results for two authentication technologies, Microsoft set of tasks while focusing on the each website’s secu- CardSpace and Verisign Secure Letterhead. Based on the rity. Most participants checked for either the lock icon data collected, we conclude that the testbed is useful for or https in the URL bar, but few checked for or under- gathering data on a variety of technologies. stood certificates. Similarly, in an effort to understand why phishing attacks are successful, Dhamija et al. measured how users evaluate possible phishing websites [6]. 1 Introduction Participants were presented with a series of websites and asked to determine if the site was real or fake. 23% of The heightened interest in website authentication tech- the participants based their decision solely on indicators nologies is fueled by a rise in cybercrimes, such as phish- they found in the webpage content. Some participants ing, and US federal regulations that require financial looked for the lock icon but mistook lock images in the websites to use two-factor authentication [2, 7]. Website webpage content for trusted security indicators. authentication technologies attempt to solve one direc- Web Wallet is an anti-phishing tool that alters how tion of the mutual authentication problem on the Internet password are entered [16] by providing an interface for by either altering the login process or providing the user entering sensitive information, other than the web form with supplemental information. The primary usability provided by the website. It helps the user by removing questions in website authentication are how does a web- the guesswork of which websites have been visited in site communicate to the user it is the real site and how the past. A usability study showed Web Wallet was ef- does a user identify a malicious website? Usability plays fective at helping the participants identify the real web- a major factor in the effectiveness of the technology but site but participants were easily tricked by spoofs of the receives little attention during development. interface. Wu et al. also evaluated the usability of tool- To facilitate usability evaluations, we present RUST, a bars to assess if they assisted users in identifying phish- Retargetable USability Testbed which is a testbed com- ing websites [17]. The results of the usability study in- ∗C. Atreya graduated from Columbia University in Fall 2007 dicate the toolbars are ineffective in assisting users on yA. Aviv is currently a student at the University of Pennsylvania well-designed spoofs. Another study evaluating website authentication was Jackson et al.’s evaluation of whether securely. browser indicators of extended validation certificates as- To evaluate both spoofability and usability, we asked sisted users in identifying phishing attacks [11]. The re- participants to complete tasks at real and spoofed bank- sults showed new indicators like a green URL bar for an ing websites. Spoofability is measured by the number EV certificate did not offer an advantage over the exist- of successful attacks in a session. To evaluate learnabil- ing indicators. ity, four tasks are given before the participant is pro- Schechter et al. [13] conducted an in-lab study to eval- vided with instructions in the fifth task. This provides uate a website authentication technology where each user the chance to gather data on the technology’s ease of use has a personalized image. Participants were asked to per- prior to the participant reading documentation. The ac- form a series of online banking tasks while security indi- ceptability of the technology is based purely on a par- cators were gradually removed. Their results show participant’s subjective opinion and cannot be measured ticipants fail to recognize the absence of security indica- through direct observation. Instead, we collected feed- tors, like the SSL lock and HTTPS, and will enter their back through questions, using Likert scales to classify password in the absence of their personalized image. their reactions, and open-ended questions to comment on Usability study design is a well-studied area [10], their thoughts during the session. however, designing security usability studies creates ad- We designed the study as a within-subject study, where ditional challenges. One issue is how to design a study each participant is given the same set of tasks under where the test administrators attack the participants [3]. the same conditions. We collected session data by the More recently, usability studies have been designed to test harness and through self-reported feedback. Before evaluate methods of conducting security usability stud- beginning the study, we gave each participant a demo- ies. For example, Schechter et al. conducted a between- graphic survey with questions to gauge their experience subject usability study to measure the effect of asking a with web browsing. We gave them copies of the study participant to play a role and use fake credentials rather instructions, the role they are asked to play, and personal than their personal information [13]. They found partic- information to accompany the role. The instructions state ipants who used their real data act more securely dur- the goal of the study is to improve online banking. We ing the tasks. To help usability study designers, SOUPS asked participants to imagine that they have an uncle made kits available from the papers in their proceed- who is in the hospital for an unexpected extended pe- ings [14]. The kits provide usability study material but riod of time and needs someone to assist in managing his are fairly specific and reusing the material would require finances. In addition, we asked the participant to act nor- a number of changes. mally and to treat their uncle’s information as they would their own. During a session, we sent the participant eight emails, 2.1 User Study Design each of which contains a task and a link to the website In RUST, usability is measured by the technology’s where they should complete the task. Four of the emails spoofability, learnability, and acceptability. Spoofabil- are phishing attacks and direct the participant to an il- ity is an attacker’s ability to trick the participant into en- legitimate site. The other four emails direct the partic- tering personal information on an illegitimate website. ipant to a real financial institution’s website. Some of Learnability is the user’s ability to correctly use the tech- them are requests from Uncle John for a specific action nology with and without instruction. Acceptability is the to be taken, and one is an email from the bank introduc- user’s reaction to the technology; if users do not under- ing the new technology and providing basic instructions. stand why a security process is necessary they will find The first task directs the participant to the real site, and ways to break the process [1]. allows them to experience the technology working prop- We chose an in-lab study as the method of evaluation erly before an attempted spoofing attack. Between each so we could attack our participants and measure spoofa- of the tasks, we asked the participants to comment on bility without raising ethical concerns [12]. Because we their experience if they completed the task, or why they wanted to see how participants would behave under con- decided not to complete the task for any reason. After ditions of attack, we did not disclose the purpose of the the tasks are completed we asked participants to express study beforehand, since doing so would place an unreal- their opinion of the technology in a post-study question- istic focus on security. We supplied participants with cre- naire. dentials to eliminate privacy concerns and because users do not already have the necessary credentials to use with 2.2 Test Harness Implementation the novel technologies we were testing. We asked participants to play a role during the session to justify the The test harness component of the RUST testbed cre- use of fake credentials and motivate the participant to act ates a transparent testing environment that can be easily 2 configured for different technologies, thus allowing for parties, tricking the user into giving away their creden- a simplified process for conducting usability evaluations. tials is less feasible and different attacks are required.

Load more