A Large-Scale Study of Web Password Habits

WWW 2007 / Track: Security, Privacy, Reliability, and Ethics Session: Passwords and Phishing A Large-Scale Study of Web Password Habits Dinei Florencioˆ and Cormac Herley Microsoft Research One Microsoft Way Redmond, WA, USA [email protected], [email protected] ABSTRACT to gain the secret. However, challenge response systems are We report the results of a large scale study of password use generally regarded as being more time consuming than pass- and password re-use habits. The study involved half a mil- word entry and do not seem widely deployed. One time lion users over a three month period. A client component passwords have also not seen broad acceptance. The dif- on users' machines recorded a variety of password strength, ¯culty for users of remembering many passwords is obvi- usage and frequency metrics. This allows us to measure ous. Various Password Management systems o®er to assist or estimate such quantities as the average number of pass- users by having a single sign-on using a master password words and average number of accounts each user has, how [7, 13]. Again, the use of these systems does not appear many passwords she types per day, how often passwords are widespread. For a majority of users, it appears that their shared among sites, and how often they are forgotten. We growing herd of password accounts is maintained using a get extremely detailed data on password strength, the types small collection of passwords. For a user with, e.g. 30 pass- and lengths of passwords chosen, and how they vary by site. word accounts, the problem becomes not remembering 30 The data is the ¯rst large scale study of its kind, and yields distinct passwords, but rather remembering which of 5 or 6 numerous other insights into the rôlethe passwords play in passwords was used. This appears to be done using a com- users' online experience. bination of memory, pieces of paper, trial and error (trying each of the passwords in turn), and password resets. Since passwords protect accounts with valuable assets Categories and Subject Descriptors they have increasingly been subjected to harvesting attacks. K.6.5 [Management Of Computing And Information Phishing attacks, where a victim is lured into submitting her Systems]: Security and Protection|Authentication password to a malicious site masquerading as a trusted insti- tution have increased enormously in the last few years [11]. General Terms Incidences of keylogging malware, which record keystrokes on a PC have also been rising rapidly. Unlike brute force Security attacks on passwords, both phishing and keylogging harvest strong passwords as easily as weak ones. Thus the nature Keywords of the risk surrounding password authentication has altered greatly. The longstanding problem of users choosing pass- password, authentication, measurements words that are too easily brute forced [12, 6, 3] has been joined by the new problem of users unwittingly revealing 1. INTRODUCTION their passwords in the clear. Passwords play a large part of the typical web user's expe- The convenience of web access to accounts is extremely rience. The are the near universal means for gaining access compelling, and thus the rôlethey play in the average web to accounts of all kinds. Email, banks, portals, dating and users life seems likely to increase. However we ¯nd that ¯rm social networking sites all require passwords. So important data on users' actual password habits is hard to come by. It are they that HTML has a special form ¯eld to allow for is conventional wisdom that users choose weak passwords, the special treatment they require, and an important rôleof frequently re-use passwords across multiple sites, and often SSL is protecting the secrecy of passwords from observers of forget them. In this paper we report on a large scale study of the connection. web users habits where we measured and report these and Alternative to passwords certainly exist. Hardware au- other patterns for the ¯rst time. We obtained data from thentication, e.g. [1], is sometimes used for access to corpo- over half a million users over a period of three months. This rate networks. However, this requires an issuing authority is more than 100 times more participants than any previous and seems to be limited to environments that justify the study we are aware of. cost, such as in the employer-employee relationship. Chal- Among our interesting ¯ndings is how large a rôleweb lenge response authentication has the advantage that ob- passwords play in users lives. The average user has 6.5 serving a single successful sign in does not allow an attacker passwords, each of which is shared across 3.9 di®erent sites. Copyright is held by the International World Wide Web Conference Com- Each user has about 25 accounts that require passwords, and mittee (IW3C2). Distribution of these papers is limited to classroom use, types an average of 8 passwords per day. That users choose and personal use by others. weak passwords has been known informally for some time; WWW 2007, May 8–12, 2007, Banff, Alberta, Canada. ACM 978-1-59593-654-7/07/0005. 657 WWW 2007 / Track: Security, Privacy, Reliability, and Ethics Session: Passwords and Phishing we are able to measure exactly how weak. Users choose PRE Report: this contains: passwords with an average bitstrength 40.54 bits. The over- ² the current (primary) URL whelming majority of users choose passwords that contain ² all of the URLs previously associated with the pass- lower case letters only (i.e. no uppercase, digits, or special word (secondary URLs) characters) unless forced to do otherwise. We were able to ² time since last login at each URL previously associated measure that 0.4% of users type passwords (on an annual- with the password ized basis) at veri¯ed phishing sites, and at least 0.2% of ² time since ¯rst login at each URL previously associated users actively maintain their own router. Finally users for- with the password get passwords a lot: we estimate that at least 1.5% of Yahoo ² the password strength users forget their passwords each month. ² number of entries in the PPL, and number of PREs In the next section we cover details of the client and the ¯led by client data gathered. In Section 3 we present our results, broken ² number of unique passwords used by this client into logical sections. In Section 4 we discuss related work. ² the age of the client. 2. EXPERIMENTAL METHOD The format of the report is Our client software shipped as a component of one skew [Up; fsU0; sU1; ¢ ¢ ¢ ; sUN¡1g; ft0; t1; ¢ ¢ ¢ ; tN¡1g; of Windows Live Toolbar. Not all toolbar users received the f¿ ;¿ ; ¢ ¢ ¢ ;¿ g; PwdStr, PPLSz, NPREs, NPwds, CAge]: component. The component was optional, and users were 0 1 N¡1 presented with an opt-in agreement. The toolbar was ¯rst Suppose for example that a user has a password that is used available for download on the Microsoft web on 7/24/2006, at PayPal, Yahoo, eBay and YouTube. The ¯rst time the and a total of 544960 clients received, opted in and activated password is typed (say at eBay) it will be added to the PPL, by 10/1/2006. and no report made to the server. This password can then be typed at eBay over and over and will generate no PRE 2.1 Client Implementation reports and no additions to the PPL. The next time it is The client consists of a module within the toolbar that typed at a site other than eBay (say Yahoo) a PRE report monitors and records Password Re-use Events (PRE's). It will be sent listing www.yahoo.com as the primary URL Up contains the following main components. and www.ebay.com/login as the secondary sU0: Now typing HTML password locator: this component scans the it at PayPal will cause a PRE report listing www.paypal. document object model in search of ¯lled-out password com as Up and www.yahoo.com and www.ebay.com/login as ¯elds, and extracts the passwords. The ¯rst task merely sU0; sU1: Observe that neither the password, nor its hash involves searching the HTML for ¯elds declared are sent in the report. There is no personally identifying inputtype="password" information in the report. and extracting the value ¯eld. This search is initiated every Note: The reason that we perform the realtime password time the browser BeforeNavigate2 event occurs. Thus we check is that we wish to be sure that we catch every Pass- ¯nd completed password ¯elds before they are sent to the word Re-use Event. If a user enters a password at URL A it server. Once the password is found it is hashed and added will be entered in the PPL by the HTML password locator. to the Protected Password List (PPL). However it is possible that the password could be typed at Protected Password List: This list contains the pass- another site that does not use a HTML password ¯eld. We word hash, the full URL of the receiving server, the bit- wish to capture and report any case where a previously used strength of the password, the current time, and minutes password is typed at another site. since both the ¯rst and last time (if any) that password was sent to that server. All of the information in the PPL is 2.1.1 Privacy stored using the Data Protection API (DPAPI) provided by A number of measures were taken to protect the privacy Windows [14] (the same API that is used to protect pass- of those who opted in. No Personally Identifying Informa- words that Windows stores).

A Large-Scale Study of Web Password Habits

Online Visualization of Geospatial Stream Data Using the Worldwide Telescope

The Fourth Paradigm

WWW 2013 22Nd International World Wide Web Conference

Denying Extremists a Powerful Tool Hany Farid ’88 Has Developed a Means to Root out Terrorist Propaganda Online

Microsoft .NET Gadgeteer a New Way to Create Electronic Devices

The World-Wide Telescope1 MSR-TR-2001-77

Web Search As Decision Support for Cancer

Microsoft Research Lee Dirks Dr

Transcript of Rick Rashid's Keynote Address

Microsoft Research Worldwide Telescope Multi-Channel Dome/Frustum Setup Guide

Time-Travel Debugging for Javascript/Node.Js

Machine Learning for Kinect