-Biometrics Introduction—Why Important

Improving Usability and Testing Resilience to Spoofing of Liveness Testing Software for Fingerprint Authentication

Joshua Smith with Advisor Dr. S. Schuckers Honors Program Thesis Proposal March 7th, 2005

Objective: To improve the speed and usability of currently existing software that processes liveness as an anti-spoofing measure in fingerprint biometric devices and use the improved software to experimentally test the resilience of liveness algorithms to spoofing attempts. The current implementation of this software is relatively slow (15-20 seconds per fingerprint processed) and configured to process a large quantity of fingerprints at once. The improved software will reduce the total time it takes from capture of a fingerprint to feedback to the user, which facilitates faster experimentation and greater practicality for commercial applications.

Introduction: In today’s high-speed internet-enabled world, millions of transactions occur every minute: from bidding on an item on an online auction site to a company negotiating prices with a contractor. For these transactions, data needs to be readily available for the people who are meant to have access, and kept securely from those who should not. Data, the intangible product of the networked world, has become a near equivalence to currency; it holds company trade secrets, consumer credit card numbers, and confidential military information. Keeping data from the wrong people is in everyone’s best interest. A common method used to keep data from falling into the wrong hands is the use of passwords of various kinds. The typical uses of passwords are in the context of challenge and response: a user is prompted to identify himself to the system he is trying to access and supply the password associated with that identity (typically a login name). This process is one based on knowledge or possession, that is, if one knows/has the password then he is granted access. With this system structure, it is easy for anyone to gain access to data if they are given (or can possibly guess) the right information. In addition, this places a heavy burden on the users of the system: they must create good (long, “un-guessable”) passwords for each system they have access to, and remember these passwords. A second method of securing data and systems is through the use of biometrics. Biometrics is defined as the “automated use of physiological or behavioral characteristics to determine or verify identity.”1 For example, fingerprints, iris, face, or hand geometry can be used to authenticate a person. Biometrics shifts the burden of knowledge/possession off the user and places it on a person’s physical or behavioral characteristics. In order to access a system that requires the input of biometric data, the process becomes “something you are” rather than “something you possess.”2 This shift of burden from possession to some quality of a person directly ties access to data with a person’s identity over what that person knows. The difference between these two technologies is illustrated in the example that follows. Say, for example, John Smith is a user of his company’s online paycheck system, which allows users to access financial information after supplying their password. When John picked a password he wrote it on a sticky note and stuck it underneath his desk, since he had too many passwords to remember already. When one of the night-shift workers accidentally discovered the note, they could easily pose as John and gain access to his online paycheck information. As far as the authentication system is concerned, a user claimed to be John Smith and presented his password, therefore it must be him. If the paycheck system instead used biometric data for verification, there would be no need for a password. After enrollment into the system, John would simply needs to present the biometric data required (a fingerprint, iris or signature) to gain access. Since biometric data is unique among individuals (discussed later) he would not have to worry about other users accessing his account. With a biometric system, people attempting to gain access cannot guess (or learn) something that will give them access. Only users who have been enrolled in the system will be given access after they have presented their biometric data and are verified. This raises an obvious question: can one “spoof” a biometric system into allowing access? Methods such as using play-doh or gelatin fingers with the same features as an enrolled identity have been shown to allow access for unauthorized users. In order to reduce the chance of unauthorized access, methods of authentication need to be improved to ensure that the biometric data being presented is being presented by a live authorized person, who wishes to be authenticated. To address the first of these issues, methods have been proposed to detect the liveness of a person in a non-invasive manner.3,4 These algorithms have been shown to accurately produce results when presented with live, cadaver, and spoof fingerprints. However, a question has been raised as to whether or not the liveness detection, an anti-spoofing mechanism, can itself be spoofed. In summary, using biometrics in systems results in a high degree of certainty of a person’s identity. Additionally, since biometric data cannot be shared there is an increase in accountability for the state of protected data. This confidence and accountability leads to more security, resulting in cost savings and reduced risk of financial loss for individuals and companies.Error: Reference source not found However, potential exists to exploit biometric systems through “spoofing.” The purpose of this research is to investigate the resilience of an existing liveness detection algorithm, an anti-spoofing measure itself, to spoofing attempts.

Background: Biometric Systems – Types and Process All biometric systems fall under two categories, which are familiar to those involved in security systems: identification and verification. The process, applications, and challenges are unique for both these categories because of the system-level differences that exist. An identification system is sometimes referred to as “1:N Matching” because a user presents biometric data to the system and the system attempts to identify if the user is enrolled in the system and who the person is. A verification system is referred to as “1:1 Matching” because a person makes a claim to his or her identity, presents biometric data, and the system compares the presented biometric data to the data on file only for the claimed identity. A helpful way to distinguish between these two types of authentication is the two different questions that users are essentially asking: in identification, “Who am I?”; in verification, “Am I who I claim to be?”Error: Reference source not found The process of using a biometric system is designed to be as transparent as possible. To understand what occurs during the verification or identification process, a few sub-processes need to be defined. The overall procedure is shown in Figure 1.

Stored Vitality Template Detection

Presentation Enrollment

Template Matching Output Generation

Presentation Feature Extraction

Figure 1 - Biometric System data flow

Presentation is where the user physically presents to the biometric system the data required for capture, such as a fingerprint, iris, or hand.

Enrollment is the process when a user is initially registered for access to a system. This requires the user to present his or her biometric data (fingerprint, iris, hand, etc.) so that a template can be formed in the system. This template will serve as a basis for comparison when attempting to gain access at later times. Since the template will be used many times in the future, the quality of the biometric data acquired during this stage is critical. This stage of using a biometric system can be the most tedious.

Feature Extraction is an automated process of locating and encoding distinctive characteristics from biometric data to generate a template.Error: Reference source not found For fingerprints a common method of feature extraction is minutia matching. This is done by locating ridge endings and ridge “Y’s” (bifurcation) and recording them with their location and direction, relative to other distinctive features on the thumb. An example fingerprint with marked ridge endings and bifurcations is shown below in Figure 2.

Figure 2 - Ridge endings and bifurcations http://www.east-shore.com/tech.html Templates are the output of feature extraction and are the saved identity of a person on a system. The original biometric data cannot be reproduced from the template; the template only contains relevant information to differentiate between multiple individuals, such as the location and direction of ridge endings and bifurcations, mentioned previously. The size of a typical template (this various from implementation to implementation) is very small—on the order less than 1 kilobyte, whereas the original image could be a few hundred kilobytes, depending on the capturing technology and resolution.

Matching is the process where the presented biometric data’s template is “matched” with either the user’s template in the system, in the case of verification; or with any user of the system, in the case of identification. It is important to note that after a user enrolls in a system, later presentations of biometric data will rarely, if ever, produce the exact same template. This requires that a system threshold be set for the greatest difference that can exist between an enrollment template and a generated template to be considered a match. This tolerance is a critical for system operation to be seamless (low false non-match rate), but the threshold should not be too high, reducing the system’s security and increasing the false match rate.

Biometric Data Biometric data is different than a password that can be guessed or changed because it relies on a physical or behavioral characteristic of a person. In order for a biometric system to function well, the qualities of the data taken need to be such that all users of the system can be uniquely identified. Fundamental and secondary qualities are listed below.

Fundamental Qualities Universality – Must be some trait that can be taken from many people Uniqueness – Unique per person; quality must not occur in two different individuals Permanence – Quality must be constant over time (eliminates need for re-enrollment) Collectability – Characteristic must be able to be measured quantitatively

Secondary Qualities Performance – How well the biometric balances the various requirements of the systems Acceptability – The acceptance of the users to present the biometric data Circumvention – How easy it is to fool the system

Fingerprints There are numerous possible candidates for biometric data, each with strengths and weaknesses in the qualities outlined above. Common biometrics includes fingerprint, eye, hand, face, voice, and signature. Since fingerprints are being proposed to be used in this research, the qualities of fingerprints will be explored in detail. Using fingerprints for identity verification is the oldest and most widely used biometric. This is because fingerprints have strong fundamental qualities: nearly everyone has distinguishable fingerprints (except for those without fingers, or those with certain skin diseases), fingerprints are unique from person to person and from finger to finger offering up to 10 unique prints per person. Fingerprints are formed during embryonic development and after formed have a high degree of permanence over the course of a person’s life.Error: Reference source not found Fingerprints are easily captured using various non-invasive techniques including Capacitive AC, Capacitive DC, Optical, Opto-electric.3 Due to the high degree of uniqueness among fingerprints and the accuracy and ease for which they can be measured, fingerprints offer a good choice of biometric data. Equipment being used by the researcher includes the Enthentica, Secugen, and Precise fingerprint scanners.

Current issues with use of fingerprints While biometric systems can offer greater levels of security, various attacks exist to gain unauthorized access to a system that is protected by biometric authentication. One such attack is the type that can occur at the sensor level, such as the presentation of an artificial biometric sample.3 For a system that uses a fingerprint as its biometric data, it has been found by multiple groups that the use of “gummy fingers” (artificial fingers made from gelatin) can spoof a biometric system. One such study found that it was possible to create a gummy finger from a latent fingerprint, enroll into the system and then verify using the same gummy finger against a live enrolled template.5 Methods have been proposed to make spoofing biometric systems more difficult. The method that is considered here is the determination of liveness. To determine whether or not a person is live when they present their biometric data to a system can be a difficult task to automate in a fashion that is acceptable to users, and feasible to implement. Many methods exist, such as temperature sensing, detection of pulse in fingertip, pulse oximetry, electrocardiogram, dielectric response, and impedance.3 Each of these methods have their own challenges in being able to automate and integrate into systems in the most transparent way possible. For example, the extra equipment required to perform some of these tests, such as electrocardiogram, can be expensive and inconvenient for the user. A method that requires no extra equipment has been proposed by researchers at West Virginia University and Clarkson University. The foundation of this technique is detecting active perspiration while biometric data is being presented. That is, a live finger will show a temporal change in its reading whereas an artificial (non-live) finger will show no such change. Liveness detection is then a secondary filter to authentication: even if the fingerprint is verified, the finger needs to additionally be detected as “live.”Error: Reference source not found

Liveness Detection This is a summary of the current methods used by the research team at Clarkson University for liveness detection. The full description of the process can be found elsewhere.Error: Reference source not found,Error: Reference source not found

The method to determine whether a sample presented for authentication is live or not is based on three assumptions. First, for live fingers, perspiration starts from pores on the fingertips. This will leave a pore completely covered with perspiration, or as a dry spot surrounded by a sweaty area. Second, sweat diffuses along ridges in time. This means the pore region will remain saturated while moisture spreads to drier parts. Third, perspiration does not occur in cadaver or spoofed (gummy) fingers. The first of these assumptions creates the rationale for the Static Measures (SMs) made by the liveness algorithm. The second assumption is the basis for the Dynamic Measures (DMs). In terms of the algorithms used, these measures are described briefly below. SM: For live fingerprints there is roughly a 10 pixel peak-to-peak distance which corresponds to pore-to-pore distance. The cadaver and spoof prints will not have this. To quantify this difference, the average Fourier transform of a capture at t=0s is performed, where the energy related to the typical pore spacing is used. Thus, the energy reading of a cadaver or spoof print is low compared to a live print. DM1: Total Swing ratio of the first to the last fingerprint signal. This compares how much “fluctuation” there is in the first capture in comparison to the second capture. The first capture should have greater variations in grey levels because sweat has not had time to diffuse and there are more distinct moist and dry areas. DM2: Min/max growth ratio of first and last fingerprint signal. This compares the max/min signal level ratio. For live fingerprints, the maximums should not increase because pores are already saturated. Therefore this ratio should be greater for a live finger than a cadaver or spoof sample. DM3: Last-first fingerprint signal difference mean. This subtracts the first ridge signal from the last signal. This difference will be greater for a live finger (corresponding to perspiration pattern) than for a non-live finger. DM4: Percentage change of standard deviations of first and last fingerprint signals. This is similar to other measures. If the ridge signal fluctuation is decreasing with respect to the mean then this measure will increase. DM5: Rate of low cut-off region disappearance. The higher this measure the faster dry saturation is disappearing.6 DM6: Rate of high cut-off region disappearance. The higher this measure is, the faster wet saturation is appearing.Error: Reference source not found

Research Definition This research proposes to work on two related areas in liveness detection: first, to improve the current implementation of software so that it has a more streamlined interface and can process data to produce results quickly; second, to use the improved software to determine whether the liveness detection algorithm can be “spoofed,” that is, can the algorithm that was developed to eliminate spoofing attempts using “gummy fingers” be spoofed with additional measures. These two areas are described in more detail below.

Software Improvement: The current implementation for liveness detection is relatively slow in multiple aspects. In order for the user to get a response as to the liveness of the sample presented, the user must capture the data, run the feature extractor, and then run the liveness algorithm which uses the output of feature extractor to produce an output. Each of these steps requires use of separate programs, where data needs to be manually moved from folder to folder so that the subsequent program can process it. Additionally, the time that it takes to process the data (not including time to move data from folder to folder) is long, up to 20 seconds per fingerprint. It has been suggested that in commercial applications there should be a response for users within 5 seconds. To decrease the overall time it takes to process data, multiple approaches will be taken.  Algorithms are currently implemented in MATLAB 7. Translating these into C/C++ code could offer speed-ups in some areas. Determine what areas of the code are the slowest using MATLAB profiler. Determine if algorithmic speed-ups can be made using MATLAB or other languages.  Integrate capture, feature extractor, and vitality detection features into one interface with a common data area so that the user will not be required to manually move data among folders. This includes developing a Graphical User Interface for ease of use.  Diversify data processing operations. Current implementation only allows for batch processing of multiple fingerprints. Being able to capture and immediately receive feedback or capture many fingerprints and then run a batch process is useful.

In order to achieve these goals, steps will be taken in the software product life- cycle to clearly develop specific steps that will be taken to develop software. The steps of the software life cycle are briefly outlined below: 1. Requirements Phase: “What is needed?” The needs of the end-product are explored and refined in this phase. What is needed by the biometrics research team for this and future projects will be determined and ways that the current software product can be improved to better meet this needs will be considered. 2. Specification Phase: “What will the product do?” With the needs of the final product determined, these needs will be analyzed and formed into a specifications document. The specification document outlines what the product will do in a specific manner. This is the developer’s chance to outline what he thinks the final product features will be and receive feedback from the research team. 3. Design Phase: “How will the product do what it needs to?” This step requires the developer to break the product down into components, determine how they interact, and then develop plans for each of the components (modules) needed to make the final product. The resulting documents will provide a detailed set of plans which outline how the product will do what it is supposed to do. Additionally during this phase test plans will be developed for the components to determine if they are working correctly after implementation of the design. 4. Implementation Phase: Coding and testing of the modules planned in the design phase occurs in this phase. Each module will be debugged and tested against the test plan formulated in the design phase. 5. Integration Phase: “Putting it together” Tested modules are merged together into a whole product and tested as a whole. The final software will be presented to the biometrics research team for acceptance testing. For this research a product already exists which supports processing batches of fingerprint data. The fundamental requirements for the functionality of the improved program are already shown in the current software implementation. However, the current implementation of the product will be assessed and reviewed beginning at these original requirements to determine how the product might be improved. This process will “front- load” much of the development work putting in significant time in determining what is required, and how to best implement it.

Spoofing Liveness Detection Algorithm: As noted previously, methods such as use of “gummy fingers” have been found that can “spoof” biometric systems. Knowing the methods/algorithms that the liveness detection software uses, techniques could be developed that exploits the method used to determine liveness. For example, since the algorithm is looking for a temporal change in the fingerprint, varying the pressure of a gummy finger during capture or sprinkling the finger with water prior to capture could possibly spoof the algorithm to produce a “live” response. It is unknown whether these techniques will fool the liveness algorithm and is the subject of this task. Techniques will be investigated by capturing “spoofed” data and processing the data to generate a response. After testing various techniques to spoof the algorithm, documentation of successful attempts will be made so that the weakness in the algorithm can be studied and improved so that these false positives will no longer be accepted. That is, after possible spoof methods are discovered, these will be documented and then tested using ten casts previously collected. Each of the documented methods will be tested across the ten casts.

Preliminary Progress: Research thus far has been cleaning up MATLAB code and investigating methods for speeding up the existing code such as compiling the MATLAB code, or using C/C++ functions within MATLAB. All work has been conducted on a copy of the original program so that quantifiable comparisons can be made at various points in development. Three separate programs are currently being used for research in liveness detection. First, a capture program is used to obtain a raw fingerprint image from one of three capture devices. Depending on the device being used different capture software will be needed. After capturing a fingerprint, the data needs to be placed in the correct directory for features extraction and measures (SM and DM1-6) to be done. The current script that does this, FPVitality, is suited best for batch processing where there may be multiple capture devices are used for a single fingerprint. After the BatchProcess is run in FPVitality, measures are generated and stored in *.mat files. These need to be manually moved to another working directory so a separate MATLAB script, Outcomes, can interpret the output results and produce a result from -1 to 1 (not live to live). Below in Figure 3 of the data processing and user interaction with the current implementation of the software.

Move Move Final Result Biometric Data captured measures FPVitality Outcomes (Live, not Presented data to data to Live) FPVitality Outcomes

Figure 3 - Current Data flow and user interactions Preliminary work that has been conducted in regards to this process has been understanding how this process works, the input that is required, and the output that is obtained. As previously mentioned, the current software implementation is slow, taking in the order of a minute to complete one fingerprint from input to output, including processing time and data movement. It is believed that this is due partially to the nature of the code (inefficiency) and partially to required user interaction. As outlined in the section regarding software improvement, the current implementation will need to be revisited from the beginning through conferencing with researchers to determine how to best meet the needs of the team. Additional features desired, if any, will need to be specified, and the non-essential portions of the code eliminated.

Preliminary list of software requirements: 1. Various modes of operation including “single shot” (one capture with on-screen results), and “batch” (multiple captures with displayed and saved results). 2. One interface which allows the user to capture, process, and view results. 3. Software written so that it can be easily integrated with other software (such as authentication software). 4. Minimal action required by the user—data flow should be handed automatically by the software. 5. Options should be available to the user to see and save intermediate measures data in addition to the final output. 6. Results presented in a responsive manner, which is conductive to performing repeated trials in order to test the algorithm’s resilience to spoofing.

Timeline: Finished By: Task: End March 2005 Received/developed a set of requirements for improved software End April 2005 From requirements, develop specifications. Verify with team. Summer 2005 Design software based on specified requirements. Early Fall 2005 Module implementation and testing. Integration testing. Fall 2005 Develop methods of possible spoofing techniques. Winter 2005 Use implemented software in testing proposed spoofing techniques January-Feb 2006 Time for software revision and additional spoofing techniques. March 2006 Thesis draft done, develop presentation. April 2006 Presentation. Thesis final draft. 1 S. Nanavati, M. Thieme, R. Nanavati. Biometrics: Identity Verification in a Networked World. John Wiley and Sons, Inc. 2002. 2 A. Jain, R. Bolle, S. Pankanti. Biometrics: Personal Identification in Networked Society. Kluwer Academic Publishers. 1999. 3 R. Derakhshani, S.A.C. Shuckers, L. A. Hornak, and L. O’Gorman. “Determination of vitality from a non-invasive biomedical measurement for use in fingerprint scanners.” The Journal of the Pattern Recognition Society. No 36, 2003. 383-296. 4 S.A.C. Shuckers. “Spoofing and Anti-Spoofing Measures,” Information Security Technical Report. Vol. 7 No 4, 2002. 56-62. 5 T. Matsumoto, H. Matsumoto, K. Yamada, S. Hoshino, “Impact of Artificial ‘Gummy’ Fingers on Fingerprint Systems”, Proceedings SPIE, vol. 4677, January, 2002. 6 R. Derakhshani. “Perspiration Detection Program’s Quick Guide For the New Enhanced Feature Extractor” Center for Identification Technology (CITeR). Lane Department of Computer Science and Electrical Engineering, West Virginia University.