Future of Device Fingerprinting
Total Page:16
File Type:pdf, Size:1020Kb
WHITE PAPER Future of Device Fingerprinting Contents Introduction 1 Limitations of Traditional Device Fingerprinting 2 Device Fingerprinting Evolution 3 How Simility’s Device Recon Works 4 Individual Use Cases 7 Today’s device fingerprinting technology has been made significantly more effective at fighting online fraud with advanced techniques such as fuzzy matching, clustering and predictive modeling. Introduction to Device Fingerprinting For over a decade, a crucial part of fraud detection in the virtual world has been assigning an identity to every laptop, tablet and mobile device that accesses a website or app. Such a fingerprint, often referred as a device fingerprint or device ID, is a representation of hundreds of different device-specific values taken from an end user’s device. Like in the real world, a device fingerprint aids in identification and tracking of bad actors. As a first step to any fraud attempt, fraudsters try to hide their identity or pretend to be many different people when they are, in fact, just one person. For example, a person who owns a local barber shop may want to publish a lot of fake positive reviews about his business and negative reviews about the competitors’ businesses on a review app. Obviously, the person will use a fake name because his customers and competitors know about his business. Also, the business review site probably requires a different email address for each account, so he might enter fake email addresses when setting up multiple fake accounts. Perhaps the review app even requires each account to be from a different IP address and to register a new phone number. These hurdles can make it a little more difficult for a fraudster to set up fake accounts, but they can all be easily circumvented. Copyright © Simility 2017 1 White Paper: Future of Device Fingerprinting This is where device fingerprinting comes in. The one thing fraudsters cannot spoof is the fact that they are accessing the app from the same mobile device. It would be prohibitively expensive to get a basement full of different devices to look like a bunch of legitimate users leaving reviews. So if the app can somehow detect a fingerprint from every device that opens its app, it can assign unique identities to each of its users, even if fraudsters are trying to hide their true identity. Luckily, when a computer, tablet or mobile device accesses a website or app, it passes hundreds of signals about itself in order to aid in customer experience. For example, the browser language tells the app in which language to display text, the horizontal or vertical orientation of the phone tells it how to display the user interface, and the timezone setting allows the app to cater to the local time of the user. Each device has a unique set of these signals in the pool of near-infinite possible permutations of signals, so to the app, a unique user is the unique set of signals from that particular device. Limitations of Traditional Device Fingerprinting Key limitations of the impressive device fingerprinting technology that ceased to make it effective several years ago are: 1. Good users inadvertently change their signals regularly. Generally a user might change settings on his phone multiple times per month when he updates his operating system or changes a setting on his browser, for example. With traditional technology, his device fingerprint will become completely unique every time he does this and in the eyes of the reviews app, he is a brand new unknown user each time. In fact, the typical half-life of a device fingerprint is shorter than one month for some technologies. 2. Fraudsters can actively change their signals to give themselves a new fingerprint. In fact, there’s an inexpensive, widely available program called FraudFox that does it for them. With FraudFox running on their laptop, they can emulate thousands of different device fingerprints on a single app in one day. 3. It’s completely reactive. Knowing the identity of a device is only useful to stop fraud if you know that device has committed fraud on your app in the past. So every fraudster can defraud an app at least once before it is blacklisted. You can share blacklisted data with a global repository of billions of blacklisted devices from other companies, but you have no idea why those devices were blacklisted. Just because someone sent an unwanted message on a dating app yesterday doesn’t mean they’re going to use a stolen credit card to buy a big screen TV on an e-commerce site tomorrow. 4. You are not privy to the rich information behind the device fingerprint. In their simplest forms, two device fingerprints either match or they are different. However, there is a lot of gray area and this binary paradigm does not allow you to see devices that are likely to be the same. Furthermore, a fingerprint does not provide visually rich information, such as a graphical view of the connections between devices, accounts, and orders. Copyright © Simility 2017 2 White Paper: Future of Device Fingerprinting Device Fingerprinting Evolution Fuzzy Matching: Since the fingerprint of a good user’s device will change over time, to figure out which signal changes are OK to ignore. If two distinct fingerprints differ by only one signal that often changes on organic users, such as modified alarm settings, a good fraud model should assign these the same device ID. It’s easy to see where there will be a lot of gray area, so it will be up to smart fraud detection companies to build robust algorithms to correctly draw the line between device fingerprints. Reverse Engineering Fraud Tools: FraudFox is just a deterministic program that spoofs the signals of its user according to rigid rules. Fraud detection data scientists should be able to detect patterns in how FraudFox alters signals. The good ones can effectively reverse engineer its algorithms to detect when a device’s signals have been artificially changed by a fraudster vs. when they have been organically changed by a good user. Ultimately this will turn into an arms race with FraudFox tuning its algorithms to mimic good users and fraud detection data scientists revising their detection models to differentiate between artificial and organic changes, but fraud detection software has greater resources on its side. Predictive Modeling: a device ID doesn’t help stop the first time the device commits fraud because it has not been added to any blacklists yet. But a device ID empowered with machine learning can predict whether a device will be used to commit fraud even if it has never committed fraud before. Fraudsters’ devices often share patterns in their set of signals. With the help of machine learning, device signal datasets render a fraud score. This score tells a story about the device and the user behind it. For example, fraudsters are 5X more likely to have flushed their browser referrer history or have null values in browser settings [source: simility.com/device-recon-results]. As fraudsters change their tactics, the most advanced device fingerprinting technologies will recognize the pattern shift, detect any fraudulent activity and automatically adjust the fraud model. Customized for Each App.: Fraud used to be confined to using a stolen credit card to purchase something from a store or website. Now fraud can be defined as any undesirable behavior, so companies can police their users to make sure none of them are providing a bad experience to others, in order to make a safe environment for everyone. As a result, a bad user on one app might not deserve being banned from every other app too. Similarly a device fingerprinting model for one app might not work as well for another app. A good device ID should allow fraud analysts to write rules on individual signals of their device ID model to determine their relative importance in detecting fraudsters’ devices. In this way, each device ID will be customized for its application. For example, there are some websites that allow you to post a project and have other people review it. In the interest of giving their own project an initial boost, many posters will create a few fake reviewer accounts and positively review their own projects. Some websites might not explicitly outlaw this as long as it is done in moderation. Also many people know how to use an IP proxy service as a rudimentary way to hide their device identity, so these posters may create these accounts through IP proxies. In this scenario, these websites may want to customize their device ID model to refrain from penalizing devices that are using IP proxies to create a handful of reviewer accounts because it is merely a little harmless self promotion. Copyright © Simility 2017 3 White Paper: Future of Device Fingerprinting How Simility’s Device Recon Works If a fraudster returns to your site or application and makes minor changes to device characteristics and behaviors, traditional fingerprinting technology will not detect it’s the same device, and the fraudster will slip through your blacklist defense. Simility’s Device Recon technology analyzes hundreds of mobile and desktop device characteristics and behaviors—including browser, language, location, operating system, even mobile emulation and battery level—to fingerprint devices. Fraudsters can mask identifying properties like their username, email, and IP address, but with Device Recon you can determine if a device is associated with fraud. Similar to the notion of matching a criminal’s fingerprint, Simility’s Device Recon matches past device behaviors and comes up with an assessment of the device’s fraud score.