Digital Finance and Data

How Private and Secure Is Data Used in Digital Finance?

September 2018

AUTHOR Patrick Traynor Acknowledgements Introduction 1 We gratefully acknowledge the Data Privacy and Security Issues in Online Lending 1 generous support provided by the Digital Finance Providers Evaluated 3 Center for Financial Inclusion at Accion, without which this work 1. Privacy Analysis 5 would not have been possible. We Methodology 5 would particularly like to thank Sonja Results 7 Kelly, Director of Research, and Pablo Antón Díaz, Research Manager, for not Conclusions 10 only helping us to work productively with security stakeholders around 2. Security Analysis 11 the world, but also for their tireless Methodology 11 efforts to ensure that these issues are Results 17 prioritized and addressed. Conclusions 25 We also wish to thank Jasmine Bowers, Kevin Butler, and Imani 26 Sherman of the University of Florida, 3. Terms of Service Analysis all of whom made significant contributions to the successful 4. Conclusions and Recommendations 28 completion of this work. Annex A Word Count vs. Average Reading Grade Level of Privacy Policies 30

Annex B Digital Lenders Evaluated and Analyses Performed 32

Notes 33 Introduction

Data Privacy and Security Issues Amounts and loan maturities vary from very in Online Lending short-term “nano” loans of a few dollars to Mobile phones and networks are transforming medium-term small business loans of a few the world of finance, creating opportunities hundred or some thousands of dollars. Some for widespread financial inclusion, especially companies have grown to substantial — even among neglected regions and groups. Digital massive — scale. Others are just starting out. credit offers a particularly important lifeline. All these companies have at least two things in Individuals and small businesses in these common: they use online and mobile tools to settings often suffer from the inability to connect with customers, and they use a range acquire financial assistance. As a consequence, of customer data, obtained electronically, in a farmer may not be able to repair the vehicle making their credit decisions. needed to bring goods to market; a merchant Security and privacy should be among the may not be able to fully stock shelves; or most important considerations when building an individual may not be able to pay for an digital finance systems. Credit decisions are unexpected medical procedure. often based on sensitive information and There are a number of challenges to online finance offerings are no exception. delivering credit in these scenarios. Efforts to Examples of data used in credit determinations simply reapply existing lending mechanisms include a person’s history of purchases, have failed as individuals and businesses movements through time and space (enabled in these regions and groups often lack the by GPS data), and online activity (such as data that lenders look for in more formal downloaded applications). The sensitivity of settings to make credit decisions, e.g., audited this information gives rise to a series of critical tax forms, pay stubs, property ownership questions for customers: documents, etc. A rapidly-increasing number of companies have created alternative means ••To whom am I giving my data? And who of measuring creditworthiness based on else do they allow to access it? For what observing transactions made through mobile purposes? applications and other data. Such systems hold the potential to dramatically expand financial ••How do the companies protect data so that inclusion, but if they are poorly designed and people who do not have legitimate access executed, they may discourage millions from cannot use or steal it? participating in the modern digital economy. Around the world, digital lending companies ••What rights and options do I have if are emerging and growing. They offer digital something does go wrong? loans through mobile phone apps or via , including those that are optimized The goal of this research is to examine how for mobile. These digital loan products are well digital lenders are responding to these varied: many are aimed at consumers; others questions, and the main sections of the paper focus on small enterprises. Many of these are organized accordingly. To address the companies are reaching out to customers first question regarding policies for data at the base of the economic pyramid. usage, we performed a thorough analysis

DIGITAL FINANCE AND : HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 1 ensuring that the connections between customers’ mobile devices and the lender’s FIGURE 1 servers are secure. Finally, we evaluated the Example of a Permission Screen from a Mobile Money App terms of service for each of the online finance in Android offerors to characterize the impact on a customer, should fraud occur. It is critical to characterize the state of security and privacy in this burgeoning space and to identify potential weaknesses and areas for improvement, with the overall goal of protecting system users and strengthening the defense barriers that will enable this new range of online finance providers to become trustworthy, secure, and reliable. Security claims do little to protect real users unless they are backed by strong mechanisms and adequate privacy policies. Independent evaluations — such as the one detailed in this paper — not only help consumers identify the firms currently meeting the bar for best practices, they also push other firms toward the implementation of better industry standards for consumer and asset protection in the digital financial sector. Thoroughly identifying the security flaws and vulnerabilities of an online system is no easy task, and no single analysis can measure all relevant aspects of security and privacy. Accordingly, this work should be viewed only as a first step. We also recognize that these companies vary widely in scale and maturity: while we hope to see good security practices everywhere, we note that early stage Note Cash App was not part of the list of applications that were reviewed in this study. companies may be juggling many issues. We would expect more robust security from larger, more established companies. of the privacy policies of a representative It is our hope that companies, consumers cross-section of 52 lenders in the digital and regulators can use this work to plot the finance industry. We evaluated whether path toward improved security standards for such policies covered important issues such online finance. Together, we can ensure the as the conditions under which personal transformative power of these systems while consumer data is shared with third parties, minimizing risks to all parties. With the help the timeframes for data retention and of Accion and its investment fund team at minimization, and the information on these Accion Venture Lab, we reached out to all of policies provided to clients. To examine data the companies included in this report prior to security, we performed a thorough analysis of its publication to share with them our findings, the characteristics of the connection between provide useful guidelines and tips on how to client mobile devices and the digital lender’s correct the most common vulnerabilities that own servers for 27 selected companies. While were uncovered by this study, and offer them a it is possible to measure the security of many chance to comment on these findings. The next parts of an online finance system, the most section of the paper contains more details on critical piece — and the one evaluated here — is the results of this consultative process.

2 CENTER FOR FINANCIAL INCLUSION Digital Finance Providers Evaluated Digital finance companies exist in a wide array Security claims do little to protect real of markets across the world. While many users unless they are backed by strong focus on small businesses, others work with mechanisms and adequate privacy policies. individuals. Our analysis characterizes the security and privacy practices of a wide swath of this industry, including both leading names in the industry (such as Tencent in China and Lending Club in the United States) and smaller efforts (such as Coins in the Philippines and enterprise lending, person-to-person lending, Branch in Kenya). Table 1 lists the 52 digital payroll lending, etc. We attempted to balance finance companies (14 US and 37 from other the above considerations with input from the countries) that we analyzed for this project. Center for Financial Inclusion at Accion (CFI), We considered multiple factors in our as well as members of the larger financial selection. First, all the analyzed companies inclusion community, to pick the most have a mobile application or version of their representative sample. optimized for mobile devices. Second, This study and its main findings represent these companies offer a broad range of a first step to examine the data security geographic coverage, spanning 14 countries risks of digital finance providers, and we across five continents (Figure 2). look forward to further case studies into This study was not expected to cover any one of the companies featured here or every existing digital finance provider, but others not analyzed in this effort that can only a representative sample, both by region provide additional insights as well as concrete and type of product; i.e., small and medium recommendations.

FIGURE 2

Countries Represented by Our Selection of Digital Finance Companies

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 3 TABLE 1

Digital Lenders Evaluated in This Study

COMPANY COUNTRY COMPANY COUNTRY

Airtel Democratic Republic Kubo Financiero Mexico of Congo Lending Club United States Atom United Kingdom Lulalend South Africa Azimo United Kingdom M-Pawa Kenya BlueVine United States MCo-op Cash Kenya Branch Kenya Micromobile Kenya C2FO United States MoneyTap India Coins Philippines OnDeck United States CommonBond United States Pay Your Tuition Funds United States Creditas Brazil PaySense India CrowdEstates United Kingdom PesaPata Kenya EcoCash Loans Zimbabwe Prosper United States Equitel Kenya Puddle United States Equity Direct Kenya RoadLoans United States Farm Drive Kenya Saida Kenya FastPay United States Salud Fácil Mexico GetBucks South Africa SMECorner India IndiaMART India Social Lender Nigeria Insikt United States Suregifts Redemption Nigeria InstaPaisa India Tala Kenya InvoiNet Argentina Taplend United Kingdom Jimubox China Tencent China Kabbage United States Upstart United States KCB Kenya WeFinance United States Koopkrag South Africa Yoco South Africa Kopo Kopo Kenya Zidisha Kenya

4 CENTER FOR FINANCIAL INCLUSION Privacy Analysis

Many observers point to the transition of the Methodology information economy as one of the hallmarks Our study of privacy policies addressed two of the twenty-first century. In the financial high-level questions: sector — as well as in other industries — “big data” promises to deliver customized services ••Do the privacy policies published by digital at an accelerated speed, to improve customer lenders address the appropriate issues? satisfaction through personalized product design, and to increase profits through targeted ••How readable are these policies for their advertisements. However, access to the customer target audiences? data needed to facilitate these improvements may also have unintended consequences. We collected the English-language privacy Personal information that some individuals may policies of the 52 selected companies listed prefer to keep private (e.g., pregnancy, sexual above.1 The policies were available on the orientation, behavior patterns) has become webpages or within the applications associated discoverable through the use of modern data with each company. Instead of inventing our analytics techniques. Financial information, own criteria, we combined guidance by the affiliations (political, religious, etc.) and other US Federal Deposit Insurance Corporation data may also become involved. As such, many (FDIC)2 and the GSM Association (GSMA)3 for consumers look to the acquisition and use of mobile applications. This approach was valid their data with apprehension. for many reasons. First, it combined industry Privacy policies can help alleviate some of this and regulatory perspectives on what privacy fear. They offer a public explanation of how a policies should cover. Second, using these company intends to handle user data, the uses to long-published guidelines prevented us from which it may be put, with whom they may share creating arbitrary requirements about which such data, the conclusions that may be drawn, the industry might be unaware. Finally, we and what rights customers have to correct previously applied this combination to evaluate such information. If a company follows its own the privacy policies of mobile money providers.4 privacy policy, the actual risks of improper data The use of FDIC guidelines is not use may be reduced. Unfortunately, no two without some debate. Many consider FDIC privacy policies are the same and many are recommendations as best practices and they vague, unclear, or incomplete. are followed by many financial institutions The first exercise in this study was to throughout the world (especially if those characterize privacy policies for digital lenders. institutions interact with US-based financial We sought to determine whether such policies institutions). However, the 38 digital finance discuss critical issues, how they compared providers based elsewhere that were evaluated against traditional financial offerings, and the in this work did not necessarily have any accessibility or readability of such policies for requirement to follow US agency rules. customers from different backgrounds. An Unfortunately, in this study it was impractical interesting further study question, while not for us to perform an analysis considering addressed here, is how and when users are each of the 14 possible different national exposed to these policies. standards. Further insight can be gained into

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 5 any of the studied countries through follow-on TABLE 2 analysis; however, our goal was to develop a Privacy Policy Specifications Considered for This Analysis relative understanding of performance across the industry.

BOTH FDIC AND GSMA Coding Process We evaluated privacy policies manually. We User choice and control: Notices should disclose the user’s right to opt-out created a codebook of words correlated to each and how users can control the use of their personal information. of the above policy principles, which is shown in Table 3. Security: Notices should disclose how personal information will be protected Mention of any of the words contained in the and safeguarded. codebook was sufficient to credit the privacy policy with addressing a given principle. This Data to be shared: Notices should list the personal information of users that should not be construed as an indication that may be disclosed. such policy comprehensively covers a topic; rather, our goal was to determine if a principle Data minimization/retention: Information sharing practices of personal was even considered. We took this approach information of former customers should be disclosed. Only the minimum because we sought empirical evidence of amount of user information should be collected, accessed, used, and retained compliance with the guidelines mentioned; at all times. arguing for or against the quality of coverage beyond this is an important task, but one more GSMA appropriately undertaken by more specialized researchers such as policy specialists and Purpose of data collection: Policies should disclose the purpose of collecting, consumer rights advocates. Accordingly, the accessing, and sharing user data and ensure that each purpose is for results herein should be viewed as a starting legitimate business operations. point for discussion and by no means an endorsement of any of these policies. Children and adolescents: If applicable to children, the service should We followed standard methodology used guarantee that the child’s personal information is properly collected and throughout the social sciences for evaluating should abide by all laws related to children’s privacy. text. Two individuals or “coders” were tasked with independently evaluating all of the Accountability and enforcement: Employees are held accountable for proper collected documents. If the coder did not find use and protection of user data. exact keywords in the codebook but did find similar text, the coder was instructed to use FDIC best judgment when scoring that principle. If neither the keywords nor similar text Collection process: Policies should list the types of personal information that were found for a specific principle, the policy are collected. received a score of zero. The results of the independent trials were Definitions: Policies should define terms concerning collection process, then compared and mutually reconciled information disclosure, etc. to arrive at the reported result. During the reconciliation process, if the results of the Examples: Policies should include examples of the collection process, coders diverged (e.g., when judgments about information disclosure, etc. ambiguous text differed), we discussed the instructions and thought process with both Third parties: Policies should disclose affiliates with whom the bank shares coders to determine the final score for each non-public personal information. policy and principle. Completion of all the steps in this process was critical to capturing the policy content fully and accurately.

6 CENTER FOR FINANCIAL INCLUSION Analysis of Readability The applied linguistics community has TABLE 3 developed several tests to automatically Codebook of Words Used for Each Policy Principle measure the readability of text. These tests produce a “grade level” that is meant to reflect the suggested level of education needed to fully POLICY SPECIFICATION KEY WORDS AND PHRASES comprehend the body of text. For example, although a gold-standard technique is still Purpose of Data Collection Reasoning, Enhance User Experience, debated, it has become common in certain User Experience fields to use several tests for one study. Our analysis calculated scores based on the Children and Adolescents Children, Children’s Privacy following readability scoring mechanisms: Flesch-Kincaid Grade Level, Gunning Fog Accountability and Enforcement Employee, Accountable, Accountability Index, Coleman-Liau Index, SMOG Index, and Automated Readability Index. Individual scores Collection Process Collect were tallied and we calculated an overall average score. In addition, we also calculated Definitions Means, Is, Are the estimated time-to-read and a word count. Examples Types of Personal Information, Results Types Of, For Example, Includes

Mention of Principles Third Parties Third Party, Third Parties Figure 3 is a comprehensive representation of our privacy policy content findings. The User Choice and Control Disable, Edit, User Can, Change graph shows three groups: the first two are the selected lenders we reviewed — international Security Security and US digital lenders. The third group is US traditional banks, used as a benchmark Sharing Process Share, Sharing Process because they are required to comply with FDIC guidelines. Data Minimization and Retention Minimization, Termination, It is clear from Figure 3 that the policies Continue To Share, Retention, Retain of the US digital lenders are quite similar in content coverage to those of the traditional US banks. Over three-fourths of the US lenders’ policies covered seven of the 11 categories, while over one-third covered all 11. These results are similar to those of the US traditional Therefore, to be understood by users, policies banks, where over three-fourths cover eight of need to define terms that relate to user the categories and at least one-fifth cover all 11. data and privacy. By comparison, the international policies Another topic frequently omitted was covered less content across the board. This employee accountability. Defining employee may be a result of legal requirements for credit accountability gives the user an understanding providers in the United States. Among the of what should happen in the event of specific gaps, 65 percent of the international data mishandling. policies failed to include definitions and Although we in no way assume US bank examples of privacy-related terms. Policies policies to be the ideal standard, we include tended to include jargon (e.g., browser the results to indicate the impact of regulation cookies) that may not be common knowledge. on practice.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 7 FIGURE 3

Content of Privacy Policies (by percentage of total number of companies)

International Digital Lenders US Digital Lenders US Traditional Banks 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 98.0 98.0 100 98.0 92.9 92.9

90 85.7 82.0 80

70 65.8 65.8 64.0 63.2 60.5 57.9 57.9 57.1 57.1

60 56.0 50.0 50 42.9

40 34.2 34.2 34.2

30 21.5

20 18.4 16.0

10

0 SECURITY CONTROL EXAMPLES RETENTION DEFINITIONS ADOLESCENTS THIRD PARTIES ENFORCEMENT CHILDREN AND AND CHILDREN USER CHOICE AND AND CHOICE USER PURPOSE AND USE SHARING PROCESS COLLECTION PROCESS COLLECTION DATA MINIMIZATION/ ACCOUNTABILITY AND AND ACCOUNTABILITY

Reading Grade Level Scores reading grade level of the international Privacy policy reading grade level statistics policies was higher than the median of both represent the average reading grade levels sets of US policies. With a median of 13.1 years, of our three sets of policies (see Figure 4). customers would need at least one year of As a point of reference, it is widely assumed higher education to fully grasp the meaning that the average American citizen reads at of the policies. In contrast, average school approximately an eighth grade level. Our attainment in Nigeria and the Democratic measurement showed that the grade levels Republic of the Congo, for example, is nine of the digital lenders’ policies were higher years.5 Hence, it is important that providers than the US traditional bank policies, which and policy makers aim for content that we included as a baseline to show impact of users of varying educational backgrounds regulation. As shown in Table 4, the median can read and understand.

8 CENTER FOR FINANCIAL INCLUSION FIGURE 4

Average Reading Grade Level of Privacy Policies

International Digital Lenders US Digital Lenders US Traditional Banks Average Reading Grade Level 20

18

16

14

12

10

8

6

4

2

0 LENDER (EACH DOT REPRESENTS A SINGLE LENDER INCLUDED IN THIS STUDY.)

Figure 5 shows the word counts of all TABLE 4 three sets of privacy policies. Aside from one outlier policy of over 11,000 words, the word Privacy Policy Reading Grade Level counts ranged between zero and 6,000. As shown in Table 5, the median word counts of AVERAGE READING GRADE LEVEL each data set were relatively close. However, the minimums were far less similar, with INTERNATIONAL DIGITAL LENDERS US DIGITAL LENDERS US BANKS InvoiNet having a word count of 151. In addition, InvoiNet only covered two of the 11 Minimum 7.5 (Atom) 9.4 (RoadLoans) 8.2 categories of policy content. We found that policies of lower word count are more inclined Maximum 15.5 (InvoiNet) 14.6 (Fast Pay) 14.6 to cover less user privacy content than those Median 12.9 12.6 10.6 with higher counts. In contrast, we cannot conclusively say that longer policies are fully comprehensive. However, longer policies can reflect effort put forth to fully cover critical policy topics.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 9 FIGURE 5

Word Count of Privacy Policies

International Digital Lenders US Digital Lenders US Traditional Banks Word Count 12,500

10,000

7, 5 0 0

5,000

2,500

0 LENDER (EACH DOT REPRESENTS A SINGLE LENDER INCLUDED IN THIS STUDY.)

TABLE 5 Word Count vs. Readability We also analyzed the correlation between Privacy Policy Word Count Breakdown word count and reading grade level for each set of policies (Annex A). We found a direct correlation between the word counts and WORD COUNT reading grade levels in the policies of the INTERNATIONAL international lenders: the longer the policies, DIGITAL LENDERS US DIGITAL LENDERS US BANKS the higher the reading grade required. In Minimum 151 (InvoiNet) 788 (RoadLoans) 557 comparison, we did not find a trend in the US lenders’ policies. With these findings, we Maximum 11,210 (KCB) 4,847 (Prosper) 3,494 recommend that policy writers assure that Median 1,436 1,470 1,488 their policies, regardless of length, cover the critical best practice topics and attempt to do so in language accessible to the target market.

Conclusions The international companies’ policies had higher average reading grade levels, although they covered less content for nearly almost every principle, as shown in Figure 3. We provide specific policy recommendations at the end of this document.

10 CENTER FOR FINANCIAL INCLUSION Security Analysis

Security is a critical requirement any time key? Research tells us that nearly all our money is involved. Digital financial services systems are constantly under an automated offer consumers greater physical security than barrage of attack traffic and it would be borrowing, saving, or transacting in cash. naïve to think that emerging online finance This security is in fact one of the important infrastructure would somehow be spared contributions digital financial services can from this onslaught. We do, however, note make to people on lower incomes. However, that larger companies are likely to attract security risks do not disappear when lending more attention from bad actors. moves online; the nature and location of Second, security by obscurity is entirely risk moves, too. Adversaries do not simply unnecessary given that strong solutions to disappear: as money flows online, adversaries many of these problems exist and at relatively arise that focus their efforts on whatever low cost. As discussed below, we are not vulnerabilities providers leave open. Theft necessarily recommending that companies and fraud are enabled when data falls into invest in expensive anomaly detection adversarial hands. Moreover, breaches to systems (although they may wish to consider online lending services may further endanger it); rather, they should deploy and properly customers because of the wider range of configure the strong connection security personal data they may collect (e.g., social mechanisms that are already available networks, GPS information). As such, it is to them through features of the majority critical that such services provide strong, of their infrastructure. best-practice protections for their customers. Finally, before this report was published, we Before we go on, we will address an reached out to each of the analyzed companies important misconception that our analysis to give them a chance to improve on the simply informs potential adversaries as to discovered problems. Our goal is to ensure where they should attack. We reject this that all organizations protect customer data assertion for a number of reasons. First, in as well as possible, and we are willing to the field of , this thought assist them in making the necessary changes. process is known as “security by obscurity”. The thinking goes that if adversaries do not Methodology already know of the existence of a weakness, Our study sought to measure the security of they never will. Basing the security of a system the connection between customers’ mobile on the hope that an adversary will never devices and the server within the digital find significant vulnerabilities is problematic lender’s network that processes and stores because it assumes that an adversary would data. While it is possible to measure the never bother to look. This is akin to the “key security of many parts of an online finance under the doormat” approach to security. system, the most critical is ensuring that Can anyone really pretend to be surprised connections between mobile devices and when the first thing a burglar does when the service provider’s servers are secure. approaching a property is check for a hidden Accordingly, we focused our attention here.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 11 communications use by digital finance FIGURE 6 providers (Figure 6). They are: Mapping Secure Communications 1. Do mobile devices properly use strong algorithms to protect the confidentiality and integrity of all Security Questions Addressed in the Study communications? (Devices)

DOES THE WHAT DATA IS IS THE SERVER MOBILE DEVICE 2. Do mobile devices properly verify that they FLOWING? SECURE? AUTHENTICATE? are communicating with the correct server? ()

3. Are the servers configured to use strong encryption algorithms to protect User’s mobile Digital lender’s the confidentiality and integrity of all device server TWO-WAY INFORMATION FLOW communications? (Servers)

4. What data are being collected by digital finance providers over these links? IS COMMUNICATION (Permissions) ENCRYPTED? A failure at any one of these steps can lead to a successful attack on online finance systems. For Question 1, if a mobile device is programmed to use weak or no encryption algorithms, an attacker may be able to recover Measuring the security of connections the data. For Question 2, if the mobile device between users and digital finance providers is not certain it is communicating with the is the first step in assessing the practices of correct server, it may simply be sending its this industry. If a digital finance provider fails sensitive data to an attacker. For Question to adequately protect data in this space, an 3, even if the other two issues are properly adversary could recover potentially sensitive addressed, a poorly-configured server may consumer information with very little effort. undo all of these protections and expose user Such a breach can obviously entail financial data. Finally, for Question 4, identifying the loss. However, it is critical to note that the wide data a digital finance provider has permission array of data collected by these services could to take allows us to reason about what data further harm a customer. For instance, GPS may be exposed should any of the above data could be used to track specific individuals security weaknesses be confirmed. and target them for extortion or other harm. We break down our methodology with Although correctly protecting respect to each of these questions below. communications, as discussed in this paper, is a good first step, a positive evaluation here Use of Encryption by Apps should not be viewed as an endorsement of all on Mobile Devices security practices of the studied companies; rather, it merely represents that this one aspect 1. Do mobile devices properly use is done well. Multiple additional analyses of strong encryption algorithms to protect internal policies and controls, each of which the confidentiality and integrity of all would entail significant additional efforts, communications? (Devices) remain projects for future research. Our experiments sought to answer four Of the 52 digital finance providers in this specific questions about the state of secure study, 27 offer a mobile application for Android

12 CENTER FOR FINANCIAL INCLUSION FIGURE 7

Android application bytecode in the JEB tool. Such code is extremely difficult for an analyst to read and assess.

FIGURE 8

Source code recovered from the previous figure, which is more easily understood by an analyst.

phones.6 We downloaded a copy of each of We use the JEB Decompilation tool by these applications to computers in our lab. PNF Software. JEB takes an application Android applications are written in the Java as input and outputs Java source code. programming language and the code for the Figure 7 shows application code before applications themselves is not included in a this de-compilation process. download. However, a wide range of tools exist We manually read this source code to take such mobile application code and turn it (Figure 8) to identify security-sensitive areas back into the original source code written by its and then determined whether or not the authors. We performed such reverse engineering application implemented best practices as on each application to learn more about precisely commonly understood in the how its creators attempted to provide security. community. Reverse engineering tools such

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 13 as JEB recover code in applications even if it and a broad community of policy experts is never used by the application. Our analysis attempting to provide applications with a carefully checked to make sure that such “dead straightforward means of protecting their code” was not considered. This process was communications. As such, everything from intensive and time-consuming and drew upon online banking in browsers to email generally our expertise and many years of experience in relies on TLS. this space; many hundreds of hours were spent TLS relies on certificates for establishing recovering, understanding and documenting identity. The certificate itself has a) information this code. about the website/server, b) a public key We were able to recover well over 1 million that can be used by the client to begin lines of code from the 26 mobile apps. Our in- communications, and c) a digital signature depth analyses searched for specific categories from a trusted third party (e.g., DigiCert, of operations, including data encryption and Verisign). Such certificates form much of the password handling. In the case of encryption, basis of security on the because they we specifically searched for the use of two can be used to positively identify well-known outdated and now deprecated cryptographic entities. However, certificates can easily be standards, the “Data Encryption Standard” used in incorrect and dangerous ways. For (DES)7 and “Triple DES” (3DES).8 The use instance, a client receiving a certificate may fail of either of these ciphers would void any to check to whom that certificate belongs. An protections application designers attempted to attacker could easily send their own certificate, build to protect the confidentiality of data, as and such clients would accept it as valid these standards can now be routinely breached. (thereby sending all information directly to the For password handling, we were particularly attacker). Similarly, a server could fail to have interested in the use of “,” which is random its certificate signed by a trusted third party, data used to make stored passwords stronger. opening the door for an adversary to forge a Salts should be random and unique to each similar certificate (with no way for the client to user; if they are not, any protection they may determine that such a forgery occurred). provide is defeated. We measured the strength of authentication via certificate handling in mobile applications. Authentication by Mobile Apps Unfortunately, apps do not have that helpful “lock” icon anywhere inside them, so we had 2. Do mobile devices properly verify that they to dig deeper. We accomplished this through are communicating with the correct server? the use of the Mallodroid tool. Mallodroid (Authentication) takes an application as an input and, like the JEB tool, reverse engineers it to produce Java Authentication is critical because it ensures code. Mallodroid then searches that code for that both parties know with whom they are certificate handling routines and determines if talking. Without strong authentication, anyone they are written in potentially dangerous ways. can pose as a lender or customer without the When the tool indicated a possible problem other side of the transaction knowing. during our study, we manually analyzed Correct certificate handling is critical to the routine in question and determined if authentication on the Internet. If you have a problem indeed existed and whether that ever seen a small lock icon in your browser, it routine was called by the application (and was means that the Transport Layer Security (TLS)9 not dead code). While the use of the Mallodroid protocol was protecting your communications. tool is relatively fast, it provides limited insight TLS is the result of collaboration between into the code and still requires a significant cryptographers, systems security experts, amount of evaluation by a security engineer.

14 CENTER FOR FINANCIAL INCLUSION Server Security known to be insecure. Such insecurity can lead to an adversary being able to observe, modify, 3. Are the servers configured to use and inject their own traffic between a user and strong encryption algorithms to protect the server. As such, performing this test helped the confidentiality and integrity of all us concretely characterize the security standing communications? (Servers) of an organization’s communications. The tool’s scoring rubric ranged from A+ Even with close attention to detail for security to T.11 We denoted all T scores as F* in our on the mobile device, failure to provide similar results to make it clear to readers that it is a attention in the backend servers can similarly failing grade. The scores were calculated based expose sensitive user data. Given that the on three categories: protocol support, key software engineers writing the code for mobile exchange, and cipher strength. phones are often different than the individuals configuring the servers, it was critical to check Supported Protocols both sides of the connection. TLS and SSL are cryptographic protocols that We measured the configuration of the secure communication on networks. The server via the Qualys Secure Sockets Layer original version of SSL (SSL 2.0) was created in (SSL) Test.10 Qualys provides a free service that 1994, and TLS 1.0 was created in 1999. Since attempts to connect to a target server using all then, the protocols have been updated due to possible configurations that have historically serious security flaws that allowed adversaries been approved for TLS/SSL connections. The to intercept or amend secure communication output of the Qualys SSL Test is a grade level, between two parties — a man-in-the-middle similar to those used in traditional report attack. Support for old versions of these cards. Many previously-allowed versions of protocols is therefore dangerous. The Qualys these protocols and their parameters, while scoring mechanism (Table 6) weighed each URL’s believed to be secure in the past, are now protocol based on the scale in Table 7 below.

TABLE 6 TABLE 7

Qualys Score Chart (Derived from Impact of SSL/TLS Version on “SSL Server Rating Guide”) Qualys Grade

SCORE GRADE PROTOCOL SCORE

Exceptional configurations A+ SSL 2.0 0%

Score ≥ 80 A SSL 3.0 80%

Score ≥ 65 B TLS 1.0 90%

Score ≥ 50 C TLS 1.1 95%

Score ≥ 35 D TLS 1.2 100%

Score ≥ 20 E

Score < 20 F

Untrusted certificates T (F*)

Mismatched certificate names M

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 15 Key Exchange and Cipher Strength TABLE 8 Key exchange occurs when encrypted keys are Key Exchange Characteristics and Their Impact on Qualys Score exchanged between parties, providing a secure means to transmit messages. Legitimate key exchange requires two critical factors: KEY EXCHANGE CHARACTERISTIC SCORE authentication and secure key generation. Authentication is required in order to prevent Weak key (Debian OpenSSL flaw) 0% an unauthorized party from accessing or key exchange (no authentication) 0% transmitting messages. In addition, the more random the encryption key, the harder it is for Key or DH parameter strength < 512 bits 20% an attacker to crack and steal it. Diffie-Hellman key exchange is an example key exchange Exportable key exchange (limited to 512 bits) 40% mechanism. The Qualys scoring mechanism Key or DH parameter strength < 1024 bits (e.g., 512) 40% weighed each URL’s key exchange based on the scale detailed in Table 8. Key or DH parameter strength < 2048 bits (e.g., 1024) 80% Ciphers are algorithms designed to encrypt Key or DH parameter strength < 4096 bits (e.g., 2048) 90% and decrypt data. As a general principle, ciphers that use longer keys are a greater Key or DH parameter strength ≥ 4096 bits (e.g., 4096) 100% barrier to an adversary.12 The Qualys scoring mechanism weighed each URL’s cipher strength based on the scale in Table 9. Both the protocol support and cipher strength scores were calculated by averaging

TABLE 9 the score of the strongest and weakest cipher. While we were able to measure the security Cipher Strength and Its Impact of an organization’s communications, we on Qualys Score make no claims as to whether any specific company has leaked data in the past. We are simply pointing out that the “key is under CIPHER STRENGTH SCORE the doormat” and any attacker capable of 0 bits (no encryption) 0% seeing the communications could potentially compromise them. < 128 bits (e.g., 40, 56) 20%

< 256 bits (e.g., 128, 168) 80% Permissions for Use of Data

≥ 256 bits (e.g., 256) 100% 4. What data are being collected by digital finance providers over these links? (Permissions)

Mobile applications have access to a significant amount of potentially sensitive data (see Figure 9). Such data includes a mobile user’s contact list (and therefore insight into their social network), current location (via GPS data), and potentially much more. It is understood that digital finance providers potentially rely on these kinds of data in order to make their credit decisions; however, they rarely reveal (not even

16 CENTER FOR FINANCIAL INCLUSION FIGURE 9

An Example of Permissions Collected by One Digital Finance Provider Mobile App

in privacy policies) those data they actually use Results or how such data factors into credit decisions. Our analysis looked at the Android manifest Mobile Application Security (Encryption file, which is used at the time an application is and Authentication) installed to request access to specific types of sensitive data. This mechanism enables users 1. Do mobile devices properly use to be presented with a data list at the time of strong encryption algorithms to protect installation before an application is installed the confidentiality and integrity of all so that they can determine whether or not communications? (Devices) they wish to grant such access. Accordingly, scanning this file allowed us to characterize the In order to find weak uses of cryptography, our kinds of data these applications seek to collect. reverse engineering started by first searching As such data is sent from a mobile device for instances of DES/3DES or static encryption to the server for processing, our analysis could strings. In Table 10 we show our findings in the not uncover how each piece of data is used. 27 US and international apps. Most critically, 17 Such insight would require each company to out of 27 apps offered demonstrably dangerous disclose its algorithms and many claim that ciphering options. Over half of the apps use these constitute their competitive advantage DES, an algorithm that has been deprecated for and intellectual property and are therefore over a decade — well beyond the creation date unwilling to share them. In the interest of of these apps. In many of these apps, we also transparency, credit decision-making data found disastrously incorrect cryptographic use, and algorithms must be made available in leading to substantial risk to consumers. In traditional lending in the United States; this section, we explore our findings through regulation, however, has not yet reached examples of the problems we found. the digital lending industry. Regardless, we As an example of static parameters causing believe that further study into such algorithms, cryptographic problems, in Figure 10 we including their strengths, weaknesses and show where one international company’s biases, is a worthwhile task. app generated a secret DES key with a static

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 17 TABLE 10

Applications with weak cryptographic parameters. Disturbingly, 17 of 27 applications offered demonstrably bad ciphering options.

STATIC VALUES STATIC VALUES USED FOR USED FOR APPLICATION DES 3DES ENCRYPTION APPLICATION DES 3DES ENCRYPTION

Company 1 ✓ ✓ ✓ Company 14 – – –

Company 2 ✓ ✓ – Company 15 ✓ – –

Company 3 ✓ ✓ – Company 16 ✓ – ✓

Company 4 ✓ ✓ – Company 17 – – –

Company 5 – – – Company 18 ✓ ✓ –

Company 6 ✓ ✓ ✓ Company 19 ✓ ✓ –

Company 7 – – – Company 20 – – –

Company 8 – – – Company 21 ✓ ✓ ✓

Company 9 – – – Company 22 – – –

Company 10 – – ✓ Company 23 ✓ ✓ –

Company 11 – – – Company 24 ✓ ✓ –

Company 12 ✓ ✓ – Company 25 ✓ ✓ –

Company 26 ✓ ✓ – Company 13 Obfuscated – – DES code Company 27 – – –

Note Company names have been anonymized from this table.

FIGURE 10

An Example of a Hardcoded “Salt” Value

18 CENTER FOR FINANCIAL INCLUSION FI G U R E 11

An Example of an App That Used a Static String to Generate Keys

salt — that is, one that is always the same. As key and found them in an unrelated, open- mentioned previously, salts were designed to source project.14 The secrecy of this key (and, add randomness; an adversary who knows as a result, the security of this cryptosystem) this value will have a substantially easier time is completely non-existent. Not only has the recovering secret data. company failed to produce a random value This is not the only place where this for key generation, the key itself is both static weakness occurred in the app, however. and hardcoded. No additional work is required Figure 11 is another snapshot of the code for an adversary to use this key. For other uses, found in the app. As denoted by the red arrows, Figure 14 shows the name of the company the same string was used as a passphrase to being used to encrypt and decrypt. This create a secret key. Due to this, all installations application represents the most egregious of this application would generate the same misuse of cryptography we examined among key, and any adversary with this string all applications. could likewise generate the key. Accordingly, In a different international example, the communications from this app could be application specifically allowed both DES intercepted and decrypted. and 3DES, as shown in Figure 15. Allowing Likewise, Figure 12 is a snapshot of a similar these deprecated algorithms could allow an use of cryptography in a US-based lender’s adversary to recover sensitive communications app. As shown in the figure, the static string in what is known as a “downgrade attack.” Such “i-cry-when-angles-deserve-to-die”13 was used attacks are simple to configure and execute to generate a secret key, causing the same while the victim is unaware that their data is vulnerability as in the previous example. being compromised. More critically, all of the Another example, an MNO app, contained above options were demonstrably weak — they the same vulnerability, as shown in Figure 13. allowed both null MD5 and null SHA1,15 neither The arrow points to the actual key used in of which provide encryption. All eight of the the application. Worse still, we searched the ciphers have been deemed broken and hence web for the list of numbers comprising this should not be used for encryption.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 19 FIGURE 12

This Company Also Used a Static String to Generate Its Cryptographic Key

FIGURE 13

An MNO App Used a Static Cryptographic Key Across All Users

20 CENTER FOR FINANCIAL INCLUSION FIGURE 14

This Company Used a Static Key to Encrypt and Decrypt

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 21 FIGURE 15

An App That Allowed for the Use of Export DES, DES, and 3DES, All of Which Are Considered Weak Ciphers

the security of a TLS session but, when

TA B L E 11 broken, often defeat security in subtle but catastrophic ways. Mallodroid found four applications that Trust managers accept or reject presented failed to properly handle certificates. credentials and manage trust material in order to make trust decisions. Host Name Verifiers determine if a URL’s hostname BROKEN TRUST BROKEN HOST APPLICATION MANAGER NAME VERIFIER matches its respective server’s hostname. If they do not, the verifier takes necessary Company 1 ✓ ✓ steps to determine if the connection should be allowed. As shown in Table 11, three of Company 2 ✓ ✓ the apps had broken trust managers and four Company 3 ✓ ✓ had broken host name verifiers. This means that 3 of the reviewed companies had apps Company 4 No Error Reported ✓ that failed to make trustworthy decisions, and 4 of them failed to verify if connections should actually be allowed. Note Company names have been anonymized from this table. The consequences of these broken functions are severe; adversaries that can impersonate Authentication the provider’s service to the app (e.g., on a coffee shop Wi-Fi network) can intercept 2. Do mobile devices properly verify that they the sensitive communications and record are communicating with the correct server? and tamper with messages between the app (Authentication) and the provider’s legitimate service. These apps perform no or weak verification of the We ran Mallodroid on each of the 27 Android service and in many cases will blindly accept a apps. Mallodroid detects broken trust connection to any server that answers. Without managers and host name verifiers. These this verification, the user is never notified of a functions are critically important to ensuring problem and the app appears to work normally.

22 CENTER FOR FINANCIAL INCLUSION FIGURE 16

Best and Worst Qualys Scores

COMPANY 1 COMPANY 2 COMPANY 3 COMPANY 4 COMPANY 5 COMPANY 6 COMPANY 7 COMPANY 8 COMPANY 9 COMPANY 10 C O M PA N Y 11

COMPANY 12 COMPANY 13 COMPANY 14 COMPANY 15 COMPANY 16 COMPANY 17 COMPANY 18 COMPANY 19 COMPANY 20

A+ A A– B C F F*

Note All valid links found in the apps. Companies in this figure have been anonymized.

Server Configuration Security particularly concerning as it signaled that the provider might not have had a configuration 3. Are the servers configured to use management process, which may prevent strong encryption algorithms to protect (or alert engineers to) errors such as weak the confidentiality and integrity of all Diffie-Hellman key exchange parameters, communications? (Servers) acceptance of deprecated algorithms, and certificate-not-trusted errors. Apps often contact many different servers. We also evaluated the international Each server may be configured differently, so company web sites from our original list of we first extracted all of the URLs and other 52 companies for TLS/SSL configuration server information from each app. We then errors. As shown in Figure 17, a much higher characterized the range of these configurations percentage of company websites had properly using the Qualys SSL scanner, as described above. configured TLS/SSL connections. However, Figure 16 shows the best and worst there were also a number of important Qualys score of all URLs found in the assessed negative observations. First, 12 of the websites apps. Coins showed the widest range of had demonstrably vulnerable configurations configuration failures, from A+ to F*. The (e.g., used broken encryption ciphers, were lack of consistency among providers was susceptible to interception, etc.).

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 23 FIGURE 17

Qualys Test Results for Digital Finance Provider Websites; 11 of 53 Websites Received a Failing Grade

COMPANY 1 COMPANY 2 COMPANY 3 COMPANY 4 COMPANY 5 COMPANY 6 COMPANY 7 COMPANY 8 COMPANY 9 COMPANY 10 C O M PA N Y 11 COMPANY 12 COMPANY 13 COMPANY 14 COMPANY 15 COMPANY 16 COMPANY 17 COMPANY 18 COMPANY 19 COMPANY 20 COMPANY 21 COMPANY 22 COMPANY 23 COMPANY 24 COMPANY 25 COMPANY 26 COMPANY 27 COMPANY 28 COMPANY 29 COMPANY 30 COMPANY 31 COMPANY 32 COMPANY 33 COMPANY 34 COMPANY 35 COMPANY 36 COMPANY 37 COMPANY 38 COMPANY 39 COMPANY 40 COMPANY 41 COMPANY 42 COMPANY 43 COMPANY 44 COMPANY 45 COMPANY 46 COMPANY 47 COMPANY 48 COMPANY 49 COMPANY 50 COMPANY 51 Note Companies in this figure have been A+ A A– B C F F* N/A anonymized.

24 CENTER FOR FINANCIAL INCLUSION FIGURE 18

Permissions Requested by Digital Finance Applications

Percentage of Apps 80 75.0 71.9 71.9 70 62.5 62.5 59.4 60 56.3 53.1 50.0 50

40 37. 5

30

20 15.6 9.4 10 3.1 0 SMS PHONE CAMERA HISTORY IDENTITY STORAGE LOCATION CONTACTS CALENDAR DEVICE & APP MICROPHONE INFORMATION INFORMATION DEVICE ID & CALL WI-FI CONNECTION CONNECTION WI-FI PHONE/MEDIA/FILES

Permissions the apps’ use of this access. Further study of the apps’ code may reveal why this access 4. What data are being collected by is required and how the data is stored and digital finance providers over these links? processed. Additionally, the reasons for access (Permissions) to “microphone” (9.4 percent), “device and app history”16 (15.6 percent) and the camera (56.3 We examined the permissions required by percent) were neither discussed in the privacy apps. Figure 18 shows the frequency of each policy nor obvious in their intentions. permission type. Most apps required access to the device’s identity, location, phone call Conclusions capability, photos/media/files, storage, and Overall, we found numerous egregious security device ID and call information. Though errors in over half of the apps we examined, such permissions may be requested for including misuse of cryptography, use of legitimate reasons (e.g., phone call capability weak cryptography, and excessive permission for automatically dialing customer service), requirements. Many of these issues have been all of these could be misused to leak private known by both the academic community and information about the customer. industry for years, and these problems put Requests for a number of other permissions both consumers and providers at severe risk were more concerning. For instance, 72 percent of compromise. As we will show in the next of the apps requested access to a user’s section, these problems can become even contacts. It was unclear how contacts access worse for the consumer when the provider would be used on the device; we found absolves itself of risk, forcing the consumer nothing in their privacy policies outlining to accept the risk of the apps’ flaws.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 25 Terms of Service Analysis

We reviewed each app's terms of service Only eight terms of service agreements document to examine how companies were addressed fraud by the user and only eight distributing liability (for the consequences addressed fraud against the user. Those that of kinds of security breaches described here) did address fraud against the user noted that between themselves and users. Our intention the provider is not liable. This places users in was to identify how the providers' terms of an untenable position. To use an app — many of service described the user liability in case of which we have shown to be vulnerable — they compromise. Our codebook for this analysis must accept these terms. If a fraud occurs, the was: "liable, liability, harmless, responsible, user must bear the consequences. Such terms fraud," and we followed the same coding can harm consumer trust of these apps and process discussed previously in our privacy may ultimately constrain the expansion of analysis. Our results are shown in Table 12. these critical services.

26 CENTER FOR FINANCIAL INCLUSION TABLE 12

Terms of Service: Fraud

COMPANIES MENTIONS COMPANIES MENTIONS WITH APPS FRAUD BY USER MENTIONS FRAUD AGAINST USER WITH APPS FRAUD BY USER MENTIONS FRAUD AGAINST USER

Company 1 – – Company 16 ✓ –

Company 2 – – Company 17 – –

Company 3 – – Company 18 – –

Company 4 ✓ “exclusion of liability” Company 19 – –

Company 5 ✓ – Company 20 – –

Company 6 ✓ – Company 21 – –

Company 7 ✓ “keep the Bank and Finserve Company 22 – – free and harmless” Company 23 ✓ “exclusion of liability” Company 8 – “Limitation of Liability exception” Company 24 – –

Company 25 – – Company 9 ✓ “will not be liable for” – – Company 10 – – Company 26 – – Company 11 – – Company 27 Company 28 – – Company 12 ✓ “fully assume the risks of liability” Company 29 – – – “no manner responsible Company 13 Company 30 – – for any claim” Company 31 – – Company 14 – – Company 32 – – Company 15 – “exclusion of liability” Company 33 – –

Note Company names in this table have been anonymized.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 27 Conclusions and Recommendations

based digital finance providers were longer and more difficult to read. Virtually all were We believe in the potentially transformative less understandable than the privacy policies power of digital finance. These powerful of traditional US-based financial institutions. financial instruments, their providers, and their Moreover, as our technical analysis confirmed, none of the policies provide specifics regarding users deserve the best possible protections. the types of data that will be collected or how it will be handled. Given that many of these applications request access to information such as GPS location, contact lists, cameras, microphones, and calendars, Credit can be a tremendous tool to empower we believe that these companies must consumers or help build businesses. For many significantly improve their actions in regards people, digital finance systems make it possible to sensitive user data, starting with improving to access credit for the first time, while for the quality of privacy policies. many others they increase the convenience and All players in the industry would do flexibility of the credit they can use. As with well to adopt either the FDIC or GSMA any powerful tool — a chainsaw, an automobile, recommendations — or indeed, both. Policies a medicine — it is essential for users to be should also explicitly identify the sensitive confident that it will perform safely when user data they intend to collect so that users used wisely. When the security of a credit tool can make informed decisions when selecting can be compromised, however, consumers are a lender. Beyond this, the industry should exposed to potentially very serious risks; and strive for further clarity in the creation of these these risks could also affect the viability of the policies, including writing policies that match company providing the service. the reading level and language of their intended Digital finance often relies on access to markets. Conformance to this suggestion could non-traditional data sources, especially data easily be tested using the tools we applied in our held on mobile devices. As such, it is critical to evaluation. Adoption of this recommendation determine how data is treated and protected would not only provide greater clarity for users, in such systems, and to measure the policy but would also make digital lenders better at and infrastructure that is deployed to minimize communicating privacy policies than traditional potential harm. In a world where even large banks, furthering their value proposition as a and seemingly well-protected financial more convenient and beneficial alternative. companies are being breached, such protections have never been more critical. Technical Security Recommendations Our security analysis focused on security of Privacy Policy Recommendations applications running on mobile devices, proper Our analysis of the privacy and security of authentication, and proper configuration digital finance providers characterized the areas of servers. We found widespread misuse of of coverage offered by each policy, followed cryptographic algorithms, including the use by an analysis of readability. In general, we of algorithms that have long been publicly discovered that the policies of internationally- deprecated (i.e., listed as beyond their useful

28 CENTER FOR FINANCIAL INCLUSION lifetime and therefore inherently dangerous). We discovered that 17 of the 27 apps offered It is critical to determine how data is treated such dangerous options. Our results for and protected in digital finance models, authentication were better, but four of the 27 and to measure the policy and infrastructure apps still exhibited significant vulnerabilities that would allow an adversary to impersonate that is deployed to minimize potential harm. the server and potentially trick clients into exposing their data. Finally, we measured server TLS/SSL configuration and demonstrated a wide range of security (even among different servers involved in a single app). At least eight of the should be enacted for the digital finance space. apps had entirely vulnerable configurations that While challenging, due to the wide range of represent a significant threat to their users. Poor jurisdictions involved, such strong consumer security configuration also impacted digital protection laws have the positive side-effect of finance provider websites, with approximately encouraging financial companies to invest in one-quarter also receiving a failing grade. protecting consumers so as to reduce their own At the time of this report’s publication, the losses to fraud. strongest possible relevant standard for secure In an effort to promote greater transparency communications is TLS 1.2. Digital finance and collaboration within the sector, CFI (with providers should eliminate the use of all other the collaboration of the Venture Lab team within versions as soon as possible. Factors that Accion) carried out a consultative process for may slow down such changes include legacy all the companies included in the sample for devices; however, a measurement study of TLS this research project. An effort to reach all 53 versions available to users would easily allow companies via email was made, and only one for a reasonable timeline to be developed. The company could not be contacted because no use of TLS alone is not enough — it must be contact information for that company was ever correctly parameterized. Applications should found. The remaining 52 institutions were all use strong encryption ciphers and hashing sent a brief report containing their individual algorithms (e.g., “AES_256_CBC_SHA256”). Such scores for each test, together with a document algorithms can be efficiently implemented in that briefly described what each test entailed both client and server with negligible impact and a letter of invitation to participate in a to performance. Support for weak ciphers private discussion with the research team (e.g., DES, 3DES) and weak hashing algorithms in order to answer any questions they might (e.g., MD5) must be eliminated immediately have about the findings of the study or any next from the servers of all digital finance providers, steps. We also shared with them a guidebook as they provide a false sense of security. Such that was developed by the research team to configuration should be uniformly applied help start-up fintech companies solve many of across servers and mobile devices. Finally, the most common data security vulnerabilities digital financial applications should not rely uncovered by this study. Out of the 52 companies on third-party advertising or metrics libraries, that were contacted, only seven companies some of which — our analysis shows — are responded and participated in this discussion. guilty of poor information security practices. A private link with the recording of this session All such functionality should be developed and was later shared with all companies via email. protected by the digital lenders themselves. We believe in the potentially transformative power of digital finance. However, we also believe Terms of Service Recommendations that these powerful financial instruments, Finally, we explored the terms of service their providers, and their users deserve the best associated with each app. Unfortunately, only a possible protections. Our measurement study small number of these providers had available demonstrated that this industry has significant terms of service. Financial institutions in the room for improvement in the availability, US have a legally-mandated maximum value accessibility and clarity of privacy policies of fraud for which consumers are liable (e.g., and in the correct use of security mechanisms US $50 for credit cards). Similar standards in both mobile devices and backend servers.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 29 ANNEX A Word Count vs. Average Reading Grade Level of Privacy Policies

International Digital Lenders’ Privacy Policies

12,500

10,000

7, 5 0 0

5,000 WORD COUNT

2,500

0 7 9 11 13 15 17

AVERAGE READING GRADE LEVEL

30 CENTER FOR FINANCIAL INCLUSION US Digital Lenders’ Privacy Policies

12,500

10,000

7, 5 0 0

5,000 WORD COUNT

2,500

0 7 9 11 13 15 17

AVERAGE READING GRADE LEVEL

US Traditional Banks

12,500

10,000

7, 5 0 0

5,000 WORD COUNT

2,500

0 7 9 11 13 15 17

AVERAGE READING GRADE LEVEL

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 31 ANNEX B

Digital Lenders Evaluated and Analyses Performed

PRIVACY POLICY APP SECURITY QUALYS PRIVACY POLICY APP SECURITY QUALYS ANALYSIS ANALYSIS ANALYSIS ANALYSIS ANALYSIS ANALYSIS

Airtel ✓ ✓ ✓ Kubo Financiero ✓ – ✓

Atom ✓ – ✓ Lending Club ✓ – ✓

Azimo ✓ ✓ ✓ Lulalend ✓ – ✓

BlueVine ✓ – ✓ M-Pawa – ✓ ✓

Branch ✓ ✓ ✓ MCo-op Cash – ✓ ✓

C2FO ✓ – ✓ Micromobile – – ✓

Coins ✓ ✓ ✓ MoneyTap ✓ – ✓

CommonBond ✓ – ✓ OnDeck ✓ – ✓

Creditas ✓ – ✓ Pay Your Tuition Funds ✓ – ✓

CrowdEstates ✓ – ✓ PaySense ✓ ✓ ✓

EcoCash Loans – ✓ ✓ PesaPata – ✓ ✓

Equitel – ✓ ✓ Prosper ✓ ✓ ✓

Equity Direct Mobile – ✓ ✓ Puddle ✓ – ✓

Farm Drive – ✓ ✓ RoadLoans ✓ ✓ ✓

FastPay ✓ – ✓ Saida – – ✓

GetBucks ✓ ✓ ✓ Salud Fácil – – –

IndiaMART ✓ ✓ ✓ SMECorner ✓ – ✓

Insikt ✓ – ✓ Social Lender ✓ – ✓

InstaPaisa ✓ – ✓ Suregifts Redemption – ✓ ✓

InvoiNet ✓ ✓ ✓ Tala – – ✓

Jimubox ✓ – ✓ Taplend – ✓ ✓

Kabbage ✓ ✓ ✓ Tencent ✓ ✓ ✓

KCB ✓ ✓ ✓ Upstart ✓ – ✓

Kiva ✓ ✓ ✓ WeFinance ✓ – ✓

Koopkrag – ✓ ✓ Yoco ✓ ✓ ✓

Kopo Kopo ✓ ✓ ✓ Zidisha ✓ ✓ ✓

32 CENTER FOR FINANCIAL INCLUSION Notes

1 We excluded Salud Fácil of Mexico from 6 The companies not analyzed in 12 This is a nuanced issue. Comparing this portion of the study because its site this section operate through webpages, two unrelated ciphers is difficult and is solely in Spanish. To be relevant for rather than mobile apps. Therefore, the protection provided by a specific key users, local language versions would be we were unable to examine their length in one may not be equal to the needed but, overall, we found few policies operations without explicit access protection provided by the same key available in such languages. to codes from the company. length in the other. However, general 2 Federal Deposit Insurance Corporation. 7 National Institute of Standards and practice for publicly-vetted symmetric Privacy Rule Handbook. Accessed Technology (NIST). US Department encryption ciphers is to use keys of at through the FDIC web site at https:// of Commerce. June 2, 2005. “NIST least 128 bits, with 256 bits being the www.fdic.gov/regulations/examinations/ withdraws outdated data encryption currently recommended standard. financialprivacy/handbook/index.html. standard.” Accessed online at: 13 Misspelled lyrics to the song “Chop 3 GSMA (The GSM Association). Mobile https://www.nist.gov/news-events/ Suey!” by the rock band System of a Down. Privacy Principles: Promoting Consumer news/2005/06/nist-withdraws-outdated- 14 GitHub, Inc. 2018. “juspay ec-android- Privacy in the Mobile Ecosystem. Accessed data-encryption-standard. demo.” Accessed online at: https://github. through the GSMA web site at: https:// 8 National Institute of Standards and com/juspay/ec-android-demo/blob/ www.gsma.com/publicpolicy/wp-content/ Technology (NIST). US Department of master/expresscheckout/build.gradle uploads/2016/02/GSMA2016_Guidelines_ Commerce. July 11, 2017. “Update to 15 MD5 and SHA1 are standardized Mobile_Privacy_Principles.pdf. current use and deprecation of TDEA.” hashing algorithms, not standardized 4 Bowers, Jasmine, Bradley Reaves, Accessed online at: https://csrc.nist.gov/ encryption algorithms. Hashing Imani N. Sherman, Patrick Traynor, news/2017/update-to-current-use-and- algorithms are used to make concise and Kevin Butler. 2017. Regulators, Mount deprecation-of-tdea. representations of content and are crucial Up! Analysis of Privacy Policies for Mobile 9 An alternative standard, the Secure to ensuring the integrity (as opposed Money Applications, In “Proceedings Sockets Layer (SSL), was used prior to confidentiality, which is provided by of the USENIX Symposium on Usable to TLS. SSL/TLS are often used encryption) of communications. Privacy and Security (SOUPS).” Accessible interchangeably when discussing 16 This permission can allow an online at: https://www.usenix.org/ secure communications. We will use application to gain access to sensitive system/files/conference/soups2017/ the correct term where appropriate. logs, learn the identity of the other soups2017-bowers.pdf. 10 Qualys, Inc. “SSL Server Test” applications running on the phone 5 Central Intelligence Agency. “Field accessed online at https://www.ssllabs. and read web bookmarks. listing: School life expectancy (primary com/ssltest/. to tertiary education).” Accessed at 11 Qualys, Inc. May 8, 2017. “SSL Server https://www.cia.gov/library/publications/ Rating Guide.” Accessed online at: https:// the-world-factbook/fields/2205.html#uk. github.com/ssllabs/research/wiki/SSL- Server-Rating-Guide.

DIGITAL FINANCE AND DATA SECURITY: HOW PRIVATE AND SECURE IS DATA USED IN DIGITAL FINANCE? 33 The Center for Financial Inclusion at Accion (CFI) is an action-oriented think tank that engages and challenges the industry to better serve, protect, and empower clients. We develop insights, advocate on behalf of clients, and collaborate with stakeholders to achieve a comprehensive vision for financial inclusion. We are dedicated to enabling 3 billion people who are left out of — or poorly served by — the financial sector to improve their lives. www.centerforfinancialinclusion.org www.cfi-blog.org @CFI_Accion