BLOCKCHAIN AND DIGITAL SIGNATURES FOR DIGITAL SELF-SOVEREIGNTY

______

A Thesis Presented to the Faculty of the Department of Computer Science University of Houston ______

In Partial Fulfillment of the Requirements for the Degree Masters of Science ______

By Brijesh B. Patel December 2018

BLOCKCHAIN AND DIGITAL SIGNATURES FOR DIGITAL SELF-SOVEREIGNTY

______

Brijesh B. Patel

APPROVED:

______Dr. Weidong Shi, Chairman Dept. of Computer Science

______

Dr. Nikolaos V. Tsekos Dept. of Computer Science

______Dr. Chris Bronk Dept. of Information System Security

______Dan Wells, Dean College of Natural Sciences and Mathematics

II

BLOCKCHAIN AND DIGITAL SIGNATURES FOR DIGITAL SELF-SOVEREIGNTY

______

An Abstract of a Thesis Presented to the Faculty of the Department of Computer Science University of Houston ______

In Partial Fulfillment of the Requirements for the Degree Masters of Science ______

By Brijesh B. Patel December 2018

III

Abstract

Principles of self-sovereignty have been integrated into the solution to achieve a mechanism where the user is in control of one's digital identity attributes.

Through the use of attribute-based credentials, the solution presented here allows the user to control access to their digital identity attributes, so they only have to release the required attributes to the business entities. Selective disclosure proofs, enabled by cryptographically signed containers, allow for minimization of identity attributes transferred to execute a transaction. The user can consent to access of one's identity attributes by granting access licenses to business entities through a blockchain application running on their mobile device.

Also, the user can modify the access license to restrict the access based on time or revoke access to any identity attribute. Privacy of identity attributes and access licenses stored on mobile devices is ensured by integration of transparent data encryption. Dependency on any middleman entity required by several other identity management solutions is eliminated through the use of digital signatures.

The communication between actors involved in each transaction is encrypted through a PKI infrastructure ensuring the security of claims packages transferred.

The solution enables portability through use of digital signature to verify the validation of identity attributes done by the identity guarantor. The user is able to determine the lifespan of any identity attribute through the mobile application and remove it from any future digital transaction. The solution presented here allows

IV for the application of theoretical principles of self-sovereign identity into the everyday life of the user.

V

Contents

1. INTRODUCTION ______1 1.1 Digital Identity ______1 1.2 Authentication______3 1.3 Centralized Vs. Decentralized ______6 1.4 Blockchain ______8 1.5 Research ______9

2 IDENTITY MANAGEMENT ______12 2.1 Concepts ______12 2.1.1 Know Your Customers (KYC) ______12 2.1.2 Federation ______13 2.1.3 Claims ______15 2.1.4 Self-sovereign Identity ______17 2.1.5 Attribute-based credentials ______19 2.1.6 Identity Governance Framework ______20 2.2 Real-world examples ______21 2.2.1 BanQu App ______21 2.2.2 Bitnation ______22 2.2.3 BlockAuth ______23 2.2.4 Civic ______23 2.3 Consumer expectations ______24

3. BLOCKCHAIN ______26 3.1 History ______26 3.2 Cryptocurrency ______28 3.2.1 Bitcoin ______28 3.2.2 Ethereum ______29 3.3 Blockchain components ______31 3.3.1 Nodes ______31 3.3.2 Blocks ______32 3.3.3 Transaction ______33 3.3.4 Hash Functions ______34 3.3.5 Consensus ______35 3.4 Fields of applications ______37

VI

4. SOLUTION ______42 4.1 Necessities ______42 4.1.1 Shortcomings ______42 4.2. High-level Architecture ______43 4.2.1 Actors ______43 4.2.2 Use cases______46 4.3 User ______47 4.3.1 Attribute creation ______47 4.3.2 User granting access license ______51 4.3.3 Tacking usage by Relying party ______54 4.3.4 User revoking or editing license ______56 4.3.5 Effect on User ______57 4.4 Other actors ______58 4.4.1 Identity guarantor: Validation of identity attributes ______58 4.4.2 Effect on identity guarantor ______58 4.4.3 Effect on relying party ______59 4.5 Communication and storage ______59 4.5.1 Secure communication ______60 4.5.2 Storage of attributes ______61 4.5.3 Limitation ______62 5. Future and Conclusion ______63 5.1 Online shopping ______63 5.2 Single-sign on______63 5.3 Conclusion ______64

REFERENCES ______65

VII

List of Figures

Figure 1.1 Single Sign-on Sequence Diagram ...... 4

Figure 1.2 Centralized vs Decentralized ...... 6

Figure 1.3 Structure of blockchain ...... 8

Figure 2.1 Federated identity management ...... 14

Figure 2.2 Web service with STS ...... 16

Figure 2.3 BanQu App ...... 22

Figure 4.1 High-level application architecture ...... 45

Figure 4.2 Use case diagram ...... 46

Figure 4.3 Attribute creation...... 50

Figure 4.5 User granting access license ...... 52

Figure 4.6 Accessing attributes with license ...... 56

Figure 4.7 Certification and keys ...... 61

Figure 4.8 Digital certification...... 61

VIII

1. INTRODUCTION

1.1 Digital Identity

Digital identity is the network or Internet equivalent to the real identity of a person or entity (like a business or government agency) when used for identification in transactions from PCs, cell phones, or other personal or commercial devices [1].

This identity is based on a person’s real identity and requires the person to validate their real identity before their digital identity is established. After the establishment of digital identity, it can be used to authenticate and prove one’s identity in any digital transaction [1].

The process of becomes increasingly complicated when verifying sensitive information about a person’s identity. Authentication is based on attributes that are divided into the following categories:

Inherent: An entity inherits these attributes at the time of its conception. For an individual, these attributes include age, height, birthdate, and fingerprints.

Accumulated: These attributes are something an individual has acquired over their lifetime [1]. These include health records, bank accounts, social media profiles, and employment history. These attributes may change throughout an entity’s lifespan.

1

Assigned: These attributes have been attached to the entity over their lifetime, but are not intrinsic to their lifespan [1]. Assigned attributes are based on the entity’s relationship with other entities. These include government IDs such as social security number, driver’s license number as well as utility related IDs such as phone number.

Above mentioned attributes allow the individual holding the identity to conduct transactions with other individuals, business entities, as well as government entities. All parties involved in these transactions belong to one of the following categories:

Users: Entities or individuals who need to engage in a transaction.

Identity Provider: Entities that contain the identity attributes and execute the transaction on behalf of the user.

Relying parties: Entities that rely on the identity provider to provide the user’s identity attributes to finish the transaction. These entities are mostly businesses or government entities that are meant to provide a particular form of service.

Governance body: Organizations that keep oversight and regulate the existing standards in the governance of digital identity systems.

The authentication for transactions is done through a Digital Identity System

(DIS). It has the same structure and process as a physical identity system, but

2 the storage and processing of identity attributes are done digitally eliminating the need for manual processing [2]. Increasing transaction volumes, increasing transaction complexity, rising customer expectations, more stringent regulatory requirements, and increasing speed of financial and reputational damage are some of the key trends driving the need for a DIS. In its abstract form, the structure of a DIS consists of multiple layers with purposes of attribute collection, authentication, attribute exchange, and authorization.

1.2 Authentication

Among these layers, authentication is the key to determining the efficiency and convenience of an identity management system. Authentication allows a person to prove their identity credentials to enable people execution of transactions with other parties. These methods of authentications are divided in three ways.

What you know: This information can be a secret key or any password ranging from personal information to government or financial information such as ATM

PIN, or birthdate. This kind of authentication can be tricky as the user might have to remember multiple passwords or numbers and may be easily coaxed into providing them.

What you have: This could be some form of access key or door key. It is often used in office settings. The digital version of this kind of authentication could be a

3 key fob or an app that displays a number that users can put into any portal to gain access. These are forms of digital tokens.

What you are: This is related to the Inherent aspect of digital identity. It could be any form of biometric from fingerprints to retina scans. Examples of this is a smartphone’s login system where the user has to use their thumb or fingerprint to get into the phones.

Figure 1.1 Single Sign-on Sequence Diagram

Authentication enables several forms of Identity & Access Management (IAM) systems such as Single Sign-On (SSO). SSO is a form of authentication that allows the user to access multiples platforms and service providers through their identity. This process can be used across multiple domains and system within single or multiple organizations. It is usually managed by a service provider,

4 identity provider or some form of directory server [7]. The principle goal of SSO is to provide technical interoperability for actors involved through a single login across multiple systems.

Another example of IAM system is Federated Identity Management (FIM).

Federated identity links a user’s identity across multiple security domains, each with their own IAM systems [7]. This form is very similar to SSO. Current examples of SSO are google sign in and facebook login. On the other hand, federated identity is rather concrete where the identity has been established with a designated Identity provider.

An Individual can have multiple identities across multiple platforms such as various social media profiles. Also, security of these credentials has also become an issue. There are systems such as OpenID Connect that aim to provide a more convenient and secure SSO mechanism for users. One of the early examples of

IAM system is Microsoft Passport. Established in 1997, it was the first initiative to allow users to use the same identity on multiple platforms. Microsoft used identity federation for this approach.

Not being able to maintain user preferences and tackle user’s negative experience, Microsoft came to be a centralized authority in the passport federated Identity system, and Microsoft became to central Identity provider as shown in Figure 1.1 of SSO sequence diagram. Since then, several identity management systems have been created to fit the needs of users better. In

5

2001, an organization called the Liberty Alliance was established to set several standards and guidelines for the establishment of federated identity which became an alternative to Microsoft Passport. These became the basis for what later came to be known as Security Assertion Markup Language (SAML).

1.3 Centralized Vs. Decentralized

Attribute-based Credentials (AbC) are a form of authentication mechanism that enables the user to allows flexible and selective authentication of different attributes about themselves without revealing any additional information about them(zero-knowledge property).

Figure 1.2 Centralized vs Decentralized

Although implantation of AbC requires substantial infrastructure and optimization, there are several mechanisms and solutions implemented to realize AbC based authentication. Once such example is the IRMA (I Release My Attributes) project

6 of Radboud University Nijmegen which aims to prove that even devices with limited resources such as smart cards can be used to implement an efficient AbC

[8]. For example, a person can have a smart card to prove that they are 21 years or older and can buy alcohol. This approach is an example of a Decentralized

Identity Management system (DIMS). DIMS provides an alternative to systems like Microsoft Passport that used centralized mechanisms for identity management. Figure 1.2 shows the difference between centralized and decentralized systems.

These claims based DIMS can be problematic since financial institutions as well as several other kinds of institutions cannot participate. Financial institutions are required to keep track of all account holders. Tracing of the origin of claims has to be possible so that these institutions can establish the validity of these claims through their origin. The US government has issued record-breaking fines to financial institutions that failed to sufficiently comply with Anti-money Laundering

(AmL) requirements [12]. Lately, several different kinds of systems have been created to address these issues. One such example, Sovrin by IBM implements a global trust framework that provides the legal and policy foundation for identity management globally. Another such project in progress is Microsoft’s identity management system which makes use of decentralized identifier, identity hub, and, universal resolvers for resolving identity attributes.

7

1.4 Blockchain

“The blockchain is an incorruptible digital ledger of economic transactions

that can be programmed to record not just financial transactions but

virtually everything of value.”

-Don & Alex Tapscott, authors Blockchain Revolution (2016)

The original blockchain is the decentralized ledger behind the digital currency bitcoin. The ledger consists of transactions known as blocks linked together to form a chain. A copy of this ledger is stored on each of 200000 computers that make up the network of bitcoin. Every time the ledger is changed, the record is cryptographically signed to prove that the person involved in the transaction is the actual owner of the coins. Also, no one can run a duplicate transaction to spend the same coin twice as every node in the network will have a record of that coin being spent.

Figure 1.3 Structure of blockchain [15]

8

These protocols ensure that an incorruptible record is preserved for all changes.

Thus, the result is a blockchain ledger that the individuals involved in the transaction can trust and do not need to trust any third party [13].

Blockchain guarantees immutability to the ledger. Once data has been written to the blockchain, no one, not even a system administrator, can change it [14]. In blockchain, each block is referenced by their hash, and each block explicitly specifies the preceding block using a reference to its hash. Thus, the data in blockchain is always consistent, and operations can be run on it to see if the data matches up. If not, it is plausible that the integrity of the data has been compromised. Also, since the ledger is shared, there are multiple copies of the blockchain. If one copy has been compromised, hashes of other copies from non-participating chains can be checked to see what parts of the data have been morphed. Therefore, the only way to change the blockchain is limited to making an actual change (or transaction) and adding blocks to the blockchain.

Immutability at this point is guaranteed because of the longest chain rule in consensus algorithm. This rule states that when there are multiple competing valid chains, the one with more blocks is accepted, ensuring that all the changes have been recorded with hashes matching up.

1.5 Research

Identity management is one of the broadest issues in the world of IT security.

There are very few other concepts which single-handedly impact the

9 management of the enterprise as well as consumer business services as much as identity and access management. From enterprises managing access control for employees to consumers seeking to receive any service, all are heavily dependent on secure and reliable identity and access management [16]. Digital identity management has more than its share of challenges. There are no existing standards, identity records, and authentication process. Approaches such as Security Assertion Markup Language (SAML) and Extensible Markup

Language (XML) framework are gaining momentum in standards and guidelines organizations such as Oasis and Liberty Alliance, but need to wait for a formal integration and standardization.

Digital transactions make up a substantial part of the economy nowadays. One of the significant challenges of digital identity is authentication, especially when its question of e-commerce or service registrations. According to a survey by

Gartner in 2007, the number of identity theft victims in the US has increased by more than 50 percent since 2003 [9]. Also, recent investigations in Facebook’s selling of user data have led to awareness in the minds of people and question who is holding their identity. They do not feel in control of whom they can share their identity with.

The question raised is, how can we design a mechanism where a person can be the sole owner of their identity? An architecture that allows its users to be able to

10 share their identity while preserving their privacy with automatic validation while avoiding relying on a centralized authority.

In this research, I will look at what a Decentralized Identity Management Systems consist of. I will look at what an average user (consumer) will expect from a

DIMS. This research will prove that there is a need for such a DIMS. We will take a more in-depth look at the design and algorithms that make blockchain possible.

Also, we will look at how blockchain can be applied to the issue of decentralized

DIMS. We will look at the architecture required to make such a solution possible.

Finally, we will look at how this new solution will impact both consumers and businesses. We will look at the impact on business and service models.

11

2 IDENTITY MANAGEMENT

First, the ideas and concepts for identity management systems are explored.

Second, real-world examples put into application by commercial as well as government entities are presented. Finally, an analysis of what users expect from a DIMS is presented.

2.1 Concepts

This section talks about various forms of organization and procedures that are applied in the creation of a DIMS.

2.1.1 Know Your Customers (KYC)

Know Your Customer (KYC) is a procedure that a business follows in order to identify and verify the identity of its clients. [17]. KYC can be described in the following steps undertaken by a financial institution or any business entity:

 Establish the identity of the customer.

 Find out about the client’s financial activity (The primary goal here is to verify

that the customer’s income comes from a legitimate source).

 Calculate the risk associated with the client in order to set up monitoring

client’s activity and managing risk. These include steps taken for Customer

Due Diligence (CDD) which is divided in simplified, basic, and enhanced

12

Customer Due Diligence, each with an increasing level of risk management

[18].

These procedures serve a critical function in assessing the risk associated with the customer as well as monitoring the client’s financial activity due to several institutions being required to follow regulation for fraud prevention and Anti- money Laundering (AmL).

Nowadays, financial institutes and cryptocurrency exchanges are required to abide by the KYC requirements. To be fully compliant under regulations, banks are required to have the highest possible standard of KYC process and yet be easily accessible for services. It is designed to prevent fraud as well as conform to international laws for Anti-money Laundering (AmL) compliance requirements

[17].

2.1.2 Federation

Consumers nowadays have a very active presence in the digital world and often need to access a range of services. Direct authentication requires the user to remember authentication credentials for every service. Federation of identity solves this problem by creating an enterprise resource that handles authentication on behalf of the rest of the services. Thus, the relying businesses are relieved from the task of identifying and authenticating the user [19].

13

Conceptually, Federated identity management is a set of agreements, standards, and technologies that allow for porting of identities, identifying attributes, and privileges across multiple enterprises platforms. In general, it includes identity provides and service providers (relying parties).

Figure 2.1 Federated identity management [20]

Federated identity enables identification and authentication mechanism such as single sign-on (SSO) which allows multiple organizations to provide interoperability and ease of access to the user. An employee from one organization can use the same credentials throughout multiple platforms which are connected through a federation in a trust relationship [20].

Federation also allows for a standardization of representation for identity attributes (attributes belonging to an individual’s identity such as name and date of birth) that may include data such as biometric information and other sensitive

14 information such as account numbers, roles, and geolocation as well as file access permissions.

Federated identity allows for these attributes to be used as identifiers associated with multiple roles and permissions. Federated identity also provides identity mapping. Identity mapping is simply multiple domains representing identities and attributes differently. Also, one domain may contain more information about a user than other domains. In these scenarios, Federated identity-management protocols map the identity of one user from one domain to match the requirements of the other domains [20].

2.1.3 Claims

Claims are often pieces of identity such as name, e-mail, age, and role. In this method of authentication, the user delivers claims to a service provider and the service provider examines the claims. A trusted issuer issues the claim and the service provider approves the claim based on their trust on the issuer and the authenticity of the claim. Claims-based identity and access management allow for abstraction of specific data while providing enough data to be able to authenticate the user. Claims are meant to increase federation between organizations but aren’t limited to federation [21].

15

Claims are dependent on trust between issuer party and service providers

(relying parties). A trusted Service (STS) packages one or more claims into a cryptographically signed security token and sends it in various forms from XML-based SAML or Simple Web Token (SWT) to the relying party.

Once the relying party receives the token, it verifies the token signature. If it is successfully verified, the service provider application uses the packaged claims for any other authorization checks or personalization [23]. The figure below shows the process of claim- based IAM. The process starts with the user requesting access to service through the application and STS taking user’s credentials (step 1) packaging claims in the token (step 2) and returning the cryptographically signed token to the application (step 3). Following steps describe the token being sent to the web server and authorization processes performed on claims packaged in the token [22].

Web Service

Figure 2.2 Web service with STS [22]

16

2.1.4 Self-sovereign Identity

Self-sovereign digital identity is an idea that people must be able to store their identifying information on their own devices and provide it to any service provider for validation without any centralized authority [25]. However, there is no consensus on what precisely self-sovereign identity comprises of. Self-sovereign identity is destined to be the next step away from a centralized identity. The idea of self-sovereignly for identity management not just provides interoperability across multiple platforms but also has greater control over one’s own identity.

Following principles form the basis of self-sovereign identity [26]:

1. Existence: Users must have a physical existence beyond their digital identity. The digital self-sovereign identity must only make some parts of the identity available in digital form.

2. Control: User must be able to exercise control over their identity. The user should always be able to refer to it and update it. They must be able to set their own rules of privacy. The user should also have the “right to be forgotten” when the user wishes to.

3. Access: User must be able to access their data, retrieve claims and other data belonging to their identity. There must be no hidden data. However, the user must only be able to access claims and not modify them.

17

4. Transparency: The mechanism of identity management should be transparent and open for observation and cross-examination. The algorithms, platform, and maintenance must be open-source and available for everyone to understand it and evaluate it.

5. Persistence: Existence of identity must be either forever, as long as the user wants or at least as long as the user exists. The claims, validation of claims, and keys might change, but the identity itself must persist.

6. Portability: The identity stored on user’s devices must be transportable.

Even when regimes change or trusted entities change, the identity must be portable. So identity cannot be solely held by a third party. The user must remain in control of their identity. Portability also improves identities persistence.

7. Interoperability: Identity must be available across multiple platforms.

Identity should be available across international boundaries thus creating a global identity. Interoperability can be efficiently delivered through persistence and portability.

8. Consent: Users are in control of their identity and thus, must be able to provide consent to the use of it. Interoperability and portability facilitate faster and greater sharing of identity. Even when an entity other than the user presents claims to identity, consent must be provided by the user. Consent must be well- understood and explicit.

18

9. Minimization: Sharing and disclosure of claims and data must involve the minimum amount of data. The attributes shared must be shared only when required. A layer of abstraction must be introduced in order to hide the specific data when sharing claims. This principle can be implemented through selective disclosure, range proofs and other zero-knowledge techniques [26].

10. Protection: Right to privacy and other individual rights must be protected.

In any conflict between the requirement of identity and right of an individual, the network must uphold the rights of the individual.

2.1.5 Attribute-based credentials

Attribute-based credentials (AbC) is a form of authentication that lets the user choose the attributes to release on a strictly need to know basis. An attribute- based credential is a cryptographic container of attributes associated with an individual’s identity. It is signed with a unique digital signature by a trusted, authoritative issuer that attests to the validity of these attributes. The signature here serves two purposes. First, it guarantees the integrity of data, allowing the relying party can know that no unauthorized change has occurred in the container. Second, it also guarantees the privacy of the user by not having the trusted issuer involved in the verification process. Thus, the user cannot be linked to an issuer. The user can use these attributes across multiple platforms individually and independently of each other. This process is called selective disclosure proof which uses zero-knowledge proofs as underlying privacy-

19 enhancing technology (PET). Thus, AbC provides unlinkability, security, and confidentiality. The user can even collect attributes from multiple issuers and carry them on a trusted device such as a smart card in order to reveal them to a service provider for verification, authentication, and identification purposes [28].

2.1.6 Identity Governance Framework

Organizations need to maintain and secure identity attributes of their customers and employees. Governments are increasingly creating stricter regulations concerning access and storage of sensitive information such as social security numbers, credit card numbers, and medical history. Several organizations have responded with stricter identity and access management systems in place to follow these regulations. However, there are organizations on the opposite end that fail to apply the required controls and risk theft or leak of identity-related data

[26]. The Identity Governance Framework (IGF) mitigates these risks by allowing standardization of mechanism for secured sharing of sensitive data through the establishment of contracts between their applications. These contracts create confidence that the data will not be abused or compromised. IGF provides organizations with a complete visible map of transfer and storage of information throughout all the components of application [30].

IGF was founded in 2006 by a group including HP, Oracle, Sun Microsystems, and other organizations. The goal of IGF is to simplify the development of identity and access management systems and governance of identity-related data. IGF has implemented several standards to achieve these goals. One such project is

20 the CARML protocol which stands for "Client Attribute Request Markup

Language" and defines the type of identity information that an application needs.

An XML based language, CARML defines the types of attributes an application needs to consume and the privacy rules for it. These include the persisting, purpose, and transfer of data through various parts of the application. Another such project is called AAPML, the "Attribute Authority Policy Markup Language" and it describes the constraints on the use of identity information. These include the conditions under which the data is made available, the ways data can be used and modified [31].

Although initially founded by a group of independent organizations, IGF has become part of Liberty Alliance’s projects for addressing the issues about identity governance.

2.2 Real-world examples

This section explores some of the real work examples of identity management systems.

2.2.1 BanQu App

Allowing people with no bank accounts to create a digital identity profile, BanQu app enables tracing of fund transfer by people who do not have a digital identity setup. Allowing a variety of funds transfer such as online shopping, BanQu creates a traceable financial history for effective identity verification in the future.

Thus, individuals without a digital identity can participate in the global economy.

21

BanQu provides other financial institutions with preventative services so institutions such as banks can comply with Anti-money Laundering laws (AmL). It does so by providing a shared KYC process. BanQu also provides benefits to businesses selling valuable goods by providing traceability throughout their supply chain.

Figure 2.3 BanQu App [33]

2.2.2 Bitnation

Calling itself a “borderless voluntary nation,” Bitnation is essentially a network of people. Thanks to this network comprising of people across several nations,

Bitnation allows for recording of disputes information through smart contract technology. It provides a ledger for the recording of any legal event through the use of Ethereum smart contracts. By containing records of legal procedures,

Bitnation provides the same services as governments traditionally do. The

Bitnation blockchain is called Governance 2.0 and provides people with an easy

22 choice of government service that is an end-to-end solution instead of having multiple accounts from multiple service providers. This idea of governance is based on the code of holacracy which comprises a small group of citizens making fast decisions.

2.2.3 BlockAuth

Enabling users to own and operate their identity registrar, BlockAuth creates a network of OpenID/0Auth login providers for identity verification. The verification is performed through combinations of machine learning algorithms and human validation [54], permitting for outsourcing of the KYC process. In BlockAuth’s verification process, users can pick an identity registrar. Registrars ensure that each information that the users have asserted about their identity is true [54].

BlockAuth aids in verification of identity attributes such as name, age, and citizenship. BlockAuth also allows for verification for multiple websites so users can easily access multiple services from just one verification platform. As a benefit, businesses do not have to conduct their own verification process.

2.2.4 Civic

Creating an Identity Protection Network for consumers to avoid identity theft,

Civic builds trust. Working as a digital wallet, Civic Secure Identity Platform (SIP) securely shares the validated data. Identity requester sets the process for the verification process. Once provided by the user, data is verified against multiple public, and social media records. Fraud detection algorithm combines multiple

23 reputable sources, internal decision engine, and manual auditing while civic efficiently mitigates risk for false identity verification. Allowing identity requestors to define the requirements for the format of identity attributes, civic provides a range of capabilities such as multi-factor authentication for multiple identity requestors [54].

2.3 Consumer expectations

As mentioned in the previous chapter, consumers feel like they are not the sole owners of their own identity. This concern is further amplified by the fact that current mainstream Identity management systems do not allow the user to track the usage of their identity attributes by businesses. Also, these businesses do not offer a decentralized identity. Identity attributes are scattered across several platforms controlled by different businesses. Thus, the creation of identity through acquired attributes has become quite easy. “Identity theft is a serious problem. Identity fraud and theft is a problem that over 15 million people in the

USA alone encounter every year, resulting in annual financial losses up to $50 billion. [32]”

Due to scattered identity and identity theft, identity verification has also become troublesome. Some businesses require merely a driver license or any photo id while financial institutions such as banks require rather detailed customer identity verification also known as KYC process. The verification process is hard to secure as it has become easier to forge identity attributes as well as Identity

24 cards and similar documents. For examples, prices of purchasing a US passport on the dark web begins from $938 [32]. Also, centralized identity may result in theft of data by hackers or an insider.

News about data breaches and theft of credit card numbers as well as other sensitive information have become frequent. Consumers, being aware of this situation, are concerned about the security of their identification data being held by businesses. Although there are safety standards in place, consumers can not track the usage of their identification data. Thus, consumers not being the holder of and control the sharing of their own identity means that there is a need for a solution that helps consumers be the owners of their own identities. In other words, consumers want to have a self-sovereign identity.

25

3. BLOCKCHAIN

In this chapter, we explore the Blockchain technology in detail. First, the history of the Blockchain starting from its initial use in bitcoin is presented. Second, blockchain cryptocurrencies that have gained prominence are presented. Third, details about the architecture of Blockchain is explained along with the protocols involved in validation through the description of Bitcoin’s blockchain network.

Finally, applications of Blockchain not concerning Identity management are explored.

3.1 History

 The first application of Blockchain was in the cryptocurrency bitcoin. The

trustless transaction environment offered by the blockchain has helped it

reach a market capitalization of $10-$15 billion dollars. The value of a

bitcoin has risen to over $6000 from a starting point of a few dollars [33].

 The second major innovation was the realization that the blockchain

technology that powered bitcoin could be separated from the currency and

applied in other fields and systems where parties interested in any

transaction cannot fully trust all involved parties. Several financial, as well

as government institutions, have started researching potential applications

for blockchain. In 2017, approximately 15% of banks in the United States

are believed to be using blockchain to various extents [33].

26

 Third innovation that revolutionized the application of blockchain was

smart contracts which allowed a transactional logic to be programmed

Into a blockchain. This feature was introduced with a blockchain-based

cryptocurrency called Ethereum. These pieces of transaction logic allowed

different organizations such as banks to represent assets such as loans

and investments and perform transactions such as buying and selling with

transaction records being stored on the blockchain. Till now, Ethereum

smart contract platform has reached over a billion dollars with hundreds of

new projects being introduced every month.

 The next concept was proof of stake. Blockchains then were secured by

“proof of work” where consensus was taken according to the majority, and

the group possessing the most significant amount of computing power

made the decision [33]. These groups were made up of vast data centers

meant to provide security of transaction in exchange for cryptocurrency or

bitcoin payments. Proof of work was replaced by proof of stake which

made use of deterministic algorithms. In this way of validating

transactions, the currency was already created, and there was no block

reward. There was no need to have thousands of machines working on a

proof of work to mine the cryptocurrency and spending vast sums of

money on electricity costs. Thus, it largely reduced the downward

pressure on the value of cryptocurrency [36].

27

 In a cryptocurrency blockchain, every computer participated in every

transaction. The process of validation was slow. Once the scaling of

blockchain was introduced, the validation process became faster and cost-

effective. Scaling of blockchain introduced a mechanism to figure out the

number of computers required to validate each transaction while not

compromising security. Scaled blockchains were fast which enabled the

blockchain networks to compete with enterprises in several specialized

industries such as payment processing, and the internet of things.

3.2 Cryptocurrency

3.2.1 Bitcoin

Bitcoin can be interpreted as a network that runs on the blockchain. In other words, it is a cryptocurrency that is conducted on a public ledger. A distributed ledger is a database shared and synced across the network. Not having any physical existence, it only exists on the internet. It can be traded and has a monetary value like any goods. It can also be used for payments. Bitcoin is a decentralized network and is not controlled by a single entity. Therefore, the cryptocurrency is not regulated by the government and does not require any bank account to hold it. Bitcoin is managed by a group of people who are called miners [36].

These miners keep track of all the transactions that occur in the blockchain.

28

Every transaction is encompassed in a block and added to the blockchain. Each block in the blockchain is linked together through complex cryptography. Change in a ledger requires a large amount of processing power. Also, there are thousands of nodes (devices used by miners and other users) running on the network and any attempt to make a change is visible to every device on the network. Every user is policing this transaction when a change is proposed.

Thus, the bitcoin network is theoretically impossible to tamper with [36].

The bitcoin blockchain also prevents double spending by a user through the consensus algorithm. When two transactions that spend the same input from the same block are presented, the blockchain automatically rejects both.

Transactions are similarly rejected when two or more blocks in the bitcoin blockchain receive input from the same bitcoin source [39]. More information about blockchain and hashes that play an integral part in workings of bitcoin is presented in later sections.

3.2.2 Ethereum

Although being a public ledger like the bitcoin, ethereum is different from bitcoin in key technical aspects. While being a currency, ethereum is also a blockchain- based open-source platform that allows for the building of decentralized applications. While bitcoin blockchain enables electronic cash and payment system, ethereum blockchain allows programming of code to run in a

29 decentralized environment. Ether is the currency of the ethereum blockchain and is used for paying the transaction fees and services on the blockchain [41].

The principal feature of ethereum blockchain is the ability to create a smart contract. A smart contract is a code that can be run on top of blockchain that executes when the required conditions are met. Using the smart contract, a developer can develop programs to facilitate the transfer of money, digital issue of contracts, exchange of assets, and several other functions. Executing on a blockchain, smart contract-based applications can take advantage of the immutability and security that the blockchain provides. These applications are protected from fraud, censorship, or any interference for unauthorized parties

[41]. A smart contract enables programming of operations to accomplish a wide range of tasks in comparison to the limited set of operations that blockchain provides in its basic form.

Ethereum’s core feature is the Ethereum Virtual machine (EVM). EVM allows for the execution of any program, regardless of any programming language. EVM is a Turing complete software that runs on that ethereum network and streamlines the process of building several different applications on same blockchain platforms [41].

30

3.3 Blockchain components

As a platform, blockchain is an application that functions as a transactional database that runs on a network of servers (or any computational machine) as a shared ledger. The metadata for ledger is stored using databases such as

Google’s levelDB [45]. The data is synced across all nodes in the application, and each transaction has all validator nodes participating in it. The components that are part of this process are described below.

3.3.1 Nodes

A node is simply a computing unit that runs the blockchain application which could be a server or just a regular computer. Since there is no hierarchy in the topology of the blockchain, each node consumes services equally via consensus rules [45]. Every node keeps a copy of the ledger with ownership and payment information, and each transaction is confirmed through strong consensus. On boot, each node begins a network discovery to find out and connect with peer nodes. This discovery process which involves messaging and handshake logic allow for abstraction of blockchain management through a higher degree of schematics.

31

3.3.2 Blocks

Blocks are data structures that hold the transaction information. Since the ledger is distributed (shared), a copy of each block is stored with each node. Each block contains a block header. This header contains the metadata identifying the block which is also helpful in verifying the validity of the block. Following are the attributes contained by the block header:

 Version: The version of the block.

 Previous block header hash: Hash to reference the header of the

previous block.

 Merkel root hash: Cryptographically signed hash of all transactions

packaged in the block.

 Time: the time block was created

 nBits: the current level of difficulty that went into the creation of block.

 Nonce: an arbitrary value that can be used by the creator of blockchain

any way they please.

It is noteworthy that the user of the blockchain can try to input transaction any way they please but the consensus rules make sure that only the valid transactions get accepted and inserted into the blockchain.

Following is the description of different kinds of the block that exist in the blockchain [42]:

32

 Majority of blocks are added to the main blockchain. There are called

“main branch blocks.”

 Some block reference a parent block that is part of another blockchain.

These blocks are called “side chain blocks.”

 Some blocks refer to parent block that is not known to the node

processing the block. These are called the “Orphan blocks.”

3.3.3 Transaction

Transactions are requests to input data into the blockchain. One or more transactions can be packaged into one block. Transactions consist of fields that specify all the parties involved and other information about that transaction. In a cryptocurrency blockchain, these fields would be the sender’s address, receiver’s address and the amount transferred. Thus, the transactions move the value of cryptocurrency from one address to another.

Each transaction changes the state of agreed-correct blockchain [41]. Since each node holds its own copy of shared ledger, the current state of the blockchain is calculated by processing each transaction in a linear order. Each node independently verifies each transaction. Transactions are packaged and distributed to each node in the form of a block, and this movement of blocks represents that data within the blockchain.

33

Each transaction can have multiple inputs and multiple outputs. The input to each transaction is the reference to the output of the previous transaction, and the output specifies a value and an address. In bitcoin, input data object also contains a cryptographic signature called ScriptSig which proves that the creator of this transaction has enough permissions to create it [41]. ScriptSig does it by referencing an address inside the referenced transaction’s output. This address proves that the creator of this transaction was the owner of the referenced transaction’s output.

3.3.4 Hash Functions

Hash functions play a vital role in the proof of work which is a critical concept in the bitcoin network. The hash function creates a fingerprint (a smaller version) of a piece of data that is of designated length and can be used to reference the data. Hash functions are one-to-one. One input can be mapped to one and only one output. Also, the hash cannot be reverse engineered to get the larger piece of data. Bitcoin uses SHA-256 for proof of work. Editing even a single bit in block header can change the value of the hash. This property allows for a network of computers to try to create a new hash by changing the nonce of last block’s header until a valid hash is found. Once found, the miner node broadcasts the hash to other nodes, and other nodes can easily verify its validity by executing the same hash function. Once all nodes have agreed on the validity of the hash, the block is added to the blockchain.

34

3.3.5 Consensus

A consensus is a group decision-making process is where group members create and agree to support a decision in the best interest of the whole system. Consensus algorithms are meant to achieve as much agreement as possible through collaboration, inclusiveness, and egalitarianism.

Proof-of-Work (PoW): Satoshi Nakamoto, Bitcoin’s creator, invented the proof of work protocol [45]. The first consensus algorithm to be devised, it is mostly responsible for the massive-scaled mining operations consuming substantial qualities of power. Each time a block is created, its block header contains a nonce value. Each miner node attempts to solve a cryptographic puzzle by changing the value of the nonce field in the current block’s header and performing a hash function on the reference with nonce value. If the hash meets a specific condition, the node will send out the hash to other nodes to be validated. Since hash functions theoretically provide one-to-one mapping, even the slightest change to the hash function will yield a different hash. Each node competes to come up with the valid hash first. Once the valid hash is found and the majority of nodes agree to its validity, the block is written to the blockchain. This process is extremely inefficient due to the sheer amount of power required in the creation of new blocks. Also, large corporations have greater access to computing power and have a better chance of mining,

35 preventing bitcoin from being truly decentralized.

POW is being slowly being replaced and has become a legacy especially due to several other efficient alternatives. For example, Ethereum is migrating from POW to more energy efficient proof of stake (POS).

Proof-of-Stake (POS): POS makes the process of creating new blocks virtual, and the miners turn into validators [45]. POS replaces the work required to create another block with a stake. Each validator node locks up some of their coins at stake. Once they find a block that could be the next block, they put a bet on it. If that block is appended to the blockchain, the validator gets reward proportionate to their bet. The possibility of being chosen increases as the stake locked into the block.

This consensus protocol raises the problem of “Nothing At Stake.” Since there is no computation power used to consensus, a validator could put a stake on all competing blocks resulting in excessive amounts of forks from the main blockchain which, in turn, could harm the currency. The first cryptocurrency to implement POS was PeerCoin followed by blockCoin and NXT.

Proof of elapsed time: A protocol created by Intel, this protocol works similarly to proof of work without consuming a large amount of energy [46]. It

36 does so through random generation of blocks where the algorithm uses a trusted environment such as SGX. This proof works on the guaranteed wait time provided by the SGX, and it can scale to thousands of nodes through processors that support SGX. One shortcoming of this protocol is the required trust in Intel as the provider of consensus mechanism for blockchain. This assumption of trust goes against the fundamental principles of public blockchain that there should be no trusted third party to achieve a consensus.

3.4 Fields of applications

Blockchain provides an environment where parties involved do not have to trust each other, so in addition to identity management, blockchain can be applied to several other industries where such environment is useful. Some of these solutions are described below [49].

 Insurance claims processing: Claim adjusters have to go through

several hurdles to avoid fraudulent claims due to fragmented data sources

and abandoned policies. Also, the processing of these claims has to be

done manually. Thanks to its shared ledger architecture, blockchain can

eliminate errors and detect fraudulent claims. For example, an insurer can

independently authenticate the customer, policies, and claims through

comprehensive historical records. Blockchain will also enable the

establishment of a customer claims model with a higher degree of trust by

complementing big data, and other technologies. For claims prevention,

37

this new data stream can improve the risk selection process by combining

factors such as location, external risk and other analytics [48]. Also, when

seeking reinsurance, companies can gain greater visibility to risk

exposure. When offloading risks to reinsurers, companies can balance

capital exposure to specific risks by easily finding out if the reinsurer is

offloading risk to a subsidiary of another reinsurer [48]. In addition to

claims prevention, blockchain can also provide a secure platform for

claims payment [48].

 Supply chain: Consumers can have greater visibility about the origin of

any product through a blockchain backed supply chain [49]. End-

consumers can find out if the goods they are buying are authentic.

Businesses such as jewelry shops can verify that the diamond has come

from an authorized miner. A great example of a blockchain-backed supply

chain is origin and transportation of medical marijuana. Through sensors,

government authorities as well as relying parties can track the

consignments. People who are prescribed medical marijuana can verify

that the source entity is not affiliated with any criminal entity. Similarly,

sensors can be used along with blockchain to create a traceability chain

for a seafood supply chain. Once a fish is caught, a temperature sensor is

attached along with GPS tracker. The temperature sensor sends

temperature data about fish periodically at short intervals. When the fish

reaches a retail store or any restaurant, the quality of fish can be proven

38

by verifying that the fish was stored in an environment with the right

temperature along with its origin and transportation.

 Healthcare: Blockchain can provide people with the privacy of their

medical history. Records can be stored and encoded in a blockchain. The

patient can use a private key to grant access to only specific individuals or

businesses. Also, the processing of healthcare insurance can also be

handled to avoid fraudulent claims [49]. Also, blockchain could make it

easier to ensure that research is conducted in compliance with HIPAA

laws. Receipts of surgery could be used as a proof of delivery and

automatically sent to the insurance provider. Also, blockchain could

provide easier management of drugs supervision, prescription

management, and regulation compliance, testing results and managing

healthcare supplies [49].

 Unconventional money lenders/ hard money lending: Several apps

already exist that allow for the transfer of cash. However, none exist for

cash loans. Smart contracts can revolutionize the traditional lending

system [49]. People with poor credit can easily borrow money from cash

loan lenders. Also, several people borrow money against collateral such

as cars and end up losing these properties after failing to pay back.

Blockchain can allow for mechanisms where a stranger can lend money,

and an individual can put their smart property as collateral through smart

39

contract [49]. Blockchain can remove the need for manual processing and

any need to show work or credit history.

 Cross-Border Payments: Money moving across international lines needs

to be tracked, and its use must be verified to ensure that it is not being

used for criminal intent. Existing systems are susceptible to errors as well

as hackers. Blockchain can provide a robust and secure channel for

transfer of funds across international borders. Companies such as Abra

and Bitspark are at the forefront of cross-border platform solutions [3].

Blockchain has been integrated with payment systems by multiple

financial institutions such as Santander and Align commerce [49]. These

integrations have resulted in the facilitation of cross-border payment

services that can run twenty-four hours a day.

 Smart binding contract: Smart contracts do not have legal enforcement

yet and therefore, cannot be enforced like current binding contracts. If

legalized, smart contracts will open doors to a new era of contractual

procedures. For example, if an individual seeks to rent an apartment, they

can sign smart contracts for the application. Once all the background

checks are completed, they can sign the lease to the apartment through a

smart contract. The smart contract can also help govern the payment of

rent and fixing of broken appliances or windows by the property owner. In

addition to leasing an apartment, smart contracts can also automate

40

contractual procedures in several other fields that require a trustless

environment.

 Intellectual property patenting: Ideas are intangible assets. Transfer of

ideas can be challenging to track and rewarding the rightful owners of the

ideas can involve several legal hurdles. Patent offices can effectively

regulate the intellectual property created by major organizations as well as

an individual with a unique idea and a product. However, there is no office

to regulate intellectual property that is purely intangible and does not have

significant utilities. Intellectual property can be anything from a song

written and composed by a struggling artist to an entrepreneur seeking to

establish an enterprise from a simple yet creative idea that needs to be

revealed to potential investors. Blockchain can prevent theft of these

intellectual properties by verifying that these individuals are the rightful

owners.

41

4. SOLUTION

In this chapter, design for a new solution is presented.

After discussing the necessities for the solution presented here, this paper presents the features and use cases that this solution accomplishes. Following is the section presenting the design. The design is first described in high-level architecture. After the high-level architecture, the design is further explained with every level of the workflow being discussed in details and described with activity diagrams.

After design, a section containing an explanation about the impact this solution will have on all actors involved in this scenario is presented. Subsequent sections explore additional future use cases for the solution are presented.

4.1 Necessities

The introduction chapter discusses the current shortcomings of the existing digital identity systems. In this section, I will explore these shortcomings further.

4.1.1 Shortcomings

Several entities, government as well as commercial, have tried to identify people by creating a digital identity for them and bringing them into the digital world. In this process, the identity management systems that have been created are centralized and do not allow the user (the individual that the digital identity belongs to) to own their own digital identity. Several kinds of digital identity

42 systems have been created for authentication and access management. These efforts result in an identity that is all, but sovereign. These identities are owned by these entities and shared according to their business needs without any consent of the identity holder.

Thanks to the emergence of technologies such as blockchain, the idea of a truly self-sovereign identity is gaining momentum. Self-sovereign Identity is simply a concept that we are all creators of our own identities and should have the ownership of it [1]. However, Implementation of such a concept without any oversight and regulation leads to individuals asserting claims about their own identities and thus, resulting in fraudulent transactions. This is the issue that this paper aims to address. The solution presented here describes a design for a self- sovereign identity sharing system that preserves user’s privacy as well as tracks the usage of the user’s identity attributes.

4.2. High-level Architecture

This section presents the workflow of the solution. The actors involved in this described below.

4.2.1 Actors

User: The individual that who is holding the identity. This identity may cover any number of identifying attributes from simple first and last name to confidential information including social security number. The user can be anyone who wishes to use the self-sovereign identity preserving system.

43

Identity Guarantor: This term here is defined as the entity that will attest to the user’s claims concerning their identifying attributes. Identity Guarantor can be a government or any commercial identity. The type of identity guarantor depends on the type of identity attribute that is being claimed. If the attribute is a social security number or the date of birth, the guarantor will be a government entity. If the attribute concerns a registration token for commercial activity, the guarantor will be the entity that issued the token.

Relying parties: This entity is the same as the entity with the same name introduced and discussed in previous chapters.

The figure below describes the high-level architecture flow-chart of the solution.

Detailed activity diagrams and use case diagrams are displayed in the following sections. The solution has at its core the process that was used in the past to verify someone’s identity or to get background information on people. An individual’s reputation was dependent on how others viewed them. The identity of a person was only as valid as the opinion that others held about them. This solution uses this principle to create a system where the user can have their identity endorsed and guaranteed by relevant parties. These guarantees can be used to establish trust between relying parties and the consumers.

44

Figure 4.1 High-level application architecture

45

4.2.2 Use cases

This section presents the use case diagram for the solution described in this solution.

Figure 4.2 Use case diagram

46

4.3 User

This section presents the details about the high-level architecture and the use case diagram presented in previous sections. The details are presented in the form of activity diagrams. Each box is highlighted with a dotted line or different colors to identify the entity or application that will be performing that activity. The legend for the actors and processes is as follows:

General Blockchain network

Relying party

Blockchain application/User’s blockchain, Attributes Event

Blockchain (AEB)

User

Identity Guarantor

Package in a cryptographic key

4.3.1 Attribute creation

The user will have to submit their identification information to an entity (identity guarantor) who will ensure that no one forges a non-existent identity made up by parts of other people’s information. This identity information includes not just basic information such as a combination of first name, last name, date of birth, but also more private information such as social security number, driver’s license

47 number, and even biometrics. The user may choose to include only the information they want to be certified from any levels of confidentiality. The identity guarantor will certify the validity of the identity attributes submitted by the user.

In this solution, a portion of this dependence on the identity guarantor is replaced with a blockchain that contains all the users that exist in the blockchain application. Each node in this blockchain is represents a unique user. Each time a user inserts a new identity attribute in their account, all attributes belonging to the user will be checked by the all the existing nodes in the blockchain. Each node will check the user’s identity attributes against the attributes of the user account that represents the node. Since a unique user account represents each node, any identity containing forged confidential and sensitive identity attributes can be efficiently detected. For less-sensitive identity attributes such as first and last names, eye color, and hair color, this check will be less useful as these attributes are not unique to a specific user. However, for sensitive information such as biometric data, social security number, and passport number, this check will prove useful in detecting forged unique identity attributes. This blockchain will be referred to as User-Network Blockchain (UNB). UNB is the first of the two blockchains that make up the blockchain solution application presented in this paper.

The second check is done by the identity guarantor that can be represented by any relevant party. Since the user can insert any identity attribute in the application, the identity guarantor has to be the organization that can verify the

48 provided attributes. This check makes sure that the identity attributes provided by the user are genuine and belong to the individual’s real identity. Every time the identity guarantor validates identity attribute, it returns a package signed by its digital signature. This digital signature lets the relying party know that the attributes have been validated and user’s claims have been verified.

Both of these checks are combined to ensure that duplicated attributes can not forge any identity. Thus, the possibility for fraud through the identity of fake or forged attributes is significantly reduced, providing a secure environment for users to share their identity and for relying parties to import identity attributes of customers. Once the identity guarantor returns the signed attributes, they are stored on the user’s mobile device or any other device of choice.

The activity diagram below describes the process for the user creating or updating identity attributes and identity guarantor verifying it. Here, we take the example of Attribute-based Credential (AbC). Consumers only share the attributes that are needed and when they are needed.

49

Figure 4.3 Attribute creation

50

4.3.2 User granting access license

The user has to grant business a license to use their identity attributes through smart contracts that run on top of a user’s UNB. This blockchain will contain a list of all events that occurred on the User’s identity attribute. It will be denoted as

Attribute Event Blockchain (AEB). Each event recorded in AEB here can be one of the three following categories:

1. Granting of license to access the information.

2. Revocation of license to access any or every level of identity attributes.

3. Modification of license to modify access to any or every level of identity

attributes.

Temporary token: The user can generate a temporary token that can be used to identify the user’s account and device. These are soft-tokens that are usually used in two-factor authentication. These soft tokens are based on concepts of authentication which are something a user has, something a user knows, or someone the user is. The primary types of two-factor authentication are SMS, phone call, Email, hardware token as well as a software token. For this solution, a software token integrated with the Blockchain application is used. The user can generate a token code anytime they want to share their identity attributes with any business or government entity. This code can be used to identify the user’s account. This code is temporary and expires after a certain amount of time.

51

The activity diagram below describes the process relying on parties requesting access to attributes and user approving request and issuing an access license for the relying party.

Figure 4.5 User granting access license

52

License Registry: Simply a data table, license registry contains the terms of the contract and public keys of businesses. Implemented as a non-relational database, license registry data files are secured by Transparent Data Encryption

(TDE). Transparent data encryption encrypts sensitive table data stored in data files [51]. TDE enforces unauthorized decryption by storing the encryption keys in external security modules. For each user, the data from tables is separately decrypted. There is no need to create triggers or view that decrypt data. Also, there is no need to modify the application to handle encrypted data. TDE is a key-based access control which ensures that even if the data has been stolen, its content cannot be understood until the decryption is performed. Thus, TDE ensures that the terms of the contract have not to be altered without authorization from the user. License registry is stored only on the user’s device of choice.

For current functionality, license registry contains the following format:

{ "entity": "example entity", "contractStart": "2019-06-25T00:00:00.000Z", "contractEnd": "2021-06-25T00:00:00.000Z", "Attributes": [ "First name", "Last name", "Social security number", "DateOfBirth" ] }

53

4.3.3 Tacking usage by Relying party

The core of the system described here is the oversight that user has on when their information is being accessed. The user will be able to query their AEB for any access to their identity attributes done by any business entity. The user will be able to see and keep a record of all the instances of information access done by any entity through AEB.

Attribute Event Blockchain: User and the relying party both will have a copy of blockchain that contains all events concerning the access of user’s identity attributes. The user will only have the events concerning their identity attributes while the relying party will contain the access events for all users’ identity attributes. This is to ensure that the methodology laid out in this solution does not defeat the purpose of security that blockchain offers. Thus, AEB can only be altered when both the relying party and the user or the user granted license validate the event.

Business signature verification: This is a digital signature. It will be verified by the public key stored on the license registry.

Attribute request validation: Request for an attribute will simply contain the attributes that the relying party needs to access. This request will be executed against the terms of access described in the license registry. This validation will verify that the relying party does not get access to any more attributes then they

54 have been approved for, or access is not granted for any time longer than approved by the user.

Attribute claims package: The claims package contains all the required signed attributes. The signature on the attributes proves that an identity guarantor has validated them. The signature of claims package proves that the claims package originated from the right user. These signatures let the relying party verify the authenticity of the attributes and the package as a whole.

The activity diagram on the next page specifies the process for access of identity attributes by the relying party.

55

Figure 4.6 Accessing attributes with license

4.3.4 User revoking or editing license

The user can revoke the license for access to identity attributes whenever needed. Process for revocation will only involve a change in license registry.

56

Since the license registry will contain the duration of the contract and the attributes the license is for, the license for the contract can be simply deleted.

Since the access license is verified during every access event, once the relying party requests access and the license is not found, the request will be rejected by the application. Once the user revokes a license, the event will be entered in the

AEB. Thus, the user will be able to manage the access license through the license registry and revoke the license if they please to do so.

If there is a need to update the terms of the contract, the duration for a contract or the attributes in the license can be modified by the user.

4.3.5 Effect on User

As mentioned in previous chapters, consumers expect to have a system where they are the sovereign owners of their identities. This solution provides consumers with a way to control sharing of identity and tracking the usage of their own identity. Since the identity attributes are not shared on the relying party’s server, relying party will need to access the attribute using their public keys. Every time the relying party makes an access request, the event is inserted in the AEB. Thus, consumers know every time their information is accessed. The consumer can update their information anytime they want. The consumer can also revoke or update the access license providing them sovereignty over their digital identity. Also, this solution removes any middleman in access to attributes.

Reliance on the third party has been minimized to identity verification. Once the identity guarantor verifies the identity, there is no other third party that the

57 solution relies on for other use cases. This solution can also be extended to include reputation endorsement for application in the social life of the consumer.

Thus, this solution provides the user with decentralized and legitimate sovereignty over their digital identity by addressing the persisting issues with current day identity management systems.

4.4 Other actors

The identity guarantor plays a vital role in the verification and sharing of user’s identity attributes. Relying parties use the signature of identity guarantor to verify the validity of attributes.

4.4.1 Identity guarantor: Validation of identity attributes

The validation of attribute by the identity guarantor will be the only time any third party will be involved in this process. The identity guarantor will be a party that holds authority for the type of attribute that requires the validation. Once the attributes are validated, the identity guarantor will sign the attributes to denote their validity. These signed attributes will be returned to the user and stored on their local device.

4.4.2 Effect on identity guarantor

The identity guarantor will be the least affected party in this solution. There will be a need for setting up the required infrastructure for supporting this solution.

Since several government entities have some form of PKI infrastructure, these

58 changes will not be substantial. The identity guarantor can also integrate their existing KYC processes and infrastructure for validation of attributes.

4.4.3 Effect on relying party

Relying parties will be able to request access to a user’s identity attributes once the user has approved their access request. A relying party will be required to make changes in the term of infrastructure and business process regarding KYC operations.

Since the identity attributes have already been validated, the relying parties will not have to perform their identity verification and KYC process. The cost of creating the infrastructure for the solution presented above will be balanced by the cost saved from identity verification and background checks. This solution will simplify the process of registration for the relying party since the information is automatically imported from the user’s blockchain application. There will be no need for manual processing, and there will be no intermediary the relying party will have to rely on. Thus, this solution will save administrative costs as well as provide relying parties with independence regarding information processing infrastructure.

4.5 Communication and storage

This section presents technical details along with the limitations of the solution.

59

4.5.1 Secure communication

Communication between actors is secured through Public Key Infrastructure

(PKI). PKI allows for encrypting and signing of data. PKI comprises two keys.

These keys are used to encrypt and decrypt various kind of information. These keys are called public and private keys. These keys are mathematically related, and therefore, information encrypted through one key can only be decrypted by the other key. The public key is broadcasted to other entities while the private key is kept secret. This process of known as asymmetric encryption. All actors in this solution will have PKI to encrypt messages and establish a secure communication channel.

All boxes with dotted lines displayed in activity flow in previous sections involve the use of PKI. PKI is the basis for Digital certification. Use of Digital signatures lets actors know about the origin of the information. Each time an identity attribute is sent from a user’s blockchain application to the relying party, or when sent to User Network Blockchain (UNB), the attributes are encrypted using digital signatures. These signatures allow the receiving parties to authenticate that the data package originated from a trusted source. When the relying parties receive a claims package signed by the key of the user, the origin of attributes can be easily verified by validating the signature of claims package.

Once the claims package has been authenticated to be of a valid source, each signed attribute in the package is validated by the signature of the identity

60 guarantor. Thus, the relying party can verify that an authorized identity guarantor has validated the attributes.

Figure 4.7 Certification and keys [52]

Figure 4.8 Digital certification [53]

4.5.2 Storage of attributes

Attributes can be stored on the user’s mobile device. Availability of the attributes is affected when the mobile device is not available. This can be solved by having a cloud backup of attributes and license registry. Once the request is received,

61 the cloud application can perform the validation of request and respond with the requested attributes with a signed claims package.

4.5.3 Limitation

This solution explores the approach of self-sovereignty for sharing of digital identity. In this approach, it discusses the sharing of attributes to relying parties on the need to know bases. The consumer can provide consent as well as control licensing of access to attributes. However, the communication between the relying party and the user’s application is highly dependent on the device used to store attributes. The solution in its current form does not support device portability. The solution will have to be further extended to enable communication between the user’s mobile device and the relying party. The initial communication between the relying party and the user’s blockchain application for access request needs to be made such that user can port the attributes and license registry from one platform to another.

62

5. Future and Conclusion

This section presents additional scenarios that can be satisfied by extending the solution above.

5.1 Online shopping

The user can store their credit card information on the blockchain application on their device. Since the credit card information can be transferred in the same manner as the other attributes, risks from online shopping can be greatly reduced by using the blockchain application solution presented here. While online shopping, the user can just provide the software token number to the business’s website application. The business can simply request access to credit card information, and the user can approve it. Also, the record of this attribute access is inserted in the AEB. This will require better portability for attribute and license registry storage as the current solution only supports a user’s mobile device.

5.2 Single-sign on

Blockchain application presented above can provide the user with an option to access multiple services across various platforms through a single point of login.

The user only has to provide the software token number. The service provider can request and access to the attributes, and the user can approve the access.

These attributes can be an e-mail address, first name, last name, and any other attribute required by the service provider. The service provider can access the

63 attributes, and the access event is recorded in the AEB. This functionality can also be used to sign up for services across multiple platforms.

5.3 Conclusion

The literature review of the state of digital identity environment led to analysis of the strengths and weaknesses of various identity management systems. In this analysis, it was discovered that the consumers generally do not trust the current approaches. Multiple concepts such as federation, Attribute-based credentials, identity governance as well KYC were explored in the literature review. After exploring self-sovereign identity, the metrics and principles for a truly self- sovereign identity management system were discussed. The solution above aims to satisfy the principles mentioned in previous chapters. It allows for control and sharing of identity data by the consumer. Also, the user does not have to share any more data than needed. Along with the functionality provided by the solution, its limitations are also analyzed. The solution has to extend further to facilitate portability and interoperability. Through the license registry, the solution enables a consumer to issue and revoke the access license as well as modify it. Thus, this solution attempts to address the issues faced by the consumers today and pushes the idea of digital identity management into the era of digital self- sovereignty.

64

REFERENCES

[1] Wladawsky-Berger, Irving. MIT Initiative on the Digital Economy, 7 Sept. 2016, http://ide.mit.edu/news-blog/blog/digital-identity-key-privacy-and- security-digital-world.

[2] “The Role of Authentication in Identity Management.” Identity Services at Penn State, 26 Apr. 2016, http://www.identity.psu.edu/resources/documentation/current/the-role-of- authentication/.

[3] McElroy, Craig. “The 3 Forms of Authentication & How We Use Them.” Contegix, 17 July 2015, http://www.contegix.com/the-3-forms-of- authentication-how-we-use-them/.

[4] Clark, Sarah. “The Benefits of Moving to Digital for AML and KYC Compliance.” Mitek, Mitek, 2 Aug. 2018, https://www.miteksystems.com/blog/benefits-of-moving-to-digital-identity- verification-for-aml-kyc-compliance.

[5] “Thomson Reuters Legal Solutions.” Banks Need Next-Generation KYC for Digital Identity | Legal Solutions, https://store.legal.thomsonreuters.com/law-products/solutions/clear- investigation-software/articles/banks-need-next-generation-kyc-for-digital- identity-crisis.

[6] Broeckelmann, Robert. “Authentication vs. Federation vs. SSO – Robert Broeckelmann – Medium.” Medium, Augmenting Humanity, 24 Sept. 2017, https://medium.com/@robert.broeckelmann/authentication-vs-federation- vs-sso-9586b06b1380.

[7] Rouse, Margaret. “What Is Federated Identity Management? - Definition from WhatIs.com.” SearchSecurity, TechTarget, June 2018, https://searchsecurity.techtarget.com/definition/federated-identity- management.

[8] “Attribute Based Credentials.” Patterns - Privacy Patterns, UC Berkeley, https://privacypatterns.org/patterns/attribute-based-credentials.

65

[9] Pettey, Christy. “Gartner Says Number of Identity Theft Victims Has Increased More Than 50 Percent Since 2003.” Gartner IT Glossary, Gartner, Inc., 6 Mar. 2007, https://www.gartner.com/newsroom/id/501912.

[10] Avram, Abel. “A Guide to Claim-Based Identity.” InfoQ, InfoQ, 6 Oct. 2009, https://www.infoq.com/news/2009/10/Guide-Claim-Based-Identity.

[11] “Know Your Customer: Quick Reference Guide.” PWC, Jan. 2016. https://www.pwc.com/gx/en/financial- services/publications/assets/pwc-anti-money-laundering-2016.pdf

[12] Finley, Klint. “Blockchain: The Complete Guide.” Wired, Conde Nast, 1 Mar. 2018, https://www.wired.com/story/guide-blockchain/.

[13] Lewis, Antony. “A Gentle Introduction to Immutability of Blockchains – Bits on Blocks.” Bits on Blocks, 23 Mar. 2017, https://bitsonblocks.net/2016/02/29/a-gentle-introduction-to-immutability- of-blockchains/.

[14] “What Is Blockchain Technology? A Step-by-Step Guide For Beginners.” Blockgeeks, 1 Jan. 1968, https://blockgeeks.com/guides/what- is-blockchain-technology/.

[15] Worthen, Ben. “The Pros and Cons of Identity Management Projects.” CIO, CIO, 15 Oct. 2003, https://www.cio.com/article/2441905/it- strategy/the-pros-and-cons-of-identity-management-projects.html.

[16] Koren, Shani. “‘Know Your Customer (KYC)" – Neufund.” Neufund, Neufund, 5 Mar. 2018, https://blog.neufund.org/know-your-customer-kyc- 3c5d32897983.

[17] “KYC: 3 Steps to Know Your Customer.” Trulioo: Global Identity Verification, 10 Aug. 2018, https://www.trulioo.com/blog/kyc/.

[18] “SOAPatterns.org.” SOA Patterns, http://soapatterns.org/candidate_patterns/federated_identity.

66

[19] Stallings, William. “Understanding Federated Identity.” Network World, Network World, 31 Aug. 2007, https://www.networkworld.com/article/2285444/tech- primers/understanding-federated-identity.html.

[20] “The Claims-Based Identity Model.” About Processes and Threads (Windows), https://msdn.microsoft.com/en-us/library/ee517291.aspx.

[21] “What Is Claims-Based Identity? - Definition from WhatIs.com.” SearchSecurity, TechTarget, https://searchsecurity.techtarget.com/definition/claims-based-identity.

[22] “Introduction to Claims-Based Authentication and Authorization in .NET.” Technical Blog - Future Processing, 4 Mar. 2015, https://www.future-processing.pl/blog/introduction-to-claims-based- authentication-and-authorization-in-net/.

[23] Bennage, Christopher, et al. “Federated Identity.” Microsoft Docs, Microsoft.com, 22 June 2017, https://docs.microsoft.com/en- us/azure/architecture/patterns/federated-identity.

[24] Lewis, Antony. “A Gentle Introduction to Self-Sovereign Identity – Bits on Blocks.” Bits on Blocks, 3 Aug. 2018, https://bitsonblocks.net/2017/05/17/gentle-introduction-self-sovereign- identity/.

[25] Allen, Christopher. “The Path to Self-Sovereign Identity.” CoinDesk, CoinDesk, 1 May 2016, https://www.coindesk.com/path-self-sovereign- identity/.

[26] Kettunen, Antti. “Self-Sovereign Identity Delivers MyData in Practice.” Tieto, 25 Aug. 2017, https://perspectives.tieto.com/blog/2017/08/self-sovereign-identity- delivers-mydata-in-practice/

[27] Koning, Merel, Paulan Korenhof, Gergely Alpár, and Jaap-Henk Hoepman. The ABC of ABC. N.p.: Privacy Enhancing Technologies Symposium, n.d. PDF.

67

[28] “ABC4Trust - Attribute-Based Credentials for Trust (ABC4Trust).” Edited by EPractice Editorial Team, Migration to Open Source Software – Beaumont Hospital Dublin, Ireland | Joinup, Joinup.ec.europa.eu, 28 Mar. 2012, https://joinup.ec.europa.eu/document/abc4trust-attribute-based- credentials-trust-abc4trust.

[29] "Fusion Middleware Developer's Guide for Identity Governance Framework." Lesson: All About Sockets (The Java™ Tutorials Custom Networking). N.p., 27 Feb. 2013. Web. 23 Aug. 2018.

[30] Gaehtgens, Felix. “Why Liberty's Identity Governance Framework Is so Important.” KuppingerCole, KuppingerCole, 25 Feb. 2015, https://www.kuppingercole.com/blog/gaehtgens/why-libertys-identity- governance-framework-is-so-important.

[31] Rijmenam, Mark. “How to Solve a Global Identity Crisis Using Blockchain and Propel Identity into the 21st Century.” Datafloq - Connecting Data and People, https://datafloq.com/read/author/francisco-maroto/384.

[32] “Banquapp - Banquapp.com.” Banquapp - Banquapp.com, Banquapp, https://www.banquapp.com/.

[33] Gupta, Vinay. “A Brief History of Blockchain.” Harvard Business Review, Harvard Business Review, 5 Apr. 2017, https://hbr.org/2017/02/a-brief- history-of-blockchain.

[34] “Proof of Work vs Proof of Stake: Basic Mining Guide.” Blockgeeks, Blockgeeks, https://blockgeeks.com/guides/proof-of-work-vs-proof-of- stake/.

[35] Kirkland, Justin. “Okay, Here's What You Actually Need to Know About Bitcoin.” Esquire, Esquire, 27 Dec. 2017, https://www.esquire.com/lifestyle/money/a14500146/beginners-guide- bitcoin-what-is-bitcoin/.

[36] Ross, Sean. “How Does a Block Chain Prevent Double-Spending of Bitcoins?” Investopedia, Investopedia, 19 June 2015, https://www.investopedia.com/ask/answers/061915/how-does-block- chain-prevent-doublespending-bitcoins.asp.

68

[37] Floyd, David. “How Bitcoin Works.” Investopedia, Investopedia, 13 July 2018, https://www.investopedia.com/news/how-bitcoin-works/.

[38] Sutardja Center for Entrepreneurship & Technology. BlockChain Technology Beyond Bitcoin. Berkeley, California: Sutardja Center for Entrepreneurship & Technology, n.d. PDF.

[39] "What Is Ethereum? A Step-by-Step Beginners Guide [Ultimate Guide]." Blockgeeks. N.p., 01 Jan. 1968. Web. 23 Aug. 2018. https://blockgeeks.com/guides/ethereum/.

[40] "Blockchain Architecture." Pluralsight. Pluralsight, 11 Oct. 2017. Web. 23 Aug. 2018. https://www.pluralsight.com/guides/blockchain-architecture.

[41] Botjes, Edzo. "Pulling the Blockchain Apart.. The Transaction Life- cycle." Medium. Augmenting Humanity, 11 Aug. 2017. Web. 23 Aug. 2018. https://medium.com/ignation/pulling-the-blockchain-apart-the- transaction-life-cycle-7a1465d75fa3.

[42] Samman, George. “How Transactions Are Validated On A Distributed Ledger.” SAMMANTICS, 8 Mar. 2016, https://www.sammantics.com/blog/2016/3/6/how-transactions-are- validated-on-a-shared-ledger.

[43] Scutz, John. “A Guide to Understanding the Blockchain Architecture.” The Market Mogul, 1 Feb. 2018, https://themarketmogul.com/blockchain- architecture-guide/.

[44] “Basic Primer: Blockchain Consensus Protocol.” Blockgeeks, Blockgeeks, 1 June 1969, https://blockgeeks.com/guides/blockchain-consensus/.

[45] Castor, Amy. “A (Short) Guide to Blockchain Consensus Protocols.” CoinDesk, CoinDesk, 4 Mar. 2017, https://www.coindesk.com/short-guide-blockchain-consensus-protocols/.

[46] LEVIN, MOE. 7 Weird, Wacky and Potentially Wonderful Applications for Blockchain. 15 May 2018, https://cryptopotato.com/7-weird-wacky- potentially-wonderful-applications-blockchain/.

69

[47] “Blockchain in Insurance: Applications and Pursuing a Path to Adoption.” EY, 2017. https://www.ey.com/Publication/vwLUAssets/EY-blockhain-in- insurance/$FILE/EY-blockhain-in-insurance.pdf

[48] “17 Blockchain Applications That Are Transforming Society.” Blockgeeks, Blockgeeks, https://blockgeeks.com/guides/blockchain-applications/.

[49] Windley, Phillip. “How Blockchain Makes Self-Sovereign Identities Possible.” Computerworld, Computerworld, 10 Jan. 2018, https://www.computerworld.com/article/3244128/security/how-blockchain- makes-self-sovereign-identities-possible.html.

[50] Bukarica, Leto. “Two-Factor Authentication | Blog | Deploy Inc.” Deploy, Deploy, 21 Aug. 2017, https://www.deployinc.com/tips-tricks/two-factor- authentication-google-/.

[51] “MySQL Enterprise Transparent Data Encryption (TDE).” MySQL, https://www.mysql.com/products/enterprise/tde.html.

[52] “Intro to Digital Signatures.” Digital Signatures Introduction, https://www.securedsigning.com/resources/intro-to-digital-signatures.

[53] “What Is Digital Signature? - Definition from WhatIs.com.” SearchSecurity, TechTarget, https://searchsecurity.techtarget.com/definition/digital- signature.

[54] Nabi, Atif Ghulam. “Comparative Study on Identity Management Methods Using Blockchain.” University of Zurich, Department of Informatics (IFI). https://files.ifi.uzh.ch/CSG/staff/Rafati/ID%20Management%20using %20BC-Atif-VA.pdf.

70