User Access Privacy in Oauth 2.0 and Openid Connect
Total Page:16
File Type:pdf, Size:1020Kb
User Access Privacy in OAuth 2.0 and OpenID Connect 1st Wanpeng Li 2nd Chris J Mitchell Department of Computer Science Information Security Group University of Aberdeen Royal Holloway, University of London Aberdeen, UK Egham, UK [email protected] [email protected] Abstract—Currently widely used federated login (single sign- are vulnerable to attack. A range of mitigations have been on) systems, notably those based on OAuth 2.0, offer very proposed for RP developers [2], [13], [15], [25], designed little privacy for the user, and as a result the identity to help secure OAuth 2.0 and OpenID connect systems. provider (e.g. Google or Facebook) can learn a great deal However, apart from the work of Hammann et al. [7], about user web behaviour, in particular which sites they discussed in greater detail in Section 4.3, relatively little access. This is clearly not desirable for privacy reasons, attention has been paid to the degree to which OAuth and in particular for privacy-conscious users who wish to 2.0/OpenID Connect protects (or, fails to protect) user pri- minimise the information about web access behaviour that vacy. The purpose of this paper is to address this issue, i.e. they reveal to third party organisations. In this paper we give to thoroughly investigate the user access privacy properties a systematic analysis of the user access privacy properties of both OAuth 2.0 and OpenID Connect, covering all the of OAuth 2.0 and OpenID Connect systems, and in doing so relevant protocol flows. describe how simple it is for an identity provider to track The remainder of this paper is structured as follows. user accesses. We also propose possible ways in which these Section2 provides an introduction to the operation of privacy issues could to some extent be mitigated, although we OAuth 2.0 and OpenID Connect. In Section3, we provide conclude that to make the protocols truly privacy-respecting a systematic analysis of the user access privacy properties requires significant changes to the way in which they operate. of OAuth 2.0 and OpenID Connect. We discuss the causes In particular, it seems impossible to develop simple browser- of the identified privacy issues and propose possible mit- igations in Section4. Section5 concludes the paper. based mitigations. Index Terms—OAuth 2.0, OpenID Connect, Authentication, 2. Background Authorization, Privacy 2.1. OAuth 2.0 1. Introduction The OAuth 2.0 specification [8] describes a system Since the OAuth 2.0 authorisation framework was that allows an application to access resources (typically published at the end of 2012 [8], it has been adopted personal information) protected by a resource server on by many websites worldwide as a means of providing behalf of the resource owner, through the consumption federated identity services (what we refer to here as single of an access token issued by an authorization server. In sign-on (SSO)). By using OAuth 2.0, websites can reduce support of this system, the OAuth 2.0 architecture involves the burden of password management for their users, as the following four roles (see Fig.1). well as saving users the inconvenience of re-entering attributes that are instead stored by identity providers 1) The Resource Owner is typically an end user. (IdPs) and provided to relying party (RP) websites as 2) The Client is a server which makes requests on required. There is a correspondingly rich infrastructure behalf of the resource owner (the Client is the of IdPs providing identity services using OAuth 2.0; for RP when OAuth 2.0 is used for SSO). example, Ghasemisharif et al. [6] identified 65 IdPs among 3) The Authorization Server generates access tokens the Alexa top one million websites in 2018. Indeed, some for the client, after authenticating the resource RPs, such as the website USATODAY1, support as many owner and obtaining its authorization. as six different IdPs. 4) The Resource Server stores the protected re- The security of OAuth 2.0 and OpenID Connect is sources and consumes access tokens provided therefore of critical importance, and it has been widely by an authorization server (this entity and the examined both in theory and in practice [1], [3]–[5], Authorization Server jointly constitute the IdP [10]–[12], [14]–[17], [19]–[25]. Previous studies (see, for when OAuth 2.0 is used for SSO). example, [10]–[12], [21]–[25]) show that, in practice, RPs do not always implement OAuth 2.0 correctly; as a result, Fig.1 summarises the OAuth 2.0 protocol. The client many real-world OAuth 2.0 and OpenID Connect systems (1) sends an authorization request to the resource owner. In response, the resource owner generates an authoriza- 1. https://login.usatoday.com/USAT-GUP/authenticate/? tion grant (or authorization response) in the form of a code, and (2) sends it to the client. After receiving the authorization grant, the client initiates an access token request by authenticating itself to the authorization server and presenting the authorization grant, i.e. the code issued by the resource owner (3). The authorization server issues (4) an access token to the client after successfully authen- ticating the client and validating the authorization grant. The client makes a protected source request by presenting the access token to the resource server (5). Finally, the resource server sends (6) the protected resources to the client after validating the access token. Figure 2. OpenId Connect — Protocol Overview which browser redirection is required. Note that, in the descriptions below, protocol parameters given in bold font are defined as mandatory in the OAuth 2.0 Authorization Framework [8]. 2.2. OpenID Connect OpenID Connect 1.0 [18] builds an identity layer on top of the OAuth 2.0 protocol. The added functionality Figure 1. OAuth 2.0 — Protocol Flow enables RPs to verify an end user identity by relying on an authentication process performed by an OpenID Provider (OP) (for consistency, we refer to an IdP instead of an OP 2.1.1. OAuth 2.0 used for SSO. As noted above, in in the remainder of the paper). In order to enable an RP to order to use OAuth 2.0 as the basis of an SSO system, verify the identity of an end user, OpenID Connect adds a the resource server and authorization server together play new type of token to OAuth 2.0, namely the id token. This the IdP role; the client plays the role of the RP, and the complements the access token and code, which are already resource owner corresponds to the user. OAuth 2.0 and part of OAuth 2.0. An id token contains claims about the OpenID Connect SSO systems build on user agent (UA) authentication of an end user by an OP, together with any redirections, where a user (U) wishes to access services other claims requested by the RP. OpenID Connect (see protected by the RP which consumes the access token Fig.2) supports three authentication flows [18], i.e. ways generated by the IdP. The UA is typically a web browser. in which the system can operate, namely Hybrid Flow, The IdP provides ways to authenticate the user, asks the Authorization Code Flow and Implicit Flow. user to grant permission for the RP to access the user’s attributes, and generates an access token on behalf of the 2.3. Tokens user. After receiving the access token, the RP can access the user’s attributes using the API provided by the IdP. The tokens used by OAuth 2.0 and OpenID Connect, notably the code, access token and id token, are key to the 2.1.2. OAuth 2.0 protocol flows. The OAuth 2.0 frame- operation of these protocols and also have major privacy work defines four ways in which an RP can obtain an implications. We therefore next describe their structure access token, namely Authorization Code Grant, Implicit and content. Grant, Resource Owner Password, and Client Credentials Grant. Each of these methods involves a distinct set of • An authorization code is an opaque value which protocol flows. is typically bound to an identifier and a URL of Two of the flows, i.e. Resource Owner Password and an RP. Its main purpose is as a means of giving Client Credentials Grant, are not relevant in the context the RP authorisation to retrieve other tokens from of SSO. The Resource Owner Password flow requires the the IdP. In order to help minimise threats arising user to reveal his or her IdP account password credentials from its possible exposure, it has a limited validity to the resource owner, i.e. it requires a high level of trust period and is typically set to expire shortly after in the resource owner; this severely limits the adoption issue to the RP. of this flow. The Client Credentials Grant flow only • An access token is a credential used to authorise involves the client and the authorization server, i.e./ no access to protected resources stored at a third party user interactions are needed in this flow. Also, no browser (e.g. the Resource Owner). Its value is an opaque redirection is performed in these two flows; the client string representing an authorization issued to the directly talks to authorization server. RP. It encodes the right for the RP to access data As a result, and given our focus on SSO privacy issues, held by a specified third party with a specific scope in this paper we restrict our attention to the Authoriza- and duration, granted by the end user and enforced tion Code Grant and Implicit Grant protocol flows, in by the RP and the IdP.