with PAWS: Injecting RESTful Privacy Web Services

Peter Bodorik1, Dawn Jutla2, Ajith Bryn1

1Faculty of Computer Science, Dalhousie University 2Sobey School of Business, Saint Mary’s University Halifax, NS, Canada

[email protected], (Bodorik, Bryn)@cs.dal.ca

jection problem, and addresses privacy ’ produc- Abstract tivity. We address an existing problem of complying with legal We focus our privacy solutions on a common means of requirements when collecting online users' private data. collection of private data from online users, or subjects, We design tools for a developer, or privacy , to using forms-based web pages. In most countries privacy inject privacy web services in customer-facing web pages. regulations exist (e.g., PIPEDA in Canada [15]) that stipu- The tool is aimed at increasing a software engineer’s late that before an online user/subject is asked by a web productivity when addressing privacy requirements. It also page for private information, the subject should be in- reduces the time for organizations to comply with best formed by a privacy notice about the collection of private practices, standards, and/or regulations for privacy in soft- data and its intended use and distribution and, furthermore, ware. the subject’s consent should be obtained. We concentrate on notice and consent mechanisms as they are deceptively 1 Introduction simple yet powerful privacy constructs. Our scope and objectives are: In spite of existing legal requirements, guidelines, and or- • To provide a web page developer with privacy web ser- ganizations’ policies around the collection of private data, vices, which he/she can use to inject functionality in web the reality is that software engineers are unaware of, or applications to (a) enable the showing of privacy notice(s) outright ignore common privacy requirements [1, 2, 3, 4, to a data subject and (b) to obtain data subject consent. 5]. All too often privacy–untrained software developers are • To develop a prototype privacy web services injection the creators of web applications to collect personally identi- tool and to examine initial metrics. fying and profiling data from users. Governments are increasing the amount of privacy re- 2 PAWS: Privacy Architecture for Web quirements in their regulations. For instance the European Services Union’s forthcoming Data Protection law contains Data Protection by Design (including [2, 3]) We first describe Privacy Architecture for Web Services principles, which map to privacy requirements. Due to (PAWS) as the injection of privacy services described here global trade, such European legal leadership in upholding relies on information that is collected and stored within the citizen privacy is expected to positively impact businesses context of PAWS. The main goal of the PAWS is to pro- in other markets. vide a holistic approach to collection and management of Ann Cavoukian uses a positive sum approach in her private data in a SOA environment, in which software inte- widely translated Privacy by Design (PbD) principles to gration and access to private data in databases are facilitat- reframe negative discourses around privacy. Interweaving ed by web services. The PAWS exploits key features of PbD principles and NIST’s controls [1], in international Web Services Architecture to develop a privacy architec- privacy standards [7] for software systems, enables stake- ture with desirable properties that were described in [10]. holders to open up and share data [6] in controlled ways for Figure 1 shows an architectural abstraction for how web the advancement of societies everywhere. To support inno- services may be monitored/managed for collection and use vation, the OASIS Privacy by Design Documentation for of . A Request Monitor intercepts Web ser- Software Engineers Technical Committee is currently de- vice requests, and extracts information on the web service veloping a privacy standard to help software engineers to and its operations’ parameters. The Request Monitor con- embed privacy functionality into their software [7, 8]. Tak- sults the Privacy Knowledge Base (PKB) to determine en together, regulations, standards, and a desire for open whether or not the request may proceed or is rejected. This innovation can motivate retrofit of existing web applica- decision depends on the personal data contained in the tions to insert privacy preserving actions when collecting, request parameters, semantics of the operation, i.e., how the using, and distributing private information. Our method to personal data in question is used, the privacy policy, and the inject privacy web services tackles a technical privacy in- consent under which the personal data has been collected.

978-1-943436-05-7 / copyright ISCA, SEDE 2016 September 26-28, 2016, Denver, Colorado, USA The monitor ensures that the request, and also the decision which web services are used to store collected private whether or not to proceed, are recorded in the web services data in an organization’s DB; log (ws-log). If the web service request is allowed to pro- o which privacy services are invoked, including services ceed, then the web service is invoked and its business logic to (i) show notice about the collection of private data is executed. The business logic may access a database (DB) and (ii) to obtain consent to collect private data and storing personal data. (iii) other privacy services, such as those for de-

Figure 1. Privacy Services Injection in PAWS identification and secure data transfer. Other architectural elements that PAWS draw on is the The Reply Monitor intercepts the response of the web privacy web ontology [9]. In PAWS, the Privacy Engineer service execution, and examines the content of this reply guides a semi-automatic software Privacy Information message for personal data and its usage in a manner similar Agent, to mine web pages, the ws-log that contains log to handling of the web service request by the Request records of web services requests and replies, and the DB Monitor. The monitor records the web service reply in the log, to ensure that the information in the PKB is correct and ws-log together with the decision whether the reply (con- to discover new information to be stored in the PKB. taining private data) is to be communicated to the web ser- SOA is based either on (i) the Web Service Architecture vice requestor. Central to PAWS is the content of its PKB. (WSA) stack of protocols, which uses the Simple Access The knowledge base contains information on: Object Protocol (SOAP) for client applications to invoke • web services, including their operations and semantics, web services, or more recently (ii) the simpler Resource and input and output parameters including information on Oriented Architecture (ROA) in which RESTful web ser- which are private data; vices are invoked using HTTP. As currently adoption of • applications that invoke web services and input/output ROA outstrips WSA, we implement the PAWS and our tool parameters that are exchanged with web services; for the ROA environment. • which web services invoke other web services with input Other privacy architectures exist [e.g. 9, 11, 12] but not and output parameters that are exchanged and are deemed for privacy web service injection. PAWS is not only ROA- to be private; based; it provides details dealing with intelligent agents and • which web services access DBs, including which private knowledge bases. The model of abstraction for ”context of data is retrieved or stored; use” that the PAWS architecture’s agents use is found in • web pages, including [12]. Audit-based compliance control [13] and dynamic o which private data is retrieved from an organization’s inference [14] influences the PAWS design. DB and displayed to the subject; o which private data is collected from the subject and 3 Injection of Privacy Web Services Interface (UI). In the below discussion of the architecture, we illustrate the situation where the privacy engineer selects A business problem occurs by non-compliance with a pri- one page, containing one form that collects personal data, in vacy regulation when private data is collected online, but order to inject privacy services to show notice to and obtain (a) the subject is not provided with notice about the collec- consent from the consumer or citizen (referred to as sub- tion and use of the private data, and (b) the subject’s con- ject). However, we note that PAWS supports privacy ser- sent for the use of the collected data is not obtained from vice injections in multiple forms across multiple web pages. the subject. The PAWS’s Knowledge Base (KB) is shown in figures The injection of privacy services into the web page to in- 1 and 2 as a collection of related KBs. Figure 2 shows that clude notice and consent consists of the following work- the Web-Page module is used to obtain pertinent infor- flow: (i) identify the web page; (ii) modify the web page to mation about the web page, selected by the Privacy Engi- show notice; (iii) modify the page to obtain consent; (iv) neer, including which personal data the page collects. As- allow entry of personal data (subject action); (v) modify the sisted by the Notice-Consent module, the Privacy Engineer page to store consent; and (vi) store the collected personal specifies which notice is to be displayed and which consent data. A design constraint is that the web services associated is to be sought from the subject. Finally, the Inject-Services with the web pages cannot be changed. Modifying such module is used to inject the privacy services, which display web services would entail examining and modifying the the selected notice and consent user interfaces, into the web code generating the web service, which is time consuming page. As we mentioned before, the privacy injection proto- and error prone in unforgiving production environments. type is capable to inject other privacy services, such as de- We further design our implementation solution using identification, secure data transfer between the page and the categorization. Webpages may be broadly classified into organizations application server, and authentication, but for static and dynamic types, depending on how they are gener- simplicity we here focus only on the notice and consent ated. When a static page, e.g., .html page, is requested by a privacy services. browser, the web server simply fetches it and returns it to We refer to our privacy-service injected web page as a the browser without any modifications. Dynamic pages, privacy aware page. The controller uses the Notice- such as .xhtml, .php, and .aspx, contain server-side scripts. Consent module to obtain, and provide information on (i) When a browser requests a dynamic page, the server exe- notice and consent forms that are available, and (ii) relevant cutes a script that actually generates an .html page and web services and applications to the Privacy Engineer. returns it to the browser. Relevant web services include the web service that is used We additionally classify web pages according to whether to store personal data, and those web services that will an engineer is able to, using the tool, to modify the source consume collected personal data. Consuming applications pages. For instance, for .html, .xhtml, and .aspx web pages are likewise relevant. we can modify the source page to inject privacy services. The controller also uses, and shows to the privacy engi- For other types of pages generated by server-side scripting, neer, the page’s server-side code and the page’s rendered such as .php pages, we utilize another approach. We use image. Finally, the controller provides suggestions to the the PAWS monitor to intercept the .html page generated by engineer on which notice and consent are to be used for the a server-side script and inject privacy services into the page page under consideration. The Notice-Consent module on-the-fly, before the page is returned to the browser. In- displays available (previously created or canned) notices jecting privacy services by modifying the source page is and consents to the engineer. He/she also has the ability to preferred as it leads to less overhead, as will be discussed create new notices and consent forms. The notices and later. We describe both approaches in the following sec- consent forms are stored in the Notice-Consent-KB. tions. Once the engineer decides on the notice and consent to be injected, the controller modifies the source .xhtml web page to invoke web services that show notice and obtain consent, and shows it to the engineer together with the rendered image of the page. After examining the .xhtml source page, and also its image as it would be rendered by a browser, the engineer is able to backtrack and modify the page by injecting it with another notice and consent func- tionality. The engineer can also inject other privacy ser- vices, such as de-identification of particular of particular input data. Once the engineer is satisfied with the result, (s)he commits the injection and the web page is permanent- Figure 2. Privacy Services Injection in PAWS ly modified. Furthermore, the KB is updated with infor- mation about the modified web page, that is, that the web The injection architecture is shown in Figure 2. It shows page is now privacy-aware as it displays a privacy notice a Privacy Engineer interacting with a Controller via a User and also obtains consent from the user for the collection of services directly into the server-side script of the source personal data. page, information is saved in the knowledge bases and the As an aside, PAWS supports rapid compilation of a data web page is flagged for on-the-fly injection of privacy model of an organization’s forms-based interfaces to per- services. sonal data. The Web-Page-KB records the status of all Web pages, regardless of whether it was examined previously and found to already show notice and obtains consent, or if a page was modified through privacy services injection. Thus the organization has a record of its privacy-aware pages, and can use PAWS to generate rich private-data models. 3.1 Injecting services into a Source Page Injecting privacy services for notice and consent in a simple .xhtml source page is obtained in the following way: 1. We modify the form’s header to invoke the Notice- Figure 3. Actions in Script of the Submit Button Consent-ws privacy web service. This web service has parameters Notice-ID and Consent-ID. The service queries The second phase involves the Reply Monitor in PAWS, the Notice-Consent-KB knowledge base to retrieve the which intercepts the web page and web services replies, as notice and consent templates based on their identifiers. shown in Figure 1. When the browser requests a web page, The service will pop up a new browser window showing a such as a .php page containing a server-side script, it is scrolled-bound privacy notice and a user consent sub-form fetched and the page’s server-side script is executed to area. The consent form may include checkboxes and other generate an .html web page. The Reply Monitor intercepts input constructs to obtain details on the use and distribu- the generated .html page before it is sent to the browser. tion of collected private data for specific purposes. The The monitor contains a fast access internal directory that user is provided with an option to give consent via clicking identifies pages requiring privacy services. If an intercept- on the Agree/Accept or Disagree buttons on the consent ed page does not require a privacy service, it is sent to the form. The user can also separately confirm that (s)he has browser. However, if it does, privacy services are injected read the privacy notice. into the generated .html page in the same way as described 2. When the user provides consent to data collection in the previous section: and use by clicking on the Accept button, the script asso- - The Notice-Consent-KB is queried to retrieve the notice ciated with the Accept button stores the notice and consent and consent to be inserted. IDs, and any input in the consent sub-form area, in the - Form’s header is modified to invoke the web service hidden fields of the personal data collection form. The Notice-Consent-ws with parameters that are fetched from subject is thereafter permitted to enter her/his private data the Notice-Consent-KB. The web service has the same in the data collection form’s fields. functionality as in the previous case above. 3. After the user provides her private data, she clicks - The form’s Submit button is also modified to invoke the on the Submit button modified with a script to invoke a Store-Notice-Consent-ws with the same parameters as in web service, called Store-Notice-Consent-ws (see Fig. 3). the previous case. This web service stores the user consent in the organiza- tion’s Consents DB. Finally, as part of the original script 4 PAWS Performance of the submit button, the Store-personal data-ws web ser- We developed a prototype to demonstrate our two ap- vice is invoked, which stores the data collected by the proaches to privacy web services injection using typical form in the organization’s DB. Note that the Store- web application development tools (Jersey 2.5.1, JQuery, personal data-ws web service has not been modified. REST, Eclipse JUNO IDE, Tomcat 7.0, and MySQL Server 3.2 Injecting Privacy Services On-the-Fly 5.0). To understand whether there may be a performance barrier to privacy injection adoption, we quantify the delays For organization with numerous existing web applica- due to injected privacy services. For the purposes of exper- tions where software engineers select not to modify the iments, we use an isolated test environment in which there source web pages using our automated approach to privacy are no other activities but those associated with running the service injection described above, or when dealing with experiment. We logged timestamps at the beginning and non-(x)html or non-aspx pages, we use an alternative auto- end of activities and show them in Table 1. We performed mated approach consisting of two phases. The first phase experiments: with and without privacy services’ injections prepares information for injection of a privacy service, and network delays. Although it is obvious that network while the second phase injects the privacy services into the delays will dominate, the exercise to quantify the privacy page each time the page is fetched. Instead of injecting web web services delay is useful as it has not been done before. 4.1 Delays without network data transfer tion Store-Notice- 2.9 130.1 Last 13 rows of column 3 of Table 1 show delays with Consent-ws mssg injected privacy web services in source pages - the total transit delay, shown in the last row of column 3, is almost three Store-Notice- 3.2 12.2 times higher than the delay without privacy services (9.9 ms Consent-ws execution shown in the last row of column 2). However, both delays Store-Notice- 2.3 50.3 are so low that they are considered to be negligible. Consent-ws-reply mssg 4.2 Servers connected by networks Total Delay 9.9 30.6 124 548.1 The overall measured delays show clearly that network 4.3 Increasing Engineers’ Productivity delays are a dominant consideration. It should be noted that PAWS can be used in various use case scenarios for im- the overhead delay of 548 ms, or half a second, reported in proving privacy engineers’ productivity, including the fol- the last row of the last column, includes two user interac- lowing two variants. In one scenario, there is a separation tions: one to retrieve and display notice and consent, while of duty between the web page developer and the privacy the second to store the consent and then to store the private engineer. The developer creates a web page without priva- data entered on the form. The delay to retrieve the web cy services and hands it over to the privacy engineer, who page over the Internet from the web server (107 ms) is not then uses a tool, based on our approach/prototype, to inject included in Table 1. privacy services, such as the ones discussed here for notice Since PAWS supports notice and consent contextually and consent. The tool provides privacy engineer with rele- on a per page basis, notices and consent data are very vant privacy-related information to make an informed deci- short, and the user can quickly give permission or deny sion as to which notice and consent templates are to be access. As societies access higher network bandwidths we utilized. see from our quantification that the performance of applica- The privacy engineer has information available on (1) tions with injected privacy web services can improve great- which personal data elements are being obtained, (2) in ly. which database relations or other representation of this Table 1. Delays due to Privacy Web Services information is stored, (3) which other web services and Injection in Source Pages in ms applications access the relations, and (4) how that infor- Measured Activity w/o with w/o with of injection injection injection injection mation is being used. The privacy engineer then uses our web services of Notice of Notice of Notice of Notice tool to inject privacy services without explicit coding. delays Consent Consent Consent Consent In the second scenario, the software developer creates a - one - one – Distr. – Distr. webpage and then uses our tool to inject the privacy ser- hrdw hrdw System System vices. In this scenario, standard basic generic notices and platform platform; with net with net requests for consent may be prepared, together with guid- no net no net delays delays ance on their modification for simpler situations. Further- delays delays more, the tool saves the developer’s time by inserting ap- Store-personal_data- 1.6 1.6 1.6 1.6 propriate code to inject privacy services. We designed a ws invocation first version of the privacy engineer’s interface, shown in Store-personal_data- 2.8 2.8 62.3 62.3 ws invocation messa- Figure 4 at the end of the paper, and we intend to conduct ge transit user studies to measure the increase in productivity for Store-personal data- 3.2 3.2 10.3 10.3 selected scenarios. ws execution Store-personal_data- 2.3 2.3 49.8 49.8 5 Conclusions ws reply delivery and Private information about subjects may be collected display through various means, ranging from using web pages, Notice-Consent-ws 1.6 1.6 invocation asking the user for specific information while performing Notice-Consent-ws 2.4 89.6 online transactions or when providing some targeted ser- mssg transit vice(s), collecting data from the subject’s various com- Notice-Consent-ws 2.6 26.8 municating electronic devices, or amassing big data from execution sensors observing the subjects. The larger notion of inject- Notice-Consent-ws- 2.8 110.6 ing privacy web services to preserve privacy can apply to reply mssg transit all these environments, inclusive of sensor environments Storage-in-hidden- 1.3 1.3 where software privacy agents may act on the user’s behalf fields to forbid or permit personal data collection. PAWS support Store-Notice- 1.6 1.6 for splitting notices and consent forms in small contextual Consent-ws invoca- chunks across individual web pages is user-friendly. References Documentation for Software Engineers Version 1.0 Committee Note Draft 01, 25 June 2014. Available: http://docs.oasis-open.org/pbd-se/pbd- [1] NIST Security and Privacy Controls for Federal Information se-annex/v1.0/cnd01/pbd-se-annex-v1.0-cnd01.pdf. Systems and Organizations – Appendix J, Revision 4: Privacy Controls [9] D.N Jutla, P. Bodorik, Y. Zhang, PeCAN: An architecture for Catalog. users’ privacy-aware electronic commerce contexts on the semantic web, [2] Cavoukian, A. Privacy by Design: The 7 Foundational Information Systems, 31: 4–5, June–July 2006, 295–320. Principles Implementation and Mapping of Fair Information Practices at [10] Garcia A. and Lucena C. Taming Heterogeneous Agent www.ipc.on.ca/images/Resources/pbd-implement-7found-principles.pdf, Architectures. CACM, May 2008, 51(5):75- 81. January 2011 [11] Karjoth G., Schunter M. and Waidner M. Privacy-enabled [3] Cavoukian, A. Privacy by Design: Leadership, Methods, and services for enterprises. Technical Report, IBM Research, Zurich Results, Chapter 8: http://link.springer.com/chapter/10.1007%2F978-94- Research Laboratory, 2002. 007-5170-5_8, 5th Int. Conference on Computers, Privacy & Data Protection, European Data Protection: Coming of Age, Springer, [12] Bodorik P., Jutla D., Wang X. Consistent Privacy Preferences Brussels, Belgium, 2013. (CPP): Model, Semantics, and Properties. ACM Symposium on Applied Computing, 2008, 2368-2375 [4] Michelle Finneran Dennedy, Jonathan Fox, Thomas Finneran (2014). The Privacy Engineer’s Manifesto: Getting from Policy to Code [13] Cederquist J.G., Corin R. Dekker M. A. C., Etalle S., den to QA and Value, Apress, Jan 2014. Hartog J.I., Lenzini G. Audit-based compliance control. International Journal of , 6:2 (2007):133 – 151. [5] Dawn N. Jutla, Peter Bodorik, Deyun Gao, Management of Private Data: Addressing User Privacy and Economic, Social, and Ethical [14] An X., Jutla D.N, Cercone N., Auditing and Inference Control Concerns. Secure Data Management, VLDB 2004 Workshop: 100-117. for Privacy Preservation in Uncertain Environments, First European Conference on Smart Sensing and Context, Enschede, Netherlands, Oct. [6] Shadbolt N., Berners-Lee T., Hall, W., The Semantic Web 2006, 159-173. Revisited. IEEE Intelligent Systems, Volume 21:3 (2006): 96 – 101. [15] The Personal Information Protection and Electronic Documents [7] A. Cavoukian, F. Carter, D. N. Jutla, J. Sabo, Privacy by Act (PIPEDA). Available at https://www.priv.gc.ca/leg_c/leg_c_p_e.asp, Design Documentation for Software Engineers, Version 1.0 Committee last viewed March 18, 2016. Specification Draft 01, 25 June 2014, http://docs.oasis-open.org/pbd- se/pbd-se/v1.0/csd01/pbd-se-v1.0-csd01.pdf [8] A. Cavoukian, F. Carter, D. N. Jutla, J. Sabo, F. Dawson, S. Fieten, J. Fox and T. Finneran, Annex Guide to Privacy by Design

Figure 4. The Privacy Engineer’s PAWS Interface