Data Loss Prevention Management and Control: Inside Activity Incident Monitoring, Identification, and Rt Acking in Healthcare Enterprise Environments

Journal of Digital Forensics, Security and Law

Volume 10 Number 1 Article 3

2015

Data Loss Prevention Management and Control: Inside Activity Incident Monitoring, Identification, and rT acking in Healthcare Enterprise Environments

Manghui Tu Purdue University Calumet

Kimberly Spoa-Harty Purdue University Calumet

Liangliang Xiao Frostburg State University

Follow this and additional works at: https://commons.erau.edu/jdfsl

Part of the Computer Engineering Commons, Computer Law Commons, Electrical and Computer Engineering Commons, Forensic Science and Technology Commons, and the Information Security Commons

Recommended Citation Tu, Manghui; Spoa-Harty, Kimberly; and Xiao, Liangliang (2015) "Data Loss Prevention Management and Control: Inside Activity Incident Monitoring, Identification, and rT acking in Healthcare Enterprise Environments," Journal of Digital Forensics, Security and Law: Vol. 10 : No. 1 , Article 3. DOI: https://doi.org/10.15394/jdfsl.2015.1196 Available at: https://commons.erau.edu/jdfsl/vol10/iss1/3

This Article is brought to you for free and open access by the Journals at Scholarly Commons. It has been accepted for inclusion in Journal of Digital Forensics, Security and Law by an authorized administrator of (c)ADFSL Scholarly Commons. For more information, please contact [email protected]. Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License.

DATA LOSS PREVENTION AND CONTROL: INSIDE ACTIVITY INCIDENT MONITORING, IDENTIFICATION, AND TRACKING IN HEALTHCARE ENTERPRISE ENVIRONMENTS Manghui Tu Kimberly Spoa-Harty Liangliang Xiao Department of Computer Department of Computer Department of Computer Information Technology Information Technology Science and Information & Graphics & Graphics Technologies Purdue University Calumet Purdue University Calumet Frostburg State University

ABSTRACT As healthcare data are pushed online, consumers have raised big concerns on the breach of their personal information. Law and regulations have placed businesses and organizations under obligations to take actions to prevent data breach. Among various threats, insider threats have been identified as a major threat on data loss. Thus, effective mechanisms to control insider threats on data loss are urgently needed. The objective of this research is to address data loss prevention challenges in healthcare enterprise environment. First, a novel approach is provided to model internal threat, specifically inside activities. With inside activities modeling, data loss paths and threat vectors are formally described and identified. Then, threat vectors and potential data loss paths have been investigated in a healthcare enterprise environment. Threat vectors have been enumerated and data loss statistics data for some threat vectors have been collected. After that, issues on data loss prevention and inside activity incident identification, tracking, and reconstruction are discussed. Finally, evidences of inside activities are modeled as evidence trees to provide guidance for inside activity identification, tracking, and reconstruction.

1. INTRODUCTION environment. First, a novel approach is provided to model internal threat, specifically As healthcare data are pushed online, inside activities. With inside activities consumers have raised big concerns on the modeling, data loss paths and threat vectors breach of their personal information. Law and are formally described and identified. Then, regulations have placed businesses and threat vectors and potential data loss paths organizations under obligations to take actions have been investigated in a healthcare to prevent data breach. Among various enterprise environment. Threat vectors have threats, insider threats have been identified as been enumerated and data loss statistics data a major threat on data loss. Thus, effective for some threat vectors have been collected. mechanisms to control insider threats on data After that, issues on data loss prevention and loss are urgently needed. The objective of this inside activity incident identification, tracking, research is to address data loss prevention and reconstruction are discussed. Finally, challenges in healthcare enterprise

© 2015 ADFSL Page 27 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. evidences of inside activities are modeled as essentially the interactions between the evidence trees to provide guidance for inside healthcare business entity and other entities activity identification, tracking, and such as other healthcare service providers, reconstruction government agencies, insurance companies, healthcare equipment/pharmacy providers. The data management party is essentially the 2. SYSTEM MODEL IT teams including database/web/network/computer system An abstraction of classical healthcare administers, and the security/compliance team. enterprise environment is modeled as a multi- The client party is essentially the patients and tier system that consists of multiple data their legal guardians. access or management parties, including a data module, service providers, business users, data In this research, the data items to be management team, and client party. The protected by using data loss prevention overview of the system is shown in Figure 1. mechanisms include personal health The data module is a central and critical part information (PHI) such as social security num- of this architecture and is composed of a data bers (SSNs), data of birth, payment data, storage system, a data process module, and a insurance policy information in digital format, data access module. The data storage module and personal electronic health records (EHRs), is essentially databases and files that contain as well as business data such as business client the information to be protected. The data information. In this architecture, the data process module is composed of a healthcare module is interacted with other modules, for information system which process the example, databases and file systems are information stored in the data storage module managed by the database administers, for business clients, patients, government protected by system and network agencies, and healthcare service providers. The administrators as well as the data access module is essentially web based security/compliance team. The service module interfaces for users. The service provider party and the business module will not only read the is composed of different services that are information stored in the data module, it will provided by the healthcare business, including also create and update the records in the data hospital services, lab test services, health module. In most cases, the client module will prevention care services, disease diagnosis and only read the information stored in the data treatment services, nursing, etc. These services module such as patient’s health record and involve many human users such as doctors, payment information. technicians, and nurses. The business party is

Figure.1. An architecture overview of the healthcare enterprise system.

3. RESEARCH information system, denoted as Ω., the inside METHODOLOGY activity model WDOA can be defined as below.

Definition 1.1 (user): Auserui U = 3.1 Inside Activity Identification {u1, u2, …, uL}, where L 0 and 0 i L,is Methods a specific subject to access and consume In order to develop an effective method to recourses in Ω to perform tasks defined by the identify inside activities from regular business work role, where L denote the number of users activities, in this paper, two methods will be in Ω. introduced to formally describe the relationship A user in Ω is a specific subject who can between inside activities and regular business be a doctor, a nurse, a business user, an activities. 1). The Work Role-Data Asset-User information technology staff, or a non-empty Operation-Access Preference (WDOA) model set of software processes in Ω. and, 2). The User-Operation-Data Asset- Access Path (UODP)model. Definition 1.2 (work role): A work role

wi W ={w1, w2, …, wM}, where M 0 and 0 3.1.1 The WDOA (Work Role- i M, is a group of users who have the Data Asset-User Operation- same type of tasks and the same set of Access Preference) Model privileges to access and consume recourses in Ω In a well-managed healthcare enterprise to perform tasks, where M denote the number environment, appropriate security policy and of work roles in Ω. acceptable user policy should be in place and A work role in Ω can be doctor, nurse, enforced by policy based access control (Chen, business users, information technology staff, or Laih, Pouget, & Dacier, 2005; Ellard & a non-empty set of software processes in Ω. Megquier, 2004; Johnson & Willey, 2011; Each user ui should have a well-defined work Murphey, 2007). For such a healthcare

© 2015 ADFSL Page 29 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. role wi and been assigned with data access to copy and move sensitive data objects privilege based on the need-to-know principle. around but should not delete or modify sensitive data objects, and should not copy the Definition 1.3 (sensitive asset): An data to personal devices; an application asset d D = {d , d ,…,d }, where N 0 and i 1 2 N developer will need to query sensitive data 0 i N, is a category of resource that is objects a lot but should not modify sensitive Ω owned by owner of and can be consumed by data. Therefore, a user’s accesses to sensitive user, where N denote the number of resource data objects in the healthcare enterprise Ω categories in . environment can be modeled into certain The healthcare business should identify its pattern based on the work role of different own set of assets D. For example, user users. accounts, computer and network resources, and Definition 1 (WDOA Model): The protected healthcare information. Also, the WDOA Model is a 4-tuple {W, D, O, A} data healthcare data should be classified into access preference model, where the first field different sensitive levels, and the set of represents a work role in W, the second field sensitive levels is denoted as S, where S = {s1, represents an asset in D, the third field s2, …, sm}. Each data object in D, di, is represents an operation in O, and the last field assigned a sensitive label sj. A data item that represents an access preference in A. is labeled as si has higher sensitive level than a The WDOA Model can give a hint on data item that is labeled as sj if i > j. To fulfill tasks defined by the work role, a user accesses whether a data access activity is normal or sensitive assets with certain preference. Let A not. However, the WDOA model cannot denote the set of preference level where A = precisely determine whether an access activity is an inside activity or not. For example, {a1, a2, …, an}, then the sensitive access preference can be defined by a set of 2-tuples, application developer should have access preference a to user account information, and (di, aj). A data item with access preference ai is 1 accessed with higher frequency than a data any access to such data would be suspicious. An IT administrator has low access preference item with access preference aj if i > j, and a1 is defined as the lowest access preference, e.g., ai (i > 1) to personal healthcare records for zero access preference. data management purpose, and a copy access to those data items cannot determine whether Definition 1.4 (operation): An such access is suspicious or not. operation oi O = {o1, o2, …, oΖ}, where Z 0 and 0 i Z, is a user activity to access or 3.1.2 The UODP (User- consume resources defined in Ω,whereZ Operation-Data Asset-Access denote the total number of operations that can Path) Model be performed by work roles defined in Ω. An inside activity model UODP is defined for The specific operations in O include read, an arbitrary healthcare information system Ω. write, execute, delete, shutdown, print, copy, In such a model, to reach a sensitive asset di, and any other operation that is defined in a an insider needs to have known or unknown specific business sector. Based on the access paths to asset di (Ellard & Megquier, preference levels, some operations will be 2004; Kowalski et al, 2008; Moore, Cappelli, & performed regularly, and some should rarely be Trzeciak, 2008). performed or may never be performed. For example, an IT system administrator may have

Page 30 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. Definition 2.1 (access path): An access a monitored USB device, the 4-tuples (IT

path pi P = {p1, p2, …, pK},where K 0 and system administrator, copy, dk, monitored 0 i K, denote the access channels or USB) is not an inside activity since the access media for users to access or consume monitored USB device is still under the control assets in the healthcare enterprise of the healthcare enterprise and will not lead environment, where K is number of access to data loss of dk at the current stage. With paths enabled in Ω. sufficient resources, the elements in U, O, D can be well classified and identified based on A specific access path can be USB access, current technologies. However, due to the CD access, VM instance access, email access, complexity of the healthcare information or any other access mechanism that allows system, data storage techniques, user access users to access the sensitive asset, in controls, and usage obligations, the legitimately way or illegitimately way. For identification and classification of access paths example, to steal an asset di for personal use, is still challenging to for healthcare enterprises. an insider may copy asset di from the data storage site and then send to a personal USB 3.2 Inside Activity Modeling for device that has been attached to a system Incident Tracking and within a healthcare enterprise environment. An Reconstruction insider may first create secret user account and setup a virtual machine (VM) instance, and The attack tree approach that is first proposed by Schneier (1999) is used to systematically then copy di to the VM instance to be accessed analyze security threats. Attacks are modeled later. Let Path({ui}, k) denote the set of access and represented by a tree structure where the paths to asset dk by a subset of users {ui}, then root node represents the final goal, other Path({ui}, k) can be defined by a set of 4- interior nodes represent subgoals, and leaf tuples (ui, oj, dk, pm). nodes are attacking approaches to achieve the Definition 2 (UODP Model): The final goal (Poolsapassit & Ray, 2007). Children UODP Model is a 4-tuple {U, O, D, P} data of a node in the tree can be one of the two access path model, Where the first field logical types: AND and OR. To reach the goal, represents a user in U, the second field all of its AND children, or at least one of its represents an operation in O, the third field OR children, must be accomplished. Attack represents an asset in D, and the last field trees grow incrementally by time and they represents an access path in P. capture knowledge in a reusable form. First, With the 4-tuples (U, O, D, P) model, it is possible attack goals must be identified. Each possible to determine whether an access attack goal becomes the root of its own attack activity is an inside activity or not. An tree. Construction continues by considering all application developer ui copy healthcare possible attacks against the given goal. These records dk to a un-monitored VM instance, the attacks form the AND and OR children of the 4-tuples (application developer, copy, dk, un- goal. Next, each of these attacks becomes a monitored virtual machine) can be definitely goal and their children are generated. Figure 2 considered as a suspicious inside activity since shows an example of an attack tree of the an un-monitored VM instance is beyond the inside threat, “achieving the root privilege”. In control of the healthcare enterprise and can such an attack, the attacker is a regular user

lead to data loss of dk. While an IT system and has a lower access privilege to the target administrator ui copy healthcare records dk to (which needs root privilege), and conducts a

© 2015 ADFSL Page 31 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. series of attacking operations to achieve the the “AND” relationship among the states or root privilege as the system user. Note that sub-goals, which are working together to links that are connected with a line represents achieve the same parent goal.

Figure 2. An attack tree of an internal threat “achieving the root privilege”.

4. DATA LOSS RESULTS threat vectors, examine potential data loss AND ANALYSIS threats, and to analyze potential data loss controls, the following studies will be Based on the UODP access model, if a conducted. First, potential threat vectors will legitimate user accesses and operates be enumerated and feasible operation sensitive data through an uncontrolled access controls will be listed for each threat vector. path, it can lead to potential data loss. The The status of the enforcement of such combination of such uncontrolled access controls is also indicated. Second, statistical paths and access operations are threats to data loss prevention results are provided. data loss, and are defined as the data loss threat vectors. Therefore, to prevent and 4.1 Data Loss Threat Vectors control data loss in healthcare enterprise Identification environment, the first critical task is to Data loss threat vectors can be categorized identify the set of data loss threat vectors, as external storage media and transmission more specifically, the set of uncontrolled media. External hard disks, USB flash drives, access paths. In this research, we will explore PDA’s, CD/DVD, floppy disks, and tapes the potential threat vectors in the healthcare are traditional storage media, while cell enterprise environment. Safend Data phones, SD card readers, IPAD, FTP, web Protection Suite, an end point security sites, and printing can be categorized as product from Wave, has been used in this transmission media. The only exception is research to regulate data loss prevention in cloud storage which is a new technology an enterprise healthcare environment. Data combining transmission and storage. To loss results are collected before and after the control data operations data, port controls placement and enforcement of end point such as block, allow, force encryption, set to security protection. To identify data loss

Page 32 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. read only are enforced. Data filtering credit card numbers, social security numbers, technologies based on expressions are and healthcare records. The results are deployed to filter sensitive data such as shown in Table 1.

Table 1 The enumeration of data loss threat vectors in an enterprise healthcare environment Threat Vectors Port Control Options Enforcement Status Block/Allow/Force External Hard Drives Encryption/Set To Read Only Enforced Block/Allow/Force USB Flash Drives Encryption/Set To Read Only Enforced Cell Phones Block/Allow Not Implemented PDA's Block/Allow Enforced as external storage media SD Card Readers Block/Allow/Set To Read Only Not Implemented iPad Block/Allow Not Implemented Block/Allow/Force CD/DVD Encryption/Set To Read Only Enforced No data to report - technology in environment does not allow for floppy Floppy Drives Block/Allow drives No data to report - technology in Tape Drives Block/Allow environment does not allow for tape drives Due to product, high administration efforts Websites none to identify and analyze risks. Blocked by perimeter within the domain, by static IP, site IP allowed to use, and FTP none user security - 3 factor authentications. Cloud Storage none Not Implemented Email filtering, algorithms to look for Email none sensitive data, will force encryption can block physical printers from connecting, but not Printing network printers Not Implemented

As indicated in table 1, traditional as FTP can easily be controlled since FTP storage media are usually well controlled by can easily be replaced with an alternative enforcing port controls, since they have been secure technology. It means that these well documented and the monitoring and technologies are not required to accomplish control technologies have been well designed. healthcare activities and thus can be blocked. Cloud storage is a new technology and not Some other transmission media such as well documented (Biggs & Vidalis, 2010; printing are not easily controlled since Bruening & Treacy, 2009; Brunette & Mogull, printing is required for routine business 2009), thus, mature control technologies are activities. Also, due to the nature of printing not ready yet. Some transmission media such (graphical presentation of information),

© 2015 ADFSL Page 33 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. sophisticated identification and examine period, Safend Security Protection Suite was technologies are needed to filter sensitive deployed in the enterprise healthcare data. Currently, efficient deployment of such environment to control data loss. Then, a 90- technologies has not been ready yet. day time period data collection is conducted with the deployment of the end point 4.1 Data Loss Analysis security protection technology (denoted as /A). Due to the limitation of the technology A 90-day time period data collection is and the feasibility of the policy enforcement conducted prior to the deployment of any in the enterprise environment, only part of end point security protection technology the threat vectors, USB, CD/DVD, external (denoted as /P). After the 90-day time hard disk, and phone, are controlled.

Table 2 The potential data loss path accesses and operations before and after the deployment of Safend. # # Data Data Threat Vector Users/P Users/A # Files/P # Files/A Size/P Size/A USB 2765 413 4449429 374015 1123 G 432.4G CD/DVD 157 44 212067 8530 291.7 G 76.5G External Hard Disk 161 21 443805 9356 804.39 G 2.4G Phone 426 5805 0 0 0 0

As indicated in Table 2, the number of number of files and the size of files moved in users access potential data loss threat vectors, Table 2, employees tend to abuse such threat such as USB, CD/DVD, and external hard vector accesses and operations without data disks have been significantly reduced (USB loss control, since such significant reductions users from 2765 to 413, CD/DVD users from (for examples, the number of USB accessed 157 to 44, external hard disk users from 161 files from 4449429 to 374015, the number of to 21). The only exception is the use of CD/DVD accessed files from 212067 to 8530, phone and the usage has been significantly the number of external disk accessed files increased (from 426 to 5805). One reason from 443805 to 9536) does not affect business could be the block or reluctance of the use of activities in the enterprise healthcare email and other controlled communication environment. Please note that no data is paths. However, users may plug in to charge transferred to phones due to the reason that devices without proper removable media most phones when connected are seen as protection, phones can be used to taking removable storage or external hard drives, pictures and then transmitted out without and thus will adhere to the policies already control. Therefore, such an abrupt increase enforced for that media type. In Table 3, it needs to be carefully analyzed and better to indicates that a large part of accesses are conduct a thorough investigation. A encrypted, which can significantly reduce the countermeasure to such data loss threat potential of unintentional data loss due to through phone can be achieved to enforce theft, mis-sent, and misconfiguration. non-personal phone policy in sensitive However, there are some accesses and working environment. As indicated by the operations are unencrypted but all of them

Page 34 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. are approved. As stated in the security usage predesigned processes such that all files policy and recorded in the usage logs, such accessed and operations on files are all logged approved usages are required to follow in audit reports. Table 3 The encrypted and unencrypted data loss path accesses and operations after the deployment of Safend # # Data # Data Threat Vector Users/A Files/A Size/A Users/A # Files/A Size/A Encrypted Use Unencrypted Use USB 187 247680 334 226 126335 98.4 CD/DVD 32 N/A 64.6 12 N/A 11.9 Ext. Hard Disk N/A N/A N/A N/A N/A N/A Phone N/A N/A N/A N/A N/A N/A

5. INSIDER ACTIVITY With deployment of end point security IDENTIFICATION & protection product such as Safend Security Protection Suite, it is possible to control data TRACKING loss through traditional external storage Even with data loss threat vectors media. For example, with appropriate access identification, control, and monitoring, inside control, any data accessed can be logged and activities cannot be detected or identified can be blocked to be moved to USB storage with current access control techniques since media or other external storage media. the access operation and access path are both However, potential uncontrolled data loss legitimate user privileges. Therefore, forensics access paths could still exist. investigation on inside activities in (1) A combination of multiple access healthcare enterprise environment, including technologies in the extended healthcare incident detection and reconstruction is enterprise environment. For the purpose of critically needed (Tu et al, 2012). Current business trip, an employee ui with work role research on inside threat detection and wj (e.g., sales representative) may need to identification (Eberle & Holder, 2009; Moore, move data (e.g., dn) outside of the enterprise Cappelli, & Trzeciak, 2008; Phua, Lee, network by applying access operation (i.e., Smith, & Gayler, 2007) and event copy), and such access has a high access reconstruction mechanisms (Case et al, 2008; preference for wj. Then, based on the WDOA Tang, & Daniels, 2005; Tu et al, 2012) are model, the 4-tuple (sales representative, dn, limited in real world since they require a copy, high access preference), will be defined comprehensive set of information including as a legitimate access without an alert. By social information and explicit dependence applying UODP, it can help to detect knowledge, which are not available in an potential data loss due to access violations. enterprise environment. Hence, a novel End point security product can monitor mechanisms are critical to identify potential regular access to the data and any violation inside activity and reconstruct the inside (e.g., copy dn to an unauthorized personnel activity for tracking. USB device pm) may result in an alert since the union of copy and p (copy p ) has 5.1 Data Loss Identification m m been pre-defined as data loss threat vector.

© 2015 ADFSL Page 35 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. In this way, the combination of UODP and 5.2 Inside Activity Operation and end security protection product together can Evidence Modeling create an extended enterprise environment Inside activity identification and track outside the physical enterprise network require that inside activities to be thoroughly boundary. However, there will be other studied. In such study, inside activities will techniques available to bypass the control. be modeled as attack tree and then For example, employee can photograph the conducted in a simulated environment and a data if the read access is permitted, or forensic investigation will be followed for storage media can be bit-by-bit imaged each attack successfully committed. without leaving any evidence on the media if Fingerprints will be located and identified for it is connected through write block devices. each operation of the inside activity. The (2) Forgotten paths due to unsuccessful metadata of the fingerprints of each attack change management. For example, operation, such as log name, format, location, misconfiguration could be unnoticed during timestamps, and security features. are system update, security product update, or composed into nodes, which will then become human resource change. With such child nodes of the leaf nodes in the misconfiguration, media block and media augmented attack tree. This entire process access log may not be enabled. will finally result in an evidence tree for each (3) A combination of uncontrolled inside activity studied. Fingerprints of accesses within the physical network of sensitive operations of the evidence trees will healthcare enterprise environment. In Table be identified as incident identifiers and the 1, some access technologies, such as web site, evidence tree can provide the contextual phone, and cloud technology, have not been information to reconstruct security incidents controlled. As analyzed in the above section, automatically. phone technology has the potential to result In this research, two inside activities in data loss. With appropriate surveillance have been studied, both of which utilize technology deployed, such data loss access removable media (USB drive and CD-ROM) path can be monitored and detected. Cloud as the access paths leading to data loss in the technology, web site, and local virtualization healthcare enterprise environment. technology can provide a perfect uncontrolled 5.2.1 Inside Activity Operation data loss access path. Local data can be moved between physical storage within the Modeling network and a local VM instance, which can Inside Activity A, as shown in Figure 2, is then connect to remote private cloud storage a typical industrial espionage inside activity. website. With this secret path, data In such incident, the insider has all the encrypted in the VM instance can be moved needed privileges to access data and the USB outside the physical network boundary of the ports which are required to perform the healthcare business. After the local VM user’s duty. However, those sensitive data instance deleted, little evidence will be left should not be copied to personal USB devices within the boundary of the healthcare since this may result in potential information enterprise network. The only feasible control leakage. To perform such an attack, the user is to block any encrypted traffic (Wippich, is logged into system with all needed 2007). privileges, navigate to sensitive data, copy and paste the sensitive data into the USB

Page 36 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. device. The USB device is then removed and the user is logged out of the system later.

Figure 2. Case Two Internal Attack A (USB copy)

Inside Activity B, as shown in Figure potential information leakage. To perform 3, is also a typical industrial espionage inside such an attack, the user can log into system attack. In such an attack, the attacker has with all needed privileges and navigate to all the needed privileges to access sensitive sensitive data, and then burns the sensitive data and to access the CD-ROM Drive, data onto a CD-ROM. The CD-ROM is then which is needed to perform the user’s duty. removed and the user is logged out of the However, those sensitive data should not be system. copied to CD-ROM since this may result in

Figure 3. Case Two Internal Attack B (CD-Rom copy attack)

© 2015 ADFSL Page 37 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. 5.2.2 Inside Activity Evidence a Windows machine may leave some forensic Modeling traces in the registry, some are persistent for a long time and some are volatile. If a piece The results of Attack A and Attack B are of registry fingerprint is coupled with shown in Figures 4 and 5. Each figure information from the event logs and file contains an augmented threat tree that systems, the insider attack may be tracked represents the vulnerability exploited, the and reconstructed. Based on our observation, steps needed to exploit it, the attacker's relevant fingerprints can be located in operations, and the fingerprint generated by machine’s System hive, Software hive, the those operations. The final goal of both user’s NTuser.dat hive, the setupapi.log that attacks is to steal sensitive information from keeps a history of all devices installed via a business information system with desired plug and play, and the Security event log. system permissions. Operations conducted on

Figure 4. A part of the evidence tree for Inside Activity A.

The inside attack A is conducted on with driver letter E. Examining the 7/29/2011. Based on information in the RecentDocs registry key with the tool registry, at 1:03:39 AM, a Centon USB RegExtract shows that _USBSTOR.sql, device with a serial number of 6AFA4AAD80 Removable Disk (E:), _USB.sql, and a file was attached to the machine. At 1:04:34 AM, named “highly sensitive things” which is the attack was logged into the system and flagged in the honeypot as a sensitive file, left fingerprints in the security event log. were recently accessed. At 1:14:44 AM, User Based on additional fingerprints in the synchronized the document titled with registry, the USB device with serial number “highly sensitive things”, with the Removable 6AFA4AAD80 can be linked with the disk

Page 38 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. Disk (E:). The evidence tree of attack A is sensitive” at 5:53:12. Analysis of the IDE shown in Figure 4. Device Class registry shows that a CD ROM was documented at 5:47:34, a minute after The inside attack B is conducted on Worker 2 logged on to the system. Finally, 7/29/2011. Based on fingerprints in the the user Worker 2 is found to burn the file security event log, user Worker 2 logged into “highly sensitive things” to the CD ROM at the system at 5:46: 26, and attempted to 5:53:12. The evidence tree of attack B is create a hard link with “highly sensitive very shown in Figure 5.

Figure 5. A part of the evidence tree for Inside Activity B

Now we discuss how to determine the to identify a potential inside activity. incident identifiers for the two inside Operations on sensitive assets can be labeled activities. To perform the job allowed for a as safe or highly risky for each work role wi, user’s work role, the operation and access and can be defined by a tuple {{W, O, D, path of an inside activity are allowed and P}, R}, where R defines the risk levels. cannot be prevented, thus, any individual Hence, a risk table containing entries of operation on data and access path cannot {{W, O, D, P}, R} can be developed for identify an inside activity. One approach is each service. Once a risky operation (e.g., to apply the UODP model and utilize the copy di) has been performed by user ui (with contextual information of the incident such work role of wj), the ui’s operations will be as a joint of the operations of data access tracked to look for an operation pi (an access and path access (e.g., copy di access USB) path) associate with di (e.g., USB access

CD access email access) performed by ui. Spafford, 2004; Popovsky, Frincke, and If the two operations (data access O or Taylor, 2007; Rowlinson, 2004; Tan, 2001; access path P) are discovered, then a Tang & Daniels, 2005; Wilson & Wolfe, potential inside activity is identified. Since 2003; Yasinsac & Manzano, 2001). These such combinations cannot be totally existing research efforts focus on prohibited and the knowledge of such organization-level framework design such as combinations cannot be obtained from access policy or management. None of them has control, the detection has to rely on the addressed the details of the technology part logged information in the system. of forensics readiness, e.g., mechanisms of the application and system event logging, Another challenge in digital forensics fingerprint storage and archiving, and investigation is the lack of efficient digital evidence-handling procedures, which are forensics investigation mechanisms. Huge essential to enable forensics readiness for amount of artifacts of events and operations computer information systems. Our research are logged in the system, which may presented in this paper attempts to provide a introduce inefficiency to internal incident practical mechanism to automatically tracking and reconstruction. Many of the identify, track, and reconstruct attacks or security breaches are not investigated due to inside activities, through the identification the unaffordable effort required to perform a and tracking of the evidences of the attacks forensics investigation (Sheyner, 2002; or inside activities. Todtmann, Riebach, & Rathgeb, 2007; Tu et al, 2012). Therefore, to improve the An insider usually has the desired responsiveness and to free businesses and privilege and does not need to conduct any public organizations’ burden on the incident malicious activity (or attack) to obtain the report and investigation process, an incident privilege to access sensitive assets. Current reconstruction mechanism should be in place malicious activity monitoring and detection to track inside activity incident techniques have limitations to effectively automatically. To automate the detect inside activities (Moore, Cappelli, & reconstruction of an inside activity incident, Trzeciak, 2008; Tu et al, 2012). Some external contextual information is needed to research works have attempted to address correlate individual operations of such inside threat modeling and detection issues incident, which can only be learned from (Bradford, Brown & Perdue, 2004; Burford, logged information from the networks and Lewis, & Jakobson, 2008; Chivers, 2009; information systems within the healthcare Eberle & Holder, 2009). Bradford, Brown, & enterprise environment. Therefore, Perdue (2004) proposed principles for mechanisms such as automatic tracking and proactive computer-system forensics reconstruction of a crime scene should be investigation on security incidents include designed (Tu et al, 2012). internal threats, but no technical implementation of the proposed principles has been given and their focus is not threat 6. RELATED WORK detection. Burford, Lewis, & Jakobson (2008) Forensics readiness has recently been a big proposed a comprehensive framework research concern in digital forensic defining a large set of internal threat investigation and information assurance ‘observables’, and a graph theory based (Carrier & Spafford, 2003; Carrier & method to model individuals’ behavior

Page 40 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. (Chivers, 2009). Eberle & Holder (2009) by a subject (or a user) with higher (or equal) proposed an inside activity detection method security clearance and an object (data) can in which behavioral events are modeled as only be read by a subject (or a user) with graphs and abnormal behaviors such as lower (or equal) security clearance. These inside activities can be identified by access control mechanisms can protect searching abnormal subgraphs. The above confidentiality and integrity upon the approaches offer the advantage of modeling authorization of access requests from users, potential attacker and providing interesting however, they have no control on data after insights into observable behavior (Chivers, accesses are granted. Also, insiders usually 2009). However, their applications are have the desired access privileges to access limited by the availability of social data objects, thus, access control mechanisms knowledge of the insiders. Our research, will have limited effectiveness on inside however, will simply require the locating and activity identification and tracking. identification the fingerprints left in the systems by operations of attacks or inside activities, with the guidance of evidence 7. CONCLUSION models, attack identifiers, and access This paper addressed the data loss preference models. prevention management problem in The WDOA (Work Role-Data Asset-User healthcare enterprise environment. First, a Operation-Access Preference) Model and the novel approach is provided to model inside UODP (User-Operation-Data Asset-Access activities and a UODP inside activity Path) model reply on the classification of modeling mechanism is proposed. With work role and data asset. Similarly, access inside activities modeling, data loss paths control models such as role based access and threat vectors are formally described and models and Lattice based access models identified. Second, threat vectors and [Harris, 2012] all rely on such on the potential data loss paths have been classification or labeling of subjects and investigated in a healthcare enterprise objects. The role based access control models environment. Threat vectors have been classify users to a set of work roles and each enumerated and data loss statistics results role is assigned with a set of access privileges. for some threat vectors have been collected A subject (or a user) can exercise a and analyzed. After that, issues on data loss permission only if the permission is prevention and inside activity incident authorized for the subject's (or user’s) active identification, tracking, and reconstruction role. The Lattice based access models are are discussed. Finally, inside activities are mandatory access control models. The Bell- conducted in a simulated healthcare LaPadula Model focuses on the protection of environment, evidences of inside activities confidentiality of information such that an are collected, analyzed and then modeled. object (data) can only be read by a subject Evidence trees have been developed for inside (or a user) with higher (or equal) security activities, which are expected to provide clearance and an object (data) can only be guidance for internal activity incident write by a subject (or a user) with lower (or identification and reconstruction. equal) security clearance. The Biba Model focuses on the protection of system integrity such that an object (data) can only be write

REFERENCES Biggs, S. and Vidalis, S. (2010). Cloud CENZIC. (2008). Q1 Cenzic application Computing Storms: IJICR 1(1), pp. 61- security trends report. [online]. Available: 68. http://www.cenzic.com/downloads/Cenzi c_AppSecTrends_Q3_Q4-2008.pdf. Bradford, P., Brown, M., Perdue, J. (2004). Towards proactive computer-system Chen, P., Laih, C., Pouget, E. and Dacier, forensics. IEEE International Conference M. (2005). Comarative survey of local on Information Technology: Coding and honeypot sensor to assist network Computing (ITCC 2004). forensics. Proceedings of the 1st International Workshop on Systematic Bruening, P. J. and Treacy, B. C. (2009). Approach to Digital Forensics Cloud computing: privacy, security Engineering, 120-132. challenges. Privacy & Security Law Report by The Bureau of National Chivers, H., Nobles, P., Shaikh, S., Clark, J., Affairs, Inc. [online]. Available: Chen, H. (2009). Accumulating Evidence http://www.bna.com. of Insider Attacks. 1st International Workshop on Managing Inside Security Brunette, G. and Mogull, R. (2009). Security Threats (MIST09). Guidance for critical areas of focus in Cloud Computing V2. 1. CSA (Cloud Eberle, W. and Holder, L. (2009). Insider Security Alliance), USA. [online]. threat detection using graph-based Available: approaches. Proceedings of IEEE http://www.cloudsecurityalliance.org/gui Cybersecurity Applications & Technology dance/csaguide. Conference for Homeland Security (CATCH), 237-241. Burford, J., Lewis, L., and Jakobson, G. (2008). Insider threat detection using Ellard, D. and Megquier, J. (2004). DISP: situation-aware MAS. In IEEE 11th practical, efficient, secure and fault- International Conference on Information tolerant distributed data storage. ACM Fusion, 1–8, Germany. Transactions on Storage. 1(1). 71-94. Carrier, B. & Spafford, E. (2003). Getting El Emam, K., Neri, E., Jonker, E., Sokolova, physical with the digital investigation M., Peyton, L., Neisa, A., Scassa, T. process. International Journal of Digital (2010). The inadvertent disclosure of Evidence, 2(2). personal health information through peer‐ to‐peer file sharing programs. J. Carrier, B. & Spafford, E. (2004, July). An American Medical Informatics Assoc., event-based digital forensic investigation 17(2), 148–158. framework. In Proceedings of Digital Forensic Research Workshop. Ernst & Young. (2011). Data loss prevention: keeping your sensitive data out of the Case, A. Cristina, A., Marziale, L., Richard public domain. White Paper. [online]. G., & Roussev, V. (2008). FACE: Available: automated digital evidence discovery and https://www.watchguard.com/tips- correlation. Digital Investigation,5, s65- s75.

Page 42 © 2015 ADFSL Data Loss Prevention and Control: Inside Activity Incident Monitoring... JDFSL V10N1 This work is licensed under a Creative Commons Attribution 4.0 International License. resources/grc/wp-data-loss- Information Security. 39, 17-52. prevention.asp. Murphey, R. (2007). Automated windows Fratto, M. (2008). Security survey: we’re event logs forensics. Journal of Digital spending more, but data’s no safer than Investigations. 4S, S92-S100. last year. [online]. Available: http://www.informationweek.com/news/s Phua, C., Lee, V., Smith, K. and Gayler, R. ecurity/management/showArticle.jhtml?a (2007). A comprehensive survey of data rticleID=208800942. mining-based fraud detection research. [online]. Available: Halbesleben, J.R.B, Wakefield, D.S. and http://www.bsys.monash.edu.au/people/ Wakefield, B.J. (2008). Work-arounds in cphua/. healthcare settings: literature review and research agenda. Health Care Manage- Poolsapassit, N. & Ray, I. (2007). ment Rev., 33(1), pp. 2–12. Investigating computer attacks using attack trees. IFIP International Harris, S. (2012). CISSP All-In-One Exam Federation for Information Processing, Guide. 6th edition, ISBN: 978- Vol. 242. Advanced Digital Forensics III. 0071781749. Popovsky, B. E. & Frincke, D. (2004). Hoffman, P. (2007). RSA security reports Adding the fourth “R”. In Proceeding of low level of trust in online banking the 2004 IEEE Workshop on Information security. eWeek News. [online]. Assurance. Available:http://www.eweek.com/c/a/Se curity/RSA-Survey-Reports-Low-Level- Popovsky, B. E., Frincke, D., and Taylor, C. of-Trust-in-Online-Banking-Security/. (2007). A theoretical framework for organizational network forensic readiness. Johnson, M. E, and Willey, N. (2011). Journal of Computers. Vol. 2, No. 3. Usability failures and healthcare data hemorrhages. IEEE Security and Privacy. Ramzan, Z. (2008). Security trends of 2008 Issue March/April 2011, pp. 18-25. and predictions for 2009. Net Security News, [online]. Available: Kowalski, E., Conway, T., Keverline, S., http://www.net- Williams, M., Cappelli, D. and Moore, A. security.org/article.php?id=1194. Dec. (2008). Insider threat study: illicit cyber 24. activity in the government sector. [online]. Available: Randazzo, M. Keeney, M., Kowalski, E., http://www.cert.org/insider_threat/. Cappelli, D. and Moore, A. (2004). Insider threat study: illicit cyber activity Mauw, S. & Oostdijk, M. (2005). in the banking and finance sector,” Foundations of attack trees. In Won, D., [online]. Available: Kim, S., eds.: International Conference http://www.cert.org/insider_threat/. on Information Security and Cryptology – ICISC 2005.Volume 3935 of LNCS, Rowlinson, R. (2004). Ten steps to forensic Springer 186–198. readiness. International Journal of Digital Evidence, 2(3). Moore, A., Cappelli, D.. & Trzeciak, R. (2008). The “big picture” Rozinat, A. van der Aalst, W., Dustdar, S., of insider IT sabotage across U.S. Fiadeiro, J. and Sheth, A. (2006). critical infrastructures. Advances in Decision mining in ProM. In: Lecture

© 2015 ADFSL Page 43 JDFSL V10N1 Data Loss Prevention and Control: Inside Activity Incident Monitoring... This work is licensed under a Creative Commons Attribution 4.0 International License. Notes in Computer Science. 4102. Tan, J. (2001). Forensics readiness. Springer, Berli Electronic version available at HTUhttp://www.arcert.gov.ar/webs/text Rozinat, A., Mans, R., Song, M. and van der os/forensic_readiness.pdf. Aalst, W. (2008). Discovering colored petri nets from event logs. International Tang, Y. and Daniels, T. (2005). A simple Journal on Software Tools for framework for distributed forensics. In Technology Transfer, 10(1). Proceedings of the 25th IEEE International Conference on Distributed RSA Security. (2008). CSI computer crime & Computing Systems Workshops, 163-169. security survey. [online]. Available: http://i.zdnet.com/blogs/csisurvey2008.p Todtmann, B., Riebach, S. and Rathgeb, E. df. (2007). The honeynet quarantine: reducing collateral damage caused by Saini, V., Duan, Q., Paruchuri, V. (2008). early intrusion response. In proceedings Threat modeling using attack trees. J. of the 6th international Conference on Comput.Small Coll. 23(4). Networking, 464-465. Schneier, B. (1999). Attack trees: modeling Tu, M., Xu, D., Butler, E., and Schwartz, A. security threats. Dr. Dobb’s Journal. (2012). Locating and identifying forensic Seltxer, L. (2006). Is online banking too evidence for attacks against online dangerous? eWeek News. [online]. business information systems by using Available: honeynet. Journal of Digital Forensics, http://www.eweek.com/c/a/Security/Is- Security, and Law. 7(4), 73- 97. Online-Banking-Too-Dangerous/. Wilson, W. & Wolfe, H. (2003). Management Shah, A. (2009). More employees neglecting strategies for implementing forensic data security, survey says. [online]. security measures. Information Security Available: Technical Report, 8(2). http://www.networkworld.com/news/200 Wippich, B. (2007). Detecting and 9/061009-more-employees-neglecting- preventing unauthorized outbound data-security.html. IDG News Service. traffic. White Paper, SANs Institute Sheyner, O., Haines, J., Jha, S., Lippmann, Reading Room. [online]. Available: R. and Wing, J. (2002). Automated https://www.sans.org/reading- generation and analysis of attack graphs. room/whitepapers/detection/detecting- Proceedings of the IEEE Symposium on preventing-unauthorized-outbound- Security and Privacy, 273-284. traffic-1951. Singleton, T., Singleton, A., Bologna, G., and Yasinsac, A. and Manzano, Y. (2001). Lindquist, R. (2006). Fraud Auditing and Policies to enhance computer and Forensic Accounting, 3rd edition. ISBN: network forensics. Proceedings of the 9780471785910. Wiley. 2001 IEEE Workshop on Information Siponen, M. and Oinas-Kukkonen, H. (2007). Assurance and Security. A review of information security issues and respective research contributions. Database for Advances in Information Systems. 38(1), 60-80.