2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)

ASPGen: an Automatic Security Policy Generating Framework for AppArmor

Yun Li, Chenlin Huang, Lu Yuan, Yan Ding Hua Cheng College of Computer, National State Key Laboratory of Mathematical University of Defense Technology Engineering and advanced computing Chang Sha, China Wu Xi, China liyun18, clhuang, [email protected], dingyan [email protected] [email protected]

Abstract—The security of the has always be denied. A thorough and professional analysis is required been the basis of information systems. Several security frame- for the generation of security policies. However, due to the works have been proposed to enhance the security of the complexity in software, large amounts of access control rules operating system, such as SELinux and AppArmor. However, the major drawback of these solutions is the complexity of the have to be configured or programmed in security policy security policy configuration in which strong professionalism configuration files to meet the strict security requirements, is required. Therefore, there are many related studies on which makes it impossible for normal users to configure optimizing the process of configuring security policies, but so security policies for a new application. far the involvement of security experts is still required. In this SELinux is the most famous MAC solution in paper, we aim to further optimize the AppArmor’s security policy generating process and propose ASPGen, which is a because of its strict policies, but it is overly complex novel framework for generating AppArmor security policies and unbefitting ordinary user management. To reduce the automatically. ASPGen can autogenerate security policy with complexity in security policies, a path-based MAC solu- the least privilege and RBAC (Role-Based Access Control) for tion AppArmor is designed and implemented which binds applications, and effectively alleviate the complexity and sub- access control attributes to software rather than to users. jectivity in manually configuring AppArmor’s security policy, as well as the security threats that result from the improper The security policy is centrally controlled by the security policy. We implement the prototype of ASPGen in administrator and the user has no right to override the policy 16.04. Unlike previous approaches, ASPGen does not depend in AppArmor. To ease the burden of policy generating, an on experts after the expert system is built. In our experimental audit and rule-judging mechanism is provided in AppArmor. evaluation, several typical applications are chosen and generate A security expert is required to run the application with the their AppArmor security policies with ASPGen. The complete- ness and precision of the generated policies are evaluated and a goal of achieving higher code coverage when the application case of mysql-server is thoroughly analyzed by comparing the is misbehaving by accessing points in LSM (Linux Security default AppArmor security policies with the policies generated Module), audit messages are sent to the log files. Then the by ASPGen. The evaluation demonstrates that the policy expert generates a corresponding policy by selecting the generated by ASPGen is complete, precise, and fine-grained, granted permission based on the logs. The policy config- even without expert intervention. Our contribution can be further improved with a more general and intelligent security uration file can be generated by recording the operations policy generating mechanism and is being extended to support allowed by the application, and any operations not listed are other security frameworks including SELinux and SEAndroid. denied. There are two major drawbacks with the existing solu- Keywords-; AppArmor; RBAC; tion: imposing intense manual labor and generating coarse- grained AppArmor policies only. To meet these challenges, I.INTRODUCTION we propose ASPGen, an AppArmor Security Policy Gen- The security of the operating system is the foundation of erating framework based on an expert system, which re- information systems. As the core of the security architecture duces the dependence on human experts when generating in the operating system, MAC (Mandatory Access Control) AppArmor policy. ASPGen contains three stages, as shown determines whether to allow a subject access to objects by in Fig.1. First, a more complete dataset of apparmor-related comparing the security attributes of subject and object. The logs is obtained by running the application or collecting mainstream MAC solutions include SELinux, AppArmor, the history logs extensively. Second, an expert system is , and so on. They protect the operating system and established to guide whether the permissions required for applications against threats from attacks and misbehaves by the resource are granted. Finally, the security policies are limiting the access to resources according to pre-defined automatically generated for the specified application by security policies. These policies specify how to control an expert system. We implement ASPGen’s prototype in user access to each upcoming access path in the specific Ubuntu 16.04, and its effectiveness and ease-of-use are application and what types of unauthorized access should evaluated by analyzing five commonly used applications.

978-0-7381-3199-3/20/$31.00 ©2020 IEEE 392 DOI 10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00075 The evaluation results demonstrate the effectiveness of our the feature of the security framework that records autdit framework, which achieves an expected precision and com- logs. EASEAndroid [3] is the first SEAndroid analysis pleteness. ASPGen can wildly generate security policies for platform for automatic policy analysis and refinement, which various Linux applications. can analyze audit logs by semi-supervised learning. [8] Our contributions can be summarized as follows: mines audit logs by machine learning to automatically • A path-based confidence model for AppArmor is pro- generate SELinux policy. These methods need to collect posed to support the auto-generation of security policy, a large amount of log information from different models which follows the principle of the least privilege and of machines, which may violate user privacy. SPOKE [4] infers permissions based on the matching degree of extracts domain knowledges from rich-semantic functional the resource paths and the frequency of the chosen tests and uses these knowledge to represent the attack permission. surface of SEAndroid policy rules. In addition, SELinux • An expert system based on the path-based confi- has also released some automatic policy generation tools, dence model is designed to guide policy generation, the audit2allow [6] of which converts the audit log into in which a knowledge base is constructed by cate- rules to automatically generate and refine SELinux policies. gorizing resource objects and assigning access rights Because there is a problem the same as [5], the conversion of each category for each role. According to our process of audit2allow is very mechanical, and cannot know research, there is currently no public knowledge base the true intentions of the user operations, which give rise to a for guiding AppArmor security policy generation, we problem of over-privilege. PyBE [7] resolves this problem by have made the knowledge base available at http- specifying user-specific security policies. It predicts policy s://github.com/lyeeer/ASPGen. decisions for new scenarios, based on the policy examples • A novel AppArmor security policies auto-generating provided by users. The policy decisions specify if actions framework based on the expert system is proposed and should be allowed or denied in certain scenarios. implemented, which offers a fine-grained generation Role-Based Access Control. RBAC [10]–[12] focuses on capability, and reduces reliance on the intensive labor the relationship between USERS, ROLES and PERMIS- of human experts. In our evaluation of ASPGen with SIONS, where USERS perform ROLES, and users deter- five typical applications, ASPGen successfully obtains mine permissions based on their roles, as shown in Fig.2. a total of 337 apparmor-related logs and auto-generates In addition to common user roles, system administrators, a total of 337 security rules accordingly. By comparing security administrators, and audit administrators are intro- with the default security policy, the completeness and duced based on the principle of separation of the powers. precision of the security policy generated by ASPGen In terms of permission scope, the system administrator is is evaluated. responsible for the entire system user group, role, user cre- The rest of this paper is organized as follows. The related ation, and permission assignment. The security administrator work is discussed in Section 2. The operating principle and shall audit the authorization of a special role or user group. design of ASPGen are explained in detail in Section 3. The audit administrator shall inspect the system audit log to Section 4 presents the experimental setup and evaluates the supervise other administrators and common users who shall performance of ASPGen. Finally, future work is concluded not have the right to create users and authorize approval. in Section 5. Research on minimizing privileges usually evaluates user privileges allocation based on completeness(under-privilege) II.RELATED WORK and security(over-privilege), in order to strike balance be- The analysis and generation of security policy in the tween the security risks of granting over-privilege and the operating system can be divided into the following two parts: effort of under-privilege. Sanders and Yue [8] define the Security Policy Generation. Current research on security balance between over-privilege and under-privilege as Priv- policy generation usually focuses on SELinux and SEAn- ilege Error Minimization Problem(PEMP) for RBAC and droid. Some research is based on static code analysis to propose a method for quantitatively scoring security policy. achieve. Lachmund [5] chooses a call graph based static Because Attribute Based Access Control(ABAC) has better analysis approach and distinguishes application-initiated re- granularity, flexibility and usability. In [13], they implement source access from user-initiated resource access. Polgen [2] the automatic generation of least privilege ABAC policies, is a framework for automatically generating policy, and which minimize both under-privilege and over-privilege by creating a high-level description of the required privileges of taking a rule mining approach to mine system’s audit logs. software through binary analysis. The generation of call tree based on an application’s binary image is not comprehensive. III.ASPGEN In consequence, the generated policies are incomplete. The To achieve the goal of auto-generating security policies generated policy only includes access rights not derived and producing more fine-grained policies, we propose ASP- from user interactions. Some research is achieved through Gen, which is an AppArmor Security Policy Generating

393 Figure 1. Overview of ASPGen.

the execution of all resource access points in the application, the ASPGen framework runs on ”complain” mode for the data-collection stage. To fully use the learning feature of AppArmor in complain mode, which records violation attempts or error operations in log files, the concepts and tools of white-box testing in the field of software testing are introduced into ASPGen. White-box testing can be performed in multiple stages of the test. For example, during the unit testing phase, white-box testing can cover paths within units [15]. In the first stage of ASPGen, we recommend either using a white-box testing tool to automate the execution of the application to trigger Figure 2. The RBAC model on which ASPGen relies every possible paths which need to access resources with maximum code coverage, or by doing it yourself, leading to obtain more reports of security violations. Since the role of framework based on an expert system. As illustrated in the user is determined by the scope of permissions granted, Fig.1, in the framework, 1) the apparmor-related logs with there is no need to distinguish roles during log generation. the maximum code coverage are obtained by running the An apparmor-related log related to file manipulations application fully, and resources (or objects) are extracted is shown in Listing 1, the type of the entry is 1400, from it for generation. 2) Then, an expert system is con- indicating that it is recording a ”DENIED” error operation structed to guide the correspondence between resources (or (by apparmor). This is an open operation (by operation) objects) and operations. 3) The permission assignments are between the evince-thumbnai application subject (by comm) determined by the expert system, and the security policies and the ”/usr/bin/fix-qdf” file object (by name). Because that can be used in the AppArmor mechanism are generated AppArmor is path-based, the generated policy filename (by automatically. profile) is based on the path of the application. A. Log Obtainment Building the log dataset is the first step of auto-generating Nov 24 15 : 33 : 36 ubuntu kernel : [5321.126995] audit : type = policies in ASPGen. AppArmor is based on the LSM frame- 1400 audit(1574580816.875 : 26) : apparmor = work, which provides many hook functions to determine the ”DENIED” operation = ”open” profile = access control of tasks, program loading, IPC, file system, ”/usr/bin/evince-thumbnailer” name = and other aspects of the application at runtime. It covers ”/usr/bin/fix-qdf” pid = 3878 comm = ”evince- almost all aspects of operating on objects (i.e. resources) thumbnai” requested mask = ”r” denied mask = ”r” fsuid = 1000 ouid = 0 within the system [14]. The hook function is also used for audit extensions, and records audit data by calling the audit module to generate related audit logs. Listing 1: The log form of the evince-thumbnai application The policy configuration files are used to restrict the By analyzing the form of the log, we find that parts of applications’interactions with resources or objects in the a log that are closely related to the generation of security system. AppArmor has two modes: enforce and complain, policies are: operation, profile, name, requested mask, and both will log and report violation attempts (either via auditd other parts. We use the python language to perform the or syslog). AppArmor records violation attempts or errors in audit log segmentation, and store the parsed logs by nesting logs instead of enforcing policy in complain mode. To allow the defaultdict in the collections module multiple times to

394 implement LogParser. through the AppArmor configuration file, as well as through a combination of suboptions within the option, but not B. Expert System Construction all combinations are allowed. For example, the AppArmor Following the security principle of least privilege, user security policy rule for the log entry in Listing 1 is ”allow decentralization and data abstraction, RBAC achieved the /usr/bin/fix-qdf r” (or ”/usr/bin/fix-qdf r”). goal of extending traditional AppArmor, and met the re- According to the format of the policy rule, five essential quirements of ASPGen. The blue lines in Fig.2 are the ones elements are concluded which make up the rule. Mining the that ASPGen focuses on, that is, assigning permissions to d- domain knowledge expression corresponding to a rule, we ifferent roles. The permission contains resources (or objects) represent it as a 5-tuple: and the executable image of the corresponding application (i.e. operations), and the type of operation depends on the knowledge = (role, resource, operation, isdeny, isowner) type of resource. For example, the possible operations in the Since ASPGen introduced RBAC, the role of the current file system are read, write, execute, etc. user needs to be identified, and the domain knowledge An expert system is constructed to guide the correspon- will be determined accordingly. Next, based on the format dence between resources (or objects) and operations, and of AppArmor configuration file, domain knowledge should the expert system is composed with a knowledge base and a indicate what operation is available to the current role for permission inference module. As a core part, the knowledge one or a group of resources, and whether qualifiers such as base stores the set of security expert knowledge in the ”deny” and ”owner” should be enabled in the corresponding form of domain knowledge which is designed based on rules. What is special is that operation will be ”capability” the AppArmor security policies. The permission inference when the unit object is restricted by AppArmor, or be module can be regarded as the ”brain” of the expert system, ”rlimit” when the application’s hard limit is controlled by which can simulate the thinking process of the security AppArmor, etc. expert in authorization, it mainly includes a path-based Since ”everything is a file” is one of the basic philosophies confidence evaluation model to grant permissions to each in Linux, all resources can be treated as files. To create a accessed resource. taxonomy for the numerous files in Linux and ignore the 1) Knowledge Base: Security profiles are used to restrict differences in file names across different distributions or how applications interact with resources (or objects) in versions, ASPGen divides the resource objects into different the system. We take the limitations on capability and file categories according to the functions, characteristics, and manipulation as examples to describe the format of the practical application requirements of the files. The domain security policy rules in the AppArmor policy configuration knowledge base is constructed by assigning access rights for file. each category to multiple roles. CAP ABILIT Y RILE = [0audit0][0allow0|0deny0][0owner0] 0capability0[CAP ABILIT Y LIST ] F ILERULE = [0audit0][0allow0|0deny0][0owner0] (0file0|[0file0](ACCESS)) ACCESS = (0r0|0w0|0a0|0l0|0k0|0m0|0ix0|0ux0|0Ux0|...) The first three of CAP ABILIT Y RULE and F ILE RULE are qualifiers, where 0audit0 used to specify that messages with related access will be logged to the audit log when this is done. 0deny0 is used to indicate that the opeation on the resource is explicitly not allowed, but due to the whitelisting mechanism that profile actually uses, the access is not allowed if permissions are not granted to the corresponding path by default. The purpose of the Figure 3. Partial results of classifying files by system function and 0owner0 is to make the rule conditional on the ownership weighting them by role of the object. Capabilities are distinct units derived from Linux’s partitioning of privileges traditionally associated A small part of the knowledge base is illustrated as with superuser, so CAP ABILIT Y LIST is a list of an example in Fig.3. As defined in the Linux file system unit objects whose current roles are restricted by security hierarchy standard (FHS) [16], /usr/bin contains almost mechanisms. And the ACCESS option in F ILE RULE all user commands. It is the software library of the sys- represents operations that will restrict the specified 0file0 tem. Consequently, other commands are stored in /bin or

395 /usr/local/bin. Therefore, these three directories and the /usr/include/ successfully matches /usr/include/ ∗ ∗ files under them are classified into bin category. As for the (’∗∗’ represents any file or directory in /usr/include/ commands in the bin category that require to be booted, the or any directory below /usr/include/) in the knowledge execution permissions are granted to all roles. As for other base. After the matching is completed, the inference module executable files in the bin category related to different users, will provide a path-based confidence evaluation model. The the permissions are granted by their roles accordingly. For calculation formula of the confidence is as follows: system administrators, additional read and write permissions Count level CONF = sumP er × currentP ath (1) are granted. Count level 2) Permission Inference Module: Since AppArmor’s se- currentP er fullP ath curity policy is based on a resource path, a path-based con- When the related knowledge is matched, the first bracket fidence evaluation model is proposed in ASPGen to achieve quantifies the accuracy of the permission-based on the fre- the goal of the least privilege. According to this confidence quency of the currently selected permission, and the second evaluation model, the permission inference module included bracket quantifies the matching degree based on the directory in the ASPGen framework is used to infer the appropriate level of the resource path. permissions for the current role for each access resource. Based on this understanding, an algorithm to calculate Since the knowledge base is managed in the database, it is confidence is designed as shown in Algorithm 1. First, very convenient to match the actual path with the resource in we continue to remove the lowest directory of the path, the knowledge with SQL statements. This allows generating and then query relevant permission knowledge Kpath and AppArmor security policies automatically. When a match P AT Hset in the knowledge base until Kpath is not empty, is achieved, security rules are adopted immediately. On the that is, the permission information of the maximum path contrary, the inference module will provide recommended matching degree has been found (lines 2-5). We then get the permissions and confidence on a SQL miss. permission set perset based on the permission information Resources that require permission determination issue of each path in the query path set P AT Hset (lines 8-11), queries as query strings via SQL statements, for example, and obtain the permission of the most frequent occurrence to query a given access resource (i.e. path) p, the query (line 12). Finally, the confidence degree is composed of path (standard SQL statement with pseudo-code constraints in matching degree and permission frequency (line 15). WHERE clauses) is: . Security Policy Generation SELECT (resource, operation, is deny, is owner) FROM The goal of the third stage in ASPGen is to generate {user rule or sec admin rule or audit admin rule and deploy the AppArmor policy configuration file for specific applications following the principle of the least or sys admin rule} W HERE resource {= or LIKE} p privilege. The original AppArmor does not support the Because of the RBAC mechanism, select the separation of the powers by roles, so ASPGen followed queried operation table (user rule, sec admin rule, the implementation of AppArmorRBAC [17] to support it. sys admin rule, audit admin rule) based on the role Because user behaviors are closely related to thier roles, and of the current user. As mentioned earlier, a policy is made ASPGen follows the RBAC mechanisASPGen meets users’ up of qualifiers, resources and operations. For the capability, requirements to a certain extent. you can simply query whether the specified role has the By default, four users are set up corresponding to four corresponding capability. However, for the limitations of roles. To simplify the operation, we write the security policy the file system, since the resource is expressed in the form specific to each user in the hat of the corresponding appli- of a path, if the query result is not NULL, the rule is cation, which finally realizes the separation of the powers in directly constituted by the returned result. If the query AppArmor. result is NULL, we use ’/’ as a delimiter, split the directory The automated generation process of security policies is string at each level of the path, and continuously remove shown in Fig.4, the apparmor-related logs are forwarded subdirectories for secondary matching. Repeat the above to the Log P arser, which parses the logs to obtain the steps until a matching path in the knowledge base is found. resources that the application needs to access. For each We take /usr/include/python2.[4567]/pyconfig.h as resource, we pass it to the MySQL Database that stores an example, and suppose the current user is common user. the knowledge of security experts. Through the interac- By querying the user rule table, we find that there is no tion between the MySQL Database and the P ermission matched information for this resource in the knowledge Inference Module, the information about the permission base, so we cannot directly grant permissions through of each resource can be obtained. Finally, the obtained matching. Next, we remove the lowest-level subdirecto- information will be combined into an AppArmor config- ry and use /usr/include/python2.[4567]/ as keyword to uration file which can be automatically deployed to the match. Keep removing subdirectories until the current path system. After that, AppArmor will use the enforce mode

396 Algorithm 1 Confidence Calculator A. Experiment Datasets and Environment Input: P AT H The path string to be queried; K The query According to our extensive research, there is currently knowledge base no public knowledge base for guiding AppArmor security Output: PER Recommended permissions; CONF The policy generation. In order to construct the knowledge base, confidence of the output permission. we divide system resources according to the functions, 1: P AT H , per , PER , CONF ← φ query set characteristics and practical application requirements of the 2: while K is null do path files. For each role’s permission to each resource category, 3: P AT H ← remove the lowest directory in current we first grant it permissions based on the resource attributes P AT H query of each category and the role’s permission scope. Next, 4: K ,P AT H ← query permission knowledge path set our security experts dynamically adjust and update various about P AT H from K current types of permissions to form the final knowledge aiming to 5: end while maintain a stable and secure state for both applications and 6: P AT H ← remove the lowest directory in current systems. A total of 256 security policies are collected. We P AT H query classify the collected security polices by roles and security 7: level ← calculate the level of P AT H currentP ath current policy categories, and establish a domain knowledge base 8: level ← calculate the level of P AT H fullP ath query through MySQL database. 9: for path  P AT Hset do We hope that the application for evaluation meets the 10: per ← select permissions about path from K following requirements,1) the application is open source and 11: perset ← perset ∪ per use GIT as a version control tool, 2) the application has 12: end for a large market share or technical influence in its applica- 13: PER ← find the permissions that occur most frequently tion field, and 3) the application has longer development in perset cycles or large scale. Putting these requirements together, 14: CountsumP er ← count the total of permissions in we choose sudo, , login, ssh and mysql-server perset as our experimental subjects in this paper. Next, we obtain 15: CountcurrentP er ← count the frequency of PER ap- pears the log dataset. By extensively executing the application and Count level collecting related logs from historical execution records, we 16: CONF ← sumP er × currentP ath CountcurrentP er levelfullP ath obtain a total of 337 apparmor-related logs related to access 17: return PER, CONF control from these five applications, including 298 for file system, 15 for capability. A small part of it is requesting access to resources such as network, dbus, mount, etc. The AppArmor version number in our system is AppAr- mor 2.10.95-0ubuntu2.11, and we use Ubuntu 16.04 as the development environment.

B. Completeness and Precision of generation strategy

Figure 4. The automated generation of security policies Firstly, we use the ASPGen to generate security policies for the five applications without human intervention. Then we take the default policies preset by AppArmor as the and provide applications with the least privilege protection. baseline, and evaluate the performance of ASPGen in actual It is important to note that the dashed line section indicates scenarios.The default policies are typical policies provided the need for role intervention between the resource and the by security experts, so we believe that they are authoritative. knowledge base due to the need for decentralization. And most users do not have the ability to modify the policies, so most users use the default policies preset by IV. EVALUATION AppArmor. So we think the comparison is meaningful. We implement a prototype of ASPGen in Ubuntu To normalize the description, we identify the default 16.04 based on the apparmor-utils 2.13.2-10 package, and ASPGen-generated AppArmor policy configuration which contains command-line utilities that can help create files generated for the same application as Pdefault and and manage AppArmor profiles. To generate and parse PASP Gen, and describe the completeness and precision in apparmor-related logs, we modify the aa-genprof module the form of a set. Since the default configuration files are not in apparmor-utils to support the retrieval and parse the based on RBAC mechanism, the two sets are not formally contents of custom log files. For permission decision and the same. We express the completeness as: policy generation, we modify the aa module to support the permission decision of the expert system. ∀pPdefault, pPASP Gen

397 Where p is an AppArmor security policy in Pdefault. If the above formula is satisfied, the policy automatically generated by ASPGen meets the completeness. On the contrary, we use Plack = Pdefault − PASP Gen ∩ Pdefault to represent the missing part of PASP Gen, and measure the recall rate by percentage. For precision, since ASPGen runs programs with the goal of maximizing code coverage, it is theoretically possible to show that the generated security policies are more fine-grained, with Pextra = PASP Gen −PASP Gen ∩Pdefault , for the portion of the security policies generated by ASP- Gen is larger than the default policy. Due to AppArmor’s whitelisting mechanism, we only need to test whether each access resource in the Pextra collection is accessible in the context of the default policy. If the access fails, the policy is valid and should be added. Fig.5 shows the comparison among our method and the Figure 5. The automated generation of security policies default, we can find that the policy generated by ASPGen is outnumbered by the default for su, mysql-server and Table I lightdm. In this regard, the apparmor-related logs are the SUMMARY AND DESCRIPTION OF THE MATCHING RESULTS source for ASPGen to generate policies, since all error Matching result Matching degree Numbers operations are recorded in the audit logs by AppArmor Matched 100% 185 framework. Our goal is to collect more logs for the spe- Partially Matched ≥ 50 % 134 cific application, so multiple methods for log collection Partially Matched <50 % 16 Unmatched 0% 2 are adapted. First, by drawing on the ideas and methods of white-box testing, we can trigger as many resource access points as possible and generate the apparmor-related logs with the maximum code coverage. Second, we can the calculation rules of confidence, we believe that when also collect the application’s history log extensively. As confidence exceeds 50%, it will rely on more knowledge more resource access points are discriminated, the security base information, that is, the reasoning results are more policies generated by ASPGen performs very well in terms reliable. Therefore, we classify the partial matching results of completeness. Therefore, we believe that the reason for as the confidence of 50% as the cut-off point. As shown in this situation is that su, mysql-server and lightdm are all Table 1, only a small number of these resources cannot be written in C/C++. In consequence, we use some white-box matched. It can be proved that the constructed knowledge testing tools for C/C++ in their log collection process, such base meets the requirements. as the unit and integration testing tool Cantata, which can For partially matched with a confidence level of less provide a unique AutoTest test function [18]. As a result, the than 50%, the most common reason is that the resource code coverage is very high. For applications developed in is dynamically generated. For example, when the secu- other languages without an appropriate testing tool, we can rity policy is generated for su, one of the apparmor- collect historical logs by multiple running them manually. related logs is a request to write to /dev/pts/6. Since As for the precision, we take the default policy as standard /dev/pts is a directory where console device files are one, and consider PASP Gen ∩ Pdefault in PASP Gen is created after remote login (telnet, ssh, etc.), /dev/pts is accurate. Then we can find that for the five applications dynamically generated, which is different from other device listed, the precision of the policy generated by ASPGen files. They are hard disk nodes that are generated when is above 90%. For each application of Plack and P extra the system is built. Therefore, there are no corresponding specific correctness verification, we found that the policies records in the knowledge base. Moreover, it can only be in Pextra are desirable permissions for application security, inferred from other resources that belong to the dev cate- and the policies in Plack will not disrupt the functioning of gory. In case of unmatched, when generating for su, read the application. Because there are very few missing policies, and write operations to @HOMEDIRS/ ∗ /.Xauthority this means that the security policies generated by ASPGen are requested, and while generating for ssh, request for will improve security without affecting the functioning of write @HOMEDIRS/.Xauth∗ is requested. The variable the application. names presented by HOMEDIRS may be different for The permission inference module in ASPGen uses the different operating environments, so the knowledge base SQL queries mentioned in Section 3.2.2 to attempt to match does not contain domain knowledge with variable names. these resources with the knowledge base. By analyzing In addition, we use ASPGen to generate policies for ap-

398 grained. The dotted line indicates that the permissions of the resource parts are inconsistent. The black line indicates that the policies generated by the two methods are consistent. From the comparison figure, we see that most of the policies generated by ASPGen can correspond to those in default, and the generated policies are more fine-grained and meet the requirements of completeness.

D. Discussion From the above experimental evaluation, we find that the policies generated by ASPGen can meet the requirements of completeness and accuracy. As mentioned in the related work, many existing researches can generate security poli- cies by analyzing the program dependencies or the binary. Figure 6. The comparison of the configuration file generated by ASPGen In our case, we design a permission inference module in and the default (partially) ASPGen to create a novel way to leverage the knowledge base for AppArmor policy generation. Although in ASPGen, many pre-defined rules and knowledge base are needed so plications that have preset security policies in LTS versions as to generate policies. However, because the file location such as 12.04, 14.04 and 16.04. The effect is similar to that is relatively fixed under Linux, the rules that need to be of five typical applications. As a result of these experiments pre-defined in advance are predictable and acceptable. This we concluded that the completeness and precision of our series of work is preset in the ASPGen framework, allowing work are both higher than 90%. For some applications ASPGen to realize the real automation of users in the written in C/C++, the completeness can reach 100%. AppArmor policy generation. C. Case Study: mysql-server We take mysql-server as an example to elaborate on V. CONCLUSION the use of ASPGen. The corresponding location of mysql- In this paper, we presented ASPGen: a novel AppAr- server application in the system is /usr/sbin/mysqld. mor security policies auto-generating framework based on According to the ASPGen framework, we obtain 53 the expert system, which offers a fine-grained generation apparmor-related logs. After parsing the log, we find that capability and reduces reliance on the intensive labor of there were 48 different resource access requests, so we human experts. Inside ASPGen, a path-based confidence get an AppArmor configuration file generated by ASPGen model for AppArmor is used to support the auto-generation that contains 48 security policies. There is a total of 43 of security policy, which follows the principle of the least security policies in the default AppArmor configuration file privilege and RBAC. And an expert system based on the of mysql-server. The security policies in these two files model is designed to guide policy generation according to are classified into 15 categories based on the type of access access rules in a knowledge base. ASPGen is the first public (with two special rule forms, include and capability). security framework supporting automated policy generation We compare the generated results from these two methods for AppArmor as far as we know. Unlike previous approach- based on the 15 categories. Figure 8 only shows access to es, ASPGen does not entirely depend on exerts after the system resource, network, config and data-dir. In the ap- expert system is built. The evaluation demonstrates that the pendix, Figure 9 shows the complete comparison chart. The policies generated by ASPGen are complete, precise, and right side represents the policies generated by ASPGen under fine-grained, even without expert intervention. In the future, the corresponding classification, while the line represents we attempt to automate the deployment of other security the comparison between the corresponding policies in the frameworks including SELinux and SEAndroid, and we default configuration file and the configuration file generated will explore the commonality in these security frameworks by ASPGen. The blue line represents the policies that are not to propose a more general and intelligent security policy included in the default, that is, the part generated by ASPGen generating mechanism. to meet the completeness requirement. The red line indicates a more fine-grained policy. For example, in allowing con- VI.ACKNOWLEDGMENT figuration access, the default policy is ”/etc/mysql/ ∗ ∗ r”, that is, all files in this folder are allowed to be read, while This work has been supported by the Joint Funds of ASPGen only allows permissions for files that need to be the National Natural Science Foundation of China (Grant accessed. Consequently, the resulting policy is more fine- No.U19A2060).

399 REFERENCES [15] S. Nidhra and J. Dondeti, “Black box and white box testing techniques-a literature review,” Int. J. Embed. Syst. Appl. [1] P. Centonze, R. J. Flynn, and M. Pistoia, “Combining static and IJESA, vol. 2, no. 2, pp. 29–50, 2012. dynamic analysis for automatic identification of precise access- control policies” in Twenty-Third Annual Computer Security [16] Filesystem Hierarchy Standard, http://www.pathname.com/ Applications Conference, 2007, pp. 292–303. fhs/2.2/fhs-5.13.html. Last accessed 2 May 2020 [2] T. Rauter, A. Holler,¨ N. Kajtazovic, and C. Kreiner, “Towards [17] AppArmorRBAC, https://gitlab.com/apparmor/apparmor/-/ an automated generation of application confinement policies wikis/AppArmorRBAC. Last accessed 2 May 2020 with binary analysis,” in International Symposium on Net- works, Computers and Communications, 2015, pp. 1–6. [18] Cantata, http://www.softtest.cn/show/46.html. Last accessed 2 [3] R. Wang et al., “EASEAndroid: automatic policy analysis and May 2020 refinement for security enhanced android via large-scale semi- Appendix supervised learning,” in 24th USENIX Security Symposium, 2015, pp. 351–366.

[4] R. Wang et al., “Spoke: Scalable knowledge collection and attack surface analysis of access control policy for security enhanced android,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, 2017, pp. 612–624.

[5] S. Lachmund, “Auto-generating access control policies for applications by static analysis with user input recognition,” in Proceedings of the 2010 ICSE Workshop on Software Engineering for Secure Systems, 2010, pp. 8–14.

[6] Userspace tools, https://github.com/SELinuxProject/selinux/ wiki. Last accessed 2 May 2020

[7] A. Nadkarni, W. Enck, S. Jha, and J. Staddon, “Policy by Example: An Approach for Security Policy Specification,” ArXiv Prepr. ArXiv170703967, 2017.

[8] M. W. Sanders and C. Yue, “Minimizing privilege assignment errors in cloud services,” in Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy, 2018, pp. 2–12.

[9] Huang C , Hou C , He L , et al. Policy-Customized: A New Abstraction for Building Security as a Service[C]// 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks & 2017 11th International Conference on Fron- tier of Computer Science and Technology & 2017 Third In- ternational Symposium of Creative Computing (ISPAN-FCST- ISCC). IEEE Computer Society, 2017.

[10] G. H. Wang, “Role-Based Access Control,” Computer, vol. 29, no. 2, pp. 38–47, 2002.

[11] L. U. O. Yu and H. E. Lian-yue, “Research and Implementa- tion of the Audit Subsystem in the Kylin Operating System,” Figure 7. The comparison of the configuration file generated by ASPGen Comput. Eng. Sci., 2007. and the default

[12] Liu M , Zhang X , Yang C , et al. Privacy-preserving detection of statically mutually exclusive roles constraints violation in interoperable role-based access control[C]// 2017 IEEE Trust- com/BigDataSE/ICESS. IEEE, 2017.

[13] M. W. Sanders and C. Yue, “Mining Least Privilege Attribute Based Access Control Policies,” 2019.

[14] J. Wu and K. Qu, “Research and realization of secure audit mechanism based on LSM,” in 2009 International Conference on Management and Service Science, 2009, pp. 1–5.

400