secmodel sandbox : An application sandbox for NetBSD

Stephen Herwig University of Maryland, College Park

Abtract POSIX interface into categories, and allows processes to whitelist or pledge their use of certain categories; an at- We introduce a new security model for NetBSD – sec- tempt to perform an operation from a non-pledged cate- model sandbox – that allows per- policies for re- gory kills the process. stricting privileges. Privileges correspond to kauth au- We implement an application sandbox for NetBSD, thorization requests, such as a request to create a socket secmodel sandbox, that allows per-process restriction or read a file, and policies specify the sandbox’s deci- of privileges. secmodel sandbox plugs into the kauth sion: deny, defer, or allow. Processes may apply mul- framework, and uses NetBSD’s support for in-kernel Lua tiple sandbox policies to themselves, in which case the [7] to both specify and evaluate sandbox policies. We policies stack, and child processes inherit their parent’s are developing several facilities with secmodel sandbox, sandbox. Sandbox policies are expressed in Lua, and the such as a secure chroot and a partial emulation of evaluation of policies uses NetBSD 7’s experimental in- OpenBSD’s pledge . kernel Lua interpreter. As such, policies may express static authorization decisions, or may register Lua func- tions that secmodel sandbox invokes for a decision. 2 NetBSD Overview

2.1 kauth 1 Introduction NetBSD 4.0 introduced the kauth kernel subsystem [3] A process sandbox is a mechanism for limiting the privi- – a clean room implementation of Apple’s kauth frame- leges of a process, as in restricting the operations the pro- work [6] for OS X – to handle authorization requests for cess may perform, the resources it may use, or its view of privileged operations. Privileged operations are repre- of the system. Sandboxes address the dual problems of sented as triples of the form (scope, action, optional sub- limiting the potential damage caused by running an un- action). The predefined scopes are system, process, trusted binary, and mitigating the effects of exploitation network, machdep, device, and vnode, each forming of a trusted binary. In either case, the goal is to restrict a a namespace that is further refined by the action and sub- process to only the necessary privileges for the purported action components. For instance, the operation to create task, and, in the latter case, to also drop privileges when a socket is identified by the triple (network, socket, they are no longer needed. open), and the operation to read a file by (vnode, Although NetBSD currently lacks a sandbox mech- read data). anism, sandbox implementations exist for various op- Some authorizations, such as (process, nice), are erating systems. systrace [5], a multi-platform mecha- triggered by a single system call (setpriority); some, such nism used in earlier versions of NetBSD, and as (system, mount, update), are triggered when a [2], a -specific implementation, exemplify the ap- system call (mount) is called with specific arguments proach of specifying a per-process system call policy, (the MNT UPDATE flag); and others, such as (system, and use system call interposition to enforce the policy filehandle) may be triggered by more than one sys- filter. For systrace, the policy format is systrace-specific, tem call (fhopen and fhstat). Many system calls do not whereas seccomp specifies the policy as a BPF program. trigger a kauth request. OpenBSD’s pledge system call [4] offers a simplified kauth uses an observer pattern whereby listeners reg- interface for dropping privileges: OpenBSD groups the ister for operation requests for a given scope; when a re- quest occurs, each listener is called. module. The sandbox Lua module allows a script to set Each listener receives as arguments the operation policy rules via the following interface: triple, the credentials of the object (typically, the pro- cess) that triggered the authorization request, as well as sandbox.default(result) additional context specific to the request. sandbox.allow(req) Each listener returns a decision: either allow, deny, sandbox.deny(req) or defer. If any listener returns deny, the request sandbox.on(req, func) is denied. If at least one listener returns allow and The sandbox.default function specifies a result of none returns deny, the request is allowed. If all listen- either ‘allow’, ‘deny’, or ‘defer’. The result is the ers return defer, the decision is scope-dependent. For sandbox’s decision for any kauth request for which the all scopes other than the vnode scope, the result is to script does not specify a more specific rule. deny the authorization. For the vnode scope, the autho- The sandbox.allow and sandbox.deny specify al- rization request contains a “fall-back” decision, which low and deny rules, respectively, for the kauth request nearly always specifies a decision conforming to tradi- given as req. tional BSD4.4 file access permissions. The sandbox Lua module uses strings of the form ‘scope.action.subaction’ to represent the requests; 2.2 secmodel hence, a request to open a socket corresponds to the string ‘network.socket.open’, and a request to read While the NetBSD kernel source contains many listen- a file to ‘vnode.read data’. A script may specify a ers (typically in accordance with kernel configuration complete request name, or a prefix. When the process options), the secmodel framework offers a lightweight triggers an authorization request, secmodel sandbox will convention for developing and managing a set of lis- select the policy rule that has the longest prefix match teners that represents a larger security model. By with the given request. As an example, a sandbox policy default, NetBSD uses secmodel bsd44, which imple- script of: ments the traditional security model based on 4.4BSD, and which itself is composed of three separate mod- sandbox.default(‘deny’) els: secmodel suser, secmodel securelevel, and sec- sandbox.allow(‘network’) model extensions. would allow any request in the network scope, but would An important, subtle point with the default security deny requests from all other scopes. model is that many authorization requests are deferred, The sandbox.on Lua function registers a Lua func- relying on kauth’s default behavior when all listeners re- tion func to be called for the given kauth request. The turn defer to fully implement the policy. signature for func is:

func(req, cred, arg0, arg1, arg2, arg3) 3 Design where req is the kauth request that generated the call- We developed secmodel sandbox as a loadable kernel back, cred is a Lua table that represents the credentials module with companion user-space library libsandbox. of the requesting object or process, and the remaining ar- By convention, we install the device file for sec- guments are request-specific. All parameters for func model sandbox at /dev/sandbox. exist only in the Lua environment; manipulating the val- A process interacts with secmodel sandbox via the ues does not affect the underlying C objects that they rep- sandbox(const char *script, int flags) func- resent. tion of libsandbox. The argument script is a Lua For many requests, the values for arg0 through arg3 script that specifies the sandbox policy. The flag argu- are nil, as the kauth request carries no additional con- ment specifies the action to take when a process attempts text. For the requests that do contain context, we trans- a denied operation: a value of 0 means that the oper- late the context into appropriate Lua values. For exam- ation returns an appropriate errno as dictated by kauth ple, for the request ‘network.socket.open’, the ar- (typically EACCES for kauth’s vnode scope and EPERM guments are Lua integers representing the arguments to for all other scopes); a value of SANDBOX ON DENY KILL the socket system call that triggered the request. For specifies the pledge behavior of killing the process. The clarity in script writing, we pre-populate the sandbox sandbox function packages these arguments into a struct Lua module with symbols for common constants, such and, via an ioctl call, passes the struct to /dev/sandbox. as sandbox.AF INET and sandbox.SOCK STREAM. For secmodel sandbox evaluates the Lua script in a Lua requests in the process scope, arg0 is a Lua table that environment that is pre-populated with a sandbox Lua represents a subset of the fields of the struct proc that

2 is the target of the request, such as the pid, ppid, comm struct sandbox. A sandbox contains two main items: (program name), and nice value. Callback functions for a Lua state and a ruleset. The Lua state is the Lua en- the vnode scope receive as arg0 a Lua table that con- vironment in which secmodel sandbox evaluates all Lua tains the pathname and file status information of the tar- code for that particluar sandbox. The ruleset is a pre- get vnode. Completely representing the context with Lua fix tree that secmodel sandbox searches during a kauth values is an ongoing effort. request to find the sandbox’s matching rule. Before secmodel sandbox evaluates the policy script 4 Sandbox Implementation in the newly created Lua state, secmodel sandbox adds the sandbox Lua functions (e.g., sandbox.allow) and Our design and implementation of secmodel sandbox constants (e.g., sandbox.AF INET) to the state. Each considered several important requirements and features. sandbox Lua function is a closure that contains a pointer First, while expressing rules in Lua is elegant, having to to the struct sandbox. In Lua terminology, the call into Lua to find a matching rule for each request is struct sandbox is a light userdata upvalue. not. Thus, we implemented secmodel sandbox so that- When the script calls a sandbox Lua function, the evaluating the policy script “compiles” the rules into a function – which is implemented in C code – performs prefix tree, mimicking the natural hierarchy provided by argument checking, retrieves the ruleset from the struct the (scope, action, subaction) format of requests. Thus, sandbox upvalue, and inserts the rule and the rule’s secmodel sandbox can quickly find a matching rule, and value into the ruleset. only needs to call into Lua for functional rules – rules For sandbox.allow, sandbox.deny, and specified as Lua functions via sandbox.on. sandbox.default, the rule’s value is a trilean: Second, we wanted to allow sandboxes to be dynamic; one of KAUTH RESULT ALLOW, KAUTH RESULT DENY, that is, allow a functional rule to set other rules. For ex- or KAUTH RESULT DEFER, as defined in sys/kauth.h. ample, a script might create rules based on the requesting For sandbox.on, the value is an index into Lua’s credential, as in the following, which installs a functional registry. The Lua registry is a global table that can rule for the network scope so that different rules may be only be accessed from C code. When a script invokes created for the root user and for ordinary users: sandbox.on, secmodel sandbox stores the callback function at an unused index in the Lua registry, and the sandbox.on(‘network’, function(rule, cred) ruleset stores this index as the rule’s value. if cred.euid == 0 then After evaluating the policy script, secmodel sandbox sandbox.allow(‘network.bind’) attaches the struct sandox to the current process’s ... credentials. The data that secmodel sandbox attaches to else a credential is in fact a list of struct sandbox’s, to sup- sandbox.deny(‘network.bind’) port allowing a process to apply multiple sandboxes dur- ... ing the course of its execution. If the list does not ex- end ist, secmodel sandbox first creates it and inserts the new end) sandbox; otherwise, the new struct sandbox is added Third, we had to be mindful of the subtleties of the to the existing list. default security model, particularly its dependency on Storing struct sandbox as an upvalue supports the kauth’s default decisions when all listeners return defer, creation of dynamic rules; that is, a sandbox.on call- so as not to allow sandboxes to elevate a process’s priv- back function that creates rules for other requests as part ileges beyond the default model. In a similar vein, we of its evaluation. If the callback function creates new needed to isolate multiple sandboxes on a single process rules by calling any of the sandbox Lua module func- so that the process is not able to install a new sandbox tions, then the C implementations of these functions can that loosens or undoes a rule of an existing sandbox. immediately find the corresponding ruleset for the given Finally, in order to ensure that child processes inherit Lua state. the sandboxes of their parent, but that, after process cre- ation, parent and child may apply additional sandboxes 4.2 Evaluating Authorization Requests independently of one another, we had to extend the nor- mal forking behavior. secmodel sandbox registers listeners for all kauth scopes. When one of the secmodel sandbox listeners 4.1 Sandbox creation is called, the listener checks whether a list of struct sandboxs is attached to the requesting credential. If a When a process sets a sandbox policy via libsandbox, list is not attached, the listener defers; if a list is attached, the kernel creates a new sandbox, represented as a secmodel sandbox searches the ruleset of each struct

3 sandbox for a value, calling into Lua if the value repre- sandbox policy, however, it is much more natural to work sents a registry index for a callback function. with pathnames rather than vnodes. If any sandbox in the list returns deny, sec- secmodel sandbox uses methods similar to those in model sandox returns deny for the request; if at least sys/kern/vfs getcwd.c to attempt to retrieve a path- one sandbox returns allow and none returns deny, sec- name. The method is to search for the basename of model sandbox returns defer, not allow as one would the vnode in the namei cache via cache revlookup, presume. The reason for “converting” allow to defer is and then walk back to the root vnode via interspersing due to subtleties in the implementation of kauth(9) and of calls to VOP LOOKUP (to retrieve a parent vnode), and the default security models that implement the traditional VOP READDIR (to find the path name component of the BSD4.4 security policy. In particular, since a large part child vnode). While we would expect the initial vnode of the traditional security model is implemented by hav- to be present in the cache, an obvious weakness of this ing all listeners defer, and thus relying on kauth’s “fall- method is the reliance on a cache hit, which cannot be back” behavior, secmodel sandbox must also defer, so as gauranteed. not to allow the elevation of privileges. 4.5 Safeguards 4.3 Process forking Evaluating user-provided Lua scripts in the kernel raises In NetBSD, a process’s credentials are represented by the a few concerns. An obvious concern is denial-of-service kauth cred t type. The kauth framework emits events caused by a Lua script with an infinite loop. While not corresponding to a credential’s life-cycle via the cred yet implemented, the defense is straight-forward, and scope. As with other kauth scopes, listeners may register used in the Lua kernel module to handle creating Lua callback functions. states with luactl. When a process forks, the normal behavior is for the In short, as part of its C API, Lua provides the func- parent and child to share the same kauth cred t, and to tion lua sethook for an application to register a C hook simply increment the credential’s reference count. Dur- function to be called at various Lua VM events. In par- ing the fork process, however, the kauth framework emits ticular, an application can register to receive a callback a fork event, thereby allowing for other behavior. For after every n Lua VM instructions. The approach is the fork event, the listener callback functions receive therefore to set a hook function to be called after some the struct proc of the parent and child, as well as the maximum number of VM instructions; if the hook is shared credential. called, the hook stops execution of the Lua VM by call- secmodel sandbox registers a callback for credential ing lua error. Lua allows only one hook function per events. During a fork event, secmodel sandbox checks Lua state; in order to “restore” the VM instruction count whether the credential contains a list of sandboxes. If back to zero, the hook function must be reset before ev- yes, then secmodel sandbox creates a new credential for ery evaluation of a Lua script or function. the child process that is identical to the parent’s creden- Another concern is that the struct sandboxs or tial, with the exception that the child credential contains a the callbacks registered via sandbox.on might be ac- new list head for the list of sandboxes. Althought the list cessed and modified from Lua code. Values in the head of the parent and child are different, they both point Lua registry and upvalues are, provided Lua’s debug li- to the same initial struct sandbox. Thus, each sand- brary is not loaded, only accessible from C code. sec- boxed process has its own kauth cred t and its own model sandbox does not load the debug library. More- sandbox list head, but the individual struct sandboxs over, secmodel sandbox does not provide a require are shared among the related processes, and hence refer- function or any other means to load additional Lua li- ence counted. braries. The handling of sandboxes in this manner means that the child is restricted by the same sandboxes as its parent 5 Applications at the time of the child’s creation, but that after the child’s creation, parent and child may add additional sandbox In this section, we describe the tools and facilities we are policies that do not affect the other process. developing with secmodel sandbox.

4.4 Mapping vnodes to pathnames 5.1 Secure chroot The request context for the the vnode kauth scope con- One application that we are developing is a se- tains the vnode that is the target of the operation. For a cure chroot. In 2011, Aleksey Cheusov proposed

4 the secmodel securechroot security model [1]. sec- 6 Conclusion model securechroot was developed as a kernel module, and once loaded, modifies the chroot system call to place We have introduced and developed a new security model, additional restrictions on the chrooted process. The re- secmodel sandbox, for NetBSD that allows per-process strictions impose process containment by preventing pro- specification and restriction of privileges. While several cess’s with one root directory from viewing information secmodels exist for NetBSD, secmodel sandbox is novel about processes with a different root directory, as well as in its use of NetBSD’s in-kernel Lua interpreter to allow denying the chrooted process several system-wide oper- processes to express privileges, subject to the bounds of ations, such as rebooting, modifying sysctls, or adding the traditional BSD4.4 security model. We designed sec- devices. model sandbox to limit excessive calls into Lua, to allow sandboxes to dynamically create rules during the execu- On NetBSD’s tech-kern mailing list, there was tion of a process, to allow a process to specify multiple, disagreement over the exact operations that should be isolated, sandboxes during the course of its execution, allowed and denied within a secure chroot. More- and to ensure that a child process inherits the sandbox over, some users expressed a desire to not override of its parent. We are developing concrete, familiar, ap- the default chroot behavior, but rather have an addi- plications in order to demonstrate our design’s ease and tional system call for secure chroot, so that users could flexibility in developing secure software. choose the level of restriction for each chrooted process. While some of the changes to kauth needed to support secmodel sandbox were merged into the NetBSD ker- References nel source, the secmodel itself was not. We are developing an implementation of sec- [1] Aleksey Cheusov. RFC: New security model sec- model securechroot as an auxiliary function, model securechroot(9). URL: https : / / mail - sandbox securechroot, in libsandbox, with an index . . org / tech - kern / 2011 / 07 / associated command-line tool. Development of the 09/msg010903.html. tool demonstrates that previously proposed secmodels [2] Jake Edge. “A seccomp overview”. In: (Sept. can be implemented using secmodel sandbox, and that 2015). URL: https : / / lwn . net / Articles / secmodel sandbox offers users flexibility in choosing 656307/. the proper level of containment. [3] Elad Efrat. “Recent Security Enhancements in NetBSD”. In: Proceeding of EuroBSDCon 2006. 2006. URL: http://www.netbsd.org/~elad/ recent/recent06.pdf. 5.2 pledge [4] pledge(2). OpenBSD 6.0. We are also developing the libsandbox auxiliary [5] Niels Provos. “Improving Security with Sys- function sandbox pledge, which attempts to emulate tem Call Policies”. In: Proceeedings of the 12th OpenBSD’s pledge system call using secmodel sandbox. Conference on USENIX Security Symposium - Vol- ume 12. 2003. A sandbox policy that mimics pledge is essentially a whitelist: explicitly allowing actions that correspond to [6] Technical Note TN2127: Kernel Authorization. a given category, and denying all others. Certain cate- Tech. rep. Apple Inc., 2010. URL: https : / / gories are easily implemented. For instance, the pledge developer . apple . com / library / content / categories that correspond to the access and modifica- technotes/tn2127/_index.html. tion of file metadata, such as rpath, wpath, fattr, and [7] Lourival Vieira Neto et al. “Scriptable Operating chown, are, with small exceptions, handled by appropri- Systems with Lua”. In: Proceedings of the 10th ate vnode scope rules. Similarly, categories that limit ACM Symposium on Dynamic Languages. 2014. network access to certain domains, such as inet and unix, are covered by rules for ‘network.bind’ and ‘network.socket.open’. Several pledge categories, however, reference system calls that, in NetBSD, do not trigger kauth requests. For example, the flock category that allows file locking or the dns category that allows DNS network transactions, lack apprporiate kauth requests. As a result such cate- gories cannot be implemented.

5