CADETS: Blending Tracing and Security
Total Page:16
File Type:pdf, Size:1020Kb
SEE TEXT CADETSONLY Blending Tracing & Security on FreeBSD The FreeBSD operating system has been used in many products that require strong security proper- ties—from storage appli- ances, to network switches and routers as well as gaming consoles. Over the 20-plus years of FreeBSD’s By Jonathan Anderson, development, there have George V. Neville-Neil, been significant additions Arun Thomas, and to the system in the area Robert N. M. Watson of security. 12 FreeBSD Journal he Jails [1] system introduced what about ‘hello, world.’” [6] Using print statements today would be considered a lightweight for a logging system in code that is supposed to Tversion of virtualization. The Mandatory have, otherwise, low overhead, is not an option, Access Control (MAC) framework was and so most logging systems are enabled or dis- added in 2002 to provide fine-grained, system- abled at either compile time, using an wide control over what users and programs could #ifdef/#endif statement, or at runtime, by do within the system [2]. The Audit subsystem wrapping the logging in an if statement. added the ability to track—on a system-call-by-sys- An example of this idiom in the C language is tem-call basis—what actions were occurring on shown below. the system and who or what was initiating those actions [3]. More recently Capsicum has intro- #ifdef LOGGING duced capabilities into the operating system that if (log) are the basis for application sandboxes [4]. printf("You have written %d bytes", len); As part of a new research project, the Causal #endif /* LOGGING */ Adaptive Distributed, and Efficient Tracing System (CADETS), we are adding new security primitives The next problem with print-based logging sys- to FreeBSD as well as leveraging existing primitives tems is that they are both static and prone to to produce an operating system with the maxi- errors. If the programmer did not expose a piece mum amount of transparency. Apart from the of information via the logging system before the security implications, having better visibility into code was built, then it will not be possible, with- systems will help us to improve overall perform- out modifying the code, to learn about any other ance and provide new runtime debugging tools. data upon which the same function might be In this article, we’ll cover our work with DTrace operating. and the Audit system, which form the core com- Together, these two problems mean that most ponents of the work, and talk about the chal- high-performance systems, such as operating sys- lenges of using DTrace as an always on tracing tems, are shipped without logging enabled, and system to track security and other events. when logging is enabled, the data that is available is limited to whatever the original programmer Starting at the Beginning wished to expose. As open-source developers, we are used to the One of the main components of FreeBSD we are idea that we can “just recompile the code,” but in exploiting in CADETS is DTrace. Originally devel- production systems, that’s not always possible. oped for Sun’s Solaris operating system in the early Imagine you have sold a system to a large bank. years of the 21st century [5], DTrace was meant to At 3 a.m., the system has a fault of some sort and solve the following problem: most software sys- logs an error. Someone in the IT department gets a tems have some sort of logging framework, which message reporting the fault, they call support, sup- usually depends on the ubiquitous printf func- port calls a programmer, the programmer then tion and formats text and sends it to some form says, “Stop the system, rebuild the code, and of console. rerun it with logging turned on.” Handling errors in that way is terrible both for the customer and printf("Hello world!"); for the developer. The customer will be annoyed at having to rebuild and restart their system, and the All C programmers know the code shown developer is unlikely to get helpful information above and every programmer knows the equiva- because the error is now far in the past. Enter lent idiom in their own language, whether it’s Dynamic Tracing. Python, Rust, Go, or PHP. There are a few prob- DTrace was designed to always be available for lems with using print statements for logging sys- use, without reducing system performance and tems. The first problem is that print statements without running the risk of outright crashing the have a high performance overhead. If you’ve ever system. The clearest way to think about DTrace is wondered about how much work is done on the as a runtime debugger with significant logging programmer’s behalf by printf then see Brooks and statistical capabilities. DTrace avoids the over- Davis’s “Everything you ever wanted to know head of printf based logging systems by using a May/June 2017 13 few tricks to subvert the execution of compiled introduced by tracing gets too high, then the ker- code. The complete details are covered in The nel will terminate the tracing as a form of self- Design and Implementation of the FreeBSD preservation. The second tenet of the DTrace sys- Operating System [7], but, briefly, DTrace can tem, that the tracing system must not unduly tax override the entry or exit point of any function the system, is one of the key challenges of using that is compiled into the kernel or into a user- DTrace as a security technology. An attacker that space program. The way in which functions are knows there might be tracing will first cause collected into libraries and programs is a well- there to be a great deal of irrelevant load on the defined process, and each function has a set of system, causing the tracing system to exit, and instructions that indicate where the function then they will go about attacking the system. begins. When DTrace traces a function, it Another component of the CADETS project looks replaces a few instructions with some of its own at the provenance of traces in order to thwart so that when the code reaches that point, such attacks on the tracing system itself. DTrace’s code gets called first, and it can collect The majority of the current use cases for and process the function’s argument. The practi- DTrace involve using it as a runtime debugger, cal upshot of this is that when DTrace is not in turning on tracing when someone suspects there use, it has zero overhead. is a problem on a system, rather than leaving the tracing running all the time. A small subset of Tracing for Security users have built complex telemetry systems How can we apply DTrace to the problems of around DTrace, including Fishworks [8], but in security? One aspect of security is finding where these first attempts at always-on tracing the the bad actor lies in a system. A key question to number of things being traced was a small subset ask when looking at a system is “Who did what of the possible tracepoints. To build a system that to whom and when?” We’ll refer to this as has complete visibility, we need to not only have “Question 1.” Establishing the tree of operations always-on tracing, but to increase the perform- such that we can trace it back to its root is one ance of the tracing system such that the over- part of forensic analysis that can help us find our head collecting the data does not overwhelm the bad actor and also show us what we need to system’s ability to do productive work. Another change to prevent future security breaches. requirement of a tracing system targeted at secu- Imagine that, regardless of the performance rity is that trace records cannot be dropped or cost, we had complete transparency into every lost due to high load. DTrace was designed in operation performed on a computer system. such a way that under high load it was accept- With sufficient time, analysis, and tooling, we able to drop trace records, long before the kernel dtrace would be able to take the output generated by might terminate the collection process the tracing system and find out the answer to due to the system being unresponsive. A system Question 1. that is trying to collect trace data for later foren- DTrace gives us the basis on which to build sic analysis turns that concept on its head. such a system, but there remain some challenges. Any system where tracing is always on will gen- In the previous section, we stated that using erate a lot of data, and that data will need to be some clever tricks to replace certain instructions analyzed to track down attackers and how they at runtime, DTrace would have zero overhead are able to compromise the system. There are sev- when not in use. That feature of DTrace was eral systems for taking arbitrary textual output what made it viable to ship it, by default, with from various tools and trying to make some sense Solaris, and the FreeBSD and Mac OS in the first of it, including Splunk [9]. The goal of the CADETS place. It was not until there was a way to ship a project is to produce data for use by such tools. It tracing system that didn’t have a performance will be much easier for consumers if the data is in impact on a running system that such a system a machine-readable format. could be fielded.