Compiling for Resource-Constrained Platforms Using Asshes: Application- and System-Specific Heuristics
Total Page:16
File Type:pdf, Size:1020Kb
Compiling for Resource-Constrained Platforms using ASSHes: Application- and System-Specific Heuristics Alastair Reid Nathan Cooprider John Regehr School of Computing, University of Utah Unpublished draft — November 2004 ABSTRACT tem. Instead of creating fast and not-too-big code, the compiler We present and evaluate application- and system-specific heuristics should aggressively try to generate code that fits into the available (ASSHes) for optimizing embedded software. ASSHes are scripts code and data space. Furthermore, the compiler should expend ex- that decide which compiler optimizations to apply; they make deci- tra optimization effort on tasks with real-time requirements. sions using feedback from previous compilations, using the results Developers using current tools can affect optimization heuristics of program analyses, and using high-level information, e.g., “there in various ways that are coarse-grained (e.g., global optimization are only 4 KB of RAM, and interrupt handlers 5–7 should be fast flags) and fine grained (e.g., profile data, pragmas). In general since we expect them to execute very frequently.” In contrast, tradi- this control is insufficient for meeting complex application- and tional compilers base optimization decisions on built-in heuristics platform-specific goals that involve multiple resources. There are that are oblivious to high-level application- and platform-specific two fundamental problems with existing tools: concerns. The ASSH effectively becomes the high-level optimiza- 1. Compilers are oblivious to the high-level non-functional re- tion policy and the existing compiler becomes a low-level mecha- quirements that are externally imposed on an embedded sys- nism for applying code transformations. ASSHes are more generic tem. than previous approaches to compiling embedded software: they permit compilation to be directed towards meeting many different 2. Compilers rely on a multitude of built-in heuristics for de- goals and combinations of goals. ciding when to apply optimizations that are not universally So far we have investigated ASSHes that control two optimiza- beneficial. It is difficult to gain effective overall control over tions: loop unrolling and function inlining. We evaluate ASSHes a diverse collection of opaque heuristics. by applying them to TinyOS sensor network applications, where they have successfully reduced resource usage. For example, for Lacking compiler support for meeting high-level resource con- one large application we can either reduce worst-case stack depth straints, developers are forced to manually specialize code to tune by 16%, reduce code size by 3.5%, or reduce the amount of time the resource usage. Manual specialization is harmful because it is time- processor is active by 14% (our baseline for comparison is the out- consuming and can introduce bugs. Also, it discourages reuse: the put of the nesC compiler, which itself performs aggressive cross- resulting code often sacrifices modularity and, furthermore, it is module inlining). These various improvements conflict: they can fragile with respect to changes in the underlying hardware plat- be achieved individually but not together—dealing with tradeoffs form: today’s resource constraints are not necessarily tomorrow’s. among resources is central to designing efficient embedded sys- This paper proposes a way to make existing compilers more suit- tems. able for compiling embedded software: we use application- and system-specific heuristics (ASSHes) to override optimization de- cisions made by the compiler. ASSHes have access to high-level 1. INTRODUCTION system requirements and to external program analyzers, code in- Optimizing compilers try to generate code that is fast and not too strumentation techniques, feedback mechanisms, etc. to direct their large—an appropriate design point for desktop and workstation ap- choices. Particularly important is giving ASSHes access to re- plications. Embedded software developers, on the other hand, are source bounding analyses [2, 21] that can statically determine if often not well-served by these compilers: their software must run a program will, for example, dynamically run out of memory. In on platforms with diverse and severe non-functional requirements. effect, the ASSH becomes the high-level optimization policy while For example, consider a program that will be run on a processor the existing compiler becomes the low-level mechanism for apply- with a low clock speed, limited battery energy, and little space for ing code transformations. This mechanism/policy separation is cru- code and data—in other words, a typical low-power embedded sys- cial: ASSHes present developers with a very narrow interface to the compiler, giving them the capability to extend it in a constrained way. In contrast, a generally extensible compiler is more power- ful but exports a much wider interface, for example exposing the compiler’s intermediate representations. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are The benefits of ASSHes come not from including any particular not made or distributed for profit or commercial advantage and that copies analyzer or heuristic—after all, compiler writers could equally well bear this notice and the full citation on the first page. To copy otherwise, to include that analyzer or heuristic—but from the ease with which republish, to post on servers or to redistribute to lists, requires prior specific new ASSHes can be written or combined. This is a corollary to permission and/or a fee. Robison’s observation [23] that Copyright 2004 ACM X-XXXXX-XX-X/XX/XX ...$5.00. 1 . small programming niches that would greatly ben- 32 Size-driven ASSH efit from certain optimizations cannot get those opti- 30 Bloat-driven ASSH mizations implemented, because payback for the com- piler vendor is not there. 28 26 Robison was not talking about the embedded domain but his obser- vation is particularly applicable there. The large diversity of archi- 24 nesC tectures, compilers, and operating systems for embedded systems Code size (KB) 22 best produces a high degree of market fragmentation. This fragmenta- 20 tion, in combination with the diverse non-functional requirements 18 imposed on embedded systems, guarantees the existence of many 0 20 40 60 80 100 small niches whose needs are not met by off-the-shelf tools. Percentage of functions inlined The contributions of this paper are: • We describe ASSHes: a novel organization for extensible op- Figure 1: Code size for the Surge Reliable application us- timization heuristics that exploits existing optimization passes ing two simple ASSHes, nesC, and our best heuristic. in order to meet application- and platform-specific goals. We intend ASSHes to be developed by programmers with a basic understanding of compilers and compiler optimizations—not • ASSHes can easily be made aware of high-level platform- by compiler specialists. and application-specific requirements, • We show that ASSHes are effective: we present experimen- • multiple ASSHes can be composed to support more sophis- tal results showing that they can produce code substantially ticated optimizations, and better than that produced by competing techniques. For ex- • ASSHes provide a convenient platform for making use of ample, on a large example we can reduce stack memory con- feedback to improve generated code. sumption by 16% or we can reduce the amount of time that the processor is active (and using energy) by 14%. 2.1 Background: TinyOS and nesC • We show that ASSHes are general: they support a variety of TinyOS is extremely lightweight systems software for the Berke- underlying optimizations (in this paper we describe ASSHes ley mote sensor network platform [10]. Motes are microcontroller- that utilize loop unrolling and function inlining) and they per- based embedded systems equipped with sensors and a packet radio. mit a wide variety of high-level goals to be met. ASSHes can The need to conserve energy and reduce costs imposes severe re- also be composed to meet more complex goals. source constraints: the mica2 motes that we use as an experimental platform are based on Atmel’s ATmega128 chip running at about • We show that ASSHes are practical: they can be added to 7 MHz. The Atmel chip implements the 8-bit AVR architecture. an existing compiler such as gcc either by modifying its code It has 128 KB of flash memory for storing programs and 4 KB of in minor ways, or by preprocessing its input using a source- RAM for data. to-source transformer. The nesC language [8] addresses the particular requirements of the mote platform by providing static component linking, whole- ASSHes significantly generalize our previous work on reducing program optimization through inlining, and static detection of con- stack depth [21], which considered tradeoffs between stack mem- currency errors. The nesC compiler emits C that is fed to a gcc ory and code memory. cross-compiler for AVR. Throughout this paper, our baseline for Section 2 provides a concrete introduction to application- and comparison is provided by the binaries generated by nesC and gcc. system-specific heuristics for optimizing embedded software by The Surge Reliable application, developed at Crossbow Inc., using a variety of ASSHes to optimize a nesC/TinyOS program. is the motivating application for this section. It performs multihop In Section 3 we describe the use of aggressive feedback-driven routing of sensor data to a sink node. Surge Reliable is rela- ASSHes to better compile embedded software. Section 4 describes tively large: it uses code from 43 components and, when compiled the interface between ASSHes and the compiler, and how to mod- by nesC, generates about 15,700 lines of C code in about 500 func- ify compilers to support ASSHes. In Section 5 we evaluate the tions. effectiveness of several ASSHes in reducing resource usage of a number of TinyOS applications that run on Berkeley motes [10]. 2.2 Code-size-driven ASSHes We compare ASSHes to previous work in Section 6 and conclude Our first example uses a variety of heuristics to achieve code-size in Section 7.