Characterizing Antivirus Workload Execution
Total Page:16
File Type:pdf, Size:1020Kb
Characterizing Antivirus Workload Execution Derek Uluski, Micha Moffie and David Kaeli Computer Architecture Research Laboratory Northeastern University Boston, MA {duluski,mmoffie,kaeli}@ece.neu.edu Abstract head is introduced if we enable anti-virus scan- ning. Many users are unhappy with the per- Despite the pervasive use of anti-virus (AV) formance penalty they must pay for security. software, there has not been a systematic The amount of overhead introduced can be study of the characteristics of the execution of so significant that many users will defer virus this workload. In this paper we present a char- scanning or totally disable their anti-virus soft- acterization of four commonly used anti-virus ware. Then their system will be vulnerable to software packages. Using the Virtutech Simics viruses. Thus, it is important to address the toolset, we profile the behavior of four popu- performance overhead associated with anti- lar anti-virus packages as run on an Intel Pen- virus software execution. tiumIV platform running Microsoft Windows- Most anti-virus software packages employ a XP. range of scanning techniques to decide whether In our study, we focus on the overhead in- or not a given file is infected. More complex troduced by the anti-virus software during on- techniques also exist such as: sandboxing, dig- access execution. The overhead associated ital watermarking, and heuristic-based tech- with anti-virus execution can dominate overall niques [11]. performance. The AV-Test group has already There are two main usage models when run- reported that this overhead can range from 23- ning anti-virus software, 1) on-demand,and2) 129% on live systems running on-access exper- 1 on-access. The on-demand model involves the iments [3]. The performance impact of the user specifying which files to scan. In this case, anti-virus execution is clearly an important is- the anti-virus software will usually be running sue, and we present the first quantitative study for a period of time, scanning numerous files. of the characteristics of this workload. Our On-demand scanning is usually performed of- study includes the impact of both operating fline, when the user does not use the com- system execution and system call execution. puter. The on-access model can be thought of as a daemon process that monitors system- 1 Introduction level and user-level operations and intervenes (scans) when a predefined event occurs. Most Security is an important issue for all com- AV software is configured to run in on-access puter users. A significant amount of over- mode. In this paper we will focus on execution overhead associated with an on-access model. 1Comparison tests were done during 2001-02 on ear- lier versions of the anti-virus packages. We are using The rest of this paper is organized as fol- more recent versions of these packages. lows. First, we present data showing the per- formance penalty due to anti-virus execution They compared the impact of running a range in section 3. In section 4 we discuss our Sim- of anti-virus scenarios. Another comparison ics environment and in section 5 we present of different anti-virus software can be found some results from our workload characteriza- in [5]. tion. We conclude the paper in section 6. There have been a few studies that have proposed solutions to overcome anti-virus ex- 2 Related work ecution overhead. In [9], the authors ana- lyze the underlying algorithms of open source Many methods exist today that are used to anti-virus projects [1, 2] and propose a CAM- guard against virus attacks. Anti-virus pack- based co-processor for boosting anti-virus soft- ages are commonly used to guard against know ware execution performance. In [10], Syman- viruses. Most anti-virus software packages em- tec (the developers of Norton Anti-Virus) de- ploy signature matching as the main mecha- scribe a anti-virus scanning hardware mech- nism to identify viruses [11]. An alternative anism that would exist on a telecommunica- strategy involves behavior blocking, wherein tions network. They suggest using a finite the behavior of a binary is analyzed and the state machine to match multiple signatures. rate of connections to a new host is lim- Tatari [12] describes the implementation of a ited [14]. Mechanisms that execute untrusted co-processor that is capable of simultaneously software in a sandbox, while monitoring be- matching complex regular expressions. havior, are described in [11]. Next, we will present a number of charac- An important class of software-based intru- teristics of anti-virus software execution. sions include stack smashing attacks [6, 15]. This class of attacks enables an intruder to redirect execution to malicious code by over- 3 Anti-virus performance writing the return address that is stored on degradation the program call stack. Stack smashing at- tacks can be addressed in several ways. Stack- Next we will quantify the amount of overhead Guard [7] is a compiler-based approach which introduced by anti-virus software. We will de- places a canary key next to the return ad- fer a discussion of the details of our evaluation dress on the program stack and validates the framework until section 4. Figure 1 plots the integrity of the return address. LibSave [4] increase in execution time due to anti-virus presents a method where special libraries are overhead. We study three different test sce- loaded dynamically that intercept calls to narios: 1) copying a small executable from known, unsafe functions. the CDROM to the hard disk, 2) executing Hardware-based solutions for stack smash- calc.exe, and 3) executing wordpad.exe. All ing also exist. StackGhost [8] provides a of this execution is running under Windows hardware-based stack protection; the hard- XP professional. The value shown in each bar ware is responsible for encrypting and decrypt- is the percent increase in execution time rela- ing return addresses. Another approach de- tive to a base case (the base case is the same scribed in [15] enhances the return stack ad- scenario run without any anti-virus software dress to detect buffer overflow attacks. present). In the area of anti-virus software characteri- We conducted a second experiment to de- zation, the AV-Test group has published on- termine the number of extra instruction exe- line results of measuring the overhead asso- cuted while performing file system operations ciated with different anti-virus softwares [3]. and while loading/executing a binary. Both copy 450% Copy (total) calc 40 400% wordpad s) Copy (Freq. AV code) 350% 35 Execute (total) 300% 30 250% 25 Execute (Freq. AV code) 200% 20 150% 15 100% % increase in cycles in % increase 10 50% 5 0% # dynamnic instructions (in million (in instructions dynamnic # Cillin F-Prot McAfee Norton 0 Base Cillin F-Prot McAfee Norton Figure 1: Anti-virus performance degradation. Figure 2: Anti-virus overhead. scenarios involve a small Helloworld binary of anti-virus execution. 28KB in size. Next, we will discuss our simulation envi- Most of the anti-virus code executed is lo- ronment for this characterization work. cated in tight loops that perform string scans. We have found that anti-virus execution is 4 Simulation framework dominated by a very small number of very hot basic blocks in each anti-virus package: 3 ba- To study anti-virus behavior, it only makes sic blocks for Cillin and F-Prot, and less than sense to use a platform where a majority of the 20 basic blocks for McAfee and Norton (con- virus attacks have been targeted, and where taining 109 and 226 instructions total, respec- there exist a number of commercial anti-virus tively). packages available. We have chosen to build In figure 2, we plot the number of dynamic our studies on top of the Virtutech Simics instructions executed. We show the total num- toolset [13], a full machine-state architectural ber of instructions executed (total) and also simulator that can emulate a faithful model of the number instructions executed that reside a large number of micro-architectures. Sim- in hot basic blocks. We consider a basic block ics allows us to profile the complete instruc- as hot if it is visited more than 50,000 times. tion stream executed by the processor (includ- We collect all the virtual addresses, labeling ing operating system and library execution), each basic block as hot and cold, and compute as well as capture all memory and I/O activ- the percentage of instructions executed that ity. The Simics toolset also includes a cycle- reside in hot basic blocks. accurate micro-architectural model which we For Cillin, McAfee and Norton, the scanning use to obtain cycle-accurate performance num- algorithm used has a relatively small footprint bers. and is frequently revisited. This opens the The Simics model we are using is known door for optimizing the most frequent basic as the Dredd model, a 2GHz Intel PentiumIV blocks, which may lead to a significant reduc- with 256MB of memory. This model contains a tion in the performance penalty introduced by generic motherboard containing a model of the Processor Model Intel Pentium 4 2.0A Processor Operating Frequency 2GHz L1 Trace Cache 12K entry L1 Data Cache 8KB L2 Cache 512KB Main Memory 256MB Table 1: Structure of the P4 microarchitecture used in this work. Intel 440BX chipset. The goal in modeling this XP professional (2002). This is the Base con- class of machine is to capture the execution of figuration and it has no anti-virus software in- an anti-virus software on a representative sys- stalled. We then created four more configura- tem. In order to obtain performance metrics, tions on top of the Base configuration, one for the instruction stream executed is passed to each anti-virus software package.