Variadic Functions How They Contribute to Security Vulnerabilities and How to fix Them by ROBERT C SECURITY Variadic Functions How they contribute to security vulnerabilities and how to fix them BY ROBERT C. SEACORD C/C++ language variadic functions are functions that accept a variable number of interface that runs on UNIX and Linux operating systems (CA-2001-27). arguments. Variadic functions are implemented using either the ANSI C stdarg approach • Helix Player, and media players based on the Helix Player, including Real Player for or, historically, the UNIX System V vararg approach. Both approaches require that the con- Linux systems (VU#361181). The following is an example of a vari- tract between the developer and user of the variadic function not be violated by the user. adic function implementation using ANSI stdarg: any of the formatted I/O functions in the ISO/IEC 9899:1999 C language 1. int average(int first, ...) { Mstandard (C99) such as printf() 2. int count = 0, sum = 0, i = first; and scanf() are defined as variadic functions 3. va_list marker; (including formatted output functions that 4. va_start(marker, first); operate on a multibyte characters [e.g., ASCII] 5. while (i != -1) { and wide characters [e.g., UNICODE]). 6. sum += i; These functions accept a fixed format 7. count++; string argument that specifies, among other 8. i = va_arg(marker, int); things, the number and type of arguments 9. } that are expected. If the contents of the 10. va_end(marker); format string are incorrect (by error or by 10. usage(argv[0]); 11. return(sum ? (sum / count) : 0); malicious intent), the resulting behavior of 11. exit(-1); 12. } the function is undefined. 12. } Incautious use of formatted I/O functions 13. } Variadic functions are declared using a have led to numerous, exploitable vulner- partial parameter list followed by the ellip- abilities. The majority of these vulnerabili- These vulnerabilities are often referred to sis notation. The variadic average() function ties occur when a potentially malicious user as “format string” vulnerabilities. Exploits accepts a single, fixed integer argument fol- is able to control all or some portion of the take a variety of forms, the most dangerous lowed by a variable argument list. Like other format specification string as shown in the of which involves using the %n conversion functions, the arguments to the variadic following program: specifier to overwrite memory and transfer function are pushed on the calling stack. control to arbitrary code of the attacker’s Variadic functions are problematic for a 1. #include <stdio.h> choosing. The easiest way to prevent format number of reasons. The first and foremost is 2. #include <string.h> string vulnerabilities is to ensure that the that the implementation has no real way of 3. void usage(char *pname) { format string does not include characters knowing how many arguments were passed 4. char usageStr[1024]; from untrusted sources. Because of interna- (even though this information is available at 5. snprintf(usageStr, 1024, tionalization, however, format strings and compile time). The termination condition “Usage: %s <target>\n”, pname); message text are often moved into external for the argument list is a contract between 6. printf(usageStr); catalogs or files that the program opens at the programmers who implement the 7. } runtime. An attacker can alter the values of library function and the programmers who 8. int main(int argc, char * argv[]) { the formats and strings in the program by use the function in an application. In this 9. if (argc < 2) { modifying the contents of these files. The implementation of the average() function, entire topic of formatted output is covered termination of the variable argument list is ABOUT THE AUTHOR in detail in my book on Secure Coding in indicated by an argument whose value is -1. Robert C. Seacord is a senior vulnerability C/C++. This means, for example, that average(5, -1, analyst at the CERT/Coordination Center (CERT/ Format string vulnerabilities have been 2, -1) is 5, not 2, as the programmer might CC) at the Software Engineering Institute (SEI) discovered in a variety of deployed C lan- expect. Also, if the programmer calling the in Pittsburgh, PA, and author of Secure Coding guage programs, including: function neglects to provide this argument, in C and C++ (Addison-Wesley, 2005). An eclectic • The Washington University FTP dae- the average() function will continue to pro- technologist, Robert is coauthor of two previous mon wu-ftpd that is shipped with many cess the next argument indefinitely until a -1 books, Building Systems from Commercial distributions of Linux and other UNIX value is encountered or an exception occurs. Components (Addison-Wesley, 2002) and Modern- operating systems (CA-2000-13). A second problem with variadic func- izing Legacy Systems (Addison-Wesley, 2003). • The common desktop environment tions is a complete lack of type checking. In [email protected] (CDE), an integrated graphical user the case of formatted output functions, the NOVEMBER 2005 12 SECURITY type of the arguments is determined by the calling sequence (partially implemented in its types and to generate versions of variadic corresponding conversion specifier in the hardware instructions) did pass a count of the functions that examine the expected argu- format string. For example, if a %d conversion number of long words making up the argu- ment type and the actual argument type specifier is encountered, the formatted out- ment list. This was carried over into Alpha, and generate a runtime error if it finds an put function assumes that the corresponding and HP VMS for Alpha still does this. unsafe or insecure mismatch. The biggest argument is an integer. If a %s is found, the If byte count were passed, the va_arg() drawback of this approach is that it might corresponding argument is interpreted as a macro (which currently returns the next introduce considerable overhead in pro- pointer to a string. This could result in a pro- argument and increments the argument cessing variadic function calls. gram fault, for example, if the corresponding pointer based on the size of the argument) argument was actually a small integer value. could also decrement the count and force a Summary and Conclusion Every time a variadic function consumes runtime-constraint violation when a vari- The current implementation of variadic an argument, an internal argument pointer adic function attempts to access more argu- functions in the C programming language is incremented to reference the next argu- ments than have actually been provided. is error prone and a major factor in format ment on the stack. If there is some type While the C Standard allows compiler string vulnerabilities in C and C++. Changes confusion, it is possible that the argument implementations to pass a byte count for are possible (but in some cases unlikely) pointer is incorrectly incremented. This variadic functions and not for normal func- within the current constraints of the C happens less than you might imagine on a tions, most implementations do not provide language specification. Requiring a stdarg’s 32-bit architecture such as the 32-bit Intel a different calling sequence for variadic variant that requires a compiler implemen- Architecture (IA-32) because almost all ar- functions. A common reason to do so is to tation to provide a byte count is a possible guments (including addresses, char, short, preserve compatibility between normal and mitigation for format string exploits, but it int, and long int) use four bytes. However, variadic calls. does not address type safety concerns. A more conversion specifiers such as a, A, e, E, f, F, Unfortunately, it’s unreasonable to modify comprehensive solution that addresses type g, or G are used to output a 64-bit floating- the C language specification to require a byte safety concerns should be researched. In the point number, thereby incrementing the count, as this change would break binary meantime, programmers should take care argument pointer by 8. compatibility between existing applications that untrusted user input is not incorporated The standard C formatted output func- and libraries. However, it might be possible into format specifications for formatted I/O tions need modifications to print 64-bit to introduce a new syntax that could be used functions and that other uses of variadic func- integer and pointer values in hexadecimal. to enable the compiler to pass a byte count. tions cannot be used to compromise system The %x modifier will only print out the first So, for example, instead of: security. Better implementations for the aver- 32 bits of the value that is passed to it and age() function, for example, include: increment the internal argument pointer int printf(const char *format, ...) { } 1. Giving the number of arguments followed by 4 bytes. To print out a 64-bit pointer, the by the values average(3, 5, -1, 2) ANSI C %p directive needs to be used rather we might have: 2. Giving the number of arguments followed than %x or %u. To print 64-bit integers, you by an array pointer average(3, a) need to use the one size specifier. int safe_printf(const char *format, argc+...); { } The first of these implementations is Solutions or some other, similar syntax. the “poor man’s” equivalent to having the One property of format string exploits is compiler automatically pass the argument that the number of arguments referenced Type Safety count (but requires additional program- by the attacker’s format string is greater Knowing the number of arguments does ming that may also be erroneous). than the arguments in the call to the for- not eliminate the possibility of format string matted output function. Unfortunately, vulnerabilities. For example, the types of Acknowledgments there is currently no mechanism by which those arguments would still not be known, I would like to acknowledge the con- a variadic function implementation can possibly causing confusion if an integer tributions of my coworkers, in particular determine the number of arguments (or is interpreted as, say, a pointer.
