Linux System Calls for HLA Programmers
Total Page:16
File Type:pdf, Size:1020Kb
Linux System Calls Linux System Calls for HLA Programmers 1 Introduction This document describes the interface between HLA and Linux via direct system calls. The HLA Standard Library provides a header file with a set of important constants, data types, and procedure prototypes that you can use to make Linux system calls. Unlike the "C Standard Library," the HLA-based systems calls add very little overhead to a system call (generally, they move parameters into registers and invoke Linux with very little other processing). Note that I have copied information from the Linux man pages into this document. So whatever copyright (copy- left) applies to that information applies here as well. As far as I am concerned, my work on this document is public domain, so this document inherits the Linux documentation copyright (I don’t know the details, but it’s probably the "Free Documentation" license; whatever it is, it applies equally here). Note that Linux man pages are known to contain some misinformation. That misinformation was copied straight through to this document. Of course, in a document of this size, there are probably some defects I’ve introduced as well, so keep this in mind when using this document. Disclaimer: I would like to claim that every effort has been taken to ensure the accuracy of the material appearing within this document. That, however, would be a complete lie. In reality, I copied the man pages to this document and make a bunch of quick changes for HLA/assembly programmers to their descriptions. Undoubtedly, this process has introduced additional defects into the descriptions. Therefore, if something doesn’t seem right or doesn’t seem to work properly, it’s probably due to an error in this documentation. Keep this in mind when using this document. Hopefully as time passes and this document matures, many of the defects will disappear. Also note that (as this was begin written) I have not had time to completely test every system call, constant, and wrapper function provided with HLA. If you’re having a problem with a system call, be sure to check out the HLA source code for the Linux wrapper functions and whatever constants and data types you’re using from the linux.hhf module. 1.1 Direct System Calls from Assembly Language To invoke a Linux system call (i.e., a Linux API call), you load parameters into various registers, load a system call opcode into the EAX register, and execute an INT($80) instruction. Linux returns results in the EAX register and, possibly, via certain pass by reference parameters. The HLA "linux.hhf" header file contains constant declarations for most of the Linux system call opcodes. These constants take the form "sys_function" where function represents a Linux system call name. For example, "sys_exit" is the symbolic name for the Linux "_exit" call (this constant just happens to have the value one). If you read the on-line documentation for the Linux system calls, you’ll find that the API calls are specified using a "C" language syntax. However, it’s very easy to convert the C examples to assembly language. Just load the associ- ated system call constant into EAX and then load the 80x86 registers with the following values: • 1st parameter: EBX • 2nd parameter: ECX • 3rd parameter: EDX • 4th parameter: ESI • 5th parameter: EDI Certain Linux 2.4 calls pass a sixth parameter in EBP. Calls compatible with earlier versions of the kernel pass six or more parameters in a parameter block and pass the address of the parameter block in EBX (this change was probably made in kernel 2.4 because someone noticed that an extra copy between kernel and user space was slowing down those functions with exactly six parameters; who knows the real reason, though). As an example, consiider the Linux exit system call. This has a "C" prototype similar to the following: void exit( int returnCode ); The assembly invocation of this function takes the following form: Released to the Public Domain by Randall Hyde Page 1 Linux System Calls mov( sys_exit, eax ); mov( returnCode, ebx ); int( $80 ); As you can see, calls to Linux are very similar to BIOS or DOS calls on the PC (for those of you who are familiar with such system calls). While it is certainly possible for you to load the system call parameters directly into the 80x86 registers, load a system call "opcode" into EAX, and execute an INT($80) instruction directly, this is a lot of work if your program makes several Linux system calls. To make life easier for assembly programmers, the Linux system call module provided with the HLA Standard Library provides wrapper functions that make Linux system calls a lot more convenient. These are functions that let you pass parameters on the stack (using the HLA high level procedure call syntax) which is much more convenient than loading the registers and execut- ing INT($80). For example, consider the following implementation of the "linux._exit" function the Linux module provides: procedure _exit( RtnCode: dword ); @nodisplay; begin _exit; mov( sys_exit, eax ); mov( RtnCode, ebx ); int( $80 ); end _exit; You can call this function using the HLA syntax: linux._exit( returnValue ); As you can see, this is far more convenient to use than the INT($80) sequence given earlier. Furthermore, this calling sequence is very similar to the "C" syntax, so it should be very familiar to those reading Linux documentation (which is based on "C"). Your code would probably be slightly smaller and a tiny bit faster if you directly make the INT($80) calls.. However, since the transition from user space to kernel space is very expensive, the few extra cycles needed to pass the parameters on the stack to the HLA functions is nearly meaningless. In a typical (large) program, the memory savings would probably be measured in hundreds of bytes, if not less. So you’re not really going to gain much by making the INT($80) calls. Since the HLA code is much more convenient to use, you really should call the Standard Library functions. For those who are concerned about inefficiencies, here’s what a typical HLA Standard Library Linux system call looks like. As you can see, there’s not much to these functions. So you shouldn’t worry at all about efficiency loss. On occasion, certain Linux system calls become obsolete. Linux has maintained the calls for the older functions for those programs that require the old semantics, while adding new API calls that support addi- tional features. A classic example is the LSEEK and LLSEEK functions. Originally, there was only LSEEK (that only supports two gigabyte file lengths). Linux added the LLSEEK function to allow access to larger files. Still, the old LSEEK function exists for code that was written prior to the development of the LLSEEK call. So if you use the INT($80) mechanism to invoke Linux, you probably don’t have to worry too much about certain system calls disappearing on you. There is, however, a big advantage to using the HLA wrapper functions. If you use the INT($80) call- ing mechanism and a system call becomes obsolete, your program will probably still work but it won’t be able to take advantage of the new Linux features within your program without rewriting the affected INT($80) calls. On the other hand, if you call the HLA wrappers, this problem exists in only one place -- in the HLA Standard Library wrapper functions. This means that whenever the Linux system calls change, you need only modify the affected wrapper function (typically in one place), recompile the HLA Standard Library, recompile your applications, and you’re in business. This is much easier than attempting to locate every INT($80) call in your code and checking to see if you need to change it. Combined with the ease of calling the HLA wrapper functions, you should serious consider whether it’s worth it to call Linux via INT($80). For this reason, the remainder of this document will assume that you’re using the HLA Linux Page 2 Version: 4/5/02 Written by Randall Hyde Linux System Calls module to call the Linux APIs. If you choose to use the INT($80) calling mechanism instead, conversion is fairly trivial (as noted above). 1.2 A Quick Note About Naming Conventions Most Linux documentation was written assuming that the reader would be calling Linux from a C/C++ program. While the HLA header files (and this document) attempt to stick as closely to the original Linux names as possible, there are a few areas where HLA names deviate from the C names. This can occur for any of three reasons: • The C name conflicts with an HLA reserved word (e.g., "exit" becomes "_exit" because "exit" is an HLA reserved word). • C uses different namespaces for structs and other objects and some Linux identifiers are the same for both structs and variables (HLA doesn’t allow this). • HLA uses case neutral identifiers, C uses case sensitive identifiers. Therefore, if two C identifiers are the same except for alphabetic case, one of them must be changed when converting to HLA. • Many Linux constant and macro declarations use the (stylistically dubious) convention of all uppercase characters. Since uppercase is hard to read, such identifiers have been converted to all lowercase in the HLA header files. 1.3 A Quick Note About Error Return Values C/C++ programmers probably expect Linux system calls to return -1 if an error occurs and then they expect to find the actual error code in the errno global variable.