Data Structure Alignment -Wikipedia, the Free Encyclopedia Page 1 of 9

Data structure alignment -Wikipedia, the free encyclopedia Page 1 of 9 Data structure alignment From Wikipedia, the free encyclopedia Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding. When a modern computer reads from or writes to a memory address, it will do this in word sized chunks (e.g. 4 byte chunks on a 32-bit system). Data alignment means putting the data at a memory offset equal to some multiple of the word size, which increases the system's performance due to the way the CPU handles memory. To align the data, it may be necessary to insert some meaningless bytes between the end of the last data structure and the start of the next, which is data structure padding. For example, when the computer's word size is 4 bytes (a byte means 8 bits on most machines, but could be different on some systems), the data to be read should be at a memory offset which is some multiple of 4. When this is not the case, e.g. the data starts at the 14th byte instead of the 16th byte, then the computer has to read two 4 byte chunks and do some calculation before the requested data has been read, or it may generate an alignment fault. Even though the previous data structure ends at the 13th byte, the next data structure should start at the 16th byte. Two padding bytes are inserted between the two data structures to align the next data structure to the 16th byte. Although data structure alignment is a fundamental issue for all modern computers, many computer languages and computer language implementations handle data alignment automatically. Ada,[1][2] certain C and C++ implementations, D,[3] and assembly language allow at least partial control of data structure padding, which may be useful in certain special circumstances. Contents ◾ 1 Definitions ◾ 2 Problems ◾ 3 Architectures ◾ 3.1 RISC ◾ 3.2 x86 ◾ 3.3 Compatibility ◾ 4 Data structure padding ◾ 4.1 Computing padding ◾ 5 Typical alignment of C structs on x86 ◾ 5.1 Default packing and #pragma pack ◾ 6 Allocating memory aligned to cache lines ◾ 7 Hardware significance of alignment requirements ◾ 8 See also ◾ 9 References ◾ 10 Further reading ◾ 11 External links Definitions http://en.wikipedia.org/wiki/Data_structure_alignment 02-03-2014 Data structure alignment -Wikipedia, the free encyclopedia Page 2 of 9 A memory address a, is said to be n-byte aligned when a is a multiple of n bytes (where n is a power of 2). In this context a byte is the smallest unit of memory access, i.e. each memory address specifies a different byte. An n-byte aligned address would have log2(n) least-significant zeros when expressed in binary. The alternate wording b-bit aligned designates a b/8 byte aligned address (ex. 64-bit aligned is 8 bytes aligned). A memory access is said to be aligned when the datum being accessed is n bytes long and the datum address is n-byte aligned. When a memory access is not aligned, it is said to be misaligned. Note that by definition byte memory accesses are always aligned. A memory pointer that refers to primitive data that is n bytes long is said to be aligned if it is only allowed to contain addresses that are n-byte aligned, otherwise it is said to be unaligned. A memory pointer that refers to a data aggregate (a data structure or array) is aligned if (and only if) each primitive datum in the aggregate is aligned. Note that the definitions above assume that each primitive datum is a power of two bytes long. When this is not the case (as with 80-bit floating-point on x86) the context influences the conditions where the datum is considered aligned or not. Data structures can be stored in memory on the stack with a static size known as bounded or on the heap with a dynamic size known as unbounded. Problems A computer accesses memory by a single memory word at a time. As long as the memory word size is at least as large as the largest primitive data type supported by the computer, aligned accesses will always access a single memory word. This may not be true for misaligned data accesses. If the highest and lowest bytes in a datum are not within the same memory word the computer must split the datum access into multiple memory accesses. This requires a lot of complex circuitry to generate the memory accesses and coordinate them. To handle the case where the memory words are in different memory pages the processor must either verify that both pages are present before executing the instruction or be able to handle a TLB miss or a page fault on any memory access during the instruction execution. When a single memory word is accessed the operation is atomic, i.e. the whole memory word is read or written at once and other devices must wait until the read or write operation completes before they can access it. This may not be true for unaligned accesses to multiple memory words, e.g. the first word might be read by one device, both words written by another device and then the second word read by the first device so that the value read is neither the original value nor the updated value. Although such failures are rare, they can be very difficult to identify. Architectures RISC http://en.wikipedia.org/wiki/Data_structure_alignment 02-03-2014 Data structure alignment -Wikipedia, the free encyclopedia Page 3 of 9 Most RISC processors will generate an alignment fault when a load or store instruction accesses a misaligned address. This allows the operating system to emulate the misaligned access using other instructions. For example, the alignment fault handler might use byte loads or stores (which are always aligned) to emulate a larger load or store instruction. Some architectures like MIPS have special unaligned load and store instructions. One unaligned load instruction gets the bytes from the memory word with the lowest byte address and another gets the bytes from the memory word with the highest byte address. Similarly, store-high and store-low instructions store the appropriate bytes in the higher and lower memory words respectively. The Alpha architecture has a two-step approach to unaligned loads and stores. The first step is to load the upper and lower memory words into separate registers. The second step is to extract or modify the memory words using special low/high instructions similar to the MIPS instructions. An unaligned store is completed by storing the modified memory words back to memory. The reason for this complexity is that the original Alpha architecture could only read or write 32-bit or 64-bit values. This proved to be a severe limitation that often led to code bloat and poor performance. To address this limitation, an extension called the Byte Word Extensions (BWX) was added to the original architecture. It consisted of instructions for byte and word loads and stores. Because these instructions are larger and slower than the normal memory load and store instructions they should only be used when necessary. Some C and C++ compilers have an “unaligned” attribute that can be applied to pointers that need the unaligned instructions. x86 While the x86 architecture originally did not require aligned memory access and still works without it, SSE2 instructions on x86 CPUs do require the data to be 128-bit (16-byte) aligned and there can be substantial performance advantages from using aligned data on these architectures. However, there are also instructions for unaligned access such as MOVDQU. Compatibility The advantage to supporting unaligned access is that it is easier to write compilers that do not need to align memory, at the expense of the cost of slower access. One way to increase performance in RISC processors which are designed to maximize raw performance is to require data to be loaded or stored on a word boundary. So though memory is commonly addressed by 8-bit bytes, loading a 32- bit integer or 64-bit floating point number would be required to start at every 64 bits on a 64-bit machine. The processor could flag a fault if it were asked to load a number which was not on such a boundary, but this would result in a slower call to a routine which would need to figure out which word or words contained the data and extract the equivalent value. Data structure padding Although the compiler (or interpreter) normally allocates individual data items on aligned boundaries, data structures often have members with different alignment requirements. To maintain proper alignment the translator normally inserts additional unnamed data members so that each member is properly aligned. In addition the data structure as a whole may be padded with a final unnamed member. This allows each member of an array of structures to be properly aligned. http://en.wikipedia.org/wiki/Data_structure_alignment 02-03-2014 Data structure alignment -Wikipedia, the free encyclopedia Page 4 of 9 Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure.

Data Structure Alignment -Wikipedia, the Free Encyclopedia Page 1 of 9

A Type Inference on Executables

Targeting Embedded Powerpc

Interprocedural Analysis of Low-Level Code

Mac OS X ABI Function Call Guide

Alignment in C Seminar “Eﬃziente Programmierung in C”

Compiler Construction

An Introduction to Reverse Engineering for Beginners

Unstructured Computations on Emerging Architectures

AUTHOR PUB TYPE Needs of Potential Users, System

Program Optimization

A Guide to Vectorization with Intel® C++ Compilers

Gdura, Youssef Omran (2012) a New Parallelisation Technique for Heterogeneous Cpus