<<

InIn computercomputer science,,aa pointer is aa programmingprogramming language datadata type whosewhose value refers directly to (or "points to") another value stored elsewhere in thethe computecomputerr memory usingusing its address..ForFor high-levelhigh-level programming languages,,pointerspointers effectively take the place of generalof general purpose registers inin low-level languages such as assemblyas oror machinemachine code,,butbut may be in availableavailable memorymemory..AA pointer references a location in memory, and obtaining the value at the location a pointer refers to is known as dereferencing the pointer. A pointer is a simple, more concrete implementation of the more abstract abstract reference datadata type. Several languages support some type of pointer, although some have more restrictions on their use than others. As an analogy, a page number in a book could be considered a pointer to the corresponding pagepage;; dereferencing such a pointer would be done by flippingflipping to the page with the given page number.

Pointers to data significantly improve performance for repetitive operations such as traversingtraversing stringsstrings,, lookuplookup tables,, controlcontrol tables andand treetree structures.structures. In particular, it is often much cheaper in time and space to copy and dereferencedereference pointers than it iiss toto copy and access the data to whiPointers are also used to hold the addresses of entry points forfor calledcalled inin proceduralprocedural programming andand for run-timefor run-time linking to dynamic link libraries (DLLs)..InIn object-orientedobject-oriented programming,, pointerspointers to functions areare used forfor bindingbinding methodsmethods,, often using what are calledcalled virtualvirtual method tables..

While "pointer" has been used to refer to references in general, it more properly applies to data structures whose explicitly allows the pointer to be manipulated (arithmetically via pointer arithmetic) as a , as opposed to aa magicmagic cookiecookie oror capabilitycapability where this is not possible.[[citation needed ]] BecausBecausee pointers allow both protected and unprotected access to memory addresses, there are risks associated with using them particularly in the latter case. Primitive pointers are often stored in a format similar to an integeran integer;;however,however, attempting to dereference or "look up" a pointer whose value was never a valid memory address would cause a program to crash. To ameliorate this potential problem, as a matter of type safety,,pointerspointers are considered a separate type to the tytypepe of data they point to, even if the underlying representation is

ch the pointers point.

Use in data structures

When setting upup datadata structures likelike listslists,, queuesqueues andand trees, it is necessary to have pointers to help manage how the structure is implemented and controlled.controlled. Typical examples of pointerspointers are start pointers, end pointers, andand stack stack pointers.pointers. These pointers can either be absolute (the actualactual physicalphysical address oror a virtuala virtual address inin virtualvirtual memory))oror relative (an(an offsetoffset fromfrom an absolute start address ("base") that typically uses fewer than a full address, but will usually require one additional arithmetic operation to resolve). A two- offset, containing a 1616-,-bit, unsigned integer,integer, can be used to provide relarelativetive addressing for up to 64 kilobytes of a . This can easily be extendedextended to 128K, 256K or 512K if the address pointed to is forced to be on a half-word, word or double-word boundary (but, requiring an additional "shift left" bitwisebitwise operation — by 1,2 or 3 bits — inin order to adjust the offset by a factor of 2,3 or 4, before its to the base address).address). Generally, though, such schemes are a lot of trouble,trouble, and for convenience to the programmer aa flat is preferred.

A one byte offset, such as the ASCIIhexadecimal ASCII valuevalue of a (e.g. X'29') can be used to point to an alternative integer value (or index) in an array (e.g. X'01'). In this way, characters can be very efficiently translated from 'raw 'raw data'' to a usable sequentialsequential indexindex andand then to an absolute address without aa lookuplookup table..

[[editedit] Use in control tables

Control tables,,thatthat are used to control programcontrol program flow,,usuallyusually make extensive use of pointers. The pointers, usually embeddeembeddedd in a table entry, may,may, for instance, be used to holdhold the entry points toto subroutinessubroutines toto be executed, based on certain conditions defineddefined in the same table entry. The pointers can however be simply indexes toto other separate, but associated, tables comprising an array of the actual addresses or the addresses themselves (depending(depending upon the programming language constructs availabavailable).le). They cancan also be used to point (back)(back) to earlier table entries (as in loop processing) or forward to skip some table entries (as in a switcha switch oror "early" exit from a loop). For this latter purpose, the "pointer" may simply be the tabletable entry number itself and can be transformed into an actual address by simple arithmetic.

[[editedit] Architectural roots

Pointers are a very thinthin abstractionabstraction onon top of the addressing capabilities provided by mostmost modernmodern architecturesarchitectures..InIn the simplest scheme, an an address,,oror a numericnumeric indexindex,,isis assigned to each unit of memory in the system, where the unit is typically either aa bytebyte oror aa wordword,, effectively transforming all of memory into a very largelarge arrayarray..Then,Then, if we have an address, the system provides an operation to retrieve the value stored in the memory unit at that address (usually utilizing the machine's generalmachine's general purpose registers))..

In the usual case, a pointer is large enough to hold more addresses than there are units of memory in the system. This introduces the possibilitypossibility that a program may attemptattempt to access an address which corresponds to no unit of memory, either because not enough memory is installed (i.e. beyond the range of available memory)memory) or the architecture does not supportsupport such addresses. The first case may, in certain platformsplatforms such as thethe IntelIntel architecture,architecture, be

called aa segmentationsegmentation fault (segfault).(segfault). The second case is possible in the currentcurrent implementation of of AMD64AMD64,,wherewhere pointers are 64 bit long and addresses only extendextend to 48 bits. There, pointers must conform to certain rules (canonical addresses), so if a noncanonicallnoncanonica pointer is dereferenced, the raises aa gegeneralneral protection fault..

On the other hand, some systems have more units of memory than there are addresses.addresses. In this case, a more complex scheme such asas memorymemory segmentationsegmentation oror pagingpaging isis employed to use different parts of the memory at different times. The last incarnations of the x86 architecture support up to 36 bits of physical memory addresses, which were mapped to the 32-bit linear address space through thethe PAEPAE pagingpaging mechanism. Thus, only 1/16 of the possible totaltotal memory may be accessed at a time. Another example in the same computer family was the 16-bit16-bit protectedprotected mode ofof the 80286the 80286 processor,processor, which, though supporting only 16 MiB of of physical memory, could accesaccesss up to 1 GiB of , but the combinationcombination of 16-bit address and segment registers made accessing more than 64 KiB in one data structure cumbersome. Some restrictions of ANSI pointer arithmetic may have been due to the segmented memory models of this processor family.

In order to provide a consistent interface, some architecturesarchitectures provide memory-mappedprovide memory-mapped I/O,, which allows some addresses to refer to units of memory while others refer toto devicedevice registers ofof other devices in the computer. There are analogousanalogous concepts such as file offsets, array indices, and remote object references that serve some of the same purposes as addresses for other types of objects. [[editedit] Uses

Pointers are directly supported without restrictions in languages such as PL/Ias PL/I,, CC,, ++C++,, PascalPascal,, and mostmost assemblyassembly languages..TheyThey are primarily used for constructing referencesconstructing references,,whichwhich in turn are fundamental to constructing nearly allall dadatata structures,,asas well as in passing data between different parts of a program.

In languages that rely heavily on lilists,sts, pointers and references are managed abstractly by the language using internal constructs likelike conscons..

When dealing with arrays, the critcriticalical lookuplookup operationoperation typically involves a stage called address calculation which involves constructing a pointer to the desireddesired data element in the array. If the data elements in the array have lengths that are divisible by powers of two, tthishis arithmetic is usually much moremore efficientefficient..PaddingPadding is frequently used as a mechanism for ensuring this is the case, despite the increased memorymemory requirement. In other data structures, such asas linkedlinked lists,,pointerspointers are used as references to explicitly tie oneone piece of the structure to another.

Pointers are used to pass parameters by reference. This is useful if the programmer wants a function's modifications to a parameter to be visible to the function's caller. This is also useful for returning multiple values from a function.

Pointers can also be used toto allocateallocate andand deallocate dynamic variables and arrays in memomemory.ry. Since a variable will often become redundant after it has served its purpose, it is a waste of memory to keep it, and therefore it is good practice to deallocate it (using the original pointer reference) when it is no longer needed. Failure to do so may result in aa memory leak (where(where

available free memory gradually, or in severe cases rapidly, diminishes because of an accumulation of numerous redundant memory blocks).

[[editedit]]CC pointers

The basic syntax to define a pointer isis::[1]

int *ptr;

This declares ptr as the identifier of an object of tthehe following type:type:

 pointer that points to an object of type int

This is usually stated more succinctly as ''ptr is a pointer to int..

Because the C language does not specspecifyify an implicit initialization for objects of automautomaticatic [2] storage durationduration,, carecare should often be taken to ensure that the addressaddress to which ptr points is valid; this is why it is sometimes suggested thatthat a pointer be explicitlyexplicitly initialized to thethe nullnull [3] pointer value,value, which is traditionally specified in C with thethe standardized macro NULL::

int *ptr = NULL;

Dereferencing a null pointer in C produces undefineundefinedd behaviobehaviorr[4],,whichwhich could be catastrophic. However, most implementations[[citation needed ]] simply halt execution of the program in question, usually with a segmentation fault.fault.

However, initializing pointers unnecessarily could hinder program analyses, thereby hiding bugs.

In any case, once a pointer has been declared, the next logical step is for it to point at something:

int a = 5; int *ptr = NULL;

ptr = &a;

This assigns the value of ptr to be the address of aa. For example, if aa is stored at memory location of 0x8130 then the value of ptr will be 0x8130 after the assignment. To dereference the pointer, an asterisk is used again:

*ptr = 8;

This means take the contents of ptr (which is 0x8130), "locate" that address in memory and its value to 8. If aa is later accessed again, its new value will be 8.

This example may be more clear if memory is examined directly. Assume that aa is located at address 0x8130 in memory and ptr at 0x8134; also assume this is a 32-bit machine such that an int is 32-bits wide. The following is what would be in memory after the following code snippet is executed:

int a = 5; int *ptr = NULL; Address Contents 0x8130 0x00000005 0x8134 0x00000000

(The NULL pointer shown here is 0x00000000.) By assigningassigning the address of aa toto ptr::

ptr = &a;

yields the following memory values:

Address Contents 0x8130 0x00000005 0x8134 0x00008130

Then by dereferencing ptr by coding:

*ptr = 8;

the computer will take the contents of ptr (which is 0x8130), 'locate' that address, and assign 8 to that location yielding the followingfollowing memory:

Address Contents 0x8130 0x00000008 0x8134 0x00008130

Clearly, accessing aa will yield the value of 8 because the previous instruction modified the contents of aa by way of the pointer ptr

Dynamic memory allocation

Pointers are used to store and manage the addresses of of dynamicallydynamically allocated blocksblocks of memory. Such blocks are used to store data objects or arrays of objects. Most structured and object-oriented languages provide an area of memory, called the heap oror free store, from which objects are ddynamicallyynamically allocated.

The example C code below illustrates how structure objects are dynamically allocated and referenced. TheThe standardstandard C library providesprovides the function function malloc() forfor allocating memory blocks from the heap. It takes the size of an object to allocate as a parameter and returns a pointer to a newly allocated block of memory suitable for storing the object, or it returns a null pointer if the allocation failed.