Backwards-Compatible Bounds Checking for Arrays and Pointers In

Backwards-Compatible Bounds Checking for Arrays and Pointers In

Backwardscompatible b ounds checking for arrays and p ointers in C programs Richard W M Jones and Paul H J Kelly Department of Computing Imp erial College of Science Technology and Medicine Queens Gate London SW BZ Abstract within which the resulting p ointer should p oint A p ointer in C can b e used in a context divorced This pap er presents a new approach to enforcing array from the name of the storage region for which it is valid b ounds and p ointer checking in the C language Check its intended referent and this has prevented a fully ing is rigorous in the sense that the result of p ointer satisfactory b ounds checking mechanism from b eing de arithmetic must refer to the same ob ject as the orig veloped There is overwhelming evidence that b ounds inal p ointer this ob ject is sometimes called the in checking is desirable and a number of schemes have tended referent The novel asp ect of this work is b een presented The main dierence b etween our work that checked co de can interoperate without restriction and Kendalls bcc and Steens rtcc is that in with unchecked co de without interface problems with our scheme the representation of p ointers is unchanged some eective checking and without false alarms This This is crucial since it means that interoperation with backwards compatibility prop erty allows the overheads nonchecked mo dules and libraries still works and much of checking to b e conned to susp ect mo dules and also checking is still p ossible Compared with interpretative facilitates the use of libraries for which source co de is schemes like SabreC we oer the p otential for much not available The pap er describ es the scheme its pro higher p erformance Patil and Fischer present a totype implementation as an extension to the GNU C sophisticated technique with very low overheads using compiler presents exp erimental results to evaluate its a second CPU to p erform checking in parallel Unfor eectiveness and discusses p erformance issues and the tunately their scheme requires function interfaces to b e eectiveness of some simple optimisations changed to carry information ab out p ointers so also has the interoperation problem Another approach is exemplied by the commercially Introduction and related work available checking package Purify Purify pro cesses the binary representation of the software so can handle C is unusual among programming languages in provid binaryonly co de Each memory access instruction is ing the programmer with the full p ower of p ointers mo died to maintain a bit map of valid storage regions Languages in the PascalAlgol family have arrays and and whether each byte has b een initialised Accesses p ointers with the restriction that arithmetic on p oint to unallo cated or uninitialised lo cations are rep orted ers is disallowed Languages like BCPL allow arbitrary as errors Purify catches many imp ortant bugs and is op erations on p ointers but lack types and so require fairly ecient However Purify do es not catch abuse clumsy scaling by ob ject sizes of p ointer arithmetic which yields a p ointer to a valid An advantage of the PascalAlgol approach is that region which is not the intended referent Fischer and array references can b e checked at runtime fairly e Patil provide evidence for the imp ortance of this ciently in fact so eciently that there is a go o d case renement for b oundschecking in pro duction co de Bounds check Our goals in this pap er are to describ e a metho d of ing is easy for arrays b ecause the array subscript syn b ounds checking C programs that fullls the following tax sp ecies b oth the address calculation and the array criteria Backwards compatibility the ability to mix checked co de and unchecked libraries for which the source Presented at AADEBUG LinkopingSweden may b e proprietary or otherwise unavailable Works with all common C programming styles Rigorously rejects violations of the ANSI C stan Bounds checking is not blo cked or weakened by the dard use of a cast ie type co ercion Casts can prop erly b e used to change the type of the ob ject to which a Checks static and stack ob jects as well as ob jects p ointer refers but cannot b e used to turn a p ointer dynamically allo cated with malloc to one ob ject into a p ointer to another A corollary is that b ounds checking is not type checking it do es Understands scop e of automatic variables not prevent storage from b eing declared with one data Performance including the ability to b e able to structure and used with another distribute programs with checks compiled in More subtly note that for this reason b ounds check ing in C cannot easily validate use of arrays of structs There remain some circumstances in which checking is which contain arrays in turn incomplete as we describ e later these are fairly un Casts and unions can b e used to create a p ointer common in practice The main shortcoming of the im from an ob ject of any other type in a machinedependent plementation describ ed in this pap er is that the p erfor way This cannot b e checked using our technique nor mance is currently p o or However the approach has by earlier approaches to b ounds checking since there is fundamental p erformance advantages over previously no ob ject for the p ointer to b e derived from published work Because checked co de interoperates easily with unchecked co de the p erformance p enalty is conned to those mo dules where it is needed Fur The technique and its advantages thermore there is substantial scop e for optimisation of lo opinvariant p ointers and p ointers which are induc In this section we review earlier approaches and explain tion variables Because the p ointer representation is the basis for the new approach unchanged there is no residual overhead once checking co de is eliminated We return to this issue in Section Earlier approaches to carrying b ounds information Overview of this pap er The next section reviews the problem of b ounds check Storage object ing for C and the limitations the language places on the checking that can b e done In the following section the new approach is introduced and we explain how unlike earlier schemes our b ounds checking scheme al base: lows interoperation with unchecked co de Then we give pointer: some details of our implementation and discuss some their eectiveness Finally we dis optimisations and limit: the eectiveness of the scheme in the light of our cuss Enhanced pointer exp erience with some large and wellknown C programs Figure Mo died p ointer representation p ointer Ob jects b ounds checking in C and its baseaddressextent triple limitations In earlier work in this area b ounds information is carried with each p ointer at runtime A ANSI C conveniently allows us to dene an object as simple approach is to represent each p ointer as a triple the fundamental unit of memory allo cation Ob jects the p ointer together with the storage regions base ad are created by declarations or allo cations such as those dress and limit or extent Checking is then straight shown in Table which may b e static automatic ie forward The larger size of p ointers requires changes in stackallocated or dynamically allo cated storage allo cation and the co de generator must b e mo d Ob jects are stored sequentially in memory and can ied to copy p ointers correctly The change in p ointer not overlap Op erations are p ermitted which manipu size can b e avoided by replacing each p ointer with an late p ointers within ob jects but p ointer op erations are index into a table which contains the p ointerbaselimit not p ermitted to cross b etween two ob jects There is triple no ordering dened b etween ob jects and the program The net eect of b oth metho ds is the same When mer should never b e allowed to make assumptions ab out the program at runtime comes to use a p ointer it how ob jects are arranged in memory must rst verify that the op eration that is ab out to b e int a A simple variable int a An array struct f g a A single record struct f g a An array of records malloc A single unit of memory allo cated with malloc Table Typical ob jects p erformed is correct It uses the information ab out the Every valid p ointervalued expression in C derives base and size of the array or structure b eing p ointed to its result from exactly one original storage ob ject If to decide if a particular index is legal the result of the p ointer calculation refers to a dierent ob ject it is invalid Although it sometimes useful to know where an in Unchanged p ointer representation valid p ointer has b een calculated rep orting every in stance can yield many false alarms We therefore re The problem with b oth these schemes is that the mo d place such incorrectlyderived p ointers with a p ointer ied p ointer representation is not interpreted correctly value which is always invalid called ILLEGAL Dened by co de compiled without b ounds checking enabled as void in our implementation This ensures This is a problem wherever a p ointer is passed to or that a b ounds error is rep orted when the p ointer is ac from an unchecked pro cedure whether as a parameter tually used a result or in a global variable It is of course often p ossible to translate p ointers where necessary called encapsulation in bcc and rtcc but this is incon Example p ointers to ob jects venient and dicult to do reliably eg where a func tion p ointer may refer either to checked or an unchecked Because of these diculties in rtcc only op routine Dead space between objects erating system calls are encapsulated all libraries

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    14 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us