Understanding and Effectively Preventing the ABA Problem in Descriptor-Based Lock-Free Designs

Total Page:16

File Type:pdf, Size:1020Kb

Understanding and Effectively Preventing the ABA Problem in Descriptor-Based Lock-Free Designs Understanding and Effectively Preventing the ABA Problem in Descriptor-based Lock-free Designs Damian Dechev Peter Pirkelbauer Bjarne Stroustrup Sandia National Laboratories Texas A&M University Texas A&M University Scalable Computing R & D Department Department of Computer Science Department of Computer Science Livermore, CA 94551-0969 College Station, TX 77843-3112 College Station, TX 77843-3112 [email protected] [email protected] [email protected] Abstract 1 Introduction An increasing number of modern real-time systems and The modern ubiquitous multi-core architectures demand the nowadays ubiquitous multicore architectures demand the design of programming libraries and tools that allow the application of programming techniques for reliable and fast and reliable concurrency. In addition, providing safe efficient concurrent synchronization. Some recently devel- and efficient concurrent synchronization is of critical impor- oped Compare-And-Swap (CAS) based nonblocking tech- tance to the engineering of many modern real-time systems. niques hold the promise of delivering practical and safer Lock-free programming techniques [11] have been demon- concurrency. The ABA1 problem is a fundamental problem strated to be effective in delivering performance gains and to many CAS-based designs. Its significance has increased preventing some hazards, typically associated with the ap- with the suggested use of CAS as a core atomic primitive for plication of mutual exclusion, such as deadlock, livelock, the implementation of portable lock-free algorithms. The and priority inversion [5], [2]. The ABA problem is a funda- ABA problem’s occurrence is due to the intricate and com- mental problem to many CAS-based nonblocking designs. plex interactions of the application’s concurrent operations Avoiding the hazards of ABA imposes an extra challenge and, if not remedied, ABA can significantly corrupt the se- for a lock-free algorithm’s design and implementation. To mantics of a nonblocking algorithm. The current state of the best of our knowledge, the literature does not offer an the art leaves the elimination of the ABA hazards to the in- explicit and detailed analysis of the ABA problem, its rela- genuity of the software designer. In this work we provide the tion to the most commonly applied nonblocking program- first systematic and detailed analysis of the ABA problem in ming techniques (such as the use of Descriptors) and cor- lock-free Descriptor-based designs. We study the semantics rectness guarantees, and the possibilities for its avoidance. of Descriptor-based lock-free data structures and propose a Thus, at the present moment of time, eliminating the haz- classification of their operations that helps us better under- ards of ABA in a nonblocking algorithm is left to the in- stand the ABA problem and subsequently derive an effective genuity of the software designer. In this work we study in ABA prevention scheme. Our ABA prevention approach out- details and define the conditions that lead to ABA in a non- performs by a large factor the use of the alternative CAS- blocking Descriptor-based design. Based on our analysis, based ABA prevention schemes. It offers speeds comparable we define a generic and practical condition, called the λδ to the use of the architecture-specific CAS2 instruction used approach, for ABA avoidance for a lock-free Descriptor- for version counting. We demonstrate our ABA prevention based linearizable design (Section 4). We demonstrate the scheme by integrating it into an advanced nonblocking data application of our approach by incorporating it in a complex structure, a lock-free dynamically resizable array. and advanced nonblocking data structure, a lock-free dy- namically resizable array (vector) [2]. The ISO C++ Stan- Keywords: ABA problem, nonblocking synchronization, dard Template Library [17] vector offers a combination of lock-free programming techniques dynamic memory management and constant-time random access. We survey the literature for other known ABA pre- vention techniques (usually described as a part of a non- 1ABA is not an acronym and is a shortcut for stating that a value at a blocking algorithm’s implementation) and study in detail shared location can change from A to B and then back to A three known solutions to the ABA problem (Sections 2.1 1 and 2.3). Our performance evaluation (Section 5) estab- Step Action lishes that the single-word CAS-based λδ approach is fast, Step 1 P1 reads Ai from Li efficient, and practical. Step 2 Pk interrupts P1; Pk stores the value Bi into Li Step 3 Pj stores the value Ai into Li Step 4 P resumes; P executes a false positive CAS 2 The ABA Problem 1 1 The Compare-And-Swap (CAS) atomic primitive (com- Table 1. ABA at Li monly known as Compare and Exchange, CMPXCHG, on the Intel x86 and Itanium architectures [12]) is a CPU instruc- tion that allows a processor to atomically test and mod- 2.1 Known ABA Avoidance Techniques I ify a single-word memory location. The application of a CAS-controlled speculative manipulation of a shared loca- A general strategy for ABA avoidance is based on the tion (Li) is a fundamental programming technique in the fundamental guarantee that no process Pj (Pj 6= P1) can engineering of nonblocking algorithms [11] (an example is possibly store Ai again at location Li (Step 3, Table 1). shown in Algorithm 1). One way to satisfy such a guarantee is to require all values stored in a given control location to be unique. To enforce Algorithm 1 CAS-based speculative manipulation of Li this uniqueness invariant we can place a constraint on the 1: repeat user and request each value stored at Li to be used only 2: value type Ai=ˆLi Known Solution 1 3: value type Bi = fComputeB once ( ). For a large majority of concur- 4: until CAS(Li, Ai, Bi) == Bi rent algorithms, enforcing uniqueness typing would not be a suitable solution since their applications imply the usage In our pseudocode we use the symbols ˆ, &, and : to in- of a value or reference more than once. dicate pointer dereferencing, obtaining an object’s address, An alternative approach to satisfying the uniqueness in- and integrated pointer dereferencing and field access. When variant is to apply a version tag attached to each value. The the value stored at Li is the target value of a CAS-based usage of version tags is the most commonly cited solution speculative manipulation, we call Li and ˆLi control loca- for ABA avoidance [6]. The approach is effective, when it tion and control value, respectively. We indicate the con- is possible to apply, but suffers from a significant flaw: a trol value’s type with the string value type. The size of single-word CAS is insufficient for the atomic update of a value type must be equal or less than the maximum num- word-sized control value and a word-sized version tag. An ber of bits that a hardware CAS instruction can exchange effective application of a version tag [3] requires the hard- atomically (typically the size of a single memory word). In ware architecture to support a more complex atomic primi- the most common cases, value type is either an integer or tive that allows the atomic update of two memory locations, a pointer value. In Algorithm 1, the function fComputeB such as CAS2 (compare-and-swap two co-located words) yields the new value, Bi, to be stored at Li. or DCAS (compare-and-swap two memory locations). The Definition 1: The ABA problem is a false positive execu- availability of such atomic primitives might lead to much tion of a CAS-based speculation on a shared location Li. simpler, elegant, and efficient concurrent designs (in con- As illustrated in Table 1, ABA can occur if a process P1 trast to a CAS-based design). It is not desirable to sug- is interrupted at any time after it has read the old value (Ai) gest a CAS2/DCAS-based ABA solution for a CAS-based and before it attempts to execute the CAS instruction from algorithm, unless the implementor explores the optimiza- Algorithm 1. An interrupting process (Pk) might change tion possibilities of the algorithm upon the availability of the value at Li to Bi. Afterwards, either Pk or any other CAS2/DCAS. A proposed hardware implementation (en- process Pj 6= P1 can eventually store Ai back to Li. When tirely built into a present cache coherence protocol) of an in- P1 resumes, its CAS loop succeeds (false positive execu- novative Alert-On-Update (AOU) instruction [16] has been tion) despite the fact that Li’s value has been meanwhile suggested by Spear et al. to eliminate the CAS deficiency manipulated. of allowing ABA. Some suggested approaches [15] split a Definition 2: A nonblocking algorithm is ABA-free when version counter into two half-words (Known Solution 2): a its semantics cannot be corrupted by the occurrence of ABA. half-word used to store the control value and a half-word ABA-freedom is achieved when: a) occurrence of ABA used as a version tag. Such techniques lead to severe lim- is harmless to the algorithm’s semantics or b) ABA is itations on the addressable memory space and the number avoided. The former scenario is uncommon and strictly spe- of possible writes into the shared location. To guarantee cific to the algorithm’s semantics. The latter scenario is the the uniqueness invariant of a control value of type pointer general case and in this work we focus on providing details in a concurrent system with dynamic memory usage, we of how to eliminate ABA. face an extra challenge: even if we write a pointer value no 2 more than once in a given control location, the memory al- value type stored in Li.
Recommended publications
  • Fast and Correct Load-Link/Store-Conditional Instruction Handling in DBT Systems
    Fast and Correct Load-Link/Store-Conditional Instruction Handling in DBT Systems Abstract Atomic Read Memory Dynamic Binary Translation (DBT) requires the implemen- Operation tation of load-link/store-conditional (LL/SC) primitives for guest systems that rely on this form of synchronization. When Equals targeting e.g. x86 host systems, LL/SC guest instructions are YES NO typically emulated using atomic Compare-and-Swap (CAS) expected value? instructions on the host. Whilst this direct mapping is efficient, this approach is problematic due to subtle differences between Write new value LL/SC and CAS semantics. In this paper, we demonstrate that to memory this is a real problem, and we provide code examples that fail to execute correctly on QEMU and a commercial DBT system, Exchanged = true Exchanged = false which both use the CAS approach to LL/SC emulation. We then develop two novel and provably correct LL/SC emula- tion schemes: (1) A purely software based scheme, which uses the DBT system’s page translation cache for correctly se- Figure 1: The compare-and-swap instruction atomically reads lecting between fast, but unsynchronized, and slow, but fully from a memory address, and compares the current value synchronized memory accesses, and (2) a hardware acceler- with an expected value. If these values are equal, then a new ated scheme that leverages hardware transactional memory value is written to memory. The instruction indicates (usually (HTM) provided by the host. We have implemented these two through a return register or flag) whether or not the value was schemes in the Synopsys DesignWare® ARC® nSIM DBT successfully written.
    [Show full text]
  • Virtual Memory
    Topic 18: Virtual Memory COS / ELE 375 Computer Architecture and Organization Princeton University Fall 2015 Prof. David August 1 Virtual Memory Any time you see virtual, think “using a level of indirection” Virtual memory: level of indirection to physical memory • Program uses virtual memory addresses • Virtual address is converted to a physical address • Physical address indicates physical location of data • Physical location can be memory or disk! 0x800 Disk:Mem: 0x803C4 0x3C00 Virtual address Physical address 2 Virtual Memory 3 Virtual Memory 1 4 Virtual Memory: Take 1 Main memory may not be large enough for a task • Programmers turn to overlays and disk • Many programs would have to do this • Programmers should have to worry about main memory size across machines Use virtual memory to make memory look bigger for all programs 5 A System with Only Physical Memory Examples: Most Cray machines, early PCs, nearly all current embedded systems, etc. Memory 0: 1: Store 0x10 CPU Load 0xf0 N-1: CPU’s load or store addresses used directly to access memory. 6 A System with Virtual Memory Examples: modern workstations, servers, PCs, etc. Memory 0: Page Table 1: Virtual Physical Addresses 0: Addresses 1: Store 0x10 CPU Load 0xf0 P-1: N-1: Disk Address Translation: the hardware converts virtual addresses into physical addresses via an OS-managed lookup table (page table) 7 Page Faults (Similar to “Cache Misses”) What if an object is on disk rather than in memory? 1. Page table indicates that the virtual address is not in memory 2. OS trap handler is invoked, moving data from disk into memory • Current process suspends, others can resume • OS has full control over placement, etc.
    [Show full text]
  • Concept-Oriented Programming: References, Classes and Inheritance Revisited
    Concept-Oriented Programming: References, Classes and Inheritance Revisited Alexandr Savinov Database Technology Group, Technische Universität Dresden, Germany http://conceptoriented.org Abstract representation and access is supposed to be provided by the translator. Any object is guaranteed to get some kind The main goal of concept-oriented programming (COP) is of primitive reference and a built-in access procedure describing how objects are represented and accessed. It without a possibility to change them. Thus there is a makes references (object locations) first-class elements of strong asymmetry between the role of objects and refer- the program responsible for many important functions ences in OOP: objects are intended to implement domain- which are difficult to model via objects. COP rethinks and specific structure and behavior while references have a generalizes such primary notions of object-orientation as primitive form and are not modeled by the programmer. class and inheritance by introducing a novel construct, Programming means describing objects but not refer- concept, and a new relation, inclusion. An advantage is ences. that using only a few basic notions we are able to describe One reason for this asymmetry is that there is a very many general patterns of thoughts currently belonging to old and very strong belief that it is entity that should be in different programming paradigms: modeling object hier- the focus of modeling (including programming) while archies (prototype-based programming), precedence of identities simply serve entities. And even though object parent methods over child methods (inner methods in identity has been considered “a pillar of object orienta- Beta), modularizing cross-cutting concerns (aspect- tion” [Ken91] their support has always been much weaker oriented programming), value-orientation (functional pro- in comparison to that of entities.
    [Show full text]
  • Mastering Concurrent Computing Through Sequential Thinking
    review articles DOI:10.1145/3363823 we do not have good tools to build ef- A 50-year history of concurrency. ficient, scalable, and reliable concur- rent systems. BY SERGIO RAJSBAUM AND MICHEL RAYNAL Concurrency was once a specialized discipline for experts, but today the chal- lenge is for the entire information tech- nology community because of two dis- ruptive phenomena: the development of Mastering networking communications, and the end of the ability to increase processors speed at an exponential rate. Increases in performance come through concur- Concurrent rency, as in multicore architectures. Concurrency is also critical to achieve fault-tolerant, distributed services, as in global databases, cloud computing, and Computing blockchain applications. Concurrent computing through sequen- tial thinking. Right from the start in the 1960s, the main way of dealing with con- through currency has been by reduction to se- quential reasoning. Transforming problems in the concurrent domain into simpler problems in the sequential Sequential domain, yields benefits for specifying, implementing, and verifying concur- rent programs. It is a two-sided strategy, together with a bridge connecting the Thinking two sides. First, a sequential specificationof an object (or service) that can be ac- key insights ˽ A main way of dealing with the enormous challenges of building concurrent systems is by reduction to sequential I must appeal to the patience of the wondering readers, thinking. Over more than 50 years, more sophisticated techniques have been suffering as I am from the sequential nature of human developed to build complex systems in communication. this way. 12 ˽ The strategy starts by designing —E.W.
    [Show full text]
  • A Separation Logic for Refining Concurrent Objects
    A Separation Logic for Refining Concurrent Objects Aaron Turon ∗ Mitchell Wand Northeastern University {turon, wand}@ccs.neu.edu Abstract do { tmp = *C; } until CAS(C, tmp, tmp+1); Fine-grained concurrent data structures are crucial for gaining per- implements inc using an optimistic approach: it takes a snapshot formance from multiprocessing, but their design is a subtle art. Re- of the counter without acquiring a lock, computes the new value cent literature has made large strides in verifying these data struc- of the counter, and uses compare-and-set (CAS) to safely install the tures, using either atomicity refinement or separation logic with new value. The key is that CAS compares *C with the value of tmp, rely-guarantee reasoning. In this paper we show how the owner- atomically updating *C with tmp + 1 and returning true if they ship discipline of separation logic can be used to enable atomicity are the same, and just returning false otherwise. refinement, and we develop a new rely-guarantee method that is lo- Even for this simple data structure, the fine-grained imple- calized to the definition of a data structure. We present the first se- mentation significantly outperforms the lock-based implementa- mantics of separation logic that is sensitive to atomicity, and show tion [17]. Likewise, even for this simple example, we would prefer how to control this sensitivity through ownership. The result is a to think of the counter in a more abstract way when reasoning about logic that enables compositional reasoning about atomicity and in- its clients, giving it the following specification: terference, even for programs that use fine-grained synchronization and dynamic memory allocation.
    [Show full text]
  • Chapter 9 Pointers
    Chapter 9 Pointers Objectives ❏ To understand the concept and use of pointers ❏ To be able to declare, define, and initialize pointers ❏ To write programs that access data through pointers ❏ To use pointers as parameters and return types ❏ To understand pointer compatibility, especially regarding pointers to pointers ❏ To understand the role of quality in software engineering Computer Science: A Structured Programming Approach Using C 1 FIGURE 9-1 Derived Types Computer Science: A Structured Programming Approach Using C 2 9-1 Introduction A pointer is a constant or variable that contains an address that can be used to access data. Pointers are built on the basic concept of pointer constants. Topics discussed in this section: Pointer Constants Pointer Values Pointer Variables Accessing Variables Through Pointers Pointer Declaration and Definition Declaration versus Redirection Initialization of Pointer Variables Computer Science: A Structured Programming Approach Using C 3 FIGURE 9-2 Character Constants and Variables Computer Science: A Structured Programming Approach Using C 4 FIGURE 9-3 Pointer Constants Computer Science: A Structured Programming Approach Using C 5 Note Pointer constants, drawn from the set of addresses for a computer, exist by themselves. We cannot change them; we can only use them. Computer Science: A Structured Programming Approach Using C 6 Note An address expression, one of the expression types in the unary expression category, consists of an ampersand (&) and a variable name. Computer Science: A Structured Programming Approach Using C 7 FIGURE 9-4 Print Character Addresses Computer Science: A Structured Programming Approach Using C 8 Note A variable’s address is the first byte occupied by the variable.
    [Show full text]
  • Lock-Free Programming
    Lock-Free Programming Geoff Langdale L31_Lockfree 1 Desynchronization ● This is an interesting topic ● This will (may?) become even more relevant with near ubiquitous multi-processing ● Still: please don’t rewrite any Project 3s! L31_Lockfree 2 Synchronization ● We received notification via the web form that one group has passed the P3/P4 test suite. Congratulations! ● We will be releasing a version of the fork-wait bomb which doesn't make as many assumptions about task id's. – Please look for it today and let us know right away if it causes any trouble for you. ● Personal and group disk quotas have been grown in order to reduce the number of people running out over the weekend – if you try hard enough you'll still be able to do it. L31_Lockfree 3 Outline ● Problems with locking ● Definition of Lock-free programming ● Examples of Lock-free programming ● Linux OS uses of Lock-free data structures ● Miscellanea (higher-level constructs, ‘wait-freedom’) ● Conclusion L31_Lockfree 4 Problems with Locking ● This list is more or less contentious, not equally relevant to all locking situations: – Deadlock – Priority Inversion – Convoying – “Async-signal-safety” – Kill-tolerant availability – Pre-emption tolerance – Overall performance L31_Lockfree 5 Problems with Locking 2 ● Deadlock – Processes that cannot proceed because they are waiting for resources that are held by processes that are waiting for… ● Priority inversion – Low-priority processes hold a lock required by a higher- priority process – Priority inheritance a possible solution L31_Lockfree
    [Show full text]
  • Pointers Pointers – Getting the Address of a Value the Address Operator (&) Returns the Memory Address of a Variable
    System Design and Programming II CSCI – 194 Section 01 CRN: 10968 Fall 2017 David L. Sylvester, Sr., Assistant Professor Chapter 9 Pointers Pointers – Getting the Address of a Value The address operator (&) returns the memory address of a variable. Every variable is allocated a section of memory large enough to hold a value of the variable data type. Commonly char one bytes shorts two bytes int, long, float four bytes double eight bytes Each byte of memory has a memory address. A variable’s address is the address of the first byte allocated to that variable. Ex: char letter; short number; float amount; letter number amount 1200 1201 1203 The addresses of the variables above are used as an example. Getting the address of a variable is accomplished by using the (&) operator in front of the variable name. It allows the system to return the address of that variable in hexadecimal. Ex: &amount //returns the variable’s address cout << &amount // displays the variable’s address Note: Do not confuse the address operator with the & symbol used when defining a referenced variable. Pointers – Sample Program // This program uses the & operator to determine a variable's // address and the sizeof operator to determine its size. #include <iostream> using namespace std; int main() { int x = 25; cout << "The address of x is " << &x << endl; cout << "The size of x is " << sizeof(x) << " bytes\n"; cout << "The value of x is "<< x << endl; } Sample output Pointers Variables Pointer variables or pointers, are special variables that hold a memory address. Just as int variables are designed to hold integers, pointers variables are designed to hold memory addresses.
    [Show full text]
  • Multithreading for Gamedev Students
    Multithreading for Gamedev Students Keith O’Conor 3D Technical Lead Ubisoft Montreal @keithoconor - Who I am - PhD (Trinity College Dublin), Radical Entertainment (shipped Prototype 1 & 2), Ubisoft Montreal (shipped Watch_Dogs & Far Cry 4) - Who this is aimed at - Game programming students who don’t necessarily come from a strict computer science background - Some points might be basic for CS students, but all are relevant to gamedev - Slides available online at fragmentbuffer.com Overview • Hardware support • Common game engine threading models • Race conditions • Synchronization primitives • Atomics & lock-free • Hazards - Start at high level, finish in the basement - Will also talk about potential hazards and give an idea of why multithreading is hard Overview • Only an introduction ◦ Giving a vocabulary ◦ See references for further reading ◦ Learn by doing, hair-pulling - Way too big a topic for a single talk, each section could be its own series of talks Why multithreading? - Hitting power & heat walls when going for increased frequency alone - Go wide instead of fast - Use available resources more efficiently - Taken from http://www.karlrupp.net/2015/06/40-years-of-microprocessor- trend-data/ Hardware multithreading Hardware support - Before we use multithreading in games, we need to understand the various levels of hardware support that allow multiple instructions to be executed in parallel - There are many more aspects to hardware multithreading than we’ll look at here (processor pipelining, cache coherency protocols etc.) - Again,
    [Show full text]
  • CS 241 Honors Concurrent Data Structures
    CS 241 Honors Concurrent Data Structures Bhuvan Venkatesh University of Illinois Urbana{Champaign March 7, 2017 CS 241 Course Staff (UIUC) Lock Free Data Structures March 7, 2017 1 / 37 What to go over Terminology What is lock free Example of Lock Free Transactions and Linerazability The ABA problem Drawbacks Use Cases CS 241 Course Staff (UIUC) Lock Free Data Structures March 7, 2017 2 / 37 Atomic instructions - Instructions that happen in one step to the CPU or not at all. Some examples are atomic add, atomic compare and swap. Terminology Transaction - Like atomic operations, either the entire transaction goes through or doesn't. This could be a series of operations (like push or pop for a stack) CS 241 Course Staff (UIUC) Lock Free Data Structures March 7, 2017 3 / 37 Terminology Transaction - Like atomic operations, either the entire transaction goes through or doesn't. This could be a series of operations (like push or pop for a stack) Atomic instructions - Instructions that happen in one step to the CPU or not at all. Some examples are atomic add, atomic compare and swap. CS 241 Course Staff (UIUC) Lock Free Data Structures March 7, 2017 3 / 37 int atomic_cas(int *addr, int *expected, int value){ if(*addr == *expected){ *addr = value; return 1;//swap success }else{ *expected = *addr; return 0;//swap failed } } But this all happens atomically! Atomic Compare and Swap? CS 241 Course Staff (UIUC) Lock Free Data Structures March 7, 2017 4 / 37 But this all happens atomically! Atomic Compare and Swap? int atomic_cas(int *addr, int *expected,
    [Show full text]
  • A Theory of Indirection Via Approximation
    A Theory of Indirection via Approximation Aquinas Hobor ∗† Robert Dockins† Andrew W. Appel † National University of Singapore Princeton University Princeton University [email protected] [email protected] [email protected] Abstract Consider general references in the polymorphic λ-calculus. Building semantic models that account for various kinds of indirect Here is a flawed semantic model of types for this calculus: reference has traditionally been a difficult problem. Indirect refer- value ≡ loc of address + num of N + ... ence can appear in many guises, such as heap pointers, higher-order type ≡ (memtype × value) → T (1) functions, object references, and shared-memory mutexes. memtype ≈ address * type We give a general method to construct models containing indi- rect reference by presenting a “theory of indirection”. Our method Values are a tagged disjoint union, with the tag loc indicating can be applied in a wide variety of settings and uses only simple, a memory address. T is some notion of truth values (e.g., the elementary mathematics. In addition to various forms of indirect propositions of the metalogic or a boolean algebra) and a natural reference, the resulting models support powerful features such as interpretation of the type A → T is the characteristic function for a impredicative quantification and equirecursion; moreover they are set of A. We write ≈ to mean “we wish we could define things this compatible with the kind of powerful substructural accounting re- way” and A*B to indicate a finite partial function from A to B. quired to model (higher-order) separation logic. In contrast to pre- The typing judgment has the form ψ ` v : τ, where ψ is vious work, our model is easy to apply to new settings and has a a memory typing (memtype), v is a value, and τ is a type; the simple axiomatization, which is complete in the sense that all mod- semantic model for the typing judgment is ψ ` v : τ ≡ τ(ψ, v).
    [Show full text]
  • Encapsulation and Inheritance in Object-Oriented Programming Languages
    Encapsulation and Inheritance in Object-Oriented Programming Languages Alan Snyder Affiliation: Software Technology Laboratory Hewlett-Packard Laboratories P.O. Box 10490, Palo Alto, CA , 94303-0971 (415) 857-8764 Abstract Object-oriented programming is a practical and useful programming methodology that encourages modular design and software reuse. Most object-oriented programming languages support data Abstraction by preventing an object from being manipulated except via its defined external operations. In most languages, however, the introduction of inheritanceseverely compromises the benefits of this encapsulation. Furthermore, the use of inheritance itself is globally visible in most languages, so that changes to the inheritance hierarchy cannot be made safely. This paper examines the relationship between inheritance and encapsulation and develops requirements for full support of encapsulation with inheritance. Introduction. Object-oriented programming is a practical and useful programming methodology that encourages modular design and software reuse. One of its prime features is support for data Abstraction, the ability to define new types of objects whose behavior is defined Abstractly, without reference to implementation details such as the data structure used to represent the objects. Most object-oriented languages support data Abstraction by preventing an object from being manipulated except via its defined external operations. Encapsulation has many advantages in terms of improving the understandability of programs and facilitating program modification. Unfortunately, in most object-oriented languages, the introduction of inheritance severely compromises encapsulation. This paper examines the issue of encapsulation and its support in object-oriented languages. We begin by reviewing the concepts of encapsulation and data Abstraction, as realized by most object-oriented language.
    [Show full text]