Efficient Subtyping Tests with PQ-Encoding

Total Page:16

File Type:pdf, Size:1020Kb

Efficient Subtyping Tests with PQ-Encoding Efficient Subtyping Tests with PQ-Encoding JOSEPH (YOSSI) GIL1 and YOAV ZIBIN Technion—Israel Institute of Technology Given a type hierarchy, a subtyping test determines whether one type is a direct or indirect descendant of another type. Such tests are a frequent operation during the execution of object- oriented programs. The implementation challenge is in a space-e±cient encoding of the type hierarchy that simultaneously permits e±cient subtyping tests. We present a new scheme for encoding multiple and single inheritance hierarchies, which, in the standard benchmark hierarchies, reduces the footprint of all previously published schemes. Our scheme is called PQ-encoding (PQE) after PQ-trees, a data structure previously used in graph theory for ¯nding the orderings that satisfy a collection of constraints. In particular, we show that in the traditional object layout model, the extra memory requirements for single inheritance hierarchies is zero. In the PQE subtyping tests are constant time, and use only two comparisons. The encoding creation time of PQE also compares favorably with previous results. It is less than a second on all standard benchmarks on a contemporary architecture, while the average time for processing a type is less than one millisecond. However, PQE is not an incremental algorithm. Other than PQ-trees, PQE employs several novel optimization techniques. These techniques are applicable also in improving the performance of other, previously published, encoding schemes. Categories and Subject Descriptors: D.1.5 [Programming Techniques]: Object-oriented Programming; D.3.3 [Programming Languages]: Language Constructs and Features—Data types; structures; G.4 [Mathematical Software]: Algorithm design; analysis General Terms: Algorithms, Design, Measurement, Performance, Theory Additional Key Words and Phrases: Casting, Encoding, Hierarchy, Inheritance, Partially Ordered Sets, PQ, PQE, Subtyping, Type inclusion 1. INTRODUCTION One of the basic operations in the runtime environment of object-oriented (OO) programs is a subtyping test. Given an object o and a type b, a subtype test determines whether a, the runtime type of o, is a subtype of b, i.e., a is a direct or indirect descendant of b in the inheritance hierarchy. These subtyping tests (also known as type inclusion tests) are carried out at runtime, and are distinct from static subtyping tests done by the language type checker. 1Research supported in part by the generous funding of the Israel Science Foundation, grant No. 128/02. Authors’ address: Technion City, Haifa 32000, Israel. Contact the authors for patent information. A preliminary version of this paper was published in the proceedings of the 16th Annual Conference on Object- Oriented Programming Systems, Languages, and Applications (OOPSLA’01). Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. °c 2001 ACM 0164-0925/99/0100-0111 $00.75 ACM Transactions on Programming Languages and Systems, Vol. ?, No. ?, January 2005, Pages 1–36. 2 ¢ J. Gil and Y. Zibin 1.1 Subtyping Tests in OO Languages A programmer may apply a subtyping test explicitly using dedicated constructs such as JAVA’s [Arnold and Gosling 1996] instanceof, and SMALLTALK’s [Goldberg 1984] isKindOf: method. In addition, several other language constructs are implemented us- ing these tests. For example, subtyping tests are implicit in the execution of type cast oper- ations, e.g., ?= in the EIFFEL programming language [Meyer 1992] and dynamic cast in C++ [Stroustrup 1997]. In JAVA for example, the lack of parametric polymorphism is a common source of such casts. When extracting an element o from a collection class, e.g., Vector, the type of o is Object. Therefore, it is typically necessary to cast o to the expected type. Also, the covariant nature of array subtyping in JAVA renders subtyping tests necessary in assignments to elements of arrays whose dynamic type is unknown. Consider for exam- ple the following code fragment void f(Object x[]) f x[1] = new A(); g It may be a bit surprising that the assignment to x[1] in f requires a subtyping test. To understand why, suppose that function f was invoked with a value of type B[] (an array of elements of type B) as an actual argument x, e.g., f(new B[3]); This invocation is correct because type B[] is a subtype of Object[] (an array of ob- jects). However, the assignment x[1] = new A(); is legal only if type A is a subtype of type B. Otherwise, the runtime environment raises an ArrayStoreException exception. Yet another construct which is implemented using subtyping tests is covariant overriding of arguments in EIFFEL. The compiler is inclined to make subtyping tests in conjunction with calls to methods that use this feature in order to ensure type-safety2. Finally, we note that subtyping tests may also be part of the implementation of exception handling in JAVA, C++, and other languages. The following code excerpt is an example of a try block, followed by a catch clause and block. try f f(); ...g catch(B b) f...g Suppose that an object o is thrown from the try block, e.g., from within function f(). Then, the program should execute the catch block, but only if type B is a supertype of the thrown object’s type. Thus, the catch clause can be implemented with a subtyping test. In JAVA, the subtyping test is with respect to the dynamic type of o, while in C++ it is with respect to the static type of o. However, in both cases, this test must be carried out at runtime. 2Various mechanisms of static analysis have been proposed to eliminate this requirement, but none of these have been implemented. ACM Transactions on Programming Languages and Systems, Vol. ?, No. ?, January 2005. Efficient Subtyping Tests with PQ-Encoding ¢ 3 1.2 Problem Definition The problem we deal with is defined formally as follows. A hierarchy is a partially ordered set (T ,¹) where T is a set of types3 and ¹ is a reflexive, transitive and anti-symmetric subtype relation. If a and b are types, and a ¹ b holds, we say that they are comparable, that a is a subtype of b and that b is a supertype of a. Given a hierarchy (T ; ¹), jT j = n, the subtyping problem is to build a data structure supporting queries of the sort a ¹ b. This data structure is called an encoding of the hierarchy. The subtyping problem has enjoyed considerable attention recently (see e.g., [Kaci et al. 1989; Agrawal et al. 1989; Caseau 1993; Habib and Nourine 1994; Capelle 1994; Fall 1995; 1996; Krall et al. 1997; Vitek et al. 1997; Habib et al. 1999; van Bommel and Beck 2000; Raynaud and Thierry 2001; Filman 2002; Palacz and Vitek 2003; Corsaro and Cytron 2003]). The challenge of implementing subtyping tests is to simultaneously optimize its four complexity measures: (1) Space. Encoding methods associate certain data with each type. We measure the average number of bits per type, also called the encoding length. Note that we do not include in the measure the space consumed by each object. Al- though the space overhead per object depends on the object layout model, it is usually assumed that each object includes a pointer to a type information record. (2) Instruction count. This is the number of machine instructions in the test code, on a certain hardware architecture. There are indications [Vitek et al. 1997] that the space consumed by the test code, which can appear many times in a program, can dominate the encoding length. An encoding is said to be uniform4 if there exists an implementa- tion of the test code in which the instruction count does not depend on the size of the hierarchy. Only uniform encodings will interest us. (3) Test time. The time complexity of the test code is of major interest. Since the test code might contain loops, the time complexity may not be constant even in uniform encodings. Our main concern here are encodings with constant-time tests (which are always uniform). To improve timing performance, loops of not constant-time tests may be unrolled, giving rise to non-constant instruction count, without violating the uniformity condition. (Bit-vector encoding, presented in Sec. 4, is an example of a uniform encoding which is not constant-time.) In explicit subtyping tests and in type casts, the supertype b is known at compile time. Therefore, the generated subtyping test code can be specialized, by pre-computing values that depend only on b and using them in the test code. Specialization thus benefits both instruction count and test time, and may even reduce the encoding length. (4) Encoding creation time. Another important complexity measure is the time for gen- erating the actual encoding, which can be large. It is essential that a compiler will be able to finish its computation in a reasonable time. In many cases, this is not possible. For example, the problem of finding an optimal bit-vector encoding was proved to be intractable [Habib and Nourine 1994]. Heuristics of bit vector encoding [Kaci et al. 3The distinction between type, class, interface, signature, etc., as it may occur in various languages does not concern us here. We shall refer to all these collectively as types. 4The term is borrowed from circuit complexity. A family of circuits for the size dependent incarnations of a certain problem is called uniform, if this family can be generated by a single Turing machine.
Recommended publications
  • Lab 7: Floating-Point Addition 0.0
    Lab 7: Floating-Point Addition 0.0 Introduction In this lab, you will write a MIPS assembly language function that performs floating-point addition. You will then run your program using PCSpim (just as you did in Lab 6). For testing, you are provided a program that calls your function to compute the value of the mathematical constant e. For those with no assembly language experience, this will be a long lab, so plan your time accordingly. Background You should be familiar with the IEEE 754 Floating-Point Standard, which is described in Section 3.6 of your book. (Hopefully you have read that section carefully!) Here we will be dealing only with single precision floating- point values, which are formatted as follows (this is also described in the “Floating-Point Representation” subsec- tion of Section 3.6 in your book): Sign Exponent (8 bits) Significand (23 bits) 31 30 29 ... 24 23 22 21 ... 0 Remember that the exponent is biased by 127, which means that an exponent of zero is represented by 127 (01111111). The exponent is not encoded using 2s-complement. The significand is always positive, and the sign bit is kept separately. Note that the actual significand is 24 bits long; the first bit is always a 1 and thus does not need to be stored explicitly. This will be important to remember when you write your function! There are several details of IEEE 754 that you will not have to worry about in this lab. For example, the expo- nents 00000000 and 11111111 are reserved for special purposes that are described in your book (representing zero, denormalized numbers and NaNs).
    [Show full text]
  • The Hexadecimal Number System and Memory Addressing
    C5537_App C_1107_03/16/2005 APPENDIX C The Hexadecimal Number System and Memory Addressing nderstanding the number system and the coding system that computers use to U store data and communicate with each other is fundamental to understanding how computers work. Early attempts to invent an electronic computing device met with disappointing results as long as inventors tried to use the decimal number sys- tem, with the digits 0–9. Then John Atanasoff proposed using a coding system that expressed everything in terms of different sequences of only two numerals: one repre- sented by the presence of a charge and one represented by the absence of a charge. The numbering system that can be supported by the expression of only two numerals is called base 2, or binary; it was invented by Ada Lovelace many years before, using the numerals 0 and 1. Under Atanasoff’s design, all numbers and other characters would be converted to this binary number system, and all storage, comparisons, and arithmetic would be done using it. Even today, this is one of the basic principles of computers. Every character or number entered into a computer is first converted into a series of 0s and 1s. Many coding schemes and techniques have been invented to manipulate these 0s and 1s, called bits for binary digits. The most widespread binary coding scheme for microcomputers, which is recog- nized as the microcomputer standard, is called ASCII (American Standard Code for Information Interchange). (Appendix B lists the binary code for the basic 127- character set.) In ASCII, each character is assigned an 8-bit code called a byte.
    [Show full text]
  • POINTER (IN C/C++) What Is a Pointer?
    POINTER (IN C/C++) What is a pointer? Variable in a program is something with a name, the value of which can vary. The way the compiler and linker handles this is that it assigns a specific block of memory within the computer to hold the value of that variable. • The left side is the value in memory. • The right side is the address of that memory Dereferencing: • int bar = *foo_ptr; • *foo_ptr = 42; // set foo to 42 which is also effect bar = 42 • To dereference ted, go to memory address of 1776, the value contain in that is 25 which is what we need. Differences between & and * & is the reference operator and can be read as "address of“ * is the dereference operator and can be read as "value pointed by" A variable referenced with & can be dereferenced with *. • Andy = 25; • Ted = &andy; All expressions below are true: • andy == 25 // true • &andy == 1776 // true • ted == 1776 // true • *ted == 25 // true How to declare pointer? • Type + “*” + name of variable. • Example: int * number; • char * c; • • number or c is a variable is called a pointer variable How to use pointer? • int foo; • int *foo_ptr = &foo; • foo_ptr is declared as a pointer to int. We have initialized it to point to foo. • foo occupies some memory. Its location in memory is called its address. &foo is the address of foo Assignment and pointer: • int *foo_pr = 5; // wrong • int foo = 5; • int *foo_pr = &foo; // correct way Change the pointer to the next memory block: • int foo = 5; • int *foo_pr = &foo; • foo_pr ++; Pointer arithmetics • char *mychar; // sizeof 1 byte • short *myshort; // sizeof 2 bytes • long *mylong; // sizeof 4 byts • mychar++; // increase by 1 byte • myshort++; // increase by 2 bytes • mylong++; // increase by 4 bytes Increase pointer is different from increase the dereference • *P++; // unary operation: go to the address of the pointer then increase its address and return a value • (*P)++; // get the value from the address of p then increase the value by 1 Arrays: • int array[] = {45,46,47}; • we can call the first element in the array by saying: *array or array[0].
    [Show full text]
  • Subtyping Recursive Types
    ACM Transactions on Programming Languages and Systems, 15(4), pp. 575-631, 1993. Subtyping Recursive Types Roberto M. Amadio1 Luca Cardelli CNRS-CRIN, Nancy DEC, Systems Research Center Abstract We investigate the interactions of subtyping and recursive types, in a simply typed λ-calculus. The two fundamental questions here are whether two (recursive) types are in the subtype relation, and whether a term has a type. To address the first question, we relate various definitions of type equivalence and subtyping that are induced by a model, an ordering on infinite trees, an algorithm, and a set of type rules. We show soundness and completeness between the rules, the algorithm, and the tree semantics. We also prove soundness and a restricted form of completeness for the model. To address the second question, we show that to every pair of types in the subtype relation we can associate a term whose denotation is the uniquely determined coercion map between the two types. Moreover, we derive an algorithm that, when given a term with implicit coercions, can infer its least type whenever possible. 1This author's work has been supported in part by Digital Equipment Corporation and in part by the Stanford-CNR Collaboration Project. Page 1 Contents 1. Introduction 1.1 Types 1.2 Subtypes 1.3 Equality of Recursive Types 1.4 Subtyping of Recursive Types 1.5 Algorithm outline 1.6 Formal development 2. A Simply Typed λ-calculus with Recursive Types 2.1 Types 2.2 Terms 2.3 Equations 3. Tree Ordering 3.1 Subtyping Non-recursive Types 3.2 Folding and Unfolding 3.3 Tree Expansion 3.4 Finite Approximations 4.
    [Show full text]
  • A Variable Precision Hardware Acceleration for Scientific Computing Andrea Bocco
    A variable precision hardware acceleration for scientific computing Andrea Bocco To cite this version: Andrea Bocco. A variable precision hardware acceleration for scientific computing. Discrete Mathe- matics [cs.DM]. Université de Lyon, 2020. English. NNT : 2020LYSEI065. tel-03102749 HAL Id: tel-03102749 https://tel.archives-ouvertes.fr/tel-03102749 Submitted on 7 Jan 2021 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. N°d’ordre NNT : 2020LYSEI065 THÈSE de DOCTORAT DE L’UNIVERSITÉ DE LYON Opérée au sein de : CEA Grenoble Ecole Doctorale InfoMaths EDA N17° 512 (Informatique Mathématique) Spécialité de doctorat :Informatique Soutenue publiquement le 29/07/2020, par : Andrea Bocco A variable precision hardware acceleration for scientific computing Devant le jury composé de : Frédéric Pétrot Président et Rapporteur Professeur des Universités, TIMA, Grenoble, France Marc Dumas Rapporteur Professeur des Universités, École Normale Supérieure de Lyon, France Nathalie Revol Examinatrice Docteure, École Normale Supérieure de Lyon, France Fabrizio Ferrandi Examinateur Professeur associé, Politecnico di Milano, Italie Florent de Dinechin Directeur de thèse Professeur des Universités, INSA Lyon, France Yves Durand Co-directeur de thèse Docteur, CEA Grenoble, France Cette thèse est accessible à l'adresse : http://theses.insa-lyon.fr/publication/2020LYSEI065/these.pdf © [A.
    [Show full text]
  • CS31 Discussion 1E Spring 17’: Week 08
    CS31 Discussion 1E Spring 17’: week 08 TA: Bo-Jhang Ho [email protected] Credit to former TA Chelsea Ju Project 5 - Map cipher to crib } Approach 1: For each pair of positions, check two letters in cipher and crib are both identical or different } For each pair of positions pos1 and pos2, cipher[pos1] == cipher[pos2] should equal to crib[pos1] == crib[pos2] } Approach 2: Get the positions of the letter and compare } For each position pos, indexes1 = getAllPositions(cipher, pos, cipher[pos]) indexes2 = getAllPositions(crib, pos, crib[pos]) indexes1 should equal to indexes2 Project 5 - Map cipher to crib } Approach 3: Generate the mapping } We first have a mapping array char cipher2crib[128] = {‘\0’}; } Then whenever we attempt to map letterA in cipher to letterB in crib, we first check whether it violates the previous setup: } Is cipher2crib[letterA] != ‘\0’? // implies letterA has been used } Then cipher2crib[letterA] should equal to letterB } If no violation happens, } cipher2crib[letterA] = letterB; Road map of CS31 String Array Function Class Variable If / else Loops Pointer!! Cheat sheet Data type } int *a; } a is a variable, whose data type is int * } a stores an address of some integer “Use as verbs” } & - address-of operator } &b means I want to get the memory address of variable b } * - dereference operator } *c means I want to retrieve the value in address c Memory model Address Memory Contents 1000 1 byte 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 Memory model Address Memory
    [Show full text]
  • Chapter 9: Memory Management
    Chapter 9: Memory Management I Background I Swapping I Contiguous Allocation I Paging I Segmentation I Segmentation with Paging Operating System Concepts 9.1 Silberschatz, Galvin and Gagne 2002 Background I Program must be brought into memory and placed within a process for it to be run. I Input queue – collection of processes on the disk that are waiting to be brought into memory to run the program. I User programs go through several steps before being run. Operating System Concepts 9.2 Silberschatz, Galvin and Gagne 2002 Binding of Instructions and Data to Memory Address binding of instructions and data to memory addresses can happen at three different stages. I Compile time: If memory location known a priori, absolute code can be generated; must recompile code if starting location changes. I Load time: Must generate relocatable code if memory location is not known at compile time. I Execution time: Binding delayed until run time if the process can be moved during its execution from one memory segment to another. Need hardware support for address maps (e.g., base and limit registers). Operating System Concepts 9.3 Silberschatz, Galvin and Gagne 2002 Multistep Processing of a User Program Operating System Concepts 9.4 Silberschatz, Galvin and Gagne 2002 Logical vs. Physical Address Space I The concept of a logical address space that is bound to a separate physical address space is central to proper memory management. ✦ Logical address – generated by the CPU; also referred to as virtual address. ✦ Physical address – address seen by the memory unit. I Logical and physical addresses are the same in compile- time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme.
    [Show full text]
  • Objective-C Internals Realmac Software
    André Pang Objective-C Internals Realmac Software 1 Nice license plate, eh? In this talk, we peek under the hood and have a look at Objective-C’s engine: how objects are represented in memory, and how message sending works. What is an object? 2 To understand what an object really is, we dive down to the lowest-level of the object: what it actually looks like in memory. And to understand Objective-C’s memory model, one must first understand C’s memory model… What is an object? int i; i = 0xdeadbeef; de ad be ef 3 Simple example: here’s how an int is represented in C on a 32-bit machine. 32 bits = 4 bytes, so the int might look like this. What is an object? int i; i = 0xdeadbeef; ef be af de 4 Actually, the int will look like this on Intel-Chip Based Macs (or ICBMs, as I like to call them), since ICBMs are little- endian. What is an object? int i; i = 0xdeadbeef; de ad be ef 5 … but for simplicity, we’ll assume memory layout is big-endian for this presentation, since it’s easier to read. What is an object? typedef struct _IntContainer { int i; } IntContainer; IntContainer ic; ic.i = 0xdeadbeef; de ad be ef 6 This is what a C struct that contains a single int looks like. In terms of memory layout, a struct with a single int looks exactly the same as just an int[1]. This is very important, because it means that you can cast between an IntContainer and an int for a particular value with no loss of precision.
    [Show full text]
  • CS/COE1541: Introduction to Computer Architecture Dept
    CS/COE1541: Introduction to Computer Architecture Dept. of Computer Science University of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/1541p/index.html Chapter 5: Exploiting the Memory Hierarchy Lecture 2 Lecturer: Rami Melhem 1 The Basics of Caches • Until specified otherwise, it will be assumed that a block is one word of data • Three issues: – How do we know if a data item is in the cache? – If it is, how do we find it? address – If it is not, what do we do? Main CPU Cache memory • It boils down to – where do we put an item in the cache? – how do we identify the items in the cache? data • Two solutions – put item anywhere in cache (associative cache) – associate specific locations to specific items (direct mapped cache) Note: we will assume that memory requests are for words (4 Bytes) although an instruction can address a byte 2 Fully associative cache 00000 00001 To cache a data word, d, whose memory address is L: 00010 00011 •Put d in any location in the cache 00100 00101 • Tag the data with the memory address L. 00110 00111 01000 Cache index 01001 01010 01011 000 01001000 01100 Example: 001 001 01101 010 010 01110 • An 8-word cache 011 011 01111 (indexed by 000, … , 111) 100 11000100 10000 101 101 10001 • A 32-words memory 110 01110110 10010 111 111 10011 (addresses 00000, … , 11111) Tags Data 10100 10101 Cache 10110 10111 11000 Advantage: can fully utilize the cache capacity 11001 11010 Disadvantage: need to search all tags to find data 11011 11100 Memory word addresses 11101 11110 (without the 2-bits byte offset) 11111 Memory 3 Direct Mapped Cache (direct hashing) 00000 Assume that the size of the cache is N words.
    [Show full text]
  • Variable Long-Precision Arithmetic (VLPA) for Recon Gurable
    UNIVERSITY OF CALIFORNIA Los Angeles Variable LongPrecision Arithmetic VLPA for Recongurable Copro cessor Architectures A dissertation submitted in partial satisfaction of the requirements for the degree Do ctor of Philosophy in Computer Science by Alexandre Ferreira Tenca c Copyright by Alexandre Ferreira Tenca ii The dissertation of Alexandre Ferreira Tenca is approved Prof Dr Willian Newman Prof Dr David Rennels Prof Dr Jason Cong Prof Dr Milos D Ercegovac Committee Chair University of California Los Angeles ABSTRACT OF THE DISSERTATION Variable LongPrecision Arithmetic VLPA for Recongurable Copro cessor Architectures by Alexandre Ferreira Tenca Do ctor of Philosophy in Computer Science University of California Los Angeles Professor Prof Dr Milos D Ercegovac Chair This is the abstract iii Contents Introduction The need for VLPA Alternative Arithmetic Systems Software Languages and Libraries for VLPA Existing Copro cessors for Long Precision Computations Chows VP Pro cessor CADAC Controlled Precision Decimal Arithmetic Unit Copro cessor for PascalXSC Interval Arithmetic Copro cessor JANUS VLP Copro cessor for the TM VLP Computation and RCArs Research Ob jectives Dissertation Outline Recongurable Copro cessor Architecture Recongurable Copro cessor Mo del
    [Show full text]
  • Various Algorithms for High Throughput Sequencing
    Various Algorithms for High Throughput Sequencing by Vladimir Yanovsky A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2014 by Vladimir Yanovsky Abstract Various Algorithms for High Throughput Sequencing Vladimir Yanovsky Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2014 The growing volume of generated DNA sequencing data makes the problem of its long term storage increasingly important. In the first chapter we present ReCoil { an I/O efficient external memory algorithm designed for compression of very large datasets of short reads DNA data. Typically each position of DNA sequence is covered by multiple reads of a short read dataset and our algorithm makes use of resulting redundancy to achieve high compression rate. While compression based on encoding mismatches be- tween the dataset and a similar reference can yield high compression rate, good quality reference sequence may be unavailable. Instead, ReCoil's compression is based on en- coding the differences between similar or overlapping reads. As such reads may appear at large distances from each other in the dataset and since random access memory is a limited resource, ReCoil is designed to work efficiently in external memory, leveraging high bandwidth of modern hard disk drives. The next problem we address is the problem of mapping sequencing data to the refer- ence genome. We revisit the basic problem of mapping a dataset of reads to the reference, developing a fast, memory-efficient algorithm for mapping of the new generation of long high error rate reads.
    [Show full text]
  • Address Translation
    Section 7: Address Translation October 10-11, 2017 Contents 1 Vocabulary 2 2 Problems 2 2.1 Conceptual Questions......................................2 2.2 Page Allocation..........................................4 2.3 Address Translation.......................................5 2.4 Inverted Page Tables.......................................6 2.5 Page Fault Handling for Pages Only On Disk.........................8 1 CS 162 Fall 2017 Section 7: Address Translation 1 Vocabulary • Virtual Memory - Virtual Memory is a memory management technique in which every process operates in its own address space, under the assumption that it has the entire address space to itself. A virtual address requires translation into a physical address to actually access the system's memory. • Memory Management Unit - The memory management unit (MMU) is responsible for trans- lating a process' virtual addresses into the corresponding physical address for accessing physical memory. It does all the calculation associating with mapping virtual address to physical addresses, and then populates the address translation structures. • Address Translation Structures - There are two kinds you learned about in lecture: segmen- tation and page tables. Segments are linearly addressed chunks of memory that typically contain logically-related information, such as program code, data, stack of a single process. They are of the form (s,i) where memory addresses must be within an offset of i from base segment s. A page table is the data structure used by a virtual memory system in a computer operating system to store the mapping between virtual addresses and physical addresses. Virtual addresses are used by the accessing process, while physical addresses are used by the hardware or more specifically to the RAM.
    [Show full text]