Performance Optimization Strategies for Transactional Memory Applications

Total Page:16

File Type:pdf, Size:1020Kb

Performance Optimization Strategies for Transactional Memory Applications Performance Optimization Strategies for Transactional Memory Applications zur Erlangung des akademischen Grades eines Doktors der Ingenieurwissenschaften von der Fakultät für Informatik des Karlsruher Instituts für Technologie (KIT) genehmigte Dissertation von Martin Otto Schindewolf aus Eschwege Tag der mündlichen Prüfung: 19. April 2013 Erster Gutachter: Prof. Dr. Wolfgang Karl Zweiter Gutachter: Prof. Dr. Albert Cohen KIT – Universität des Landes Baden-Württemberg und nationales Forschungszentrum der Helmholtz-Gesellschaft www.kit.edu Ich versichere hiermit wahrheitsgemäß, die Arbeit bis auf die dem Aufgabensteller bereits bekannte Hilfe selbständig angefertigt, alle benutzten Hilfsmittel vollständig und genau angegeben und alles kenntlich gemacht zu haben, was aus Arbeiten anderer unverändert oder mit Abänderung entnommen wurde. Karlsruhe, den 4. März 2013 Martin Schindewolf Abstract Transactional Memory (TM) has been proposed as an architectural extension to enable lock-free data structures. With the ubiquity of multi-core systems, the idea of TM gains new momentum. The motivation for the invention of TM was to simplify the synchronization of parallel threads in a shared memory system. TM features optimistic concurrency as opposed to the pessimistic concurrency with traditional locking. This optimistic approach lets two transactions execute in parallel and assumes that there is no data race. In case of a data race, e.g., both transactions write to the same address, this conflict must be detected and resolved. Therefore a TM run time system monitors shared memory accesses inside transactions. These TM systems can be implemented in software (STM), hardware (HTM) or as a hybrid combination of both. Most of the research in TM focuses on language extensions, compiler support, and the optimization of algorithmic details of TM systems. Each of the resulting TM systems has a different performance characteristic and comes with a number of parameters that can be tuned. This optimization space overwhelms the application developer who has been the target audience for the invention of TM. In contrast to other research activities, this thesis proposes TM tools that are candidates to bridge the gap between the application developer and the designer of the TM system. These tools for TM provide insights in the run time behavior of the TM application and guide the application developer to optimize the TM application. In contrast to related work presenting tools for a specific TM system coupled with a programming language, this thesis presents solutions that cover multiple TM systems (Software, Hardware, and hybrid TM) and account for different parameter settings in each of the TM systems. Moreover, the proposed methods and strategies use information of all different layers of the TM software stack. Therefore, static information, information about the run time behavior, and expert-level knowledge need to be extracted to develop these new methods and strategies for the optimization of TM applications. Extracting and using this information poses the following challenges that need to be addressed. The first challenge is to capture the genuine TM application’s behavior at run time. This is especially important because transactions are sensitive to artificially introduced delays of a thread which may cause an introduced TM conflict. This may lead to a recorded application’s behavior that is biased through the tracing machinery. Therefore, a lightweight trace generation scheme with a small probe-effect needs to be developed and evaluated. This thesis presents similar solutions for STM and hybrid TM that generate event traces. For STM, the logging of frequently occurring events, such as TM loads and stores, quickly saturates the write bandwidth of the hard disk. Thus, a reduction of the amount of data to be written to hard disk needs to be achieved. Therefore, we research how these events can be compressed online without disturbing the application. A multi-threaded trace compression scheme is designed, implemented, and evaluated. We enhance the TinySTM, a word-based open source STM, with tracing facilities. Due to the lightweight and compressing tracing i ii scheme, the resulting TracingTinySTM generates event traces that capture the genuine TM application’s behavior. For an FPGA-based hybrid TM system, called TMbox, a low-overhead tracing solution is shown. Each processing element is enhanced with an event generation unit for monitoring. When an instruction of interest is encountered, a corresponding time-stamped event is generated and passed on to the log unit. This unit uses idle times on the bus to transfer it to the memory of the host machine. Although all methods are tailored to and applied in the context of TM, they are universally applicable in a different context. Example application areas are the monitoring of arbitrary events in an FPGA-based processor prototype or the logging and compressing of events in a math library. The second challenge arises with the release of IBM’s new Blue Gene/Q architecture with HTM support. We establish a set of best practices for the new architecture so that the application developer can exploit the full potential of the architecture including the TM subsystem. For this purpose, we introduce a new benchmark, CLOMP-TM, that has a series of parameters that allow us to explore varying transaction granularities and conflict rates, coupled with typical computational kernels found in scientific codes. This enables new insights through evaluating the TM system and comparing the performance with other synchronization primitives of OpenMP. Further, this new BG/Q architecture also requires tool support to enable optimizations of scientific applications. A trace-based solution to capture the run time behavior on a per event basis is not feasible for this proprietary HTM system. This HTM system requires a complete software stack, including compiler and run time system. Thus, we design, implement and evaluate three different tools for TM that enable programmers to explore the subtleties of TM execution and contribute the following to the state-of-the-art: • the first tool that profiles applications using MPI and OpenMP with TM on BG/Q, • a tracing tool for TM that enables in-depth inspection of thread-level execution and utilization of the architecture through visualization with the state-of-the-art visualization tool Vampir, • a tool that measures overheads associated with TM, designed to dissect these over- heads and direct optimization efforts for the TM stack. These tools uncover the subtle interaction of the TM system and the prefetching on BG/Q and help to study the implications for designing applications and choosing the TM mode of execution. Moreover, these tools enable to obtain a comprehensive understanding of the performance of synchronization mechanisms in LULESH, a Lagrange hydrodynamics proxy application, and find the cause for the missing performance with TM. The third challenge is to correlate the gathered TM characteristics with microarchitecture events for the optimization of TM applications using STMs. Although this correlation is not new by itself, it is extremely helpful and has not been researched extensively in this context. Besides choosing well-suited parameters to monitor the microarchitecture, the correct interpretation of the obtained values is of key importance. To simplify the optimization for humans, the Visualization Of TM Applications (VisOTMA) framework is developed that visualizes these traces and enables the programmer to identify frequently conflicting transactions and the corresponding values of microarchitecture parameters in a post-processing step. The visualization of this aggregated data needs to be achieved in a way that an inexperienced as well as an experienced programmer can identify pathological execution patterns – posing ii iii the fourth challenge. So far optimizations have been carried out by TM experts with excellent knowledge of the underlying TM system. An inexperienced programmer does not have or may not be willing to acquire the knowledge of the TM system. Thus, the inexperienced programmer follows a trial-and-error strategy to optimize the TM application. To speedup this process, we invent EigenOpt – an exploration tool based on EigenBench – as part of the VisOTMA framework. With the help of EigenOpt, any programmer can capture the TM characteristic of the application in terms of parameters for Eigenbench. These parameters combined with Eigenbench are straightforwardly used to explore the space available through optimizations. With this tool, unrewarding optimization directions can be excluded without modifying the application. In this thesis, we will research how to identify and avoid optimization attempts with diminishing returns. This will speedup the optimization process for an inexperienced programmer and also yield new insights for an experienced one. The fifth challenge is to detect and exploit a potential phase behavior of TM applications and integrate this analysis in the VisOTMA framework. In case the behavior of the TM application has periods with high and low conflict probability, this behavior of the application can be detected and exploited. Exploiting these phases is motivated through the different proposed TM designs: optimistic conflict detection schemes detect conflicts at commit-time whereas pessimistic schemes check for conflicts at encounter-time. In the optimistic case, a conflict early in the execution of a transaction is noticed at commit time so
Recommended publications
  • Mull It Over: Mutation Testing Based on LLVM
    Mull it over: mutation testing based on LLVM Alex Denisov Stanislav Pankevich Independent researcher Independent researcher Berlin, Germany Berlin, Germany Email: [email protected] Email: [email protected] Abstract—This paper describes Mull, an open-source tool for framework [4]. It uses two components of LLVM: IR, its low- mutation testing based on the LLVM framework. Mull works level intermediate language, to perform mutations and JIT with LLVM IR, a low-level intermediate representation, to per- for runtime compilation and execution of a tested program form mutations, and uses LLVM JIT for just-in-time compilation. This design choice enables the following two capabilities of Mull: and its mutated counterparts. LLVM IR is also referred to language independence and fine-grained control over compilation as LLVM Bitcode or simply as bitcode. We use these terms and execution of a tested program and its mutations. Mull interchangeably. can work with code written in any programming language that supports compilation to LLVM IR, such as C, C++, Rust, or We consider the following criteria important for a practical Swift. Direct manipulation of LLVM IR allows Mull to do less implementation of mutation testing tool: the tool must be work to generate mutations: only modified fragments of IR code fast, configurable and easy to set up and use. The tool are recompiled, and this results in faster processing of mutated should allow smooth integration with build tools. The tool programs. To our knowledge, no existing mutation testing tool provides these capabilities for compiled programming languages. should be ready for use in mutation testing analysis of real- We describe the algorithm and implementation details of Mull, world production and open source projects.
    [Show full text]
  • A Comparative Survey on Flight Software Frameworks for 'New Space'
    https://doi.org/10.5028/jatm.v11.1081 ORIGINAL PAPER xx/xx A Comparative Survey on Flight Software Frameworks for ‘New Space’ Nanosatellite Missions Danilo José Franzim Miranda1,2,*, Maurício Ferreira3, Fabricio Kucinskis1, David McComas4 How to cite Miranda DJF https://orcid.org/0000-0002-9186-1740 Miranda DJF; Ferreira M; Kucinskis F; McComas D (2019) A Ferreira M https://orcid.org/0000-0002-6229-9453 Comparative Survey on Flight Software Frameworks for ‘New Kucinskis F https://orcid.org/0000-0001-6171-761X Space’ Nanosatellite Missions. J Aerosp Technol Manag, 11: e4619. https://doi.org/10.5028/jatm.v11.1081 McComas D https://orcid.org/0000-0002-2545-5015 ABSTRACT: Nanosatellite missions are becoming increasingly popular nowadays, especially because of their reduced cost. Therefore, many organizations are entering the space sector due to the paradigm shift caused by nanosatellites. Despite the reduced size of these spacecrafts, their Flight Software (FSW) complexity is not proportional to the satellite volume, thus creating a great barrier for the entrance of new players on the nanosatellite market. On the other side, there are some available frameworks that can provide mature FSW design approaches, implying in considerable reduction in software project timeframe and cost. This paper presents a comparative survey between six relevant fl ight software frameworks, compared according to commonly required ‘New Space’ criteria, and fi nally points out the most suitable one to the VCUB1 reference nanosatellite mission. KEYWORDS: Flight Software, On-Board Software, NASA cFS, New Space. INTRODUCTION Flight Soft ware (FSW) is soft ware that runs on a processor embedded in a spacecraft ’s avionics.
    [Show full text]
  • Integrated Solution for Rapid Development of Complex Gnc Software
    INTEGRATED SOLUTION FOR RAPID DEVELOPMENT OF COMPLEX GNC SOFTWARE Jean-Sébastien Ardaens (1), Gabriella Gaias (2) (1) German Space Operations Center (DLR/GSOC), 82234 Wessling, Germany, [email protected] (2) German Space Operations Center (DLR/GSOC), 82234 Wessling, Germany, [email protected] ABSTRACT low altitude (500km) with very different ballistic coefficients, it has to be avoided to fix problems - which The paper describes the integrated software solution could have been detected before - after establishing the retained for the design and development of the formation because of the high propellant cost required AVANTI experiment, a challenging on-board to maintain the formation. autonomous formation-flying endeavour to be After a brief description of the envisioned experiment, conducted in 2016. This solution aims at enabling rapid the paper proposes some ideas to reduce the prototyping by providing a powerful development, development and validation efforts of such a GNC validation and testing environment, able to support experiment while ensuring a smooth integration in the simultaneously the design and validation of novel satellite bus to augment the chance of successful Guidance, Navigation and Control algorithms, the completion of the experiment. definition and documentation of the interfaces with the ground segment, the implementation of the onboard 2. OVERVIEW software using space quality standards, the integration into an existing satellite bus and all related testing 2.1. The AVANTI Experiment activities. As mentioned in the introduction, AVANTI will 1. INTRODUCTION demonstrate the capability to perform rendezvous and receding approaches with respect to a noncooperative The possibility to conduct a spaceborne experiment is a client satellite making use of vision-based angles-only rare and precious opportunity that has to be taken even measurements.
    [Show full text]
  • Source Code Verification for Embedded Systems Using Prolog
    Source Code Verification for Embedded Systems using Prolog Frank Flederer Ludwig Ostermayer University of Wuerzburg University of Wuerzburg Aerospace Information Technology Knowledge-based Systems [email protected] [email protected] Dietmar Seipel Sergio Montenegro University of Wuerzburg University of Wuerzburg Knowledge-based Systems Aerospace Information Technology [email protected] [email protected] System relevant embedded software needs to be reliable and, therefore, well tested, especially for aerospace systems. A common technique to verify programs is the analysis of their abstract syntax tree (AST). Tree structures can be elegantly analyzed with the logic programming language Prolog. Moreover, Prolog offers further advantages for a thorough analysis: On the one hand, it natively provides versatile options to efficiently process tree or graph data structures. On the other hand, Prolog’s non-determinism and backtracking eases tests of different variations of the program flow without big effort. A rule-based approach with Prolog allows to characterize the verification goals in a concise and declarative way. In this paper, we describe our approach to verify the source code of a flash file system with the help of Prolog. The flash file system is written in C++ and has been developed particularly for the use in satellites. We transform a given abstract syntax tree of C++ source code into Prolog facts and derive the call graph and the execution sequence (tree), which then are further tested against verification goals. The different program flow branching due to control structures is derived by backtracking as subtrees of the full execution sequence.
    [Show full text]
  • Software Evolution from TET-1 to Eu:CROPIS
    Software Evolution from TET-1 to Eu:CROPIS Olaf Maibaum (1), Ansgar Heidecker (2) (1) German Aerospace Center (DLR), Simulation and Software Technology, Lilienthalplatz 7, 38108 Braunschweig, Germany (2) German Aerospace Center (DLR), Institute of Space Systems, Robert Hooke Str. 7, 28359 Bremen, Germany ABSTRACT The base of the Eu:CROPIS (Euglena Combined Regenerative Organic food Production In Space) Attitude and Orbit Control System (AOCS) is the three layer AOCS software architecture of the TET-1 satellite (Technology demonstrator). Because of different AOCS requirements between TET-1 and Eu:CROPIS, a software reuse is only possible for software components in the interface layer. In the other two architecture layers, the software components have to be replaced by new implementations to fulfil the changed requirements of the Eu:CROPIS mission. In contrast to the former software evolution from BIRD (Bispectral Infra-Red Detection) to the TET-1 AOCS, the software evolu- tion is forced in Eu:CROPIS by the reuse of software design principals applied in TET- 1. This enables the change of the underlying scheduling mechanisms from a fixed-time approach to a more reactive software behavior presented in this paper. 1. INTRODUCTION The reactive scheduling mechanism used in the Eu:CROPIS AOCS is a result from ex- periences made in the BIRD mission [7]. This mechanism, named “Tasking Frame- work”, resolves one weakness in the BIRD and TET-1 AOCS software [8]: the scant timing for the control torque computation. The Tasking Framework is the core element in the runtime system development of DLR’s OBC-NG (Onboard Computer – Next Generation) project [5], which will provide a distributed onboard computer platform for reconfigurable and high redundant systems.
    [Show full text]
  • Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain
    ACCEPTED FOR PUBLICATION ON IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1 Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results in the Space Domain Oscar Cornejo, Fabrizio Pastore, Member, IEEE, and Lionel C. Briand, Fellow, IEEE Abstract—On-board embedded software developed for spaceflight systems (space software) must adhere to stringent software quality assurance procedures. For example, verification and validation activities are typically performed and assessed by third party organizations. To further minimize the risk of human mistakes, space agencies, such as the European Space Agency (ESA), are looking for automated solutions for the assessment of software testing activities, which play a crucial role in this context. Though space software is our focus here, it should be noted that such software shares the above considerations, to a large extent, with embedded software in many other types of cyber-physical systems. Over the years, mutation analysis has shown to be a promising solution for the automated assessment of test suites; it consists of measuring the quality of a test suite in terms of the percentage of injected faults leading to a test failure. A number of optimization techniques, addressing scalability and accuracy problems, have been proposed to facilitate the industrial adoption of mutation analysis. However, to date, two major problems prevent space agencies from enforcing mutation analysis in space software development. First, there is uncertainty regarding the feasibility of applying mutation analysis optimization techniques in their context. Second, most of the existing techniques either can break the real-time requirements common in embedded software or cannot be applied when the software is tested in Software Validation Facilities, including CPU emulators and sensor simulators.
    [Show full text]
  • Design and Implementation of Multi-Core Support for an Embedded Real-Time Operating System for Space Applications
    Design and Implementation of Multi-core Support for an Embedded Real-time Operating System for Space Applications Master of Science Thesis KTH Royal Institute of Technology Author: Wei Zhang, KTH, Sweden Supervisor: Ting Peng, DLR, Germany Examiner: Assoc. Prof. Ingo Sander, KTH, Sweden Abstract Nowadays, multi-core processors are widely used in embedded applications due to the advantages of higher performance and lower power consumption. However, the complexity of multi-core architectures makes it a considerably challenging task to extend a single-core version of a real-time operating system to support multi-core platform. This thesis documents the process of design and implementation of a multi-core version of RODOS - an embedded real-time operating system developed by German Aerospace Center and the University of Würzburg - on a dual-core platform. Two possible models are proposed: Symmetric Multiprocessing and Asymmetric Multi- processing. In order to prevent the collision of the global components initialization, a new multi-core boot loader is created to allow that each core boots up in a proper manner. A working version of multi-core RODOS is implemented that has an ability to run tasks on a multi-core platform. Several test cases are applied and verified that the performance on the multi-core version of RODOS achieves around 180% improved than the same tasks running on the original RODOS. Deadlock free communication and synchronization APIs are provided to let parallel applications share data and messages in a safe manner. Key words: real-time operating system, multi-core architecture, embedded sys- tem Acknowledgment This thesis is dedicated to my parents whose support and help over the years let me study abroad and pursue my dream.
    [Show full text]
  • RODOS Real Time Kernel Design for Dependability
    RODOS Real Time Kernel Design for Dependability Dr. Sergio Montenegro, Frank Dannemann DLR, Institute of Space Systems, Robert-Hooke-Str. 7, 28359 Bremen, Germany Web: http://www.dlr.de/irs E-Mail: [email protected], [email protected] Introduction The NetworkCentric core avionics machine consists of several harmonised components which work together to implement dependable computing in a simple way. The NetworkCentric Machine Computing units (CPU +MEM) are managed by the local real-time kernel operating system (OS) RODOS. On top of the kernel runs the software middleware (MW) of RODOS and around this middleware the user can implement its applications (AP). To communicate with external units, including devices and other computing units, each node provides a gateway to the network and around the network's several devices (IO Devices and computing nodes) may be attached to the system. RODOS is a real-time embedded operating system (OS) designed for applications demanding high dependability. Simplicity is our main strategy for achieving dependability, as complexity is the cause of most development faults. The system was developed in C++, using an object-oriented framework simple enough to be understood and applied in several application domains. Although targeting minimal complexity, no fundamental functionality is missing, as its microkernel provides support for resource management, thread synchronisation and communication, input/output and interrupts management. The system is fully pre-emptive and uses priority-based scheduling and round robin for same priority threads. RODOS Core RODOS was designed as a framework offering the following features: • object oriented C++ interfaces, • ultra fast booting • real time priority controlled primitives multithreading, • time management (as a central point), • thread safe communication and synchronisation, • event propagation RODOS can be executed on embedded target hardware and on top of Linux.
    [Show full text]
  • Luna Gräfje – Embedded Software Engineer
    Ottostraße 3 Luna Gräfje 10555 Berlin H +49 1590 2239111 B [email protected] Embedded Software Engineer ∗ 03.07.1994, Berlin Education 2016–2019 Msc. Computer Science, TU Berlin. Including various space technology courses taken as elective Average grade: 1.0 2012–2016 Bsc. Computer Science, TU Berlin. Minor subject: Mathematics Master Thesis Title Model-driven Verification Using Symbolic Execution Supervisors Prof. Dr. Jean-Pierre Seifert, Prof. Dr.-Ing. Klauß Brieß Description I developed a method to employ symbolic execution to prove the conformance between abstract software models of embedded software and its implementation. The developed method was demonstrated on software from the TechnoSat nanosatellite Languages German Native language English Fluent (TOEFL iBT 112/120 points) Technical skills Programming C++, Python, C, Bash, Assembly (x86 and ARM) Languages Communication I2C, SPI, UART / Serial, CANBUS protocols Technologies Microcontrollers, Catch2, Make, Git, LATEX, Qt, ARM Architecture, Jenkins Operating systems Linux, RODOS, FreeRtos Methodologies Test Driven Development, Agile software development Other skills Reading schematics, debugging communication with a logic analyzer Experience Employment 2016–2017 Student Assistant / Embedded Software Engineer, SecT TU Berlin. Performance optimization of an Armv7-A paravirtualization hypervisor { Created benchmarks to evaluate the performance of the system { Implemented support for new software features { Modified the guest OS’s Linux Kernel to further improve performance { Keywords:
    [Show full text]
  • Event-Driven Multithreading Execution Platform for Real-Time On-Board Software Systems
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Institute of Transport Research:Publications Event-Driven Multithreading Execution Platform for Real-Time On-Board Software Systems Zain A. H. Hammadeh, Tobias Franz, Olaf Maibaum, Andreas Gerndt, Daniel Ludtke¨ Simulation and Software Technology German Aerospace Center (DLR) Braunschweig, Germany fzain.hajhammadeh, tobias.franz, olaf.maibaum, andreas.gerndt, [email protected] Abstract—The high computational demand and the modularity of the satellite. This filter comprises several estimation and of future space applications make the effort of developing prediction modules executed by the controller thread. Each multithreading reusable middlewares worthwhile. In this paper, estimation module computes one value in the attitude state we present a multihreading execution platform and a software development framework that consists of abstract classes with vector, for example, the sun vector from the sun sensor input virtual methods. The presented work is written in C++ following values, the predicted sun vector and expected control effect the event-driven programming paradigm and based on the in- from the last control cycle, a rate from the new sun vector verse of control programming principle. The platform is portable or magnetic field vector, or the best rate by cross checking over different operating systems, e.g., Linux and RTEMS. This magnetic field vector rate, sun vector rate and measured rate platform is supported with a modeling language to automatically generate the code from the given requirements. Our platform has from gyroscopes. The computation order is given by the data been used in already flying satellites, e.g., Eu:CROPIS.
    [Show full text]
  • Jrodos User Guide
    JRodos User Guide Versio n 3 .4 (JRo do s February 2017 u2 ) => Updated Chapters to version 3.1 are marked in yellow<= => Text updates to version 3.21 are marked in cyan <= => Text updates to version 3.31 are marked in pink <= => Text updates to version 3.33 are marked in green <= Ievgen Ievdin Ukrainian Centre for Environmental and Water Projects Dmytro Trybushnyi, Christian Staudt, Claudia Landman Karlsruher Institute of Technology, Institut für Kern- und Energietechnik December 2018 2 Table of Contents ABBREVIATIONS, ACRONYMS, DENOTATIONS ............................................................ 6 ANNOTATED FIGURE OF THE JRODOS WINDOW ......................................................... 8 ANNOTATED FIGURE OF THE JRODOS TOOL BAR ICONS ......................................... 9 STARTING JRODOS; WHAT TO DO IN CASE OF PROBLEMS; TRAINING MATERIAL .......................................................................................................................................... 10 Starting JRodos and logging in; client and server ....................................................................................................... 10 What to do in case of problems; Bugzilla ...................................................................................................................... 10 Filing a bug and closing a bug in Bugzilla ................................................................................................................... 11 Training material ...........................................................................................................................................................
    [Show full text]
  • RODOS Porting Guide
    Porting the Real-Time Operating System RODOS to SPARC V9 RODOS Porting Guide Andre Bartke Würzburg, August 2012 In this document we will briefly explore the interfaces that must be provided for a port of the RODOS operating system to a new hardware platform. The necessary code will be put to the src/bare-metal/<platform>/ directory where in our case <platform> is sparc64. This direc- tory has to contain, among others, some specific header files that will be required at compile time. Other interfaces demand symbols for linkage and some are completely optional or depend on the respective target architecture. In the following sections, optional interfaces will be mentioned as stubs, meaning, their symbol must be present but they are not doing anything. This section focuses on operating system specifics but we also need to support a few manda- tory C++ run-time features. Besides this section it might be valuable to take a look at some language-related implementation details outlined in section 2 on p. 8. 1.1 Development Environment Usually a new hardware platform requires a GNU cross compiler. This is necessary since de- velopment and boot image creation is not done from the same architecture we are targeting. For the development of this port a cross toolchain consisting of binutils and gcc is sufficient and must be configured to emit code for the target. This is done by building them with the --target=sparc64-unknown-elfi target triplet. On a common setup the build and host machine (on which the cross compiler is run from) is x86-based.
    [Show full text]