Gcc Link Time Optimization and the Linux Kernel Andi Kleen Intel

Total Page:16

File Type:pdf, Size:1020Kb

Gcc Link Time Optimization and the Linux Kernel Andi Kleen Intel gcc link time optimization and the Linux kernel Andi Kleen Intel OTC Apr 2013 [email protected] Acknowledgments ● Lots of people helped/contributed ● Ralf Baechle, Richard Biener, Tim Bird, Honza Hubicka, H.J. Lu, Joe Mario, Markus Trippelsdorf, Changlong Xie, others Why LTO? ● Optimize over the whole binary – Not just function (3.0) or file (<4.5) ● Avoid inline dependency hell in header files ● Without changing Makefiles significantly gcc 4.7+ LTO WHOPR crash course ● Compiler parses files, writes GIMPLE to object files (LGEN) – Function optimization summaries are computed ● Linker calls lto1 with all files for sequential whole program analysis (WPA) – Merges types, Generates global callgraph, IPA optimization summaries, writes partitions ● Generate code per partition in parallel (LTRANS) – Inline inside partition, run IPA optimizations for real, run per function optimizations, generate object code LTO IPA optimizations (4.8) ● Inlining between files ● De-virtualization ● Function cloning for ● Change ABI (SSE) specific arguments ● Constant propagation ● Remove unused code (fwhole-program) ● Scalar replacement of aggregates ● Increase alignment ● Constructor / ● Discover pure/const destructor merging ● Keep globals/statics alive over calls green: does not benefit kernel today User time / real time 10.00 15.00 0.00 5.00 200 400 600 800 0 Gcc 4.7 Gcc 4.7 Build time small time Build Parallelism LTO is and slow not enough parallel LTO Gcc 4.8 Gcc 4.8 4.7 -lto 4.7 4.7 -lto 4.7 4.8 -lto 4.8 4.8 -lto 4.8 Build time Faults (M) 20.00 40.00 60.00 0.00 user time (s) 2000 4000 Gcc 4.7 0 Faults small config small Faults Gcc 4.7 Gcc 4.8 smallconfig User time User Gcc 4.8 4.7 -lto 4.7 4.7 -lto 4.7 4.8 -lto 4.8 4.8 -lto 4.8 Parallelism small build 4.7 Object file generation Parsing / LGEN Runnable processes Modules without job server LTO kernel build small config 60 50 40 LTRANS ) code generation p ( r 30 e l b a n n u r 20 10 WPA + real linker 0 Type merging 11 31 51 71 91 111131151171191211231251271291311331351371391411431451471491511531551 1 21 41 61 81 101121141161181201221241261281301321341361381401421441461481501521541 4 times time (s) Small config parallelism User time / Runtime 15 10 ) s ( 5 e WHOPR still has poor parallelism m i t 0 Gcc 4.7 Gcc 4.7 no lto gcc 4.8 Gcc 4.8 no lto Multilink ● vmlinux links 2-4x: runs LTO that often ● Generates integrated symbol table (kallsyms) ● So far not fixed – KALLSYMS can be disabled ● One of those unexpected quirks of real build systems 6000.00 4000.00 2000.00 0.00 Gcc 4.8 w/o kallsyms Gcc 4.8 no lto Gcc 4.7 with kallsyms Memory usage 4.7 small build Active memory Kernel LTO build small config 8 7 6 ) 5 B G ( 4 m e m 3 e v i t c a 2 1 0 9 25 41 57 73 89 105121137153169185201217233249265281297313329345361377393409425441457473489505521537553 1 17 33 49 65 81 97 113129145161177193209225241257273289305321337353369385401417433449465481497513529545 time (s) Faults small config 60.00 Even small build swaps in 4GB system ) 40.00 M ( Memory peaks all in WPA s t l 20.00 u a F 0.00 Gcc 4.7 Gcc 4.8 4.7 -lto 4.8 -lto Memory consumption ● Temporary data can be a problem, together with WPA – Early many swap storms with /tmp = tmpfs – Partitioning algorithm was improved – Use TMPDIR=objdir – With modules need to avoid too large -j* for parallel WPA – Jobserver has to be disabled, makes it worse Large build 4.8 (allyes) Linux allyes nosym runnable 4.8 -j16 60 Kallsyms disabled 50 (disables various features) 40 ) p ( 30 e l b a n 20 ~ 42mins n u r ~1h 11min with kallsyms 10 ~15GB peak 0 81 237 393 549 705 861 1017 1173 1329 1485 1641 1797 1953 2109 2265 2421 3 159 315 471 627 783 939 1095 1251 1407 1563 1719 1875 2031 2187 2343 2499 time (s) Linux allyes nosym gcc 4.8 memory 14.00 12.00 10.00 ) G ( 8.00 m e 6.00 m e v 4.00 i t c a 2.00 0.00 87 255 423 591 759 927 109512631431159917671935210322712439 3 171 339 507 675 843 1011117913471515168318512019218723552523 time (s) WPA cycle profile 4.8 Samples: 257K of event ©cycles©, Event count (approx.): 183142453583 ● - 9.71% lto1-wpa lto1 [.] htab_find_slot_with_hash ◆ ● - htab_find_slot_with_hash ▒ ● + 61.79% gimple_type_hash(void const*) ▒ ● + 26.54% gimple_register_type_1(tree_node*, bool) ▒ ● + 4.79% visit(tree_node*, sccs*, unsigned int, vec<tree_node*, va_heap, v▒ ● + 1.53% iterative_hash_canonical_type(tree_node*, unsigned int) ▒ ● + 0.98% iterative_hash_gimple_type(tree_node*, unsigned int, vec<tree_nod▒ ● + 0.92% symtab_get_node(tree_node const*) ▒ ● + 0.77% gimple_type_eq(void const*, void const*) ▒ ● + 0.53% build_int_cst_wide(tree_node*, unsigned long, long) ▒ ● + 0.53% streamer_string_index(output_block*, char const*, unsigned int, b▒ ● + 9.00% lto1-wpa libc-2.17.so [.] _int_malloc ▒ ● + 7.34% lto1-wpa lto1 [.] iterative_hash_hashval_t(unsigned i▒ ● + 7.33% lto1-wpa lto1 [.] gimple_type_eq(void const*, void co▒ ● + 6.66% lto1-wpa libc-2.17.so [.] __memset_sse2 ▒ ● + 5.35% lto1-wpa libc-2.17.so [.] malloc_consolidate ▒ ● + 4.11% lto1-wpa libc-2.17.so [.] _int_free ▒ ● + 4.10% lto1-wpa lto1 [.] tree_map_base_eq(void const*, void ▒ ● + 2.85% lto1-wpa lto1 [.] pointer_map_insert(pointer_map_t*, ▒ ● + 2.76% lto1-wpa lto1 [.] inflate_fast ▒ ● + 2.17% lto1-wpa lto1 [.] lto_output_tree(output_block*, tree▒ Most time spent in hashing types for type merging Type merging Complicated iterative hash consisting struct foo ** of 4 hash tables (including two “hash cache hashes”) Each LGEN has its own types, need pointer_type to be unified pointer_type Struct foo Members (record_type) Hashing improvements ● Use tcmalloc instead of glibc malloc – build time -9%; peak mem +35% ● Start with large type hash tables / caches ● Use murmur for hashing / fix iterative / fix pointer hash to handle all bits – (collisions 90%->70%->64%) ● Splay trees for type merging (bad idea) 12.00% Build time Improvement hash changes 10.00% a 8.00%t 4.8, build time small config l e d 6.00%) s ( 4.00%e m i 2.00%t 0.00% tcmalloc murmur fullhash Possible future hash improvements ● Precompute hash in LGEN – And stream types out in hash order – Requires even larger hash tables to avoid collisions (or a good tree) ● Only merge per partition? – And do that in parallel ● Parallelize WPA further? Incremental linking ● Kernel build system uses ld -r / ar extensively ● Linker merges all sections with the same name – Added random postfixes to LTO sections ● Still problem with pure assembler files – .S assembler code dropped during LTO – Originally attempted to fix with slim LTO ● Then fixed in Linux binutils – Unfortunately changes not merged into mainline binutils – Use Linux binutils for kernel Kernel modifications (excerpt) ● Section attributes ● Disable LTO for some files – const __initconst char * vs ● Workarounds for 4.7 __initconst char * const partitioning bugs (visible) ● Toplevel inline assembler ● Gcc-ld – Need globals and visible for any symbols ● A few fixes for optimizations (rare) ● Lots of externally_visible ● Various changes for dot NUMBER symbols Kernel changes: 108 files changed, 740 insertions(+), 313 deletions(-) (v3.8; without most section attributes) Debugging ● Kernel debugging: more difficult with more inlining – Especially with initialization functions – May need inline aware backtracer (“dwarf kallsyms”) ● Compiler – No simple test cases. Lots of regressions; Complicated delta. – Complex multi process – Hard to find right process to attach/profile – -dH (core dumps) are most useful LTO has challenges for debugging Global Optimizations (small config) 21k new inlines between files Inlining 13k unique Scalar replacement No change Partial Inline +16% Constant Propagation +40% 266->1052 const clones Cloning for constant propagation (CONFIG_LTO_CLONE) +2% text (~200k) Static instruction changes Generated code 4.8 small LTO Changes in static instructions (vmlinux) More = Better 12.00% 10.00% 8.00% a t 6.00% l e d 4.00% 2.00% 0.00% calls functions fsize avg push pop mov instruction Generated code Runtime ● Runtime – Mixed results LKP dbench tbench AIM7 OLTP Sandy Bridge 0% 0% +3% 0% Nehalem 0% 0% +4% +6% ● Likely need some more optimizations – Open: Atom testing (more sensitive to bad code) Text size ● Text size – Slight increase for large configs, but lower for small configs ● small config +3% ● allno shrinks -50k – The smaller the kernel the less infrastructure is used. But EXPORT_SYMBOL defeats it again. ARM embedded improvements (Tim Bird) vmlinux.non-lto => vmlinux.lto ● baseline other change percent ● -------- ----- ------ ------- ● text: 5242000 4869340 -372660 -7% ● data: 490184 476200 -13984 -2% ● bss: 121824 122112 288 0% ● total: 5854008 5467652 -386356 -6% ● ● Kernel: non-lto lto ● --------------------------- ------- -------- ● time for full build: 1m5.87s 3m22s ● time for incremental build: 0m5.24s 2m12s ● kernel image size: 5854008 5467652 ● compressed uImage size: 2849920 2694224 ● system meminfo MemTotal: 17804 kB 18188 kB ● system meminfo MemFree: 11020 kB 11260 kB ● boot time: 2.466369 2.414428 gcc LTO improvements wish list ● Per file options (partial patch) ● Fix passing through of options ● Standard gcc-ld ● Don't error for section mismatches ● Better procedure for LTO error reporting ● Better messages for type mismatches ● No .XXXX postfixes ● -fno-toplevel-reorder by symbol? ● Automatic procedure for externally_visible discovery? ● Better syntax for “symbol visible to toplevel assembler” Status ● Tested on x86, ARM, MIPS ● Uptodate with Linux 3.8 ● Some options disabled (function-trace, gcov, modversions) References ● LTO kernel tree http://github.com/andikleen/linux-misc – Branch lto-3.x (x increasing) ● Linux binutils http://www.kernel.org/pub/linux/devel/binutils/ ● Tcmalloc http://code.google.com/p/gperftools/ ● [email protected] http://halobates.de Backup Future kernel improvements ● Avoid multi-link ● Use constructors to allow disabling -fno- toplevel-reorder ● Support new option to disable instrument_function Quick HOWTO for LTO kernel ● Get git://github.com/andikleen/linux-misc lto-3.x tree ● Need gcc 4.7+ set up for LTO / Linux binutils ● Enable CONFIG_LTO – Make sure FUNCTION_PROFILER et.al./MODVERSIONS are disabled ● make CC=compiler LD=linux-ld AR=gcc-ld -jX – Check LTO does not get disabled ● See Documentation/lto-build for more details .
Recommended publications
  • AMD Athlon™ Processor X86 Code Optimization Guide
    AMD AthlonTM Processor x86 Code Optimization Guide © 2000 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other applica- tion in which the failure of AMD’s product could create a situation where per- sonal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD logo, AMD Athlon, K6, 3DNow!, and combinations thereof, AMD-751, K86, and Super7 are trademarks, and AMD-K6 is a registered trademark of Advanced Micro Devices, Inc. Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation.
    [Show full text]
  • Lecture Notes in Assembly Language
    Lecture Notes in Assembly Language Short introduction to low-level programming Piotr Fulmański Łódź, 12 czerwca 2015 Spis treści Spis treści iii 1 Before we begin1 1.1 Simple assembler.................................... 1 1.1.1 Excercise 1 ................................... 2 1.1.2 Excercise 2 ................................... 3 1.1.3 Excercise 3 ................................... 3 1.1.4 Excercise 4 ................................... 5 1.1.5 Excercise 5 ................................... 6 1.2 Improvements, part I: addressing........................... 8 1.2.1 Excercise 6 ................................... 11 1.3 Improvements, part II: indirect addressing...................... 11 1.4 Improvements, part III: labels............................. 18 1.4.1 Excercise 7: find substring in a string .................... 19 1.4.2 Excercise 8: improved polynomial....................... 21 1.5 Improvements, part IV: flag register ......................... 23 1.6 Improvements, part V: the stack ........................... 24 1.6.1 Excercise 12................................... 26 1.7 Improvements, part VI – function stack frame.................... 29 1.8 Finall excercises..................................... 34 1.8.1 Excercise 13................................... 34 1.8.2 Excercise 14................................... 34 1.8.3 Excercise 15................................... 34 1.8.4 Excercise 16................................... 34 iii iv SPIS TREŚCI 1.8.5 Excercise 17................................... 34 2 First program 37 2.1 Compiling,
    [Show full text]
  • Targeting Embedded Powerpc
    Freescale Semiconductor, Inc. EPPC.book Page 1 Monday, March 28, 2005 9:22 AM CodeWarrior™ Development Studio PowerPC™ ISA Communications Processors Edition Targeting Manual Revised: 28 March 2005 For More Information: www.freescale.com Freescale Semiconductor, Inc. EPPC.book Page 2 Monday, March 28, 2005 9:22 AM Metrowerks, the Metrowerks logo, and CodeWarrior are trademarks or registered trademarks of Metrowerks Corpora- tion in the United States and/or other countries. All other trade names and trademarks are the property of their respective owners. Copyright © 2005 by Metrowerks, a Freescale Semiconductor company. All rights reserved. No portion of this document may be reproduced or transmitted in any form or by any means, electronic or me- chanical, without prior written permission from Metrowerks. Use of this document and related materials are governed by the license agreement that accompanied the product to which this manual pertains. This document may be printed for non-commercial personal use only in accordance with the aforementioned license agreement. If you do not have a copy of the license agreement, contact your Metrowerks representative or call 1-800-377- 5416 (if outside the U.S., call +1-512-996-5300). Metrowerks reserves the right to make changes to any product described or referred to in this document without further notice. Metrowerks makes no warranty, representation or guarantee regarding the merchantability or fitness of its prod- ucts for any particular purpose, nor does Metrowerks assume any liability arising
    [Show full text]
  • Codewarrior® Targeting Embedded Powerpc
    CodeWarrior® Targeting Embedded PowerPC Because of last-minute changes to CodeWarrior, some of the information in this manual may be inaccurate. Please read the Release Notes on the CodeWarrior CD for the most recent information. Revised: 991129-CIB Metrowerks CodeWarrior copyright ©1993–1999 by Metrowerks Inc. and its licensors. All rights reserved. Documentation stored on the compact disk(s) may be printed by licensee for personal use. Except for the foregoing, no part of this documentation may be reproduced or trans- mitted in any form by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from Metrowerks Inc. Metrowerks, the Metrowerks logo, CodeWarrior, and Software at Work are registered trademarks of Metrowerks Inc. PowerPlant and PowerPlant Constructor are trademarks of Metrowerks Inc. All other trademarks and registered trademarks are the property of their respective owners. ALL SOFTWARE AND DOCUMENTATION ON THE COMPACT DISK(S) ARE SUBJECT TO THE LICENSE AGREEMENT IN THE CD BOOKLET. How to Contact Metrowerks: U.S.A. and international Metrowerks Corporation 9801 Metric Blvd., Suite 100 Austin, TX 78758 U.S.A. Canada Metrowerks Inc. 1500 du College, Suite 300 Ville St-Laurent, QC Canada H4L 5G6 Ordering Voice: (800) 377–5416 Fax: (512) 873–4901 World Wide Web http://www.metrowerks.com Registration information [email protected] Technical support [email protected] Sales, marketing, & licensing [email protected] CompuServe Goto: Metrowerks Table of Contents 1 Introduction 11 Read the Release Notes! . 11 Solaris: Host-Specific Information. 12 About This Book . 12 Where to Go from Here .
    [Show full text]
  • System Calls and Inline Assembler
    System calls and inline assembler Michal Sojka [email protected] ČVUT, FEL License: CC-BY-SA System calls ● A way for “normal” applications to invoke operating system (OS) kernel's services. ● Applications run in unprivileged CPU mode (user space, user mode) ● OS kernel runs in privileged CPU mode (kernel mode) ● System call is a way how to securely switch from user to kernel mode. What is a system call technically? ● A machine instruction that: – Increases the CPU privilege level and – Passes the control to a predefined place in the kernel. ● Arguments are (typically) passed in CPU registers. ● Instructions: – x86: int 0x80, sysenter, syscall – MIPS: syscall – ARM: swi x86 user execution environment (32 bit) Basic Program Execution Registers Address Space* 2^32 -1 Eight 32-bit General-Purpose Registers Registers General-Purpose Registers 31 16 15 8 07 16-bit 32-bit AH AL AX EAX Six 16-bit Segment Registers Registers BH BL BX EBX General-Purpose Registers 32-bits EFLAGS Register CH CL 031 CX ECX DH DL DX 32-bits EIP (Instruction Pointer Register) EAX EDX BP EBX EBP FPU Registers SI ECX ESI EDX Eight 80-bit Floating-Point DI EDI ESI Registers Data Registers 0 SP EDI ESP *The address space can be flat or segmented. Using EBP 16 bits Control Register the physical address ESP 16 bits Status Register extension mechanism, a physical address space of 16 bits Tag Register 2^36 - 1 canbeaddressed. Segment Registers Opcode Register (11-bits) 15 0 48 bits FPU Instruction Pointer Register CS 48 bits FPU Data (Operand) Pointer Register DS SS MMX
    [Show full text]
  • GCC and Assembly Language GCC and Assembly Language
    slide 2 slide 1 gaius gaius GCC and Assembly language GCC and Assembly language one could use an assembly language source file during the construction of an operating system kernel, define manyfunctions which: get, set registers microkernel, or embedded system it is vital to be able to access some of the microprocessor attribute unavailable in a high levellanguage this is inefficient, as it requires a call, ret to set a register cause cache misses and introduce a 3 instruction for example the operating system might need to: overhead modify a processes, stack pointer (%rsp) turn interrupts on and off manipulate the virtual memory directory processor register slide 3 slide 4 gaius gaius Consider an example (dangeous) foo.S suppose we wanted to get and set the value of the .globl foo_setsp stack pointer: rsp #void setsp (void *p) # #move the parameter, p, into $rsp # we might initially start to write an assembly file: foo_setsp: pushq %rbp foo.S which sets and gets the stack pointer movq %rsp, %rbp movq %rdi, %rsp leave ret # #void *getsp (void) # foo_getsp: movq %rsp, %rax ret slide 5 slide 6 gaius gaius Nowwrite some C code: bar.c Compile and link the code extern void foo_setsp (void *p); $ as -o foo.o foo.S $ gcc -c bar.c void someFunc (void) $ gcc foo.o bar.o { void *old = foo_getsp(); foo_setsp((void *)0x1234); } what are the problems with this code? hint examine what happens to the stack pointer slide 7 slide 8 gaius gaius Problems Writing the example the correct way the stack pointer is modified during the call and abetter technique is
    [Show full text]
  • In Using the GNU Compiler Collection (GCC)
    Using the GNU Compiler Collection For gcc version 6.1.0 (GCC) Richard M. Stallman and the GCC Developer Community Published by: GNU Press Website: http://www.gnupress.org a division of the General: [email protected] Free Software Foundation Orders: [email protected] 51 Franklin Street, Fifth Floor Tel 617-542-5942 Boston, MA 02110-1301 USA Fax 617-542-2652 Last printed October 2003 for GCC 3.3.1. Printed copies are available for $45 each. Copyright c 1988-2016 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with the Invariant Sections being \Funding Free Software", the Front-Cover Texts being (a) (see below), and with the Back-Cover Texts being (b) (see below). A copy of the license is included in the section entitled \GNU Free Documentation License". (a) The FSF's Front-Cover Text is: A GNU Manual (b) The FSF's Back-Cover Text is: You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development. i Short Contents Introduction ::::::::::::::::::::::::::::::::::::::::::::: 1 1 Programming Languages Supported by GCC ::::::::::::::: 3 2 Language Standards Supported by GCC :::::::::::::::::: 5 3 GCC Command Options ::::::::::::::::::::::::::::::: 9 4 C Implementation-Defined Behavior :::::::::::::::::::: 373 5 C++ Implementation-Defined Behavior ::::::::::::::::: 381 6 Extensions to
    [Show full text]
  • PA Build RM.Pdf
    CodeWarrior Development Studio for Power Architecture® Processors Build Tools Reference Revised: 2 March 2012 Freescale, the Freescale logo and CodeWarrior are trademarks of Freescale Semiconductor, Inc., Reg. U.S. Pat. & Tm. Off. The Power Architecture and Power.org word marks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org. All other product or service names are the property of their re- spective owners. © 2005-2012 Freescale Semiconductor, Inc. All rights reserved. Information in this document is provided solely to enable system and software implementers to use Freescale Semicon- ductor products. There are no express or implied copyright licenses granted hereunder to design or fabricate any inte- grated circuits or integrated circuits based on the information in this document. Freescale Semiconductor reserves the right to make changes without further notice to any products herein. Freescale Semiconductor makes no warranty, representation or guarantee regarding the suitability of its products for any partic- ular purpose, nor does Freescale Semiconductor assume any liability arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation consequential or incidental dam- ages. “Typical” parameters that may be provided in Freescale Semiconductor data sheets and/or specifications can and do vary in different applications and actual performance may vary over time. All operating parameters, including
    [Show full text]
  • ARM Cortex-A Series Programmer's Guide
    ARM® Cortex™-A Series Version: 4.0 Programmer’s Guide Copyright © 2011 – 2013 ARM. All rights reserved. ARM DEN0013D (ID012214) ARM Cortex-A Series Programmer’s Guide Copyright © 2011 – 2013 ARM. All rights reserved. Release Information The following changes have been made to this book. Change history Date Issue Confidentiality Change 25 March 2011 A Non-Confidential First release 10 August 2011 B Non-Confidential Second release. Updated to include Virtualization, Cortex-A15 processor, and LPAE. Corrected and revised throughout 25 June 2012 C Non-Confidential Updated to include Cortex-A7 processor, and big.LITTLE. Index added. Corrected and revised throughout. 22 January 2014 D Non-Confidential Updated to include Cortex-A12 processor, Cache Coherent Interconnect, expanded GIC coverage, Multi-core processors, Corrected and revised throughout. Proprietary Notice This Cortex-A Series Programmer’s Guide is protected by copyright and the practice or implementation of the information herein may be protected by one or more patents or pending applications. No part of this Cortex-A Series Programmer’s Guide may be reproduced in any form by any means without the express prior written permission of ARM. No license, express or implied, by estoppel or otherwise to any intellectual property rights is granted by this Cortex-A Series Programmer’s Guide. Your access to the information in this Cortex-A Series Programmer’s Guide is conditional upon your acceptance that you will not use or permit others to use the information for the purposes of determining whether implementations of the information herein infringe any third party patents. This Cortex-A Series Programmer’s Guide is provided “as is”.
    [Show full text]
  • Learn Linux Kernel Programming, Hands-On: a Uniquely Effective Top-Down Approach
    The Linux® Kernel Primer: A Top-Down Approach for x86 and PowerPC Architectures By Claudia Salzberg Rodriguez, Gordon Fischer, Steven Smolski ............................................... Publisher: Prentice Hall PTR Pub Date: September 21, 2005 ISBN: 0-13-118163-7 Pages: 648 Table of Contents | Index Learn Linux kernel programming, hands-on: a uniquely effective top-down approach The Linux® Kernel Primer is the definitive guide to Linux kernel programming. The authors' unique top-down approach makes kernel programming easier to understand by systematically tracing functionality from user space into the kernel and carefully associating kernel internals with user-level programming fundamentals. Their approach helps you build on what you already know about Linux, gaining a deep understanding of how the kernel works and how its elements fit together. One step at a time, the authors introduce all the tools and assembly language programming techniques required to understand kernel code and control its behavior. They compare x86 and PowerPC implementations side-by-side, illuminating cryptic functionality through carefully- annotated source code examples and realistic projects. The Linux® Kernel Primer is the first book to offer in-depth coverage of the rapidly growing PowerPC Linux development platform, and the only book to thoroughly discuss kernel configuration with the Linux build system. Coverage includes Data structures x86 and PPC assembly language Viewing kernel internals Linux process model User and kernel space Interrupts and exceptions Memory allocation and tracking Tracing subsystem behavior I/O interactions Filesystems and file operations Scheduling and synchronization Kernel boot process Kernel build system Configuration options Device drivers And more... If you know C, this book teaches you all the skills and techniques you need to succeed with Linux kernel programming.
    [Show full text]
  • The Shellcoder 039 S Handbook Discovering And
    80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page iii The Shellcoder’s Handbook Discovering and Exploiting Security Holes Second Edition Chris Anley John Heasman Felix “FX” Linder Gerardo Richarte The Shellcoder’s Handbook: Discovering and Exploiting Security Holes (1st Edition) was written by Jack Koziol, David Litchfield, Dave Aitel, Chris Anley, Sinan Eren, Neel Mehta, and Riley Hassell. Wiley Publishing, Inc. 80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page ii 80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page i The Shellcoder’s Handbook Second Edition 80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page ii 80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page iii The Shellcoder’s Handbook Discovering and Exploiting Security Holes Second Edition Chris Anley John Heasman Felix “FX” Linder Gerardo Richarte The Shellcoder’s Handbook: Discovering and Exploiting Security Holes (1st Edition) was written by Jack Koziol, David Litchfield, Dave Aitel, Chris Anley, Sinan Eren, Neel Mehta, and Riley Hassell. Wiley Publishing, Inc. 80238ffirs.qxd:WileyRed 7/11/07 7:22 AM Page iv The Shellcoder’s Handbook, Second Edition: Discovering and Exploiting Security Holes Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2007 by Chris Anley, John Heasman, Felix “FX” Linder, and Gerardo Richarte Published by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-08023-8 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600.
    [Show full text]
  • CCES 2.9.0 C/C++ Compiler Manual for SHARC Processors
    CCES 2.9.0 C/C++ Compiler Manual for SHARC Processors (Includes SHARC+ and ARM Processors) Revision 2.2, May 2019 Part Number 82-100117-01 Analog Devices, Inc. One Technology Way Norwood, MA 02062-9106 Copyright Information ©2019 Analog Devices, Inc., ALL RIGHTS RESERVED. This document may not be reproduced in any form without prior, express written consent from Analog Devices, Inc. Printed in the USA. Disclaimer Analog Devices, Inc. reserves the right to change this product without prior notice. Information furnished by Ana- log Devices is believed to be accurate and reliable. However, no responsibility is assumed by Analog Devices for its use; nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under the patent rights of Analog Devices, Inc. Trademark and Service Mark Notice The Analog Devices logo, Blackfin, Blackin+, CrossCore, EngineerZone, EZ-Board, EZ-KIT, EZ-KIT Lite, EZ-Ex- tender, SHARC, SHARC+, and VisualDSP++ are registered trademarks of Analog Devices, Inc. EZ-KIT Mini is a trademark of Analog Devices, Inc. All other brand and product names are trademarks or service marks of their respective owners. CCES 2.9.0 C/C++ Compiler Manual for SHARC Processors ii Contents Preface Purpose of This Manual................................................................................................................................. 1±1 Intended Audience........................................................................................................................................
    [Show full text]