
ROOT and x32-ABI Nathalie Rauschmayr On behalf of LHCb collaboration March 13, 2013 Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 1 / 20 Outline 1 History of application binary interfaces Why is a new binary interface necessary? x32-ABI - A new 32-bit ABI for x86-64 Preconditions 2 How ROOT can profit Results within other CERN applications Results within ROOT-Benchmarks Reasons for performance differences 3 Support of x32-ABI 4 Conclusion Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 1 / 20 History of application binary interfaces x86: 32-bit word size ! 4GB addressable x86 64 as an extension of x86: runs 32-bit as well as 64-bit applications 64-bit mode supports new hardware features Number of CPU registers Floating-point performance Function parameters passed via registers Syscall instruction ... Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 2 / 20 Why is a new binary interface necessary? x86-64 (64-bit mode): new hardware features no memory limit slowdown due to memory issues contention page faults paging ... x86: very low memory footprint performance issues Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 3 / 20 x32-ABI - A new 32-bit ABI for x86-64 Application Binary Interface based on 64-bit x86 architecture Reduces size of pointers and C-datatype long to 32-bit Takes advantage of all x64-features Avoids memory overhead advantages of 64-bit instruction set + memory footprint of a 32-bit application Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 4 / 20 x32-ABI - A new 32-bit ABI for x86-64 Developed by H.J. Lu (Intel) Introduced in Linux-Kernel 3.4 (released in summer 2012) http://www.linuxplumbersconf.org/2011/ocw/sessions/531 Opinions in Linux-community differ quite a lot 32-bit time values will overflow adressable memory ... Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 5 / 20 x32-ABI - A new 32-bit ABI for x86-64 Impressive results from x32-developers: 181.mcf from SPEC CPU 2000 (memory bound): Intel Core i7 ∼ 40% faster than x86-64 ∼ 2% slower than ia32 Intel Atom ∼ 40% faster than x86-64 ∼ 1% faster than ia32 Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 6 / 20 x32-ABI - A new 32-bit ABI for x86-64 186.crafty from SPEC CPU 2000 (64bit integer): Intel Core i7 ∼ 3 % faster than x86-64 ∼ 40% faster than ia32 Intel Atom ∼ 4 % faster than x86-64 ∼ 26% faster than ia32 =) CERN applications will definitely reduce memory footprint and very likely CPU-time Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 7 / 20 Preconditions x32-ABI requires: Linux Kernel 3.4 compiled with CONFIG X86 X32=y Gcc 4.7 Binutils 2.22 Glibc 2.16 Recompiling all system libraries, required by an application, with gcc -mx32 ELF 32-bit LSB shared object, x86-64, version 1 (SYSV) Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 8 / 20 How ROOT can profit CERN applications use millions of pointers GAUDI framework stores all events as pointers Many virtual functions (virtual tables: function pointers and hidden pointers) ROOT (histogram as a tree of pointers ...) ... Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 9 / 20 Results within other CERN applications Memory Time 30 30 25 20 20 10 15 0 10 −10 5 Reduction of time in % Memory reduction in % −20 0 −30 −5 namd dealII soplex povray omnetpp astar xalancbmk namd dealII soplex povray omnetpp astar xalancbmk Resident Size (32 vs x32) Resident Size (64 vs x32) CPU Time (32 vs x32) CPU Time (64 vs x32) Figure : Comparison of memory consumption and CPU-time within the HEPSPEC-benchmarks Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 10 / 20 Results within other CERN applications LHCb Reconstruction with Gaudiv23r3 and ROOT 5.34.00 Reconstruction of 1000 Events: Physical Memory: -20 % Total elapsed: -2 % reduction LHCb Analysis with Gaudiv23r3 and ROOT 5.34.00 Analysis of 10000 Events: Physical Memory: -21 % Total elapsed: 1.5 % increase Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 11 / 20 Results within ROOT-Benchmarks Following ROOT-benchmarks: stressHistogram stress 1000 stressHepix IO linear algebra vector sparse matrix Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 12 / 20 Results within ROOT-Benchmarks Memory Time 30 60 25 50 20 40 15 30 10 20 Reduction of time in % Memory reduction in % 5 10 0 0 Histogram Events Spectrum Linear Fit Histogram Events Spectrum Linear Fit Resident Size (64 vs x32) Real Time (64 vs x32) CPU Time (64 vs x32) Figure : Comparison of memory consumption and CPU-time within the ROOT-benchmarks Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 13 / 20 Results within ROOT-Benchmarks stress 1000: Memory ·106 100 1 0:8 80 0:6 60 40 Memory in kB 0:4 Memory reduction in % 0:2 20 0 0 0 200 400 600 800 1;000 1;200 Histogram Events Spectrum Linear Fit Snapshots RSS m64 VSS m64 RSS x32 VSS x32 heap - profiling (64 vs x32) heap - maps (64 vs x32) Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 14 / 20 Reasons for performance differences Memory seen by system and by process are different Default thresholds of malloc: t = 4 · 1024 · 1024 · sizeof (long) if x > t: malloc uses mmap if x < t: malloc uses sbrk mmap: memory blocks can be independently given back anonymous memory blocks as an extension of the heap sbrk: memory blocks cannot be returned to the operating system Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 15 / 20 Reasons for performance differences Looking into dealII, omnetpp, ROOT stress: dealII : page faults reduced by 10% omnetpp: page faults reduced by 38% stress: difference in CPU-time likely caused by memory issues Further Issues: code and data alignment x32 loop unrolling needs some tuning zero-extensions for pointer conversion =) x32-ABI still under development and there are possiblities for optimization Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 16 / 20 Outline 1 History of application binary interfaces Why is a new binary interface necessary? x32-ABI - A new 32-bit ABI for x86-64 Preconditions 2 How ROOT can profit Results within other CERN applications Results within ROOT-Benchmarks Reasons for performance differences 3 Support of x32-ABI 4 Conclusion Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 16 / 20 Support of x32-ABI Ubuntu: Ongoing work, support planned for 13.04: https://blueprints.launchpad.net/ubuntu/+spec/ foundations-r-x32-planning Gentoo: Release candidate available since summer 2012: http://www.gentoo.org/news/20120608-x32_abi.xml Red Hat: No plans according to: http://comments.gmane.org/ gmane.linux.redhat.fedora.devel/164019 LLVM-compiler: supports x32-ABI since summer 2012 http: //www.phoronix.com/scan.php?page=news_item&px=MTI3NTk Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 17 / 20 Outline 1 History of application binary interfaces Why is a new binary interface necessary? x32-ABI - A new 32-bit ABI for x86-64 Preconditions 2 How ROOT can profit Results within other CERN applications Results within ROOT-Benchmarks Reasons for performance differences 3 Support of x32-ABI 4 Conclusion Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 17 / 20 Conclusion New room for improvement, ( CERN ) applications can profit substantially Computing Grid limits memory anyway to 4 GB per process Gain in performance is for FREE Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 18 / 20 Conclusion Recompilation: Cast from time values to long (...) will produce wrong results (xrootd) New pointer size required modifications in CINT (function and data pattern) Getting a working environment: does not work out of the box building gcc fails due to missing glibc x32 building glibc x32 with a partly built gcc Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 19 / 20 Any questions? Rauschmayr (CERN) ROOT and x32-ABI March 13, 2013 20 / 20.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages23 Page
-
File Size-