Retargeting Open64 to a RISC Processor

Total Page:16

File Type:pdf, Size:1020Kb

Retargeting Open64 to a RISC Processor Retargeting Open64 to A RISC processor -- A Student’s Perspective Huimin Cui, Xiaobing Feng Key Laboratory of Computer System and Architecture, Institute of Computing Technology, CAS, 100190 Beijing, China {cuihm, fxb}@ict.ac.cn compiler [6]. Abstract Open64 has been retargeted to a number of This paper presents a student’s experience architectures. Pathscale modified Open64 to in Open64-Mips prototype development, we create EkoPath, a compiler for the AMD64 and summarize three retargeting observations. X8664 architecture. The University of Open64 is easy to be retargeted and the Delaware's Computer Architecture and Parallel procedure takes only a short period. With the Systems Laboratory (CAPSL) modified Open64 retarget procedure done, the compiler can to create the Kylin Compiler, a compiler for achieve good and stable performance. Open64 Intel's X-Scale architecture [1]. Besides the also provides many supports for debugging, with targets mentioned above, there are several other which a beginner can debug the compiler supported targets including PowerPC [6], without difficulty. We also share some NVISA [9], Simplight [10] and Qualcomm[11]. experiences of our retarget, including In this paper, we will discuss three issues methodology of verifying the compiler when retargeting Open64: ease of retarget, framework, attention to switches in Open64, performance, and debuggability. The discussion importance of debugging and reading generated is based on our work in retargeting Open64 to code. the MIPS platform, using the Simplight branch as our starting point, which is a RISC style DSP 1 Introduction with mixed 32/16 bit instruction set. During our discussion, we will use GCC (Mips target) for Open64 receives contributions from a comparison. This retarget work is just a student number of compiler groups around the world, project, so we have some non-goals, which will from industry as well as from academia. It is be discussed in section 5. derived from the SGI MIPSPro64 compiler [1]. And we will also share some of our Good performance and industrial strength origin retargeting experiences, in four folds: 1) Steps to make the Open64 compiler a popular choice for verify the compiler framework. 2) Turn on/off research projects. The Open64 compiler supports switches for special hardware features. 3) C/C++ and Fortran95 languages and many Debugging is very important, not only for researchers from industry and academia have retarget, but also for later extensive work. 4) Pay contributed to the retargeting of the Open64 attention to the generated code if it doesn’t work 1 as expected. isa/<targ>/isa.cxx The paper is organized as follows. Section isa/<targ>/isa_properties.cxx 2 introduces the retarget procedure. Section 3 isa/<targ>/isa_subset.cxx presents our major retarget results and analysis. isa/<targ>/isa_registers.cxx Section 4 summarizes our experiences and isa/<targ>/isa_enums.cxx expectations. Section 5 shows our non-goals and isa/<targ>/isa_lits.cxx section 6 concludes the paper. isa/<targ>/isa_operands.cxx isa/<targ>/isa_hazards.cxx 2 Suggested Retarget Procedure isa/<targ>/isa_print.cxx isa/<targ>/isa_pack.cxx This section presents our retargeting isa/<targ>/isa_bundle.cxx procedure, and it is applicable if one can find a isa/<targ>/isa_decode.cxx similar processor in Open64’s supporting targets. isa/<targ>/isa_pseudo.cxx Here, letters (A,B,..) represent the file creation abi/<targ>/abi_properties.cxx order, and numbers (1,2...) represent the building proc/<targ>/proc.cxx order. proc/<targ>/proc_properties.cxx (A). Add make rules for your target(<targ>) in proc/<targ>/<targ>_si.cxx top level Makefile.gsetup This directory contains most of the basic (B). Create dir targia32_targ<targ> setting for retarget purposes. Most retarget This is the build directory for your new problems will be due to wrong settings in this target. Host is specified as ia32. directory. All these files should be written (C). Create and set up Makefile in each subdir carefully. under targia32_targ<targ> (3). Build targ_info Make sure the BUILD_TARGET is set (F). Deal with frontend in kgccfe and kg++fe correctly so it goes to the right target directory directories. for each sub component. In kgccfe/gnu, kgccfe/gnu/config, kg++fe/ Set your BUILD_VENDOR correctly. gnu, kg++fe/gnu/config, create directory of (1). Build include <targ> by copying from ia64 or x8664 and (D). In common/com, create <targ>/ config_ fixing the values and the include file names as targ.h well as define statement. Add macros for Is_Target_xxx, xxx Fix up the gnu_config.h to include the represents the supported processors of the corresponding config.h file for your target. new target, e.g, Is_Target_R10K() for MIPS. Fix up the <targ>/config.h, <targ>/ Add ABI macros for the new target. hconfig.h, <targ>/tconfig.h, <targ>/tm_p.h, to Set GP area size. include the corresponding file for your target. Set target debugging information. Make sure Makefile and Makefile.gbase In common/util, create <targ>c_qwmultu.c by know to go to gnu/<targ> directory also. It copying from ia64, or as appropriate. requires one to check whether gnu/<targ> has (2). Build libcmplrs, libiberty, libcomutil been added into SRC_DIRS. Sometimes, it helps to show the compile In include/elf.h, the #define for Elf32_Byte line during building of the compiler, simply and Elf64_Byte can always be enabled. change gcommondefs and gcommonrules in If need to add intrinsic, the following files linux/make/. needs changing. (E). In common/targ_info, create <targ> dir's intrinsic.def – defines and their files. Use this order: 2 INTRINSIC_ID, property of consistency. intrinsic (order is important, config_cache_targ.cxx sets cache models. indexed by INTRINSIC_ID) Targ_em_* files are for assembly output FE definition files, kgccfe/gnu control, section types, relocations, dwarf etc. Builtins.def – id, name, One only need to change these files when focus prototype, attribute in BE portion. Builtin-types.def – needed if new Basically one needs to tailor files type is needed mentioned about and then continue building Wfe_expr.cxx – translate GNU components as follows. builtin to WHIRL. (4). Build gccfe (G). Create the following files: (5). Build g++fe common/com/<targ>/config_platform.h (6). Build ir_tools common/com/<targ>/config_asm.h (H). Create be/be/<targ> and be/com/<targ> common/com/<targ>/config_cache_targ.cxx directories common/com/<targ>/config_elf_targ.cxx Create driver_targ.cxx, fill_align_targ.cxx common/com/<targ>/config_host.c in be/be/<targ> dir common/com/<targ>/config_platform.c Create betarget.cxx, sections.cxx in common/com/<targ>/config_targ.cxx be/com/<targ> dir common/com/<targ>/config_targ.h There is no need to be optimal at this point, common/com/<targ>/config_targ_opt.cxx just make it work, and then make it perform. e.g. common/com/<targ>/config_targ_opt.h One can set Can_Do_Fast_Multiply to return common/com/<targ>/targ_const.cxx FALSE, and return to that later. common/com/<targ>/targ_const_private.h (7). Build be common/com/<targ>/targ_ctrl.h (8). Build libelf, libelfutil, libdwarf, libunwindP common/com/<targ>/targ_em_const.cxx (I). Create be/cg/<targ>/register_targ.h common/com/<targ>/targ_em_dwarf.cxx be/cg/<targ>/tn_targ.h common/com/<targ>/targ_em_dwarf.h be/cg/<targ>/op_targ.h common/com/<targ>/targ_em_elf.cxx Create one’s own target specific portion in common/com/<targ>/targ_em_elf.h lib/elf.h (or appropriate elf.h), define the right common/com/<targ>/targ_sim.cxx relocations, sections, etc. common/com/<targ>/targ_sim.h (9). Build wopt All these files can be copied from the (J). Create files in be/cg/<targ> directory of a similar target, and modified as Fix cgtarget_arch.h, such as CGTARG_ required. Copy_Op, ... Make sure the endianness is set correctly. It There might be needs to add more things in is set in config_host.c, config_targ.cxx and be/cg/op.h common/com/config.cxx. Fix cgdwarf_targ.cxx, Find_Spill_TN, ..., Make sure the assembly syntax for .s file unwind table in dwarf. output is set correctly. It is set in config_asm.c. Copy other files from a similar processor’s Make sure the calling convention and directory. Go over expand.cxx, whir2ops.cxx parameter passing are set correctly. They are set carefully. And exp_branch.cxx, exp_divrem.cxx, in targ_sim.h and targ_sim.cxx. If you need your exp_loadstore.cxx, expand.cxx, entry_exit_targ. own ABI definition, just change this file, and do cxx needs to be changed or tailored substantially. not forget to modify targ_sim.cxx for (10). Build cg 3 (11). Build driver 3.2 Main Results (12). Build ipl (13). Build lno Observation 1 (See also section 3.3.1). A (14). Build inline student can complete the retarget procedure in (15). Build whirl2c about two months, if an appropriate supported (16). Build ipa target processor can be used to start the retargeting. In our case, the Simplight SL1 3 Retarget Results & Discussion processor is a RISC based processor with 16 bit instructions and DSP extensions. We chose this 3.1 Machine Assumption & as our starting point for its RISC basis as ISA. benchmarks Observation 2 (See also section 3.3.2). Performance of the generated Open64 compiler Machine Assumption: is satisfying, in two folds. Processor: Loongson 2f, out of order 1. We expect the retargeted Open64 ISA: MIPS3 compiler will generate very good code due to the Frequency: 666MHz excellent middle ends such as Wopt and Lno Issues: 4 components. Indeed, when we achieved ALU: 2 competitive or better performance compared to a FALU: 2 matured GCC compiler, for all the benchmarks MEM Unit: 1 we tested as a result of our effort. It is customary to use SPEC CPU2000 or 2. Open64 provides obvious performance 2006 to evaluate a compiler performance, but improvement with optimization option switched that is our non-goal (discussed in section 5). As a from –O2 to –O3, especially for C++ programs student project, we only consider the following with higher abstraction levels.
Recommended publications
  • Resourceable, Retargetable, Modular Instruction Selection Using a Machine-Independent, Type-Based Tiling of Low-Level Intermediate Code
    Reprinted from Proceedings of the 2011 ACM Symposium on Principles of Programming Languages (POPL’11) Resourceable, Retargetable, Modular Instruction Selection Using a Machine-Independent, Type-Based Tiling of Low-Level Intermediate Code Norman Ramsey Joao˜ Dias Department of Computer Science, Tufts University Department of Computer Science, Tufts University [email protected] [email protected] Abstract Our compiler infrastructure is built on C--, an abstraction that encapsulates an optimizing code generator so it can be reused We present a novel variation on the standard technique of selecting with multiple source languages and multiple target machines (Pey- instructions by tiling an intermediate-code tree. Typical compilers ton Jones, Ramsey, and Reig 1999; Ramsey and Peyton Jones use a different set of tiles for every target machine. By analyzing a 2000). C-- accommodates multiple source languages by providing formal model of machine-level computation, we have developed a two main interfaces: the C-- language is a machine-independent, single set of tiles that is machine-independent while retaining the language-independent target language for front ends; the C-- run- expressive power of machine code. Using this tileset, we reduce the time interface is an API which gives the run-time system access to number of tilers required from one per machine to one per archi- the states of suspended computations. tectural family (e.g., register architecture or stack architecture). Be- cause the tiler is the part of the instruction selector that is most dif- C-- is not a universal intermediate language (Conway 1958) or ficult to reason about, our technique makes it possible to retarget an a “write-once, run-anywhere” intermediate language encapsulating instruction selector with significantly less effort than standard tech- a rigidly defined compiler and run-time system (Lindholm and niques.
    [Show full text]
  • The LLVM Instruction Set and Compilation Strategy
    The LLVM Instruction Set and Compilation Strategy Chris Lattner Vikram Adve University of Illinois at Urbana-Champaign lattner,vadve ¡ @cs.uiuc.edu Abstract This document introduces the LLVM compiler infrastructure and instruction set, a simple approach that enables sophisticated code transformations at link time, runtime, and in the field. It is a pragmatic approach to compilation, interfering with programmers and tools as little as possible, while still retaining extensive high-level information from source-level compilers for later stages of an application’s lifetime. We describe the LLVM instruction set, the design of the LLVM system, and some of its key components. 1 Introduction Modern programming languages and software practices aim to support more reliable, flexible, and powerful software applications, increase programmer productivity, and provide higher level semantic information to the compiler. Un- fortunately, traditional approaches to compilation either fail to extract sufficient performance from the program (by not using interprocedural analysis or profile information) or interfere with the build process substantially (by requiring build scripts to be modified for either profiling or interprocedural optimization). Furthermore, they do not support optimization either at runtime or after an application has been installed at an end-user’s site, when the most relevant information about actual usage patterns would be available. The LLVM Compilation Strategy is designed to enable effective multi-stage optimization (at compile-time, link-time, runtime, and offline) and more effective profile-driven optimization, and to do so without changes to the traditional build process or programmer intervention. LLVM (Low Level Virtual Machine) is a compilation strategy that uses a low-level virtual instruction set with rich type information as a common code representation for all phases of compilation.
    [Show full text]
  • Tech Tools Tech Tools
    NEWS Tech Tools Tech Tools LLVM 3.0 Released Low Level Virtual Machine (LLVM) recently announced ver- replacing the C/ Objective-C compiler in the GCC system with a sion 3.0 of it compiler infrastructure. Originally implemented more easily integrated system and wider support for multithread- for C/ C++, the language-agnostic design of LLVM has ing. Other features include a new register allocator (which can spawned a wide variety of provide substantial perfor- front ends, including Objec- mance improvements in gen- tive-C, Fortran, Ada, Haskell, erated code), full support for Java bytecode, Python, atomic operations and the Ruby, ActionScript, GLSL, new C++ memory model, Clang, and others. and major improvement in The new release of LLVM the MIPS back end. represents six months of de- All LLVM releases are velopment over the previous available for immediate version and includes several download from the LLVM re- major changes, including dis- leases web site at: http:// continued support for llvm- llvm.​­org/​­releases/. For more gcc; the developers recom- information about LLVM, mend switching to Clang or visit the main LLVM website DragonEgg. Clang is aimed at at: http://​­llvm.​­org/. YaCy Search Engine Online Open64 5.0 Released The YaCy project has released version 1.0 of its Open64, the open source (GPLv2-licensed) peer-to-peer Free Software search engine. YaCy compiler for C/C++ and Fortran that’s does not use a central server; instead, its search re- backed by AMD and has been developed by sults come from a network of independent peers. SGI, HP, and various universities and re- According to the announcement, in this type of dis- search organizations, recently released ver- tributed network, no single entity decides which sion 5.0.
    [Show full text]
  • Automatic Isolation of Compiler Errors
    Automatic Isolation of Compiler Errors DAVID B.WHALLEY Flor ida State University This paper describes a tool called vpoiso that was developed to automatically isolate errors in the vpo com- piler system. The twogeneral types of compiler errors isolated by this tool are optimization and nonopti- mization errors. When isolating optimization errors, vpoiso relies on the vpo optimizer to identify sequences of changes, referred to as transformations, that result in semantically equivalent code and to pro- vide the ability to stop performing improving (or unnecessary) transformations after a specified number have been performed. Acompilation of a typical program by vpo often results in thousands of improving transformations being performed. The vpoiso tool can automatically isolate the first improving transforma- tion that causes incorrect output of the execution of the compiled program by using a binary search that varies the number of improving transformations performed. Not only is the illegaltransformation automati- cally isolated, but vpoiso also identifies the location and instant the transformation is performed in vpo. Nonoptimization errors occur from problems in the front end, code generator,and necessary transforma- tions in the optimizer.Ifanother compiler is available that can produce correct (but perhaps more ineffi- cient) code, then vpoiso can isolate nonoptimization errors to a single function. Automatic isolation of compiler errors facilitates retargeting a compiler to a newmachine, maintenance of the compiler,and sup- porting experimentation with newoptimizations. General Terms: Compilers, Testing Additional Key Words and Phrases: Diagnosis procedures, nonoptimization errors, optimization errors 1. INTRODUCTION To increase portability compilers are often split into twoparts, a front end and a back end.
    [Show full text]
  • Automatic Derivation of Compiler Machine Descriptions
    Automatic Derivation of Compiler Machine Descriptions CHRISTIAN S. COLLBERG University of Arizona We describe a method designed to significantly reduce the effort required to retarget a compiler to a new architecture, while at the same time producing fast and effective compilers. The basic idea is to use the native C compiler at compiler construction time to discover architectural features of the new architecture. From this information a formal machine description is produced. Given this machine description, a native code-generator can be generated by a back-end generator such as BEG or burg. A prototype automatic Architecture Discovery Tool (called ADT) has been implemented. This tool is completely automatic and requires minimal input from the user. Given the Internet address of the target machine and the command-lines by which the native C compiler, assembler, and linker are invoked, ADT will generate a BEG machine specification containing the register set, addressing modes, instruction set, and instruction timings for the architecture. The current version of ADT is general enough to produce machine descriptions for the integer instruction sets of common RISC and CISC architectures such as the Sun SPARC, Digital Alpha, MIPS, DEC VAX, and Intel x86. Categories and Subject Descriptors: D.2.7 [Software Engineering]: Distribution, Maintenance, and Enhancement—Portability; D.3.2 [Programming Languages]: Language Classifications— Macro and assembly languages; D.3.4 [Programming Languages]: Processors—Translator writ- ing systems and compiler generators General Terms: Languages Additional Key Words and Phrases: Back-end generators, compiler configuration scripts, retargeting 1. INTRODUCTION An important aspect of a compiler implementation is its retargetability.For example, a new programming language whose compiler can be quickly retar- geted to a new hardware platform or a new operating system is more likely to gain widespread acceptance than a language whose compiler requires extensive retargeting effort.
    [Show full text]
  • Sun Microsystems: Sun Java Workstation W1100z
    CFP2000 Result spec Copyright 1999-2004, Standard Performance Evaluation Corporation Sun Microsystems SPECfp2000 = 1787 Sun Java Workstation W1100z SPECfp_base2000 = 1637 SPEC license #: 6 Tested by: Sun Microsystems, Santa Clara Test date: Jul-2004 Hardware Avail: Jul-2004 Software Avail: Jul-2004 Reference Base Base Benchmark Time Runtime Ratio Runtime Ratio 1000 2000 3000 4000 168.wupwise 1600 92.3 1733 71.8 2229 171.swim 3100 134 2310 123 2519 172.mgrid 1800 124 1447 103 1755 173.applu 2100 150 1402 133 1579 177.mesa 1400 76.1 1839 70.1 1998 178.galgel 2900 104 2789 94.4 3073 179.art 2600 146 1786 106 2464 183.equake 1300 93.7 1387 89.1 1460 187.facerec 1900 80.1 2373 80.1 2373 188.ammp 2200 158 1397 154 1425 189.lucas 2000 121 1649 122 1637 191.fma3d 2100 135 1552 135 1552 200.sixtrack 1100 148 742 148 742 301.apsi 2600 170 1529 168 1549 Hardware Software CPU: AMD Opteron (TM) 150 Operating System: Red Hat Enterprise Linux WS 3 (AMD64) CPU MHz: 2400 Compiler: PathScale EKO Compiler Suite, Release 1.1 FPU: Integrated Red Hat gcc 3.5 ssa (from RHEL WS 3) PGI Fortran 5.2 (build 5.2-0E) CPU(s) enabled: 1 core, 1 chip, 1 core/chip AMD Core Math Library (Version 2.0) for AMD64 CPU(s) orderable: 1 File System: Linux/ext3 Parallel: No System State: Multi-user, Run level 3 Primary Cache: 64KBI + 64KBD on chip Secondary Cache: 1024KB (I+D) on chip L3 Cache: N/A Other Cache: N/A Memory: 4x1GB, PC3200 CL3 DDR SDRAM ECC Registered Disk Subsystem: IDE, 80GB, 7200RPM Other Hardware: None Notes/Tuning Information A two-pass compilation method is
    [Show full text]
  • Cray Scientific Libraries Overview
    Cray Scientific Libraries Overview What are libraries for? ● Building blocks for writing scientific applications ● Historically – allowed the first forms of code re-use ● Later – became ways of running optimized code ● Today the complexity of the hardware is very high ● The Cray PE insulates users from this complexity • Cray module environment • CCE • Performance tools • Tuned MPI libraries (+PGAS) • Optimized Scientific libraries Cray Scientific Libraries are designed to provide the maximum possible performance from Cray systems with minimum effort. Scientific libraries on XC – functional view FFT Sparse Dense Trilinos BLAS FFTW LAPACK PETSc ScaLAPACK CRAFFT CASK IRT What makes Cray libraries special 1. Node performance ● Highly tuned routines at the low-level (ex. BLAS) 2. Network performance ● Optimized for network performance ● Overlap between communication and computation ● Use the best available low-level mechanism ● Use adaptive parallel algorithms 3. Highly adaptive software ● Use auto-tuning and adaptation to give the user the known best (or very good) codes at runtime 4. Productivity features ● Simple interfaces into complex software LibSci usage ● LibSci ● The drivers should do it all for you – no need to explicitly link ● For threads, set OMP_NUM_THREADS ● Threading is used within LibSci ● If you call within a parallel region, single thread used ● FFTW ● module load fftw (there are also wisdom files available) ● PETSc ● module load petsc (or module load petsc-complex) ● Use as you would your normal PETSc build ● Trilinos ●
    [Show full text]
  • Porting Open64 to the Cygwin Environment
    Porting Open64 to the Cygwin Environment Nathan Tallent Mike Fagan ∗ November 2003 Abstract Cygwin is a Linux-like environment for Windows. It is sufficiently com- plete and stable that porting even very large codes to the environment is relatively straightforward. Cygwin is easy enough to download and install that it provides a convenient platform for traveling with code and giving conference demonstra- tions (via laptop) – even potentially expanding a code’s audience and user base (via desktop). However, for various reasons, Cygwin is not 100% compatible with Linux. We describe the major problems we encountered porting Rice University’s Open64/SL code base to Cygwin and present our solutions. 1 Introduction Cygwin is a Linux-like environment for Windows. It provides a Windows DLL that emulates nearly all of the Linux API. Programs written for Linux can link with the Cygwin DLL and thus run on Windows. A reasonably proficient Unix user can easily download and install a relatively complete Linux environment, including shells, de- velopment tools (GCC, make, gdb), editors (emacs, vi) and a graphical environment (XFree86). Moreover, the environment is complete and stable enough that it is rela- tively straightforward to port even very large codes to Cygwin. Because of the preva- lence of Windows-based desktops and laptops, Cygwin provides an attractive means for traveling with code and demonstrating it at conferences – even potentially expand- ing a code’s audience and user base. However, for full-time development or for critical applications, Unix remains superior. More information about Cygwin can be found on its web site, http://www.cygwin.com.
    [Show full text]
  • Introduchon to Arm for Network Stack Developers
    Introducon to Arm for network stack developers Pavel Shamis/Pasha Principal Research Engineer Mvapich User Group 2017 © 2017 Arm Limited Columbus, OH Outline • Arm Overview • HPC SoLware Stack • Porng on Arm • Evaluaon 2 © 2017 Arm Limited Arm Overview © 2017 Arm Limited An introduc1on to Arm Arm is the world's leading semiconductor intellectual property supplier. We license to over 350 partners, are present in 95% of smart phones, 80% of digital cameras, 35% of all electronic devices, and a total of 60 billion Arm cores have been shipped since 1990. Our CPU business model: License technology to partners, who use it to create their own system-on-chip (SoC) products. We may license an instrucBon set architecture (ISA) such as “ARMv8-A”) or a specific implementaon, such as “Cortex-A72”. …and our IP extends beyond the CPU Partners who license an ISA can create their own implementaon, as long as it passes the compliance tests. 4 © 2017 Arm Limited A partnership business model A business model that shares success Business Development • Everyone in the value chain benefits Arm Licenses technology to Partner • Long term sustainability SemiCo Design once and reuse is fundamental IP Partner Licence fee • Spread the cost amongst many partners Provider • Technology reused across mulBple applicaons Partners develop • Creates market for ecosystem to target chips – Re-use is also fundamental to the ecosystem Royalty Upfront license fee OEM • Covers the development cost Customer Ongoing royalBes OEM sells • Typically based on a percentage of chip price
    [Show full text]
  • Sun Microsystems
    OMPM2001 Result spec Copyright 1999-2002, Standard Performance Evaluation Corporation Sun Microsystems SPECompMpeak2001 = 11223 Sun Fire V40z SPECompMbase2001 = 10862 SPEC license #:HPG0010 Tested by: Sun Microsystems, Santa Clara Test site: Menlo Park Test date: Feb-2005 Hardware Avail:Apr-2005 Software Avail:Apr-2005 Reference Base Base Peak Peak Benchmark Time Runtime Ratio Runtime Ratio 10000 20000 30000 310.wupwise_m 6000 432 13882 431 13924 312.swim_m 6000 424 14142 401 14962 314.mgrid_m 7300 622 11729 622 11729 316.applu_m 4000 536 7463 419 9540 318.galgel_m 5100 483 10561 481 10593 320.equake_m 2600 240 10822 240 10822 324.apsi_m 3400 297 11442 282 12046 326.gafort_m 8700 869 10011 869 10011 328.fma3d_m 4600 608 7566 608 7566 330.art_m 6400 226 28323 226 28323 332.ammp_m 7000 1359 5152 1359 5152 Hardware Software CPU: AMD Opteron (TM) 852 OpenMP Threads: 4 CPU MHz: 2600 Parallel: OpenMP FPU: Integrated Operating System: SUSE LINUX Enterprise Server 9 CPU(s) enabled: 4 cores, 4 chips, 1 core/chip Compiler: PathScale EKOPath Compiler Suite, Release 2.1 CPU(s) orderable: 1,2,4 File System: ufs Primary Cache: 64KBI + 64KBD on chip System State: Multi-User Secondary Cache: 1024KB (I+D) on chip L3 Cache: None Other Cache: None Memory: 16GB (8x2GB, PC3200 CL3 DDR SDRAM ECC Registered) Disk Subsystem: SCSI, 73GB, 10K RPM Other Hardware: None Notes/Tuning Information Baseline Optimization Flags: Fortran : -Ofast -OPT:early_mp=on -mcmodel=medium -mp C : -Ofast -mp Peak Optimization Flags: 310.wupwise_m : -mp -Ofast -mcmodel=medium -LNO:prefetch_ahead=5:prefetch=3
    [Show full text]
  • Best Practice Guide - Generic X86 Vegard Eide, NTNU Nikos Anastopoulos, GRNET Henrik Nagel, NTNU 02-05-2013
    Best Practice Guide - Generic x86 Vegard Eide, NTNU Nikos Anastopoulos, GRNET Henrik Nagel, NTNU 02-05-2013 1 Best Practice Guide - Generic x86 Table of Contents 1. Introduction .............................................................................................................................. 3 2. x86 - Basic Properties ................................................................................................................ 3 2.1. Basic Properties .............................................................................................................. 3 2.2. Simultaneous Multithreading ............................................................................................. 4 3. Programming Environment ......................................................................................................... 5 3.1. Modules ........................................................................................................................ 5 3.2. Compiling ..................................................................................................................... 6 3.2.1. Compilers ........................................................................................................... 6 3.2.2. General Compiler Flags ......................................................................................... 6 3.2.2.1. GCC ........................................................................................................ 6 3.2.2.2. Intel ........................................................................................................
    [Show full text]
  • Automated Malware Analysis Report for Funny Linux.Elf
    ID: 449051 Sample Name: funny_linux.elf Cookbook: defaultlinuxfilecookbook.jbs Time: 05:20:57 Date: 15/07/2021 Version: 33.0.0 White Diamond Table of Contents Table of Contents 2 Linux Analysis Report funny_linux.elf 3 Overview 3 General Information 3 Detection 3 Signatures 3 Classification 3 Analysis Advice 3 General Information 3 Process Tree 3 Yara Overview 3 Jbx Signature Overview 4 Mitre Att&ck Matrix 4 Malware Configuration 4 Behavior Graph 4 Antivirus, Machine Learning and Genetic Malware Detection 5 Initial Sample 5 Dropped Files 5 Domains 5 URLs 5 Domains and IPs 5 Contacted Domains 5 Contacted IPs 5 Runtime Messages 6 Joe Sandbox View / Context 6 IPs 6 Domains 6 ASN 6 JA3 Fingerprints 6 Dropped Files 6 Created / dropped Files 6 Static File Info 6 General 6 Static ELF Info 7 ELF header 7 Sections 7 Program Segments 8 Dynamic Tags 8 Symbols 8 Network Behavior 10 System Behavior 10 Analysis Process: funny_linux.elf PID: 4576 Parent PID: 4498 10 General 10 File Activities 10 File Read 10 Copyright Joe Security LLC 2021 Page 2 of 10 Linux Analysis Report funny_linux.elf Overview General Information Detection Signatures Classification Sample funny_linux.elf Name: SSaampplllee hhaass sstttrrriiippppeedd ssyymbboolll tttaabblllee Analysis ID: 449051 Sample has stripped symbol table MD5: e0ba4089e9b457… Ransomware SHA1: 21b3392a2fdab2a… Miner Spreading SHA256: d2544462756205… mmaallliiiccciiioouusss malicious Evader Phishing Infos: sssuusssppiiiccciiioouusss suspicious cccllleeaann clean Exploiter Banker Spyware Trojan / Bot Adware Score:
    [Show full text]