And 64-Bit Windows® Operating Systems
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
07 Vectorization for Intel C++ & Fortran Compiler .Pdf
Vectorization for Intel® C++ & Fortran Compiler Presenter: Georg Zitzlsberger Date: 10-07-2015 1 Agenda • Introduction to SIMD for Intel® Architecture • Compiler & Vectorization • Validating Vectorization Success • Reasons for Vectorization Fails • Intel® Cilk™ Plus • Summary 2 Optimization Notice Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Vectorization • Single Instruction Multiple Data (SIMD): . Processing vector with a single operation . Provides data level parallelism (DLP) . Because of DLP more efficient than scalar processing • Vector: . Consists of more than one element . Elements are of same scalar data types (e.g. floats, integers, …) • Vector length (VL): Elements of the vector A B AAi i BBi i A B Ai i Bi i Scalar Vector Processing + Processing + C CCi i C Ci i VL 3 Optimization Notice Copyright © 2015, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Evolution of SIMD for Intel Processors Present & Future: Goal: Intel® MIC Architecture, 8x peak FLOPs (FMA) over 4 generations! Intel® AVX-512: • 512 bit Vectors • 2x FP/Load/FMA 4th Generation Intel® Core™ Processors Intel® AVX2 (256 bit): • 2x FMA peak Performance/Core • Gather Instructions 2nd Generation 3rd Generation Intel® Core™ Processors Intel® Core™ Processors Intel® AVX (256 bit): • Half-float support • 2x FP Throughput • Random Numbers • 2x Load Throughput Since 1999: Now & 2010 2012 2013 128 bit Vectors Future Time 4 Optimization Notice -
Exploring the X64
Exploring the x64 Junichi Murakami Executive Officer, Director of Research Fourteenforty Research Institute, Inc. Who am I? • Junichi Murakami – @Fourteenforty Research Institute, Inc. – Both Windows and Linux kernel development – Reversing malware and P2P software, etc. – Speaker at: • Black Hat 2008 US and Japan, AVAR 2009, RSA Conference(2009-) – Instructor at Security & Programming Camp(2006-) 2 Environment • Windows 7 x64 Edition • Visual Studio 2008 • Windbg • IDA Pro Advanced – STD doesn’t support x64, an offering is needed! 4 Agenda • Windows x64 • ABI(Application Binary Interface) • API Hooking • Code Injection 5 Windows x64 • Native x64 and WoW64 • Virtual Address Space – 2^64 = 16 Exa Byte ( Exa: 10^18) – but, limited to 16TB by Microsoft • File/Registry reflection • New 64-bit APIs – IsWow64Process, GetNativeSystemInfo, etc. 6 ABI • Binary Format • Register • Calling Convention • Exception Handling • Systemcall(x64, WoW64) 11 Binary Format(Cont.) • Some fields were extended to 64-bits – IMAGE_NT_HEADERS.IMAGE_OPTIONAL_HEADER • ImageBase • SizeOfStackReserve • SizeOfStackCommit • SizeOfHeapReserve • SizeOfHeapCommit 13 Calling Convention • first 4 parameters are passed by RCX, RDX, R8, R9 – 5th and later are passed on the stack • caller allocates register home space on the stack • RAX is used for return values • leaf / non-leaf function – leaf function: never use stack – PE32+ contains non-leaf function’s information in its EXCEPTION DIRECTORY • Register’s volatility – volatile: RAX, RCX, RDX, R8-R11 15 Exception Handling • -
Through the Looking Glass: Webcam Interception and Protection in Kernel
VIRUS BULLETIN www.virusbulletin.com Covering the global threat landscape THROUGH THE LOOKING GLASS: and WIA (Windows Image Acquisition), which provides a WEBCAM INTERCEPTION AND still image acquisition API. PROTECTION IN KERNEL MODE ATTACK VECTORS Ronen Slavin & Michael Maltsev Reason Software, USA Let’s pretend for a moment that we’re the bad guys. We have gained control of a victim’s computer and we can run any code on it. We would like to use his camera to get a photo or a video to use for our nefarious purposes. What are our INTRODUCTION options? When we talk about digital privacy, the computer’s webcam The simplest option is just to use one of the user-mode APIs is one of the most relevant components. We all have a tiny mentioned previously. By default, Windows allows every fear that someone might be looking through our computer’s app to access the computer’s camera, with the exception of camera, spying on us and watching our every move [1]. And Store apps on Windows 10. The downside for the attackers is while some of us think this scenario is restricted to the realm that camera access will turn on the indicator LED, giving the of movies, the reality is that malware authors and threat victim an indication that somebody is watching him. actors don’t shy away from incorporating such capabilities A sneakier method is to spy on the victim when he turns on into their malware arsenals [2]. the camera himself. Patrick Wardle described a technique Camera manufacturers protect their customers by incorporating like this for Mac [8], but there’s no reason the principle into their devices an indicator LED that illuminates when can’t be applied to Windows, albeit with a slightly different the camera is in use. -
Analysis of SIMD Applicability to SHA Algorithms O
1 Analysis of SIMD Applicability to SHA Algorithms O. Aciicmez Abstract— It is possible to increase the speed and throughput of The remainder of the paper is organized as follows: In an algorithm using parallelization techniques. Single-Instruction section 2 and 3, we introduce SIMD concept and the SIMD Multiple-Data (SIMD) is a parallel computation model, which has architecture of Intel including MMX technology and SSE already employed by most of the current processor families. In this paper we will analyze four SHA algorithms and determine extensions. Section 4 describes SHA algorithm and Section possible performance gains that can be achieved using SIMD 5 discusses the possible improvements on SHA performance parallelism. We will point out the appropriate parts of each that can be achieved by using SIMD instructions. algorithm, where SIMD instructions can be used. II. SIMD PARALLEL PROCESSING I. INTRODUCTION Single-instruction multiple-data execution model allows Today the security of a cryptographic mechanism is not the several data elements to be processed at the same time. The only concern of cryptographers. The heavy communication conventional scalar execution model, which is called single- traffic on contemporary very large network of interconnected instruction single-data (SISD) deals only with one pair of data devices demands a great bandwidth for security protocols, and at a time. The programs using SIMD instructions can run much hence increasing the importance of speed and throughput of a faster than their scalar counterparts. However SIMD enabled cryptographic mechanism. programs are harder to design and implement. A straightforward approach to improve cryptographic per- The most common use of SIMD instructions is to perform formance is to implement cryptographic algorithms in hard- parallel arithmetic or logical operations on multiple data ware. -
Fibre Channel Interface
Fibre Channel Interface Fibre Channel Interface ©2006, Seagate Technology LLC All rights reserved Publication number: 100293070, Rev. A March 2006 Seagate and Seagate Technology are registered trademarks of Seagate Technology LLC. SeaTools, SeaFONE, SeaBOARD, SeaTDD, and the Wave logo are either registered trade- marks or trademarks of Seagate Technology LLC. Other product names are registered trade- marks or trademarks of their owners. Seagate reserves the right to change, without notice, product offerings or specifications. No part of this publication may be reproduced in any form without written permission of Seagate Technol- ogy LLC. Revision status summary sheet Revision Date Writer/Engineer Sheets Affected A 03/08/06 C. Chalupa/J. Coomes All iv Fibre Channel Interface Manual, Rev. A Contents 1.0 Contents . i 2.0 Publication overview . 1 2.1 Acknowledgements . 1 2.2 How to use this manual . 1 2.3 General interface description. 2 3.0 Introduction to Fibre Channel . 3 3.1 General information . 3 3.2 Channels vs. networks . 4 3.3 The advantages of Fibre Channel . 4 4.0 Fibre Channel standards . 5 4.1 General information . 6 4.1.1 Description of Fibre Channel levels . 6 4.1.1.1 FC-0 . .6 4.1.1.2 FC-1 . .6 4.1.1.3 FC-1.5 . .6 4.1.1.4 FC-2 . .6 4.1.1.5 FC-3 . .6 4.1.1.6 FC-4 . .7 4.1.2 Relationship between the levels. 7 4.1.3 Topology standards . 7 4.1.4 FC Implementation Guide (FC-IG) . 7 4.1.5 Applicable Documents . -
AMD Opteron™ Shared Memory MP Systems Ardsher Ahmed Pat Conway Bill Hughes Fred Weber Agenda
AMD Opteron™ Shared Memory MP Systems Ardsher Ahmed Pat Conway Bill Hughes Fred Weber Agenda • Glueless MP systems • MP system configurations • Cache coherence protocol • 2-, 4-, and 8-way MP system topologies • Beyond 8-way MP systems September 22, 2002 Hot Chips 14 2 AMD Opteron™ Processor Architecture DRAM 5.3 GB/s 128-bit MCT CPU SRQ XBAR HT HT HT 3.2 GB/s per direction 3.2 GB/s per direction @ 0+]'DWD5DWH @ 0+]'DWD5DWH 3.2 GB/s per direction @ 0+]'DWD5DWH HT = HyperTransport™ technology September 22, 2002 Hot Chips 14 3 Glueless MP System DRAM DRAM MCT CPU MCT CPU SRQ SRQ non-Coherent HyperTransport™ Link XBAR XBAR HT I/O I/O I/O HT cHT cHT cHT cHT Coherent HyperTransport ™ cHT cHT I/O I/O HT HT cHT cHT XBAR XBAR CPU MCT CPU MCT SRQ SRQ HT = HyperTransport™ technology DRAM DRAM September 22, 2002 Hot Chips 14 4 MP Architecture • Programming model of memory is effectively SMP – Physical address space is flat and fully coherent – Far to near memory latency ratio in a 4P system is designed to be < 1.4 – Latency difference between remote and local memory is comparable to the difference between a DRAM page hit and a DRAM page conflict – DRAM locations can be contiguous or interleaved – No processor affinity or NUMA tuning required • MP support designed in from the beginning – Lower overall chip count results in outstanding system reliability – Memory Controller and XBAR operate at the processor frequency – Memory subsystem scale with frequency improvements September 22, 2002 Hot Chips 14 5 MP Architecture (contd.) • Integrated Memory Controller -
Minimum Hardware and Operating System
Hardware and OS Specifications File Stream Document Management Software – System Requirements for v4.5 NB: please read through carefully, as it contains 4 separate specifications for a Workstation PC, a Web PC, a Server and a Web Server. Further notes are at the foot of this document. If you are in any doubt as to which specification is applicable, please contact our Document Management Technical Support team – we will be pleased to help. www.filestreamsystems.co.uk T Support +44 (0) 118 989 3771 E Support [email protected] For an in-depth list of all our features and specifications, please visit: http://www.filestreamsystems.co.uk/document-management-specification.htm Workstation PC Processor (CPU) ⁴ Supported AMD/Intel x86 (32bit) or x64 (64bit) Compatible Minimum Intel Pentium IV single core 1.0 GHz Recommended Intel Core 2 Duo E8400 3.0 GHz or better Operating System ⁴ Supported Windows 8, Windows 8 Pro, Windows 8 Enterprise (32bit, 64bit) Windows 10 (32bit, 64bit) Memory (RAM) ⁵ Minimum 2.0 GB Recommended 4.0 GB Storage Space (Disk) Minimum 50 GB Recommended 100 GB Disk Format NTFS Format Recommended Graphics Card Minimum 128 MB DirectX 9 Compatible Recommended 128 MB DirectX 9 Compatible Display Minimum 1024 x 768 16bit colour Recommended 1280 x 1024 32bit colour Widescreen Format Yes (minimum vertical resolution 800) Dual Monitor Yes Font Settings Only 96 DPI font settings are supported Explorer Internet Minimum Microsoft Internet Explorer 11 Network (LAN) Minimum 100 MB Ethernet (not required on standalone PC) Recommended -
Architecture and Application of Infortrend Infiniband Design
Architecture and Application of Infortrend InfiniBand Design Application Note Version: 1.3 Updated: October, 2018 Abstract: Focusing on the architecture and application of InfiniBand technology, this document introduces the architecture, application scenarios and highlights of the Infortrend InfiniBand host module design. Infortrend InfiniBand Host Module Design Contents Contents ............................................................................................................................................. 2 What is InfiniBand .............................................................................................................................. 3 Overview and Background .................................................................................................... 3 Basics of InfiniBand .............................................................................................................. 3 Hardware ....................................................................................................................... 3 Architecture ................................................................................................................... 4 Application Scenarios for HPC ............................................................................................................. 5 Current Limitation .............................................................................................................................. 6 Infortrend InfiniBand Host Board Design ............................................................................................ -
SIMD Extensions
SIMD Extensions PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 12 May 2012 17:14:46 UTC Contents Articles SIMD 1 MMX (instruction set) 6 3DNow! 8 Streaming SIMD Extensions 12 SSE2 16 SSE3 18 SSSE3 20 SSE4 22 SSE5 26 Advanced Vector Extensions 28 CVT16 instruction set 31 XOP instruction set 31 References Article Sources and Contributors 33 Image Sources, Licenses and Contributors 34 Article Licenses License 35 SIMD 1 SIMD Single instruction Multiple instruction Single data SISD MISD Multiple data SIMD MIMD Single instruction, multiple data (SIMD), is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data simultaneously. Thus, such machines exploit data level parallelism. History The first use of SIMD instructions was in vector supercomputers of the early 1970s such as the CDC Star-100 and the Texas Instruments ASC, which could operate on a vector of data with a single instruction. Vector processing was especially popularized by Cray in the 1970s and 1980s. Vector-processing architectures are now considered separate from SIMD machines, based on the fact that vector machines processed the vectors one word at a time through pipelined processors (though still based on a single instruction), whereas modern SIMD machines process all elements of the vector simultaneously.[1] The first era of modern SIMD machines was characterized by massively parallel processing-style supercomputers such as the Thinking Machines CM-1 and CM-2. These machines had many limited-functionality processors that would work in parallel. -
Chapter 6 MIDI, SCSI, and Sample Dumps
MIDI, SCSI, and Sample Dumps SCSI Guidelines Chapter 6 MIDI, SCSI, and Sample Dumps SCSI Guidelines The following sections contain information on using SCSI with the K2600, as well as speciÞc sections dealing with the Mac and the K2600. Disk Size Restrictions The K2600 accepts hard disks with up to 2 gigabytes of storage capacity. If you attach an unformatted disk that is larger than 2 gigabytes, the K2600 will still be able to format it, but only as a 2 gigabyte disk. If you attach a formatted disk larger than 2 gigabytes, the K2600 will not be able to work with it; you could reformat the disk, but thisÑof courseÑwould erase the disk entirely. Configuring a SCSI Chain Here are some basic guidelines to follow when conÞguring a SCSI chain: 1. According to the SCSI SpeciÞcation, the maximum SCSI cable length is 6 meters (19.69 feet). You should limit the total length of all SCSI cables connecting external SCSI devices with Kurzweil products to 17 feet (5.2 meters). To calculate the total SCSI cable length, add the lengths of all SCSI cables, plus eight inches for every external SCSI device connected. No single cable length in the chain should exceed eight feet. 2. The Þrst and last devices in the chain must be terminated. There is a single exception to this rule, however. A K2600 with an internal hard drive and no external SCSI devices attached should have its termination disabled. If you later add an external device to the K2600Õs SCSI chain, you must enable the K2600Õs termination at that time. -
Application Note 904 an Introduction to the Differential SCSI Interface
DS36954 Application Note 904 An Introduction to the Differential SCSI Interface Literature Number: SNLA033 An Introduction to the Differential SCSI Interface AN-904 National Semiconductor An Introduction to the Application Note 904 John Goldie Differential SCSI Interface August 1993 OVERVIEW different devices to be connected to the same daisy chained The scope of this application note is to provide an introduc- cable (SCSI-1 and 2 allows up to eight devices while the pro- tion to the SCSI Parallel Interface and insight into the differ- posed SCSI-3 standard will allow up to 32 devices). A typical ential option specified by the SCSI standards. This applica- SCSI bus configuration is shown in Figure 1. tion covers the following topics: WHY DIFFERENTIAL SCSI? • The SCSI Interface In comparison to single-ended SCSI, differential SCSI costs • Why Differential SCSI? more and has additional power and PC board space require- • The SCSI Bus ments. However, the gained benefits are well worth the addi- • SCSI Bus States tional IC cost, PCB space, and required power in many appli- • SCSI Options: Fast and Wide cations. Differential SCSI provides the following benefits over single-ended SCSI: • The SCSI Termination • Reliable High Transfer Rates — easily capable of operat- • SCSI Controller Requirements ing at 10MT/s (Fast SCSI) without special attention to termi- • Summary of SCSI Standards nations. Even higher data rates are currently being standard- • References/Standards ized (FAST-20 @ 20MT/s). The companion Application Note (AN-905) focuses on the THE SCSI INTERFACE features of National’s new RS-485 hex transceiver. The The Small Computer System Interface is an ANSI (American DS36BC956 specifically designed for use in differential SCSI National Standards Institute) interface standard defining a applications is also optimal for use in other high speed, par- peer to peer generic input/output bus (I/O bus). -
Programming Model Intel Itanium 64
11/11/2003 64-bit computing AMD Opteron 64 Application of Win32 Executable File Legacy 64 bit platforms Inbuilt 128-bit bus DDR memory controller with memory bandwidth speed up to 5.3GB/s. Infectors on Intel Itanium and AMD Benefits of 64-bit processors Opteron Based Win64 Systems Use of hyper transport protocol, “glueless” architecture. Oleg Petrovsky and Shali Hsieh Increased integer dynamic range Computer Associates International Inc. Available in up to 8 way configuration with the clock speeds 1 Computer Associates Plaza, Islandia, NY 11749, Much larger addressable memory space of 1.4 GHz, 1.6 GHz and 1.8 GHz . USA Benefits to database, scientific and cryptography Reuses already familiar 32-bit x86 instruction set and applications extends it to support 64-bit operands, registers and memory pointers. AMD64 Programming Model AMD64: Programming model Intel Itanium 64 X86 32-64 64 bit Itanium line of processors is being developed by Intel XMM8 X86 80-Bit Extends general use registers to 64-bit, adds additional eight 64-Bit X87 general purpose 64-bit registers. Itanium - 800 MHz, no on die L3 cache, Itanium 2 - 1GHz, RAX EAX AX 3MB L3 on die, Itanium 2003 (Madison) - 1.5 GHz, 6MB L3 on die cache, 410M transistors, largest integration on a RBX Reuses x86 instruction set. single silicon crystal today. XMM15 RCX Runs 32-bit code without emulation or translation to a native Itanium line of processors utilizes more efficient and robust XMM0 than legacy x86 instruction set architecture F instruction set. R8 L A Itanium has to use x86-to-IA-64 decoder a specifically Minimizes learning curve.