Software Optimization Guide for the AMD Family 15H Processors

Total Page:16

File Type:pdf, Size:1020Kb

Software Optimization Guide for the AMD Family 15H Processors Software Optimization Guide for AMD Family 15h Processors Publication No. Revision Date 47414 3.08 January 2014 Advanced Micro Devices © 2014 Advanced Micro Devices Inc. All rights reserved. The information contained herein is for informational purposes only, and is subject to change without notice. While every precaution has been taken in the preparation of this document, it may contain technical inaccuracies, omissions and typographical errors, and AMD is under no obligation to update or otherwise correct this information. Advanced Micro Devices, Inc. makes no representations or warranties with respect to the accuracy or completeness of the contents of this document, and assumes no liability of any kind, including the implied warranties of noninfringement, merchantability or fitness for particular purposes, with respect to the operation or use of AMD hardware, software or other products described herein. No license, including implied or arising by estoppel, to any intellectual property rights is granted by this document. Terms and limitations applicable to the purchase or use of AMD’s products are as set forth in a signed agreement between the parties or in AMD's Standard Terms and Conditions of Sale. Trademarks AMD, the AMD Arrow logo, and combinations thereof, AMD Athlon, AMD Opteron, 3DNow!, AMD Virtualization, and AMD-V are trademarks of Advanced Micro Devices, Inc. HyperTransport is a licensed trademark of the HyperTransport Technology Consortium. Linux is a registered trademark of Linus Torvalds. Microsoft and Windows are registered trademarks of Microsoft Corporation. MMX is a trademark of Intel Corporation. PCI-X and PCI Express are registered trademarks of the PCI-Special Interest Group (PCI-SIG). Solaris is a registered trademark of Sun Microsystems, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies. 47414 Rev. 3.08 January 2014 Software Optimization Guide for AMD Family 15h Processors Contents Tables . .11 Figures . .13 Revision History. .15 Chapter 1 Introduction . .17 1.1 Intended Audience . .17 1.2 Getting Started . .17 1.3 Using This Guide . .18 1.3.1 Special Information . .19 1.3.2 Numbering Systems . .19 1.3.3 Typographic Notation . .20 1.4 Important New Terms . .20 1.4.1 Multi-Core Processors . .20 1.4.2 Internal Instruction Formats . .20 1.4.3 Types of Instructions . .21 1.5 Key Optimizations . .22 1.5.1 Implementation Guideline . .22 1.6 What’s New on AMD Family 15h Processors . .22 1.6.1 AMD Instruction Set Enhancements . .23 1.6.2 Floating-Point Improvements . .23 1.6.3 Load-Execute Instructions for Unaligned Data . .26 1.6.4 Instruction Fetching Improvements . .26 1.6.5 Instruction Decode and Floating-Point Pipe Improvements . .26 1.6.6 Notable Performance Improvements . .26 1.6.7 Additional Enhancements for Models 30h–4Fh . .27 1.6.8 AMD Virtualization™ Optimizations . .28 Chapter 2 Microarchitecture of AMD Family 15h Processors . .29 2.1 Key Microarchitecture Features . .30 2.2 Microarchitecture of AMD Family 15h Processors . .30 Contents 3 Software Optimization Guide for AMD Family 15h Processors 47414 Rev. 3.08 January 2014 2.3 Superscalar Processor . .31 2.4 Processor Block Diagram . .31 2.5 AMD Family 15h Processor Cache Operations . .34 2.5.1 L1 Instruction Cache . .34 2.5.2 L1 Data Cache . .34 2.5.3 L2 Cache . .34 2.5.4 L3 Cache . .35 2.6 Branch-Prediction . .35 2.7 Instruction Fetch and Decode . .36 2.8 Integer Execution . .36 2.9 Translation-Lookaside Buffer . .36 2.9.1 L1 Instruction TLB Specifications . .37 2.9.2 L1 Data TLB Specifications . .37 2.9.3 L2 Instruction TLB Specifications . .37 2.9.4 L2 Data TLB Specifications . .37 2.10 Integer Unit . .37 2.10.1 Integer Scheduler . .37 2.10.2 Integer Execution Unit . .37 2.11 Floating-Point Unit . .38 2.12 Load-Store Unit . .41 2.13 Write Combining . .41 2.14 Integrated Memory Controller . .42 2.15 HyperTransport™ Technology Interface . .42 2.15.1 HyperTransport Assist . .43 Chapter 3 C and C++ Source-Level Optimizations . .45 3.1 Declarations of Floating-Point Values . .46 3.2 Using Arrays and Pointers . .47 3.3 Use of Function Prototypes . .49 3.4 Unrolling Small Loops . .49 3.5 Expression Order in Compound Branch Conditions . .50 4 Contents 47414 Rev. 3.08 January 2014 Software Optimization Guide for AMD Family 15h Processors 3.6 Arrange Boolean Operands for Quick Expression Evaluation . .51 3.7 Long Logical Expressions in If Statements . .52 3.8 Pointer Alignment . .53 3.9 Unnecessary Store-to-Load Dependencies . .54 3.10 Matching Store and Load Size . .55 3.11 Use of const Type Qualifier . .58 3.12 Generic Loop Hoisting . .58 3.13 Local Static Functions . .61 3.14 Explicit Parallelism in Code . .61 3.15 Extracting Common Subexpressions . .64 3.16 Sorting and Padding C and C++ Structures . .65 3.17 Replacing Integer Division with Multiplication . .66 3.18 Frequently Dereferenced Pointer Arguments . .67 3.19 32-Bit Integral Data Types . ..
Recommended publications
  • Effective Virtual CPU Configuration with QEMU and Libvirt
    Effective Virtual CPU Configuration with QEMU and libvirt Kashyap Chamarthy <[email protected]> Open Source Summit Edinburgh, 2018 1 / 38 Timeline of recent CPU flaws, 2018 (a) Jan 03 • Spectre v1: Bounds Check Bypass Jan 03 • Spectre v2: Branch Target Injection Jan 03 • Meltdown: Rogue Data Cache Load May 21 • Spectre-NG: Speculative Store Bypass Jun 21 • TLBleed: Side-channel attack over shared TLBs 2 / 38 Timeline of recent CPU flaws, 2018 (b) Jun 29 • NetSpectre: Side-channel attack over local network Jul 10 • Spectre-NG: Bounds Check Bypass Store Aug 14 • L1TF: "L1 Terminal Fault" ... • ? 3 / 38 Related talks in the ‘References’ section Out of scope: Internals of various side-channel attacks How to exploit Meltdown & Spectre variants Details of performance implications What this talk is not about 4 / 38 Related talks in the ‘References’ section What this talk is not about Out of scope: Internals of various side-channel attacks How to exploit Meltdown & Spectre variants Details of performance implications 4 / 38 What this talk is not about Out of scope: Internals of various side-channel attacks How to exploit Meltdown & Spectre variants Details of performance implications Related talks in the ‘References’ section 4 / 38 OpenStack, et al. libguestfs Virt Driver (guestfish) libvirtd QMP QMP QEMU QEMU VM1 VM2 Custom Disk1 Disk2 Appliance ioctl() KVM-based virtualization components Linux with KVM 5 / 38 OpenStack, et al. libguestfs Virt Driver (guestfish) libvirtd QMP QMP Custom Appliance KVM-based virtualization components QEMU QEMU VM1 VM2 Disk1 Disk2 ioctl() Linux with KVM 5 / 38 OpenStack, et al. libguestfs Virt Driver (guestfish) Custom Appliance KVM-based virtualization components libvirtd QMP QMP QEMU QEMU VM1 VM2 Disk1 Disk2 ioctl() Linux with KVM 5 / 38 libguestfs (guestfish) Custom Appliance KVM-based virtualization components OpenStack, et al.
    [Show full text]
  • A Quantitative Study of Advanced Encryption Standard Performance
    United States Military Academy USMA Digital Commons West Point ETD 12-2018 A Quantitative Study of Advanced Encryption Standard Performance as it Relates to Cryptographic Attack Feasibility Daniel Hawthorne United States Military Academy, [email protected] Follow this and additional works at: https://digitalcommons.usmalibrary.org/faculty_etd Part of the Information Security Commons Recommended Citation Hawthorne, Daniel, "A Quantitative Study of Advanced Encryption Standard Performance as it Relates to Cryptographic Attack Feasibility" (2018). West Point ETD. 9. https://digitalcommons.usmalibrary.org/faculty_etd/9 This Doctoral Dissertation is brought to you for free and open access by USMA Digital Commons. It has been accepted for inclusion in West Point ETD by an authorized administrator of USMA Digital Commons. For more information, please contact [email protected]. A QUANTITATIVE STUDY OF ADVANCED ENCRYPTION STANDARD PERFORMANCE AS IT RELATES TO CRYPTOGRAPHIC ATTACK FEASIBILITY A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree of Doctor of Computer Science By Daniel Stephen Hawthorne Colorado Technical University December, 2018 Committee Dr. Richard Livingood, Ph.D., Chair Dr. Kelly Hughes, DCS, Committee Member Dr. James O. Webb, Ph.D., Committee Member December 17, 2018 © Daniel Stephen Hawthorne, 2018 1 Abstract The advanced encryption standard (AES) is the premier symmetric key cryptosystem in use today. Given its prevalence, the security provided by AES is of utmost importance. Technology is advancing at an incredible rate, in both capability and popularity, much faster than its rate of advancement in the late 1990s when AES was selected as the replacement standard for DES. Although the literature surrounding AES is robust, most studies fall into either theoretical or practical yet infeasible.
    [Show full text]
  • Software Optimization Guide for Amd Family 15H Processors (.Pdf)
    Software Optimization Guide for AMD Family 15h Processors Publication No. Revision Date 47414 3.06 January 2012 Advanced Micro Devices © 2012 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. The infor- mation contained herein may be of a preliminary or advance nature and is subject to change without notice. No license, whether express, implied, arising by estoppel or other- wise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as compo- nents in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other application in which the failure of AMD’s product could create a situation where personal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD Arrow logo, and combinations thereof, AMD Athlon, AMD Opteron, 3DNow!, AMD Virtualization and AMD-V are trademarks of Advanced Micro Devices, Inc.
    [Show full text]
  • On Security and Privacy for Networked Information Society
    Antti Hakkala On Security and Privacy for Networked Information Society Observations and Solutions for Security Engineering and Trust Building in Advanced Societal Processes Turku Centre for Computer Science TUCS Dissertations No 225, November 2017 ON SECURITY AND PRIVACY FOR NETWORKED INFORMATIONSOCIETY Observations and Solutions for Security Engineering and Trust Building in Advanced Societal Processes antti hakkala To be presented, with the permission of the Faculty of Mathematics and Natural Sciences of the University of Turku, for public criticism in Auditorium XXII on November 18th, 2017, at 12 noon. University of Turku Department of Future Technologies FI-20014 Turun yliopisto 2017 supervisors Adjunct professor Seppo Virtanen, D. Sc. (Tech.) Department of Future Technologies University of Turku Turku, Finland Professor Jouni Isoaho, D. Sc. (Tech.) Department of Future Technologies University of Turku Turku, Finland reviewers Professor Tuomas Aura Department of Computer Science Aalto University Espoo, Finland Professor Olaf Maennel Department of Computer Science Tallinn University of Technology Tallinn, Estonia opponent Professor Jarno Limnéll Department of Communications and Networking Aalto University Espoo, Finland The originality of this thesis has been checked in accordance with the University of Turku quality assurance system using the Turnitin OriginalityCheck service ISBN 978-952-12-3607-5 (Online) ISSN 1239-1883 To my wife Maria, I am forever grateful for everything. Thank you. ABSTRACT Our society has developed into a networked information soci- ety, in which all aspects of human life are interconnected via the Internet — the backbone through which a significant part of communications traffic is routed. This makes the Internet ar- guably the most important piece of critical infrastructure in the world.
    [Show full text]
  • Amd Epyc 7351
    SPEC CPU2017 Floating Point Rate Result spec Copyright 2017-2019 Standard Performance Evaluation Corporation Sugon SPECrate2017_fp_base = 176 Sugon A620-G30 (AMD EPYC 7351) SPECrate2017_fp_peak = 177 CPU2017 License: 9046 Test Date: Dec-2017 Test Sponsor: Sugon Hardware Availability: Dec-2017 Tested by: Sugon Software Availability: Aug-2017 Copies 0 30.0 60.0 90.0 120 150 180 210 240 270 300 330 360 390 420 450 480 510 560 64 550 503.bwaves_r 32 552 165 507.cactuBSSN_r 64 163 130 508.namd_r 64 142 64 141 510.parest_r 32 146 168 511.povray_r 64 175 64 121 519.lbm_r 32 124 64 192 521.wrf_r 32 161 190 526.blender_r 64 188 164 527.cam4_r 64 162 248 538.imagick_r 64 250 205 544.nab_r 64 205 64 160 549.fotonik3d_r 32 163 64 96.7 554.roms_r 32 103 SPECrate2017_fp_base (176) SPECrate2017_fp_peak (177) Hardware Software CPU Name: AMD EPYC 7351 OS: Red Hat Enterprise Linux Server 7.4 Max MHz.: 2900 kernel 3.10.0-693.2.2 Nominal: 2400 Enabled: 32 cores, 2 chips, 2 threads/core Compiler: C/C++: Version 1.0.0 of AOCC Orderable: 1,2 chips Fortran: Version 4.8.2 of GCC Cache L1: 64 KB I + 32 KB D on chip per core Parallel: No L2: 512 KB I+D on chip per core Firmware: American Megatrends Inc. BIOS Version 0WYSZ018 released Aug-2017 L3: 64 MB I+D on chip per chip, 8 MB shared / 2 cores File System: ext4 Other: None System State: Run level 3 (Multi User) Memory: 512 GB (16 x 32 GB 2Rx4 PC4-2667V-R, running at Base Pointers: 64-bit 2400) Peak Pointers: 32/64-bit Storage: 1 x 3000 GB SATA, 7200 RPM Other: None Other: None Page 1 Standard Performance Evaluation
    [Show full text]
  • CS 110 Discussion 15 Programming with SIMD Intrinsics
    CS 110 Discussion 15 Programming with SIMD Intrinsics Yanjie Song School of Information Science and Technology May 7, 2020 Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 1 / 21 Table of Contents 1 Introduction on Intrinsics 2 Compiler and SIMD Intrinsics 3 Intel(R) SDE 4 Application: Horizontal sum in vector Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 2 / 21 Table of Contents 1 Introduction on Intrinsics 2 Compiler and SIMD Intrinsics 3 Intel(R) SDE 4 Application: Horizontal sum in vector Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 3 / 21 Introduction on Intrinsics Definition In computer software, in compiler theory, an intrinsic function (or builtin function) is a function (subroutine) available for use in a given programming language whose implementation is handled specially by the compiler. Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 4 / 21 Intrinsics in C/C++ Compilers for C and C++, of Microsoft, Intel, and the GNU Compiler Collection (GCC) implement intrinsics that map directly to the x86 single instruction, multiple data (SIMD) instructions (MMX, Streaming SIMD Extensions (SSE), SSE2, SSE3, SSSE3, SSE4). Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 5 / 21 x86 SIMD instruction set extensions MMX (1996, 64 bits) 3DNow! (1998) Streaming SIMD Extensions (SSE, 1999, 128 bits) SSE2 (2001) SSE3 (2004) SSSE3 (2006) SSE4 (2006) Advanced Vector eXtensions (AVX, 2008, 256 bits) AVX2 (2013) F16C (2009) XOP (2009) FMA FMA4 (2011) FMA3 (2012) AVX-512 (2015, 512 bits) Yanjie Song (S.I.S.T.) CS 110 Discussion 15 2020.05.07 6 / 21 SIMD extensions in other ISAs There are SIMD instructions for other ISAs as well, e.g.
    [Show full text]
  • Efficient Hashing Using the AES Instruction
    Efficient Hashing Using the AES Instruction Set Joppe W. Bos1, Onur Özen1, and Martijn Stam2 1 Laboratory for Cryptologic Algorithms, EPFL, Station 14, CH-1015 Lausanne, Switzerland {joppe.bos,onur.ozen}@epfl.ch 2 Department of Computer Science, University of Bristol, Merchant Venturers Building, Woodland Road, Bristol, BS8 1UB, United Kingdom [email protected] Abstract. In this work, we provide a software benchmark for a large range of 256-bit blockcipher-based hash functions. We instantiate the underlying blockci- pher with AES, which allows us to exploit the recent AES instruction set (AES- NI). Since AES itself only outputs 128 bits, we consider double-block-length constructions, as well as (single-block-length) constructions based on RIJNDAEL- 256. Although we primarily target architectures supporting AES-NI, our frame- work has much broader applications by estimating the performance of these hash functions on any (micro-)architecture given AES-benchmark results. As far as we are aware, this is the first comprehensive performance comparison of multi- block-length hash functions in software. 1 Introduction Historically, the most popular way of constructing a hash function is to iterate a com- pression function that itself is based on a blockcipher (this idea dates back to Ra- bin [49]). This approach has the practical advantage—especially on resource-constrained devices—that only a single primitive is needed to implement two functionalities (namely encrypting and hashing). Moreover, trust in the blockcipher can be conferred to the cor- responding hash function. The wisdom of blockcipher-based hashing is still valid today. Indeed, the current cryptographic hash function standard SHA-2 and some of the SHA- 3 candidates are, or can be regarded as, blockcipher-based designs.
    [Show full text]
  • Motmot Documentation Release 0
    motmot Documentation Release 0 Andrew Straw June 26, 2010 CONTENTS 1 Overview 3 1.1 The name motmot............................................3 1.2 Packages within motmot.........................................3 1.3 Mailing list................................................4 1.4 Related Software.............................................4 2 Download and installation instructions5 2.1 Quick install: FView application on Windows..............................5 3 Full install information 7 3.1 Supported operating systems.......................................7 3.2 Download.................................................7 3.3 Installation................................................7 3.4 Download direct from the source code repository............................8 4 Gallery of applications built on motmot packages9 4.1 Open source...............................................9 4.2 Closed source............................................... 12 5 Frequently Asked Questions 13 5.1 What cameras are supported?...................................... 13 5.2 What frame rates, image sizes, bit depths are possible?......................... 13 5.3 Which way is up? (Why are my images flipped or rotated?)...................... 13 6 Writing FView plugins 15 6.1 Overview................................................. 15 6.2 Register your FView plugin....................................... 15 6.3 Tutorials................................................. 15 7 Camera trigger device with precise timing and analog input 25 7.1 camtrig – Camera trigger
    [Show full text]
  • AMD's Bulldozer Architecture
    AMD's Bulldozer Architecture Chris Ziemba Jonathan Lunt Overview • AMD's Roadmap • Instruction Set • Architecture • Performance • Later Iterations o Piledriver o Steamroller o Excavator Slide 2 1 Changed this section, bulldozer is covered in architecture so it makes sense to not reiterate with later slides Chris Ziemba, 鳬o AMD's Roadmap • October 2011 o First iteration, Bulldozer released • June 2013 o Piledriver, implemented in 2nd gen FX-CPUs • 2013 o Steamroller, implemented in 3rd gen FX-CPUs • 2014 o Excavator, implemented in 4th gen Fusion APUs • 2015 o Revised Excavator adopted in 2015 for FX-CPUs and beyond Instruction Set: Overview • Type: CISC • Instruction Set: x86-64 (AMD64) o Includes Old x86 Registers o Extends Registers and adds new ones o Two Operating Modes: Long Mode & Legacy Mode • Integer Size: 64 bits • Virtual Address Space: 64 bits o 16 EB of Address Space (17,179,869,184 GB) • Physical Address Space: 48 bits (Current Versions) o Saves space/transistors/etc o 256TB of Address Space Instruction Set: ISA Registers Instruction Set: Operating Modes Instruction Set: Extensions • Intel x86 Extensions o SSE4 : Streaming SIMD (Single Instruction, Multiple Data) Extension 4. Mainly for DSP and Graphics Processing. o AES-NI: Advanced Encryption Standard (AES) Instructions o AVX: Advanced Vector Extensions. 256 bit registers for computationally complex floating point operations such as image/video processing, simulation, etc. • AMD x86 Extensions o XOP: AMD specified SSE5 Revision o FMA4: Fused multiply-add (MAC) instructions
    [Show full text]
  • AMD Ryzen 5 1600 Specifications
    AMD Ryzen 5 1600 specifications General information Type CPU / Microprocessor Market segment Desktop Family AMD Ryzen 5 Model number 1600 CPU part numbers YD1600BBM6IAE is an OEM/tray microprocessor YD1600BBAEBOX is a boxed microprocessor with fan and heatsink Frequency 3200 MHz Turbo frequency 3600 MHz Package 1331-pin lidded micro-PGA package Socket Socket AM4 Introduction date March 15, 2017 (announcement) April 11, 2017 (launch) Price at introduction $219 Architecture / Microarchitecture Microarchitecture Zen Processor core Summit Ridge Core stepping B1 Manufacturing process 0.014 micron FinFET process 4.8 billion transistors Data width 64 bit The number of CPU cores 6 The number of threads 12 Floating Point Unit Integrated Level 1 cache size 6 x 64 KB 4-way set associative instruction caches 6 x 32 KB 8-way set associative data caches Level 2 cache size 6 x 512 KB inclusive 8-way set associative unified caches Level 3 cache size 2 x 8 MB exclusive 16-way set associative shared caches Multiprocessing Uniprocessor Features MMX instructions Extensions to MMX SSE / Streaming SIMD Extensions SSE2 / Streaming SIMD Extensions 2 SSE3 / Streaming SIMD Extensions 3 SSSE3 / Supplemental Streaming SIMD Extensions 3 SSE4 / SSE4.1 + SSE4.2 / Streaming SIMD Extensions 4 SSE4a AES / Advanced Encryption Standard instructions AVX / Advanced Vector Extensions AVX2 / Advanced Vector Extensions 2.0 BMI / BMI1 + BMI2 / Bit Manipulation instructions SHA / Secure Hash Algorithm extensions F16C / 16-bit Floating-Point conversion instructions
    [Show full text]
  • C++ Code __M128 Add (Const __M128 &X, Const __M128 &Y){ X X3 X2 X1 X0 Return Mm Add Ps(X, Y); } + + + + +
    ECE/ME/EMA/CS 759 High Performance Computing for Engineering Applications Final Project Related Issues Variable Sharing in OpenMP OpenMP synchronization issues OpenMP performance issues November 9, 2015 Lecture 24 © Dan Negrut, 2015 ECE/ME/EMA/CS 759 UW-Madison Quote of the Day “Without music to decorate it, time is just a bunch of boring production deadlines or dates by which bills must be paid.” -- Frank Zappa, Musician 1940 - 1993 2 Before We Get Started Issues covered last time: Final Project discussion Open MP optimization issues, wrap up Today’s topics SSE and AVX quick overview Parallel computing w/ MPI Other issues: HW08, due on Wd, Nov. 10 at 11:59 PM 3 Parallelism, as Expressed at Various Levels Cluster Group of computers communicating through fast interconnect Coprocessors/Accelerators Special compute devices attached to the local node through special interconnect Node Group of processors communicating through shared memory Socket Group of cores communicating through shared cache Core Group of functional units communicating through registers Hyper-Threads Group of thread contexts sharing functional units Superscalar Group of instructions sharing functional units Pipeline Sequence of instructions sharing functional units Vector Single instruction using multiple functional units Have discussed already Haven’t discussed yet 4 [Intel] Have discussed, but little direct control Instruction Set Architecture (ISA) Extensions Extensions to the base x86 ISA One way the x86 has evolved over the years Extensions for vectorizing
    [Show full text]
  • Stream Cipher Designs: a Review
    SCIENCE CHINA Information Sciences March 2020, Vol. 63 131101:1–131101:25 . REVIEW . https://doi.org/10.1007/s11432-018-9929-x Stream cipher designs: a review Lin JIAO1*, Yonglin HAO1 & Dengguo FENG1,2* 1 State Key Laboratory of Cryptology, Beijing 100878, China; 2 State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China Received 13 August 2018/Accepted 30 June 2019/Published online 10 February 2020 Abstract Stream cipher is an important branch of symmetric cryptosystems, which takes obvious advan- tages in speed and scale of hardware implementation. It is suitable for using in the cases of massive data transfer or resource constraints, and has always been a hot and central research topic in cryptography. With the rapid development of network and communication technology, cipher algorithms play more and more crucial role in information security. Simultaneously, the application environment of cipher algorithms is in- creasingly complex, which challenges the existing cipher algorithms and calls for novel suitable designs. To accommodate new strict requirements and provide systematic scientific basis for future designs, this paper reviews the development history of stream ciphers, classifies and summarizes the design principles of typical stream ciphers in groups, briefly discusses the advantages and weakness of various stream ciphers in terms of security and implementation. Finally, it tries to foresee the prospective design directions of stream ciphers. Keywords stream cipher, survey, lightweight, authenticated encryption, homomorphic encryption Citation Jiao L, Hao Y L, Feng D G. Stream cipher designs: a review. Sci China Inf Sci, 2020, 63(3): 131101, https://doi.org/10.1007/s11432-018-9929-x 1 Introduction The widely applied e-commerce, e-government, along with the fast developing cloud computing, big data, have triggered high demands in both efficiency and security of information processing.
    [Show full text]