<<

Master Thesis ICT/ECS-2006-71

Multi-IP-Based SoC Design Including CCM Security Mode of Operation

By Solmaz Ghaznavi

A thesis presented to the University of Waterloo and KTH University in the fulfillment of the thesis requirement for the degree of Master of Science in System on-Chip Design

Waterloo, Ontario, Canada, 2006 © Solmaz Ghaznavi, 2006

Supervisor: Professor Cathy Gebotys Examiner: Professor Axel Jantsch

I hereby declare that I am the sole author of this thesis. I authorize the University of Waterloo and KTH University to lend this thesis to other institutions or individuals for the purpose of scholarly research.

I further authorize the University of Waterloo and KTH University to reproduce this thesis by photocopying or by other means, in total or in print, at the request of other institutions is individuals for the purpose of scholarly research.

ii Abstract

Embedding security in many mobile electronic devices is of great importance. With the emergence of powerful self-contained FPGAs which include microprocessors, memory etc. for SoC designs, it has shifted focus to these programmable platforms. A co-design approach can be used to optimize speed, area and power consumption by partitioning function onto the on-chip microprocessor and programmable logic blocks. FPGAs typically provide higher efficiency compared to software. On the other hand they offer more flexibility and much lower design and debug costs compared to specifically-built hardware. This thesis mainly implements CCM security mode of operation on a FPGA platform by using the AES algorithm, it then builds a complete SoC that is based on multi IP cores including CCM. Except for the hard on-chip IP cores (i.e. microprocessors and memory), the device controllers, the PLB and OPB buses and CCM are all soft IP peripherals to build a complex system. The idea of building the elements as soft IP cores makes it very easy for further on-chip developments or modifications. The CCM core that sits on the same PLB bus at 80 MHz, can easily communicate with PowerPC or DDR SDRAM or BRAM controllers which are on the same bus. The implementation exploits iterative structure of AES to save the hardware resources; it implements the expansion core as well. It also reports on the challenges and problems throughout the implementation.

iii

Acknowledgements

I would like to thank my supervisor, Professor Cathy Gebotys, for all her advice, guidance and encouragement. I would like to acknowledge CMC (Canadian Microelectronics Corporation) support for using the AP1100 board. I would also like to thank my parents and my best friend Adela for their support.

iv Table of Contents

Abstract ...... iii

List of Figures...... vii

List of Tables...... viii

1 Introduction...... 1 1.1 Thesis Objective...... 2 1.2 Security Algorithm Choice...... 2 1.3 Thesis Overview...... 3

2 Board and the FPGA Features ...... 4 2.1 Board Architecture ...... 4 2.2 Configuration, Debugging and Power Connections...... 6 2.3 FPGA Features ...... 7 2.3.1 Configurable Logic Blocks ...... 8 2.3.2 Slice Description ...... 9 2.3.3 Memory Style ...... 10 2.3.3.1 Distributed SelectRAM+ ...... 10 2.3.3.2 Block SelectRAM+...... 12 2.3.4 FPGA Clocking ...... 14

3 Security Standards...... 15 3.1 CCM ...... 16 3.1.1 CCM Cryptographic Techniques...... 17 3.1.1.1 Counter Mode Encryption (CTR)...... 17 3.1.1.2 CBC-MAC...... 19 3.1.2 CCM Security Assurance...... 21 3.2 Advanced Encryption Standard (AES)...... 21 3.2.1 AES Cipher ...... 22 3.2.2 Key Expansion...... 24

4 Design and Analysis of CCM in SoC ...... 26 4.1 Security Design Objective ...... 26 4.2 High Level Design Architecture...... 26 4.2.1 User Logic S/W Register Support...... 28 4.2.2 Memory Map of PowerPC...... 29 4.3 CCM Implementation and Analysis...... 31 4.3.1 Key Expansion and Synthesis Analysis...... 31 4.3.2 Cipher Module and Synthesis Analysis ...... 33 4.3.3 Comparison with Previous Research...... 34 4.3.3.1 Microprocessor Implementation ...... 35 4.3.3.2 FPGA Implementation ...... 36 4.3.3.2.1 AES Iterative Implementation ...... 36 4.3.3.2.2 AES Unrolled Implementation...... 37 4.3.4 Conclusion ...... 38

v 4.4 Testing and Debugging ...... 39 4.5 Software Tools and Some Practical Hints...... 40

5 Discussion and Conclusions ...... 42 5.1 Summary ...... 42 5.2 Limitations and Future Work ...... 42

References ...... 44

Appendix A: AES Cipher HDL Synthesis Report ...... 45

Appendix B: MixColumns HDL Synthesis Report...... 47

Appendix C: Key Expansion HDL Synthesis Report ...... 48

Appendix D: S-box (AES Forward Cipher)...... 50

Appendix E: Test Vectors ...... 51

Appendix F: VHDL Codes ...... 53

vi List of Figures

Figure 2-1. AP1100 Board Architecture ...... 5

Figure 2-2. Virtex-II Pro CLB Element...... 8

Figure 2-3. General Slice in Virtex-ll Pro ...... 9

Figure 2-4. Half Slice in Virtex-ll Pro ...... 10

Figure 2-5. Single-port Distributed SelectRAM+ ...... 11

Figure 2-6. Dual-port Distributed SelectRAM+...... 12

Figure 3-1. CTR Block Diagram ...... 18

Figure 3-2. CBC-MAC Block Diagram...... 21

Figure 3-3. Forward Cipher Operation ...... 23

Figure 3-4. Key Expansion ...... 25

Figure 4-1. Baseline Block Diagram with CCM Added to as Part of the System ...... 27

Figure 4-2. XMD Window Showing How to Trigger Key Expansion and CCM ...... 29

Figure 4-3. Memory Map of PowerPC...... 30

Figure 4-4. CBC-MAC Schematic...... 31

Figure 4-5. Key Expansion RTL Schematic ...... 32

Figure 4-6. S-box After Synthesis...... 32

Figure 4-7. Cipher RTL Schematic...... 33

Figure 4-8. AES Iterative Implementation ...... 37

Figure 4-9. AES Unrolled Pipelined Architecture ...... 38

vii List of Tables

Table 2-1. Virtex-II Pro Resources ...... 7

Table 2-2. Resources in a CLB (4 slices)...... 8

Table 2-3. Resources Used by Distributed Memory...... 11

Tabel 2-4. Supported Memory Configurations for Single-port and Dual-port Modes...... 12

Table 2-5. Distributed RAM Switching Characteristics...... 13

Table 2-6. Block RAM Switching Characteristics ...... 13

Table 3-1. CTRi Formatting ...... 18

Table 3-2. Flags Byte ...... 18

Table 3-3. Block Zero (B0) for CBC-MAC ...... 19

Table 3-4. Flags Byte in B0 ...... 19

Table 3-5. Parameters Dependent on ...... 21

Table 3-6. Round Constant Bytes, RC in Hexadecimal ...... 25

Table 4-1. IPIF Software Reset Register Description...... 28

Table 4-2. Instructions Execution for MixCulomns...... 36

Table 4-3. AES Encryption Results ...... 39

viii 1 Introduction

The emergence of many electronic devices that we use such as Security Identity Module (SIM) cards in mobile phones, cash cards, Radio Frequency Identification (RFID) chips etc. has increased the need for security. As a consequence it has triggered the desire to embed security in system on-chip (SoC) designs within many mobile and ubiquitous devices. There are three common ways of implementing algorithms in electronic devices. One method is to use hardware built specifically for that algorithm such as Application Specific Integrated Circuit (ASIC). This method produces a device that is highly efficient with respect to speed, area, and power consumption. However there are some downsides to this approach; including a long and expensive design time, inflexibility (if there is a need to modify the product due to some flaws or updates it usually has to be remanufactured) and cost. The second approach is using a software-programmed microprocessor. The great advantage in this method is that modification is done through the software which makes it very flexible comparing with the specifically-built hardware. On the other hand it may not be efficient with respect to speed, area, and power consumption. Reconfigurable devices such as Field Programmable Gate Arrays (FPGAs) can be considered an intermediate option. FPGAs provide Configurable Logic Blocks (CLBs) and routing resources that are programmable. Since the design can be tested and verified at the user site it benefits from a less expensive design process than ASICs. FPGAs can provide very high flexibility and they can produce efficient devices with respect to speed, area, and power consumption relatively. The need for flexibility in case of modification or damage can be crucial in some cases; for instance in cosmic equipment or in satellites, cosmic rays might affect electronic devices, the capability to reprogram the devices remotely could be of extreme importance. A security mode of operation called CCM (Counter with Cipher Block Chaining-Message Code) that uses AES cipher block is implemented as an IP peripheral in this thesis. The implementation has all the elements for a system on-chip design, including a microprocessor, memory and different buses etc. in which the buses, controllers and CCM are all soft IP cores.

1 1.1 Thesis Objective

The current state-of-the-art FPGAs provide ample hardware resources for a complete SoC design. An IP-based approach, that contains different devices, controllers or buses as soft IP cores in a library, could lead to a very flexible, efficient, and fast design methodology in SoC designs on FPGAs. The main objective of this thesis is to implement the CCM security algorithm as an IP peripheral following an IP-based design approach in order to provide a very flexible and powerful SoC implementation on Xilinx Virtex-ll Pro FPGA. This self-contained FPGA implementation includes all the necessary elements of a SoC design such as the microprocessor, memory and buses etc. (device controllers, buses and CCM are soft IP cores) that can easily communicate with each other.

1.2 Security Algorithm Choice

CCM is a security mode that provides authentication assurance by scarcity of ; meaning that an attacker without access to the key cannot easily generate a valid . So the output of the decryption-verification process is either an invalid error message or the valid plaintext. However an attacker can produce a ciphertext with a certain probability. There is an important parameter in CCM mode that can be set accordingly to control the probability of the accepting inauthentic data as authentic. More security comes at a price of larger bandwidth [ref. 1]. CCM is based on an approved symmetric key algorithm whose block size is 128 bits. In this thesis the underlying symmetric block cipher is the Advanced Encryption Standard (AES) that was approved as the standard to replace (DES) [ref. 1]. One of the advantages of CCM is that it only uses forward cipher in both generation-encryption and decryption-verification processes. Another advantage is that it allows preprocessing; the counter blocks may be generated in advance. AES was published by NIST (National Institute of Standards and Technology) in 2001. Among all the initial 21 candidates 5 were chosen according to the following criteria [ref. 2]:

- General Security AES has no known attacks as-of-yet, although it has received some criticism on its mathematical structure vulnerability. - Software Implementations AES has high potentials for parallelism which yields to efficient use of processor resources in software implementations. - Restricted-Space Environments

2 AES is very well suited for environments where either encryption or decryption is required. The downside where both are needed will be the ROM requirement. - Hardware Implementations AES has the potentials of parallelism and concurrency performance by unrolled or pipelined implementations which come at the price of lager area. - Attacks on Implementations AES showed that it was among the easiest to defend against power and timing attacks without causing significant performance degradation comparing with other candidate algorithms. - Encryption vs. Decryption AES does not vary significantly between encryption and decryption, although the key setup takes longer for decryption. - Key Agility It refers to the ability to change the key quickly with minimum resources. AES requires the key expansion to run one time for a specific key. Key expansion includes some hardware resources in either encryption or decryption. - Other Versatility and Flexibility AES supports key sizes of 128 bits, 192 bits and 256 bits that could be selected according to the level of security needed and it supports data block size of 128 bits. - Potential for Instruction-Level Parallelism AES has a very high capability for concurrency for a single block encryption.

1.3 Thesis Overview

This thesis is composed of 5 Chapters. Chapter 2 provides the technical information on the board and the FPGA used in the thesis. Chapter 3 describes the security algorithm (CCM) used in this thesis in a clear and concise manner. Chapter 4 presents the thesis contributions in implementing CCM in an IP-based system, compares it with previous research, and discusses other design architectures. Chapter 5 describes the limitations and contributions of this research.

3 2 Board and The FPGA features

In this thesis the implementation on the FPGA is based on multi IP cores that provide the elements and devices needed in SoC. The main SoC hard units on the FPGA are 444 18 Kb RAM blocks and two microprocessors; the other elements such as the PLB bus, OPB bus, PLB DDR SDRAM controller, PLB Bram Controller, and the security module (CCM) are all soft IP cores in VHDL or Verilog. The idea of making a SoC based on IP cores offers flexibility, and speed in the design process and further modifications. This chapter is devoted to the Board AP1100 and the FPGA Xilinx Virtex-ll Pro used in this thesis. The purpose of this chapter is not to go through the details of the datasheets, instead it tries to explain the main practical features that were involved in this project or could be useful in future work.

2.1 Board Architecture [ref. 4]

Figure 2-1 shows the AP1100 board hardware architecture. The Virtex-II Pro FPGA is the main feature of the board which has interfaces to different on-board devices. The two DDR SDRAM banks (64 MB each) provide 32-bit Data width for the two on-chip microprocessors. There are also two separate 18MB synchronous SRAMs providing large data width. These SRAMs can be accessed as a single 72-bit bank or as two completely separate 36- bit banks. The memory controllers inside the FPGA are soft IP cores. As it is shown in figure 2-1, configuration Flash, program Flash and System ACE are accessible through the local bus interface. A ported Linux distribution (2.4.18) is included for use with AP1100. The kernel binary code is included on the board, stored in program flash at 0x20060000. A ramdisk image is stored in the program flash at 0x20160000. By default, the AP1100 will load the kernel and mount the ramdisk when powered on. U-Boot is a bootloader program that provides the ability to load Linux, as well as a monitor program that allows access to the AP1100 resources. It is stored on the board in Program Flash at 0x20000000, and is transferred to memory for execution. While U-Boot is running, it makes use of the SDRAM. The System ACE provides the FPGA with an additional high-speed and high-performance configuration solution. The Processor Bus Dual PCI Bridge provides an interface to additional devices on the local PCI bus. This bus provides the means to include a wide variety of I/Os by installing a PMC module, it also allows to attach an Ethernet controller for network access. Furthermore high-speed network interfaces can be used through the two Gigabit Ethernet physical layer devices that are connected directly to the Virtex-II Pro.

4 Additional expansion connectors are available through the Expansion I/O ports on the board. These expansion ports allow either cabling or custom PCB daughter cards to be directly connected to the Virtex-II Pro. CompactFlash, PCI 10/100/1000 Ethernet, and RS-232D connectors are accessible from outside the chassis while the remaining connectors are accessible from within the system chassis.

Figure 2-1. AP1100 Board Architecture [ref. 3]

Here is the list of main features of AP1100:

- Xilinx Virtex-II Pro Platform FPGA with two embedded PowerPC405 processors - Dual 64 MB DDR SDRAM Banks - Dual 2 MB SRAM Banks - 16 MB Program Flash - 16 MB Configuration Flash - Xilinx System ACE CompactFlash Interface - 64-bit/66 MHz System PCI Bus

5 - 32-bit/66MHz Local PCI Bus - Four HSSDC2 Connectors offering direct access to four Virtex-II Pro MGTs (Multi- Gigabit Transceivers) - Dual 10/100/1000BASE-T Ethernet Ports - 10/100/1000BASE-T Ethernet Port - RS-232D Serial Port - Single IEEE 1386.1 PMC site - U-Boot 1.1.1 - Linux 4.0

2.2 Configuration, Debugging and Power Connections

Amirix AP1100 provides several connectors that can be found in the datasheet. The connectors that were involved in this project are as follows [ref. 4]:

- PCI bus, the card can be installed in any PCI slot; in order to benefit from the full PCI bandwidth it should be placed in a 64-bit PCI slot at 66MHz. The AP1100 operates from a single 3.3 V supply and meets the requirements of the maximum power consumption for a PCI card (the maximum power consumption is 25 W). - Xilinx Parallel-IV cable, this connector is a high-speed download cable that configures or programs the Xilinx FPGA. It connects to the JTAG port of the FPGA. The cable uses IEEE 1284 ECP protocol and Xilinx iMPACT software to increase download speeds over eight times faster than existing solutions. A 3-way mouse port cable between the mouse connector and PC’s mouse port provides power for the Parallel-IV cable.

The configuring mode that is used is boundary-scan which is an industry standard (IEEE 1149.1, and 1532) for serial programming. External logic from a cable, microprocessor, or other devices is used to drive the JTAG specific pins which are Test Data In (TDI), Test Mode Select (TMS), Test Clock (TCK) and Test Data Out (TDO to sense device response). This mode is the most popular mode of configuration due to its standardization and ability to program FPGAs, PLDs, and PROMs through only these four JTAG pins. The data is transferred at one bit per TCK in boundary_scan mode [ref. 5]. PowerPC has a built-in JTAG port for debugging. The JTAG ports of both of the PowerPC processors can be chained with the JTAG port present in the FPGA using a bus interface called JTAGPPC. “EDK provides wrappers (jtagppc_cntlr) for connecting the PowerPC and JTAGPPC.

6 This way, the same JTAG cable used by the iMPACT tool for configuring the FPGA with a bitstream file can also be used for debugging PowerPC programs” [ref. 6]. Xilinx Microprocessor Debugger (XMD) provides a Tool Command Language (Tcl) interface. XMD console can be used for command line control and testing and debugging of the target. It is also capable of running complex test scripts to verify a complete system. XMD communicates with the PowerPC through the JTAG connection on the board [ref. 7].

2.3 FPGA Features

The Virtex-II Pro family is user-programmable gate arrays for designs that are based on soft IP peripherals. This family includes multi-gigabit transceivers and PowerPC microprocessors blocks within the FPGA. It is based on 0.13 µm CMOS technology nine-layer copper process. The specifications for the Xlinx FPGA in AP1100 board that is used in this project are as follows [ref. 8]: Architecture: virtex-ii pro Device size: xc2vp100 Package: ff1704 Grade: -6 Table 2-1 shows the resources in Virtex-II Pro FPGA:

Table 2-1. Virtex-II Pro Resources [ref. 8]

CLB(4 slices Block =max 128 SelectRAM+ Maximum PowerPC405 bits) 18 X 18 User Device Processor Logic Bit DCMs(2) Slices I/O Blocks Cells(1) Multiplier Max Max Pads Blocks 18 Kb Block Slices Distr Blocks RAM RAM (Kb) (Kb) xc2vp100 2 99216 44096 1378 444 444 7992 12 1164 Notes: 1- Logic Cell includes 4-input LUT + (1) FF + Carry Logic. 2- DCM: Digital clock manager.

7 2.3.1 Configurable Logic Blocks

The Virtex-II Pro configurable logic blocks (CLBs) are organized in an array and are used to build combinatorial and sequential circuits. As it is shown in figure 2-2 each logic block is attached to a switch matrix to access the routing resources. A CLB element consists of 4 identical slices, with fast local feedback within the logic block [ref. 8]. The four slices are split into two columns of two slices with two independent carry chains and one common shift chain.

Figure 2-2. Virtex-II Pro CLB Element [ref. 8]

Table 2-2 summarizes the available resources in one CLB (4 slices). All of the CLBs are identical.

Table 2-2. Resources in a CLB (4 slices) [ref. 8] Arithmetic Flip- Logic MULT- SOP(1) Distributes Shift Slices LUTs & Carry TBUF Flops multiplexer ANDS Chains SelectRAM+ Register Chains

4 8 8 8 8 2 2 128 bits 128 bits 2

Notes: SOP: Some of products.

8 2.3.2 Slice Description

Each slice includes two 4-input function generators, fast carry look-ahead chain, arithmetic logic gates, wide function multiplexers and two storage elements, figure 2-3 shows a general slice in Virtex-ll Pro. The function generators F & G are either configurable as follows: - 4-input look-up tables (LUTs) - 16-bit shift registers - 16-bit distributed SelectRAM+ memory If Virtex-II Pro function generators (F and G in figure2-3) are implemented as 4-input look-up tables (LUTs) four input lines are provided to each of them which acts as the address lines. These function generators might be used to build any arbitrarily 4-input boolean function [ref. 8].

Figure 2-3. General Slice in Virtex-ll Pro [ref. 9]

As it is given in figure 2-4 there are five options for the signal from a function generator it could: - exit the slice (X or Y output), - feed the XOR dedicated gate, - feed the carry-logic multiplexer, - feed the D input of the storage element, - go to the MUXF5 (multiplexers are not shown in figure 2-4). The Virtex-II Pro slice contains multiplexers (MUXF5 and MUXFX multiplexers) that when combined with function generators, can provide any function of five, six, seven, or eight inputs. The MUXFX is either MUXF6 or MUXF7 or MUXF8 according to the slice considered in the Logic block [ref.8].

9 Storage element in a slice could be either an edge-triggered D flip-flop or a level-sensitive latch. The clock enable signal (CE) is active High by default.

Figure 2-4. Half Slice in Virtex-ll Pro [ref. 8] Note: Multiplexers MUXF5, MUXF6, MUXF7 and MUXF8 are not shown in this figure.

2.3.3 Memory Style

There are two choices for memory (RAM or ROM) style in XST HDL option; it can be either distributed memory that consists of LUTs or on-chip block memory. Distributed style is used for S- boxes in this project.

2.3.3.1 Distributed SelectRAM+

Each LUT which has four address lines can implement a 16 x 1-bit synchronous RAM called a distributed SelectRAM+ element. Distributed SelectRAM+ elements could be configured within a CLB as follows [ref. 8]:

- Single-Port 16x8-bit RAM - Single-Port 32x4-bit RAM - Single-Port 64x2-bit RAM

10 - Single-Port 128x1-bit RAM - Dual-Port 16x4-bit RAM - Dual-Port 32x2-bit RAM - Dual-Port 64x1-bit RAM

Distributed SelectRAM+ memory modules are write-synchronous. The asynchronous read access time is short, while the synchronous write simplifies high-speed designs. The distributed SelectRAM+ memory and the register share the same clock signal. Table 2-3 shows the number of LUTs (2 per slice) used in each distributed SelectRAM+ configuration.

Table 2-3. Resources Used by Distributed Memory [ref. 8] RAM/ ROM # of LUTs 16x1S 1 16x1D 2 32x1S 2 32x1D 4 64x1S 4 64x1D 8 128x1S 8 Notes: S= single-port configuration, D=dual-port configuration.

In single-port configurations, synchronous writes and asynchronous reads use the same address lines, given in figure 2-5. In dual-port mode, one LUT address lines are connected to shared read and write addresses. The second LUT uses the same address lines for synchronous write and another address lines for the second asynchronous read shown in figure 2-6 [ref. 8].

Figure 2-5. Single-port Distributed SelectRAM+ [ref. 8] Notes: A is asynchronous read address lines, WG is synchronous write address lines.

11

Figure 2-6. Dual-port Distributed SelectRAM+ [ref. 8] Notes: G is asynchronous read address lines, WG is synchronous write address lines.

2.3.3.2 Block SelectRAM+

In addition to distributed memory Virtex-II Pro devices include large amount of 18 Kb block SelectRAM+ resources. The 18 Kb SelectRAM+ blocks can d cascaded to implement deeper or wider single-port or dual-port memory. There are 444 18 Kb blocks that make it totally 7792 Kb block SelectRAM+, table 2-4 gives supported memory configurations for single-port and dual-port modes [ref. 8]. Tabel 2-4. Supported Memory Configurations for Single-port and Dual-port Modes [ref. 8] 16Kx1 bit 2Kx9 bits 8Kx2 bits 1Kx18 bits 4Kx4 bits 512x36 bits

Since the block SelectRAM+ have a regular array structure place-and-route software takes advantage of this feature to deliver optimum system performance and fast compile times. The segmented routing resources are essential to guarantee IP cores portability and to efficiently handle an incremental design flow that is based on modular implementations. Total design time is reduced due to fewer and shorter design iterations. Another feature of the block SelectRAM+ is that there is one optimized multiplier associated with each 18 Kb block SelectRAM+ resource. These 18-bit x 18-bit multipliers have the same

12 organization as the block SelectRAM+, they are optimized for high-speed operations and have a lower power consumption compared to an 18-bit x 18-bit multiplier in slices [ref. 8]. Switching Characteristics of both distributed and block type are given in table 2-5 and table 2-6 for comparison. Table 2-5. Distributed RAM Switching Characteristics [ref. 8] Description -6 (Speed grade) Units Sequential Delays Clock CLK to X/Y outputs (WE active) in 16 x 1 mode 1.38 ns, max

Clock CLK to X/Y outputs (WE active) in 32 x 1 mode 1.75 ns, max

Clock CLK to F5 output 1.68 ns, max Setup and Hold Times Before/After Clock CLK BX/BY data inputs (DIN) 0.41/ -0.07 ns, min F/G address inputs 0.47/ 0.00 ns, min SR input 0.24/ 0.05 ns, min Clock CLK Minimum Pulse Width, High 0.72 ns, min Minimum Pulse Width, Low 0.72 ns, min Minimum clock period to meet address write cycle time 1.44 ns, min

Table 2-6. Block RAM Switching Characteristics [ref. 8] Description -6 (Speed grade) Units Sequential Delays Clock CLK to DOUT output 1.50 ns, max Setup and Hold Times Before/After Clock CLK Address inputs 0.31/ 0.25 ns, min Data inputs (DIN) 0.23/ 0.25 ns, min EN inputs 0.32/ 0.00 ns, min RST input 032/ 0.00 ns, min WEN input 0.35/ 0.00 Clock CLK Minimum Pulse Width, High 1.30 ns, min Minimum Pulse Width, Low 1.30 ns, min Minimum clock period to meet address write cycle time 2.60 ns, min Notes: 1- A Zero “0” Hold Time listing indicates no hold time or a negative hold time. Negative values cannot be guaranteed “best-case”, but if a “0” is listed, there is no positive hold time.

13 2.3.4 FPGA Clocking

All Virtex-II Pro devices from XC2VP2 to XC2VP100 have 16 global clock buffers and support 16 global clock domains. Up to eight of these clocks can be used in any quadrant of the device by the synchronous logic elements (that is, registers, 18Kb block RAM, pipeline multipliers) and the IOBs. The software tools place and route these global clocks automatically. Digital clock manager (DCM) and global clock multiplexer buffers provide a complete solution for designing high-speed clock schemes. Up to twelve DCM blocks are available. To generate deskewed internal or external clocks, each DCM can be used to eliminate clock distribution delay. The DCM also provides 90-, 180-, and 270-degree phase-shifted versions of its output clocks. Fine-grained phase shifting offers high-resolution phase adjustments in increments of 1/256 of the clock period. Very flexible frequency synthesis provides a clock output frequency equal to a fractional or integer multiple of the input clock frequency [ref. 8]. In this project since CCM sits on the PLB bus the same PLB clock (80 MHz) is used, but faster frequency could be practiced by using a DCM as a frequency multiplier. The following clock signals (extracted from the user constraint file, “system.ucf”) feed the FPGA clock pins: fpga_opb_clk 40 MHz Clock fpga_plb_clk 80 MHz Clock fpga_125diff_n 125 MHz Differ. fpga_125diff_p 125 MHz Differ. v2p_20mhz_clk 20MHz Clock v2p_25mhz_clk 25MHz Clock) lpci_v2p_clk 33/66MHz Clock

14 3 Security Standards

This chapter briefly describes a mode of operation, called CCM, for a symmetric key block cipher algorithm, which is the focus of the thesis. Before delving into details in section 3.1, the following is a high level introduction to CCM and the security terminologies. “In essence, a mode of operation is a technique for enhancing the effect of a cryptographic algorithm or adapting the algorithm for an application, such as applying a block cipher to a sequence of data blocks or a data stream.” [ref.2] National Institute of Standards and Technology (NIST) introduced five modes of operations in Special Publication 800-38A to satisfy the requirements of the applications which use the AES algorithm. CCM being implemented in this thesis is an algorithm that combines two modes of operations the Cipher Block Chaining (CBC) mode and the Counter (CTR) mode. NIST published the CCM mode for authentication and confidentiality in Special Publication 800-38C. Authentication assures that the recipient sending the message is the source that it claims to be. Confidentiality protects data from eavesdropping or monitoring. The CBC and CTR modes both use the Advanced Encryption Standard (AES) block cipher which is a symmetric block cipher and is used in a wide range of applications. AES is an encryption/decryption scheme in which a block of a ciphertext is produced for a plaintext; they both have the same length. AES is a symmetric algorithm that is also referred to as single-key, secret-key or conventional algorithm meaning both sender and receiver use the same key. The secret key is a value independent of the plaintext (input of the encryption) and of the algorithm. The AES algorithm produces different ciphertexts (scrambled messages) depending on the secret key, [ref. 2]. There are three inputs to the decryption process of CCM namely, the nonce, the associated data and the payload. Depending on the application the nonce may be a timestamp, a counter or a random number. The nonce is required to be non-repeating in any two distinct data pairs during the lifetime of the key. The associated data is a header that needs to be protected from modification and will be authenticated, but will not be encrypted and remains readable. For instance in the IPSec protocol the associated data is used for data integrity in situations where data is not secret but must be authenticated, for example where access is enforced by IPSec to trusted computers only, or where network intrusion detection, QoS, or firewall filtering requires traffic inspection. The payload is the actual data, [ref. 1]. Typically in the CTR mode the first input block is set to a distinct value in each data transmission; CCM uses the nonce to maintain distinctiveness, the consequent input blocks are built by incrementing the previous input block by 1. The CBC mode has a chained structure and applies a formatting function on the inputs which are the nonce, the payload and the associated data to produce the input blocks.

15 The decryption process of CCM (not discussed in this project) has three inputs, namely, the nonce, the associated data and the ciphertext from the CCM encryption. Similar to the CCM encryption, it uses both CTR and CBC modes in which both use AES cipher. The AES decryption is not used in the CCM mode of operation. In general it is assumed that the attacker knows the encryption algorithm, consequently the typical goal of attacking an encryption system is to recover the key. There are generally two categories of attacking a system and deducting the key, [ref. 2]: - : Cryptanalysis attacks try to exploit the nature of the algorithm. In addition to that they might have some information on the plaintext or ciphertext. - Brute-force attack: Brute-force attacks try every possible key on a ciphertext to derive the plaintext. Once the key is deducted by the attacker the effect is catastrophic. The entire ciphertexts related to this deducted key can be decrypted. The AES block cipher in the CCM mode provides the confidentiality through AES; meaning that it’s impossible to obtain the plaintext without knowing the secret key. The Authentication is provided by scarcity of the valid ciphertexts. The attacker without access to the secret key is not able to produce the valid ciphertexts with a certain probability. Consequently any ciphertext that passes the CCM decryption was probably generated legitimately, [ref. 1].

3.1 CCM [ref. 1]

Counter with Cipher Block Chaining- Code (abbreviated CCM) is a mode of operation of the block cipher algorithm that can provide assurance of the confidentiality and authenticity of data. In this project CCM is based on Advanced Encryption Standard (AES); that is a symmetric cipher algorithm with the block size of 128 bits. The key expansion should be implemented for AES to produce the expanded key beforehand. The total number of invocations of the cipher during the lifetime of the key must be limited to 2 21 [ref. 1]. There are three inputs in CCM [ref. 1]:

- Payload, data that will be both authenticated and encrypted; Plen is the bit length of the payload. - Associated data, data that will be authenticated but not encrypted; Alen is the bit length of the associated data, - Nonce, that is assigned to the payload and the associated data; Nlen is the bit length of the nonce.

CCM consists of two related functions; generation-encryption and decryption-verification. It combines two cryptographic techniques; counter mode encryption (CTR) and Cipher Block

16 Chaining-Message Authentication Code (CBC-MAC). Only the forward cipher function of the AES algorithm is used within these techniques. CTR can do preprocessing and run in advance before the input data is received. In generation-encryption, CBC-MAC is applied to the payload, the associated data, and the nonce to generate a message authentication code (MAC); then CTR result is applied to the MAC result (the cryptographic checksum) and the payload, to transform them into an unreadable data, called the ciphertext. In decryption-verification, counter mode decryption is used to recover the MAC and the corresponding payload; then cipher block chaining is applied to the recovered payload, the received associated data, and the received nonce to verify the MAC. Successful verification indicates that the payload and the associated data originated from a source with access to the key. A MAC (a cryptographic checksum that is designed to detect intentional, unauthorized modifications of the data, as well as accidental modifications) provides stronger assurance of authenticity than a non-cryptographic checksum or an error detecting code (that is designed to detect only accidental modifications of the data). [ref. 1]

3.1.1 CCM Cryptographic Techniques

CCM combines two cryptographic modes that are based on forward cipher (AES cipher) [ref. 1]:

- Counter mode encryption (CTR) - Cipher Block Chaining-Message Authentication Code (CBC-MAC)

Each mechanism applies a specific formatting to the input data (payload or associated data or nonce) to produce the input sequence of blocks (the block size is 16 bytes).

3.1.1.1 Counter mode encryption (CTR) [ref. 1]

One mechanism in CCM is CTR that is a confidentiality mode. The formatting function generates the input sequence of blocks, called the counter blocks (16-byte CTRis) to the CTR unit. The counter blocks must be different within a single invocation and through all other invocations of CTR under any key. The element that guarantees the distinction in a single invocation is the bits that count from 0 to m (ceil(Plen/128)) for each 16-byte CTRi (0<=i<=m ). The element that assures the distinction through all other invocations of CTR under a key is the nonce, the nonce must be non-repeating meaning that any distinct data pairs must be assigned distinct nonces, but

17 they do not need to be random. The formatting of CTRi is shown in table 3-1; all the blocks in CTR have the same format.

Table 3-1. CTRi Formatting [ref. 1]

Byte number: 0 1 … n n+1 … 15

Contents: Flags Nonce [i] 8q Notes: - n = (the octet length of nonce) = (Nlen/8) - i is the counter value that is unique in each block

- In [i] 8q, 8q is the number of bits for presenting i in binary

The flags byte (octet 0) is the same for all CTRis and has the following formatting shown in table 3-2:

Table 3-2. Flags Byte [ref. 1]

Bit number: 7 6 5 4 3 2 1 0

Contents: 0 0 0 0 0 [-1] 3 Notes: - q = 15 - (the octet length of nonce) = 15 - (Nlen/8)

- In [q-1] 3, 3 is the number of bits for presenting q-1 in binary

CTR block diagram is shown in figure 3-1:

Figure 3-1. CTR Block Diagram Notes: m is equal to ceil(Plen/128).

Since CTRis are generated from nonce and nonce only needs to be distinct in each invocation, CTR provides preprocessing by running in advance before the input data.

18 3.1.1.2 CBC-MAC [ref. 1]

The other cryptographic mechanism in CCM is CBC-MAC that is a confidentiality mode whose encryption process features combining (“chaining”) of the plaintext blocks with the previous ciphertext blocks. It provides the authenticity with an of zero applied to the data to be authenticated. The cryptographic checksum (MAC) results from the final block of the CBC-MAC output, possibly truncated, serves as the message authentication. The formatting function generates the input sequence of blocks (16-byte Bis) to CBC-MAC. The formatting is done in three sections; it’s applied on the nonce on the associated data and the payload (three examples are given in appendix E). In the first section formatting is done on the nonce and it makes the first block (B0) for CBC-MAC as bellow, shown in table 3-4.

Table 3-3. Block Zero (B0) for CBC-MAC [ref. 1]

Byte number: 0 1 … n n+1 … 15 Contents: Flags Nonce [P_oct] 8q Notes: - n = (the octet length of nonce) = (Nlen/8) - q = 15 - (the octet length of nonce) = 15 - (Nlen/8) - P_oct is the octet length of payload that is Plen/8

- In [P_oct] 8q, 8q is the number of bits for presenting P_oct in binary

The flags byte (octet 0) has the following formatting given in table 3-5:

Table 3-4. Flags Byte in B0 [ref. 1]

Bit number: 7 6 5 4 3 2 1 0

Contents: 0 Adata [(t-2)/2] 3 [q-1] 3 Notes: - t is the octet length of the MAC - q = 15 - (the octet length of nonce) = 15 - (Nlen/8)

- In [q-1] 3, 3 is the number of bits for presenting q-1 in binary - If there is no associated data Adata will be ‘0’ otherwise ‘1’.

In the second section formatting is done on the associated data that produces B1, B2, … Bu. If there is no associated data then no block will be produced otherwise the following rules will be applied (A_oct is the octet length of the associated data) [ref. 1]:

19 16- 8 - If 0 < A_oct < 2 2 then [A_oct] 16 will produce the 16 least significant bits of the first block (B1).

- If 2 16-2 8 <= A_oct < 2 32 then the 48 least significant bits of the first block (B1) will be

0xff || 0xfe || [A_oct] 32.

- If 2 32 <= A_oct < 2 64 then the 80 least significant bits of the first block (B1) will be

0xff || 0xff || [A_oct] 64.

The above rules for formatting ensure the three cases will not overlap. The associated data follows the least significant bits. Then the resulting bit string will be followed by the minimum number of zeros, such that the resulting string can be partitioned into 16-byte blocks. In the third section formatting is done on the payload that produces Bu, Bu+1 … Br, where r=u+ ceil(Plen/128); Plen is the bit length of the payload. In order to make the blocks the payload is followed by the minimum number of zeros, such that the resulting string can be partitioned into 16-byte blocks. After formatting and producing the blocks, figure 3-2 shows how CBC-MAC mechanism works. The MAC (cryptographic checksum on data that is designed to reveal both accidental errors and intentional modifications of the data) is Tlen most significant bits of the last result in CBC-MAC.

MAC = MSB Tlen (Yr); MSB is the most significant bit and Tlen is the bit length of MAC.

Considering Sis as the CTR output shown in figure 3-1 and defining signal S as the result of the following concatenation:

S = S1 || S2 || … || Sm; where m is ceil(Plen/128).

The final result of CCM is defined as follows:

CCM final result = ( Payload XOR MSB Plen (S) ) || ( MAC XOR MSB Tlen(S0) ).

20

Figure 3-2. CBC-MAC Block Diagram

3.1.2 CCM Security Assurance

CCM provides authentication assurance by scarcity of ciphertexts; meaning that an attacker without access to the key cannot easily generate a valid ciphertext. Consequently the result of the decryption-verification process is either an invalid error message or the valid plaintext. However this assurance of generating a valid plaintext is not absolute and an attacker can produce a ciphertext with a certain probability. The length of the MAC (Tlen parameter) can be set accordingly to control the probability of the accepting inauthentic data as authentic. The larger values of Tlen that provide more security come at a price of larger bandwidth [ref. 1].

3.2 Advanced Encryption Standard (AES)

The Advanced Encryption Standard (AES) was published by NIST (National Institute of Standard and Technology) in 2001. AES is a symmetric block cipher that is intended to replace DES as approved standard for a wide range of applications. AES takes a 128-bit block as the input data (plaintext), has a key size of 128, 192, or 256 bits and produces the 128-bit block as the output data (ciphertext). It has no known security attacks but has been criticized that its mathematical structure may lead to attacks [ref. 2]. One of the main features of AES is simplicity that is achieved by symmetry at different levels and the choice of basic operations. Symmetry comes from the fact that AES encrypts the 128-bit plaintext by repeatedly applying the same round transformation a number of times depending on the key size shown in table 3-6.

Table 3-5. Parameters Dependent on Key Size [ref. 8] Key size (bits) 128 192 256 Number of rounds 10 12 14 Key expansion result (words/bytes) 44/176 52/208 60/240

21 This project is involved with the AES forward cipher operation and it uses 128-bit key.

3.2.1 AES Cipher

Depending on the key size there is different number of rounds that has to be executed in AES cipher. The input 128-bit block (plaintext) is presented in a 4x4 matrix of bytes (this matrix is called state) and is modified in each round. In case of 128-bit key there are 10 rounds to run shown in figure 3-3.The input key is expanded into an array of forty four 32-bit words. The key expansion should be done before the cipher operation, and in each round 4 words of the expanded key will be used [ref. 2]. Each round consists of four stages as follows [ref. 8]:

- SubBytes, used to substitute each byte in the State. - ShiftRows, shifts each row by an offset.

- MixColumns, is a column-wise operation over GF(2 8). - AddRoundKey, is bitwise XOR of the current state with a portion of the expanded key (4 words of the expanded key).

22

Figure 3-3. Forward Cipher Operation [ref. 2]

SubBytes The function SubBytes is the only non-linear function in AES. All the four stages together provide confusion, diffusion and non-linearity. SubBytes operation uses a 16x16 matrix of byte called S-box (given in appendix D) that contains a permutation of all possible 8-bit values. The content of the table can be computed by a finite- field inversion followed by an affine transformation over GF(2 8). Each byte of state is mapped into a byte from the S-box; The 4 leftmost bits are used as the row index while the 4 rightmost bits are used as the column index. The S-box is designed to be resistance to known cryptanalytic attacks. Specifically it has a low correlation between input bits and output bits, the property that the output cannot be described as a simple mathematical function of the input [ref. 2].

ShiftRows ShiftRows is a byte circular left shift operation by an offset that equals the row index. The first row (row number 0) is not changed. The second row (row number one) is left-shifted circularly one

23 byte. For the third row (row number 2) a 2-byte circular left shift is performed. For the forth row (row number 3) a 3-byte circular left shift is performed. Since the MixColumns and AddRoundKey operations are done column by column, ShiftRows ensures that 4 bytes of one column are spread out to four different columns [ref. 2].

MixColumns MixColumns function operates on the state column by column; each byte of the column is mapped into a new value that is a function of all the four bytes in that column as follows:

⎡02 03 01 01⎤ ⎡s0,0 s0,1 s0,2 s0,3⎤ ⎡s'0,0 s'0,1 s'0,2 s'0,3⎤ ⎢01 02 03 01⎥ ⎢ s1,0 s1,1 s1,2 s1,3⎥ ⎢ s'1,0 s'1,1 s'1,2 s'1,3⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢01 01 02 03⎥ ⎢s2,0 s2,1 s2,2 s2,3⎥ ⎢s'2,0 s'2,1 s'2,2 s'2,3⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣03 01 01 02⎦ ⎣s3,0 s3,1 s3,2 s3,3⎦ ⎣s'3,0 s'3,1 s'3,2 s'3,3⎦

The matrix multiplication is done in GF(2 8) meaning the addition is the bitwise XOR and the multiplication is the polynomial multiplication modulo x 8 + x 4+ x 3 + x + 1. There are different ways of implementing MixColumns depending on the platform being used to obtain the maximum efficiency. MixColumns operation ensures a good mixing among the bytes of each column. ShiftRows and MixColumns together ensure that after executing the rounds all output bits depend on all input bits [ref. 2].

AddRoundKey AddRoundKey operation is designed as simple as possible; all the 128 bits of state are XORed with 4 words (128 bits) of expanded key resulting from key expansion. AddRoundKey is the only operation that involves using the key to ensure security. Key expansion is based on SubBytes operation in addition to some simple byte level operations [ref. 2].

3.2.2 Key expansion

The Key expansion operation takes the 16-byte input key and its output is a 44-word expanded key array, each round of AES cipher uses 4 words of that 44-word expanded key. The AES developers designed the key expansion to be resistant to known cryptanalytic attacks. It is designed in a way that each key bit affects many other round key bits. The first 4 words of the output array is the 16-byte input key. Except from the words whose indexes are multiple of 4 the other words are simply made by XORing the preceding word with the word four positions back shown in the figure 3-4. The words whose indexes are multiple of four go through a more complex function (called function g).

24

Figure 3-4. Key Expansion [ref. 2] Notes: Kis are bytes and Wis are words

The function g takes the preceding word performs a one-byte circular left shift, then it performs SubBytes on each byte of the shifted result. In the last step it takes the substituted word and XOR it with a round constant word “RC(i), 0, 0, 0”, RC(i) is given in table 3-7 in hexadecimal for each round.

Table 3-6. Round Constant Bytes, RC in Hexadecimal [ref. 2]

I (round 1 2 3 4 5 6 7 8 9 10 number) RC(i) 01 02 04 08 10 20 40 80 1B 36

The purpose of using round constants is to eliminate symmetries and similarities in making the 4-word expanded key for each round.

25 4 Design and Analysis of CCM in SoC

This chapter describes the CCM mode implementation in an IP-based platform. It begins by top level description of the design, its communication with other main SoC modules and then it goes through details of each module that make CCM. It will then compare the results with previous research on software and FPGA implementation of AES cipher. The remainder of the chapter describes testing and debugging, and it provides some useful hints and the solutions to hindering practical problems that were dealt with regarding the software tools in this project.

4.1 Security design Objective

Security algorithms can be implemented in software, hardware or a combination of both with respect to speed, area and power consumption. In general software implementations tend to have low throughput comparing with hardware implementation, since they may not have the efficient instruction set or operand size for a particular algorithm. On the other hand they are more cost-efficient and flexible than hardware solutions. Since one of the objectives in this project is throughput, the security algorithm is implemented using the FPGA. Another goal is to make it a very flexible SoC design for future modifications. It provides the capability to split the design such that key expansion runs by the microprocessor and cipher runs on the FPGA. Since key expansion needs to be executed once in any key life time it has a little impact on performance overall. At a higher level it offers options to communicate with other future devices that might be added to the design.

4.2 High Level Design Architecture

The underlying foundation that has been used in this project is the baseline provided by CMC as the system-on-chip platform (CMC is a federally incorporated non-profit corporation that provides microsystems researchers with industry-calibre design resources, access to state-of-the-art manufacturing technologies, and support services). The CCM core is connected to the PLB bus as a slave module (shown in figure 4-1) through an IP interface (abbreviated as IPIF) and uses the address space 0x40000000-0x400001FF. It uses the PLB clock line of 80 MHz. Beside standard functions like address decoding the IPIF module offers other commonly used services namely [ref. 10]:

- S/W reset and Module information register (RST/MIR) - Burst and cashline transaction support

26 - DMA - FIFO - User logic interrupt support - User logic S/W register support - User logic mater support - User logic address range support

Two services out of the above services that have been used in this project are as follows: - S/W reset and Module information register (RST/MIR) - User logic S/W register support (explained in the section 4.2.1)

Figure 4-1. Baseline Block Diagram with CCM Added to as Part of the System [ref. 4] Notes:

PLB2OPB bridge shown in the figure functions as a slave on PLB side and a master on OPB side.

The software reset allows individual peripherals to be reset from the software application. The peripheral has a special write-only address. When a specific word is written to this address, the

27 IPIF generates a reset signal for the peripheral (table 4-1). The peripheral resets itself using this signal.

Table 4-1. IPIF Software Reset Register Description [ref. 10]

Bits Core Register’s address Description Access 0-31 Write C_BASEADDR(1) + “0x0000000A” 0x00000100 generates a reset Notes:

C_BASEADDR is the address of the IPIF sitting on PLB bus.

The inputs come through the slave registers to the main core named ccm_core.vhd. Since the design is based on active high reset, generating a software reset causes all the slave registers to be set to ‘1’ accordingly.

4.2.1 User Logic S/W Register Support

The User logic S/W register service provides 32 64-bit (the same width as the PLB bus) registers, maximum. Selecting the same data width as the PLB bus is more efficient, since selecting a data width that is less than the target bus will result in more resource usage due to byte steering logic [ref. 10]. The inputs are read asynchronously and are stored in write-only address space versus the output is written synchronously and are stored in read-only address. The input and the output data share the same address space; when something is written in that address space it’ll be stored in the write-only input registers, on the other hand when something is read from that address space it’ll be read from the output registers. This way saves more space for both the input and the output data; consequently 256 bytes for the inputs and 256 bytes for the output using the same addresses. The inputs that are K (the cipher key), N (the nonce), A (the associated data string), P (the payload) should be written to C_BASEADDR + 0x00000000 sequentially and the maximum length allowed is 255 bytes overall, the last byte is used as the control byte for triggering the circuit. To trigger the module there are two reset signals; one that goes to key expansion module and the other that goes to CCM. Obviously CCM shouldn’t be triggered before the key expansion output is valid; they can be triggered at the same time though. The resets are active-high and are located at the following addresses:

Key expansion reset: C_BASEADDR+ 0x00000002 (second bit position from right)

28 CCM reset: C_BASEADDR+ 0x00000001 (first bit from right) Here is an example on how to trigger the circuit through XMD console; generics are Klen=128, Tlen=32, Nlen=56, Alen=64, Plen=32.

Figure 4-2. XMD Window Showing How to Trigger Key Expansion and CCM mwr 0x400000ff 0x00000003 -- activates the reset for both key expansion and CCM. mrd 0x40000000 -- 4 reads four output words. mwr 0x40000000 {0x40 0x41…0x23} 35 b -- the inputs “K” 16 bytes, “A” 7 bytes, “N” 8 bytes, “ P” 4 bytes are read; letter b indicates byte. mwr 0x400000ff 0x00000000 -- resets both key expansion and CCM. Mrd 0x40000000 2 -- displays the output.

4.2.2 Memory Map of PowerPC

The entire 4GB of memory space is given in figure 4-4, with the lower 1.25 GB zoomed in. The map shows the default baseline address decoding employed on the AP1100. The AP1100 Baseline Platform supports 64MB of SDRAM. This physical memory is aliased throughout the 512 MB range shown in the memory map. The lower and upper 1MB portions within this space are used to store the U-Boot code and care must be taken to ensure this area of memory is not overwritten. When downloading data or other software to the AP1100, these lower and upper 1 MB portions of SDRAM are best to be avoided [ref. 4].

29

Figure 4-3. Memory Map of PowerPC [ref. 4] Notes: 1- Blue section indicates that these devices reside on the Local Bus. The PowerPC can access them through the PLB2OPB and OPB_EXT bridges. 2- Yellow section indicates that these devices reside on the OPB. The PowerPC can access them through the PLB2OPB bridge.

In this project CCM is the slave module that uses the memory space (0x40000000-0x400001FF) from the unused space (0x40000000-0x4B000000) shown in the figure4-4. Since it sits on the same PLB bus as the PowerPC it can directly communicate with it. There are other options for CCM; it might be the slave on OPB bus or the master on PLB or OPB bus. It might be designed to communicate with DDR SDRAM in case a large memory is needed for the result. In this case CCM should act as a master on the bus. If it sits as the master on OPB bus then it needs an OPB2PLB bridge that is a master on PLB side and a slave on PLB side.

30 4.3 CCM Implementation and Analysis

This section explains the CCM design from RTL angle and analyzes some details of the synthesis report. The building blocks of CCM (ccm_core.vhd) are cipher module, key expansion and a control unit. It also needs a formatting procedure on the inputs that is done in the package datatypes.vhd. The source VHDL files are provided in appendix F. The optimization goal in Xilinx Synthesis Tool (XST) is set to speed, optimization effort is set to normal, RAM/ROM extraction is activated and RAM/ROM style is set to auto. The RTL schematic of CBC_MAC without the control unit is shown in figure 4-5.

Figure 4-4. CBC-MAC Schematic

The CTR mode duplicates AES cipher n times (n equals ⎡Plen/128⎤ +1). Plen which is equal to 32, 128 and 192 has been configured on the FPGA; among those cases ⎡192/128⎤+1 = 3 was the maximum number of the AES ciphers needed to be built for CTR.

4.3.1 Key Expansion and Synthesis Analysis

The key expansion RTL schematic that is used for the VHDL code is shown in figure 4-6. The control unit that basically uses a counter and drives the select lines for the multiplexers and other control lines (register and output enables etc.) is not shown in this figure.

31

Figure 4-5. Key Expansion RTL Schematic

As it is shown in the RTL schematic and the synthesis report (appendix C) there are 4 ROMs (16x128-bit ROM) and 4 multiplexers (8-bit 16-to-1 multiplexer). These resources are used to make the S-boxes given in figure 4-7 after synthesis (select lines and address lines are not shown in this figure). 4 leftmost bits select one row (128 bits) of the S-box, the multiplexer selects 8 bit out of the 128 coming from the ROM, using the 4 rightmost bits as its select signals to select the column.

Figure 4-6. S-box After Synthesis

The other multiplexer (8-bit 11-to-1 multiplexer) in the synthesis report is used to select the round constant using i/4 as its select signals. As for the timing analysis it takes 39 clock cycles from the time reset (synchronous reset) goes inactive and is sensed by the falling edge of the clock until the key expansion output (43 words) is produced.

32 4.3.2 Cipher Module and Synthesis Analysis

The cipher RTL schematic that is used for the VHDL code is shown in figure 4-8. The control unit that basically consists of a counter and drives the select lines for the multiplexers and other control lines (register and output enables etc.) is not shown in this figure.

Figure 4-7. Cipher RTL Schematic

The file mix.vhd doing MixColumns operation is combinational and add.vhd is sequential feeding the feedback registers (the source VHDL files are provided in appendix F). The MixColumns operation used in this project is based on using ROMs; the transformation is given below [ref. 2]:

⎡c0, j⎤ ⎡02 03 01 01⎤ ⎡b0, j⎤ ⎢c1, j ⎥ ⎢01 02 03 01⎥ ⎢b1, j ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ = ⎢c2, j⎥ ⎢01 01 02 03⎥ ⎢b2, j⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣c3, j⎦ ⎣03 01 01 01⎦ ⎣b3, j⎦ ⎛ ⎡02⎤ ⎞ ⎛ ⎡03⎤ ⎞ ⎛ ⎡01⎤ ⎞ ⎛ ⎡01⎤ ⎞ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ 01 ⎟ ⎜ 02 ⎟ ⎜ 03 ⎟ ⎜ 01 ⎟ ⎢ ⎥ •b0, j ⊕ ⎢ ⎥ •b1, j ⊕ ⎢ ⎥ •b2, j ⊕ ⎢ ⎥ • b3, j ⎜ ⎢01⎥ ⎟ ⎜ ⎢01⎥ ⎟ ⎜ ⎢02⎥ ⎟ ⎜ ⎢03⎥ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎜ ⎢ ⎥ ⎟ ⎝ ⎣03⎦ ⎠ ⎝ ⎣01⎦ ⎠ ⎝ ⎣01⎦ ⎠ ⎝ ⎣02⎦ ⎠

Each multiplication is implemented in a ROM (256x32-bit) that takes as input a byte value bi, j and returns a column (32-bit vector). These MixColumns tables are highly optimized automatically by the synthesis tool (XST).

33 As it is shown in the synthesis report (appendix A) there are 16 ROMs (16x128-bit ROM) and 16 multiplexers (8-bit 16-to-1 multiplexer). These are used to make the 16 S-boxes, the same way it’s been explained for key expansion. There are also 16 ROMs (256x32-bit ROM, appendix A) used for doing MixColumns operation, according to the vhdl code. If we do a thumbnail calculation for the number of 4-input LUTs for a single 256x32-bit MixColumns ROM we’ll find that: for a single 256x1-bit ROM, 16 4-input LUTs plus one 16-to-1 multiplexer is needed thus, for one 256x32-bit ROM 16*32=512 is needed. As a result for 16 ROMs (256x32-bit); 512*16=8192 is needed. The number of LUT's that is shown in the synthesis report (MixCoulmns report) is as follows: # LUT2 (2-input LUT) : 80 # LUT4 (4-input LUT) : 176 This is much lower than we calculated and that is because these ROMs for mix column operation can be extremely optimized, the optimization is done automatically by XST. As for the timing it takes 10 clock cycles from the time reset (synchronous reset) goes inactive and is sensed by the falling edge of the clock until the output (16 bytes) is produced, obviously the key expansion results should be valid in advance for the cipher unit to produce the valid output data.

4.3.3 Comparison with Previous Research

Since the main underlying block cipher algorithm in CCM is AES under a key of 128 bits this sections describes some previous implementations of AES on microprocessors and FPGA platforms. Different platforms use different processing data sizes in software implementations; they are based on 8-bit or 16-bit or 32-bit etc. architecture depending on the microprocessor. AES hardware implementations are based on larger data path widths comparing with software in order to gain higher throughput; 128-bit implementation gives the highest throughput in Gigabit range since it offers the greatest degree of parallelism to increase concurrency in the computations. There are other techniques to increase throughput; some implementations use unrolling of the rounds or some others use pipelining inside the round. Nevertheless these techniques can not be applicable in all modes of operation, for instance CBC-MAC mode (described in chapter 3.1.1.2) is not able to fully exploit the unrolling technique used in pipelining due to its feedback structure. Since the result of the previous encryption is needed as the input for the next step it stalls the pipeline so the performance gain is small and consequently hardware resources will be wasted. All these techniques for increasing throughput come at a price. Using larger data path sizes in the architecture will result in larger circuits. For instance 8-bit architecture needs one S-box; while

34 128-bit architecture uses 16 S-boxes to provide fully parallel processing (S-boxes contain a large portion of the circuit regarding area and are the most spacious parts of the AES implementation). As another example, in case of 128-bit key size the fully unrolled implementation uses roughly ten times more hardware resources than the unrolled implementation [ref. 11].

4.3.3.1 Microprocessor Implementation

As mentioned in the introduction chapter software implementations are generally slower than hardware, if not in clock frequency then in throughput (high number of clock cycles), mainly because they lack instructions for modular arithmetic operations on long operands (128-bit operands in fully concurrent implementation of AES). They need more number of clock cycles to produce the result. They usually have the frequency in Megabit range. However in previous research there have been some techniques that lead to more efficient software implementations.

There are different ways for doing MixColumns. After multiplying the two matrixes in GF (2 8) (described in 3.2.1) Mixcolumns is expressed as

s'0, j = 2.s0, j ⊕ 3.s1, j ⊕ s2, j ⊕ s3, j s'1, j = s0, j ⊕ 2.s1, j ⊕ 3.s2, j ⊕ s3, j

s'2, j = s0, j ⊕ s1, j ⊕ 2.s2, j ⊕ 3.s3, j s'3, j = 3.s0, j ⊕ s1, j ⊕ s2, j ⊕ 2.s3, j

Multiplication by 2 in GF (2 8) is 1-bit left shift followed by a conditional bitwise XOR with ‘’00011011”. In [ref. 12], that uses a 32-bit processor, the above equations are rewritten in the following format:

s'0, j = s1, j ⊕ s2, j ⊕ s3, j ⊕ (2.(s0, j ⊕ s1, j)) s'1, j = s0, j ⊕ s2, j ⊕ s3, j ⊕ (2.(s1, j ⊕ s2, j))

s'2, j = s0, j ⊕ s1, j ⊕ s3, j ⊕ (2.(s2, j ⊕ s3, j)) s'3, j = s0, j ⊕ s1, j ⊕ s2, j ⊕ (2.(s0, j ⊕ s3, j))

Consequently the instructions could be executed in the following manner given in table 4-2. In their implementation, the S-box is implemented in memory and they got the following number of clock cycles for encryption, 1675 cycles in ARM7TDMI, 1384 cycles in ARM9TDMI and 1119 cycles in Pentium-lll.

35 Table 4-2. Instructions Execution for MixCulomns [ref. 12] First Instruction Second Instruction Third Instruction y0 = x1⊕ x2 ⊕ x3 x0 = 2.x0 y0 = x0 ⊕ x1 y1 = x0 ⊕ x2 ⊕ x3 x1 = 2.x1 y1 = x1⊕ x2 y2 = x0 ⊕ x1⊕ x3 x2 = 2.x2 y2 = x2 ⊕ x3 y3 = x0 ⊕ x1⊕ x2 x3 = 2.x3 y3 = x0 ⊕ x3

Another approach that was introduced in [ref. 2] for 8-bit processors rewrites the MixColumns as followes: tmp = s0, j ⊕ s1, j ⊕ s2, j ⊕ s3, j s'0, j = s0, j ⊕ tmp ⊕[2.(s0, j ⊕ s1, j)] s'1, j = s1, j ⊕ tmp ⊕[2.(s1, j ⊕ s2, j)] s'2, j = s2, j ⊕ tmp ⊕[2.(s2, j ⊕ s3, j)] s'3, j = s3, j ⊕ tmp ⊕[2.(s3, j ⊕ s0, j)]

They also suggest that in order to make the implementation resistant against timing attacks, multiplication by 2 in the Galois Field (that is a conditional XOR operation) can be replaced by a lookup table [ref. 2].

4.3.3.2 FPGA Implementations

This section describes different design implementations with high throughput as the main optimization goal. FPGA implementations mostly use a 128-bit architecture to reach the full parallelism and concurrency in computations within each round.

4.3.3.2.1 AES Iterative Implementation

The proposed 128-bit iterative architecture has been designed to reach Gigabit throughput range. It exploits the iterative structure of AES and provides maximum hardware utilization since it reuses the hardware in each round. The general iterative block diagram without the control unit and signals is shown in figure 4-8. In the round transformation, the ShiftRows operation in a 128-bit architecture comes for free because no logic resources are used; in this case ShifRows is a routing issue and is accomplished by simple rewiring.

36

Figure 4-8. AES Iterative Implementation Notes: ShiftRows operation is just rewiring in 128-bit architecture without any hardware cost.

This architecture was completely implemented on the SoC platform in this thesis. The maximum clock frequency from the Synthesis report was 176.398 MHz. However the fastest clock that was available with the provided baseline system was 80 MHz and the CCM works at this frequency. It uses 10 clock cycles to produce the AES ciphertext (128 bits); that gives the throughput of 128*176 (Mbit / S) ≈ 2.25Gbit / S 10 The CCM throughput depends on the input Data length. For instance in the case where Tlen=32, Nlen=56, Alen=64, Plen=32, CBC_MAC needs three executions of AES cipher that yields the throughput of 2.25(Gbit / S) / 3 = 0.75Gbit / S It is important to mention that in order to reach higher frequencies, the DCM (digital clock mangers) unit should be used as a frequency multiplier. This is not implemented in this thesis due to time constraint. For detailed FPGA resources that have been used refer to the synthesis report (appendix A)

4.3.3.2.2 AES Unrolled Implementation

In applications where even higher throughput is required, loop unrolling could be used. In order to achieve the highest throughput, for instance when the key size is 128 bits all ten rounds can be unrolled and pipelining registers can be inserted between the rounds. Obviously the highest throughput comes at the price of about ten times more hardware resources [ref. 11]. The general block diagram of a fully unrolled pipelined architecture is given in figure 4-9 (the control unit and signals are not shown).

37

Figure 4-9. AES Unrolled Pipelined Architecture [ref. 11]

It is important to mention that efficient place and route in large pipelined cipher architecture may be a critical issue comparing with smaller iterative implementations [ref. 11].

4.3.4 Conclusion

The CCM unit sits as a device on the PLB bus (refer to section 4.2 for details) and the platform has all the necessary elements for a SoC design; this makes it very easy for further on-chip developments or modifications on this project. CCM can easily communicate with PowerPC or DDR SDRAM controller or BRAM controller that sits on the same PLB bus at 80MHz. Since the main part of CCM is the AES cipher some previous research on AES encryption are given in table 4-3 for rough comparison; the results are not accurately comparable since they use different FPGA technology, such as Virtex E.

38 Table 4-3. AES Encryption Results Throughput Implementation # of LUTs # of Slices # of RAM Blocks (Mbit/S) [ref. 15] NA 2222 100 6956 [ref. 14] 3516 2784 100 11776 [ref. 16] 889 NA 10 1187 [ref. 15] 877 542 10 1450 [ref. 17] NA 1880 0 589 [ref. 15] 2524 1767 0 2085 [ref. 17] NA 2529 0 833 [ref. 15] 3846 2257 0 2008 Our design 2948 2717 0 2250

The implementations above use either an iterative or unrolled architecture that offers a different tradeoff between the resources and throughput. Iterative implementation uses less hardware resources while it has lower throughput than unrolled architecture.

4.4 Testing and Debugging

Xilinx Microprocessor Debugger (XMD) console has been used for testing and debugging the circuit. XMD console provides a Tool Command Language (Tcl) interface. This interface can be used for command line control and debugging of the target as well as for running complex verification scripts to test a system thoroughly [ref. 7]. The PowerPC JTAG logic in the baseline system is connected through the native JTAG port of the FPGA (series connection) [ref. 13]. The JTAG chain inside the FPGA is through the two PowerPCs. The chain includes an interface bus named JTAGPPC that contains all the JTAG signals. The test benches for this project are taken from NIST special Publication 800- 38C and are given in the appendix E with the corresponding generics that are applied before the FPGA configuration.

4.5 Software Tools and Some Practical Recommendations

The software tools used to support the multi-IP-based SoC platform are as follows:

- ISE (version: 7.1.04i) - ModelSim Simulator - Platform Studio (version: Xilinx EDK 7.1.2) - iMPACT (version: 7.1.04i) for configuring the FPGA

39 The purpose of this chapter is to go through the issues that were poorly explained in the documents thus causing some time-consuming problems throughout the research.

Connecting the Design to a Specific Interface In general there are two ways to hook up the design to a specific IP interface (IPIF) in the Platform Studio software. However the manual does not clearly provide this important high-level view of connecting IPs to the system. One way is to use “Create/ Import Peripheral…” from Tools menu that provides you with a friendly user interface. The particular problem that we had with this “Create/ Import Peripheral…” was dealing with more-than-one dimensional arrays as inputs or outputs from the ccm_core.vhd code; these arrays were supposed to be connected to the PLB controller IPIF. The alternative way we used to tackle this problem without changing the vhdl code was to modify the user_logic.vhd for PLB controller IPIF and define the ccm_core.vhd as a component within this core.

Defining the Order in a Modular Design The Peripheral Analyze Order file (.pao file) defines the ordered list of HDL files in a library needed for synthesis and simulation. The order of defining the files in the library must be bottom up. For instance if a core named A.vhd contains B.vhd then B must precede A in the .poa file.

Some Useful Miscellaneous Recommendations Due to poor documentation the useful paths that were found throughout tackling the problems are listed bellow. Platform Studio makes a project file that can be opened in ISE software as well and makes it easy to switch between Platform Studio and XST. This .ise file is located in “MY_ProjectFolder\pcores\MY_Peripheral\devl\projnav\” There is a README.txt file located in “MY_ProjectFolder\pcores\MY_Peripheral\devl\” that gives you some useful information on your peripheral and some signal definitions. Depending on the design complexity the place and route (PAR) process can be too lengthy. In order to make it faster you can reduce the overall effort level (ol) to the slowest level that is standard. The modification is done in fast_runtime.opt file located in “MY_ProjectFolder\pcores\MY_Peripheral \etc\” There was one particular problem with the synthesis tool when concatenating (using ‘&’ operator) all the 32 slave registers. The tool was unable to perform the synthesis on the following line: read_vec<=slv_reg0 & slv_reg1 & … & slv_reg30 & slv_reg31;

40 The solution was to concatenate each of the 8 registers into temporary signals and then concatenate these temporary signals. read_vec_0<=slv_reg0 & slv_reg1 & … & slv_reg6 & slv_reg7; read_vec_1<=slv_reg8 & slv_reg9 & … & slv_reg14 & slv_reg15; read_vec_2<=slv_reg16 & slv_reg17 & ... &slv_reg22 & slv_reg23; read_vec_3<=slv_reg24 & slv_reg25 & … & slv_reg30 & slv_reg31; read_vec<= read_vec_0 & read_vec_1 & read_vec_2 & read_vec_3;

41 5 Discussion and Conclusions

5.1 Summary

The purpose of this thesis is to build a self-contained on-chip system that includes CCM as the security core. This research was one of two projects which were tested to get the SoC platform successfully working among Canadian Universities after much trouble due to lack of documents. This research was also the first in Canada to implement a system including CCM mode of operation that is based on multi-IP approach to produce a complete SoC. CCM is implemented based on the iterative architecture (described in chapter 4). The main feature that makes this system very flexible is using soft IP peripherals. Except for the hard IP cores (i.e. microprocessors and block RAMs) the other cores such as CCM, buses, device controllers and etc, are soft IP peripherals. With the emergence of powerful FPGAs with efficient on-chip cores (i.e. microprocessors and memory) the idea of building a SoC using multi soft IP cores could yield very flexible self- contained solutions.

5.2 Limitations and Future Work

In order to save the FPGA resources without significant degradation in performance, it is possible to split the security algorithm into two sections; Expand key could be implemented using PowerPC while the cipher section could be implemented using CLBs. It would have a little impact on performance since key expansion has to be executed once in each key life time. From the top-level architectural point of view in future work, CCM could also be transformed to a device that is able to interrupt PowerPC, or it can be transformed to a master device communicating with DDR SDRAM or BRAM available on the board according to the application needs. There are other ways of making the S-box tables or doing the MixColumns operation. Other than using lookup tables at the algorithm level it is possible to use mathematical operations over the Galois Field. These methods could be further investigated to determine how it would affect the speed, area or power consumption. As was mentioned before, the maximum clock frequency that has been tested is the PLB clock at 80 MHz, in order to test faster frequencies a Digital Clock Manager (DCM) has to be used as the frequency multiplier that feeds the CCM input clock. The counter (CTR) mode in this project is not designed optimally since it duplicates the cipher core ( ⎡Plen/128⎤+1) times without improving the overall CCM speed; because the CBC-MAC

42 mode that works with CTR to produce the output will slow down CCM, no matter how fast CTR works; this could be researched in the future.

43 References

[ref. 1] Morris Dworkin, “Recommendation for Block Cipher Modes of Operation: The CCM Mode for Authentication and Confidentiality”, NIST Special Publication 800-38C, May 2004. http://csrc.nist.gov/publications/nistpubs/800-38C/SP800-38C.pdf.

[ref. 2] William Stallings, “ and Network Security”, Prentice Hall, fourth edition 2006.

[ref. 3] Amirix AP100 Datasheet, Amirix Systems Inc., Oct. 2005. http://www.amirix.com/downloads/ap1000.pdf.

[ref. 4] AP1000 FPGA Development Board Users Guide, AMIRIX Systems Inc., Sep. 2007. Document #: DOC-004017 Version 02.

[ref. 5] iMPACT Overview, Xilinx Inc., 2005. http://toolbox.xilinx.com/docsan/xilinx7/help/iseguide/mergedProjects/impact/html/imp_b_overview.htm.

[ref. 6] Platform Studio Debugging PowerPC Hardware Setup, Xilinx Inc., 2005. http://toolbox.xilinx.com/docsan/xilinx8/EDKHelp/platform_studio/html/ps_p_dbg_debugging_ppc_hw_setup.htm.

[ref. 7] Embedded System Tools Reference Manual Embedded Development Kit EDK 7.1i

[ref. 8] Virtex-II Pro and Virtex-II Pro X Platform FPGAs Complete Data Sheet, Xilinx Inc., Oct. 2005. http://www.xilinx.com/bvdocs/publications/ds083.pdf.

[ref. 9] Virtex-II Pro and Virtex-II Pro X FPGA User Guide, Xilinx Inc., March 2005. http://www.xilinx.com/bvdocs/userguides/ug012.pdf. [ref. 10] IPIF PLB Xilinx core, Xilinx Inc. Aug. 2004. [ref. 11] Martin Feldhofer, Kerstin Lemke, Elisabeth Oswald, Fran¸cois-Xavier Standaert, Thomas Wollinger and Johannes Wolkerstorfer, “State of the Art in Hardware Architectures”, ECRYPT, Sep. 2005. http://www.iaik.tugraz.at/research/krypto/AES/VAM2-IAIK-17-D.VAM2-1_0.pdf. [ref. 12] Guido Bertoni, Luca Breveglieri, Pasqualina Fragnet2, Marco Macchetti, and Stefano Marchesin, “Efficient Software Implementation of AES on 32-Bit Platforms”, CHES 2003, Germany. http://www.springerlink.com/media/1lbfddawqm0urn5m8eeq/contributions/u/v/x/5/uvx5nqgnn55vk199.pdf. [ref. 13] PowerPC 405 Processor Block Reference Guide, Xilinx Inc., Jul. 2005. http://www.xilinx.com/bvdocs/userguides/ug018.pdf. [ref. 14] Francois-Xavier Standaert, Gael Rouvroy, Jean-Jacques Quisquater, and Jean-Didier Legat, “Efficient Implementation of Rijndael Encryption in Reconfigurable Hardware: Improvements and Design Tradeoffs”, CHES 2003, Germany. [ref. 15] M. McLoone and J.V. McCanny, “High Performance Single Ship FPGA Rijndael Algorithm Implementations”, in the proceedings of CHES 2001: The Third International CHES Workshop, Lecture Notes In Computer Science, LNCS 2162, pp 65–76, Springer- Verlag. [ref. 16] Helion Technology, High Performance AES (Rijndael) Cores for XILINX FPGA, CHES 2003, Germany http://www.heliontech.com. [ref. 17] A. Satoh et al, Compact Hardware Architecture for 128-bit Block Cipher , in the Proceedings of the Third NESSIE Workshop, november 6–7, 2002, Munich, Germany.

44

Appendix A: AES Cipher HDL Synthesis Report

======* Final Report * ======Final Results RTL Top Level Output File Name : cipher_mod.ngr Top Level Output File Name : cipher_mod Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO

Design Statistics # IOs : 1668

Macro Statistics : # ROMs : 32 # 16x128-bit ROM : 16 # 256x32-bit ROM : 16 # Registers : 34 # 1-bit register : 1 # 4-bit register : 1 # 8-bit register : 32 # Multiplexers : 32 # 1-bit 11-to-1 multiplexer : 16 # 8-bit 16-to-1 multiplexer : 16

Cell Usage : # BELS : 6041 # INV : 1 # LUT2 : 123 # LUT2_D : 36 # LUT2_L : 16 # LUT3 : 179 # LUT3_D : 1 # LUT3_L : 1215 # LUT4 : 2948 # LUT4_D : 129 # LUT4_L : 416 # MUXF5 : 576 # MUXF6 : 272 # MUXF7 : 128 # VCC : 1 # FlipFlops/Latches : 1041 # FDE_1 : 891 # FDRE_1 : 150 # Clock Buffers : 1 # BUFGP : 1 # IO Buffers : 1667 # IBUF : 1538 # OBUF : 129

45 ======TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE.

Clock Information: ------+------+------+ Clock Signal | Clock buffer(FF name) | Load | ------+------+------+ clk | BUFGP | 1041 | ------+------+------+

Timing Summary: ------Speed Grade: -6

Minimum period: 5.669ns (Maximum Frequency: 176.398MHz) Minimum input arrival time before clock: 4.214ns Maximum output required time after clock: 3.615ns Maximum combinational path delay: No path found

46

Appendix B: MixColumns HDL Synthesis Report

======* Final Report * ======Final Results RTL Top Level Output File Name : mix.ngr Top Level Output File Name : mix Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO

Design Statistics # IOs : 256

Macro Statistics : # ROMs : 16 # 256x32-bit ROM : 16

Cell Usage : # BELS : 256 # LUT2 : 80 # LUT4 : 176 # IO Buffers : 256 # IBUF : 128 # OBUF : 128 ======

47

Appendix C: Key Expansion HDL Synthesis Report

======* Final Report * ======Final Results RTL Top Level Output File Name : expandkey.ngr Top Level Output File Name : expandkey Output Format : NGC Optimization Goal : Speed Keep Hierarchy : NO

Design Statistics # IOs : 1539

Macro Statistics : # ROMs : 4 # 16x128-bit ROM : 4 # Registers : 183 # 1-bit register : 7 # 8-bit register : 176 # Multiplexers : 5 # 8-bit 11-to-1 multiplexer : 1 # 8-bit 16-to-1 multiplexer : 4 # Adders/Subtractors : 1 # 6-bit adder : 1

Cell Usage : # BELS : 3408 # GND : 1 # INV : 1 # LUT1 : 5 # LUT2 : 72 # LUT2_D : 4 # LUT3 : 154 # LUT3_D : 35 # LUT3_L : 328 # LUT4 : 1740 # LUT4_D : 378 # LUT4_L : 443 # MUXCY : 5 # MUXF5 : 140 # MUXF6 : 64 # MUXF7 : 32 # VCC : 1 # XORCY : 5 # FlipFlops/Latches : 1632 # FDE_1 : 1536 # FDRE_1 : 89 # FDSE_1 : 7

48 # Clock Buffers : 1 # BUFGP : 1 # IO Buffers : 1538 # IBUF : 129 # OBUF : 1409 ======TIMING REPORT

NOTE: THESE TIMING NUMBERS ARE ONLY A SYNTHESIS ESTIMATE. FOR ACCURATE TIMING INFORMATION PLEASE REFER TO THE TRACE REPORT GENERATED AFTER PLACE-and-ROUTE.

Clock Information: ------+------+------+ Clock Signal | Clock buffer(FF name) | Load | ------+------+------+ clk | BUFGP | 1632 | ------+------+------+

Timing Summary: ------Speed Grade: -6

Minimum period: 8.873ns (Maximum Frequency: 112.701MHz) Minimum input arrival time before clock: 4.992ns Maximum output required time after clock: 3.692ns Maximum combinational path delay: No path found

49

Appendix D: S-box (AES Forward Cipher)

63 7C 77 7B F2 6B 6F C5 30 01 67 2b EF D7 AB 76 CA 82 C9 7D FA 59 47 F0 AD D4 A2 AF 9C A4 72 C0 B7 FD 93 26 36 3F F7 CC 34 A5 E5 F1 71 D8 31 15 04 C7 23 C3 18 96 05 9A 07 12 80 EB 27 B2 75 09 83 2C 1A 1B 6E 5A A0 52 3B D6 B3 29 E3 2F 84 53 D1 00 ED 202 FC B1 5B 6A CB BE 39 4A 4C 58 CF D0 EF AA FB 43 4D 33 85 45 F9 02 7F 50 3C 9F A8 51 A3 40 8F 92 9D 38 F5 BC B6 DA 21 10 FF F3 D2 CD 0C 13 EC 5F 97 44 17 C4 A7 7E 3D 64 5D 19 73 60 81 4F DC 22 2A 90 88 46 EE B8 14 DE 5E 0B DB E0 32 3A 0A 49 06 24 5C C2 D3 AC 62 91 95 E4 79 E7 C8 37 6D 8D D5 4E A9 6C 56 F4 EA 65 7A AE 08 BA 78 25 2E 1C A6 B4 C6 E8 DD 74 1F 4B BD 8B 8A 70 3E B5 66 48 03 F6 0E 61 35 57 B9 86 C1 1D 9E E1 F8 98 11 69 D9 8E 94 9B 1E 87 E9 CE 55 28 DF 8C A1 89 0D BF E6 42 68 41 99 2D 0f B0 54 BB 16

50

Appendix E: Test Vectors

These are the test vector that has been used for verifying the implementation through XMD console. The following examples are taken from [ref. 1].

Example 1 The generics in the following example: Klen = 128, Tlen=32, Nlen = 56, Alen = 64, and Plen = 32. K: 40414243 44454647 48494a4b 4c4d4e4f N: 10111213 141516 A: 00010203 04050607 P: 20212223 B: 4f101112 13141516 00000000 00000004 00080001 02030405 06070000 00000000 20212223 00000000 00000000 00000000 T: 6084341b Ctr0: 07101112 13141516 00000000 00000000 S0: 2d281146 10676c26 32bad748 559a679a Ctr1: 07101112 13141516 00000000 00000001 S1: 51432378 e474b339 71318484 103cddfb C: 7162015b 4dac255d

Example 2 The generics in the following example: Klen = 128, Tlen=48, Nlen = 64, Alen = 128, and Plen = 128. K: 40414243 44454647 48494a4b 4c4d4e4f N: 10111213 14151617 A: 00010203 04050607 08090a0b 0c0d0e0f P: 20212223 24252627 28292a2b 2c2d2e2f B: 56101112 13141516 17000000 00000010 00100001 02030405 06070809 0a0b0c0d 0e0f0000 00000000 00000000 00000000 20212223 24252627 28292a2b 2c2d2e2f T: 7f479ffc a464 Ctr0: 06101112 13141516 17000000 00000000 S0: 6081d043 08a97dcc 20cdcc60 bf947b78 Ctr1: 06101112 13141516 17000000 00000001 S1: f280d2c3 75cf7945 20335db9 2b107712

51 C: d2a1f0e0 51ea5f62 081a7792 073d593d 1fc64fbf accd Example 3 The generics in the following example: Klen = 128, Tlen=64, Nlen = 96, Alen = 160, and Plen = 192. K: 40414243 44454647 48494a4b 4c4d4e4f N: 10111213 14151617 18191a1b A: 00010203 04050607 08090a0b 0c0d0e0f 10111213 P: 20212223 24252627 28292a2b 2c2d2e2f 30313233 34353637 B: 5a101112 13141516 1718191a 1b000018 00140001 02030405 06070809 0a0b0c0d 0e0f1011 12130000 00000000 00000000 20212223 24252627 28292a2b 2c2d2e2f 30313233 34353637 00000000 00000000 T: 67c99240 c7d51048 Ctr0: 02101112 13141516 1718191a 1b000000 S0: 2f8a00bb 06658919 c3a040a6 eaed1a7f Ctr1: 02101112 13141516 1718191a 1b000001 S1: c393238a d1923c5d b335c0c7 e1bac924 Ctr2: 02101112 13141516 1718191a 1b000002 S2: 514798ea 9077bc92 6c22ebef 2ac732dc C: e3b201a9 f5b71a7a 9b1ceaec cd97e70b 6176aad9 a4428aa5 484392fb c1b09951

52

Appendix F: VHDL Codes

CCM Core VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity ccm_core is generic( Klen: natural:=128; Tlen: natural:=32; Nlen: natural:=56; Alen: natural:=64; Plen: natural:=32 ); port( clk: in std_logic; rst_exp: in std_logic; rst_ciph: in std_logic; key_in: in std_logic_vector (Klen-1 downto 0); P: in std_logic_vector (Plen-1 downto 0); nonce: in std_logic_vector(Nlen-1 downto 0); Adata: in std_logic_vector(Alen-1 downto 0); oe: out std_logic; C_out: out std_logic_vector (Plen+Tlen-1 downto 0) ); end ccm_core; architecture struct of ccm_core is component cipher_mod port( input: in key; key_exp: in word_arr; clk: in std_logic; rst: in std_logic; en_exp: in std_logic; oe: out std_logic; state_out: out state ); end component; component expandkey is port (clk: in std_logic; key_in: in key; reset: in std_logic; en_exp: out std_logic;

53 key_exp: out word_arr ); end component; signal enexp, r3: std_logic; signal keyexp: word_arr; signal ctr_oe: std_logic_vector(ceil(Plen) downto 0); signal s: st_arr(ceil(Plen) downto 0);

signal ctr_in: key_arr(ceil(Plen) downto 0); signal k: key; signal r_count: natural range rgen(Alen, Plen)+1 downto 1; signal cbcmac_oe: std_logic; signal cbcmac_rst: std_logic; signal cbcmac_in: key; signal y: state; signal T: std_logic_vector(Tlen-1 downto 0); signal S_bitvec: std_logic_vector(ceil(Plen)*128-1 downto 0); signal C_temp: std_logic_vector (Plen+Tlen-1 downto 0); signal S0, Yr: std_logic_vector (127 downto 0); signal sig_in: key_arr(rgen(Alen, Plen) downto 0); begin sig_in<=format(Tlen, Nlen, Alen, Plen, nonce, Adata, P); ctr_in<=format_ctr(Nlen, Plen, nonce); key_formatting: for i in 0 to 15 generate k(15-i)<=key_in(i*8+7 downto i*8); end generate;

exp1: expandkey port map( clk=>clk, key_in=>k, reset=>rst_exp, en_exp=>enexp, key_exp=>keyexp ); ctr: for m in 0 to ceil(Plen) generate ctr_ciph: cipher_mod port map( input=>ctr_in(m), key_exp=>keyexp, clk=>clk, rst=>rst_ciph, en_exp=>enexp, oe=>ctr_oe(m), state_out=>s(m) ); end generate ctr; gen1: for u in 0 to 3 generate gen1_0: for v in 0 to 3 generate S0(u*32+v*8+7 downto u*32+v*8)<=s(0)(3-u)(3-v);

54 Yr(u*32+v*8+7 downto u*32+v*8)<=Y(3-u)(3-v); end generate gen1_0; end generate gen1; gen2: for m in 0 to ceil(Plen)-1 generate gen2_0: for u in 0 to 3 generate gen2_0_0: for v in 0 to 3 generate S_bitvec(m*128+u*32+v*8+7 downto m*128+u*32+v*8)<=s(ceil(Plen)-m)(3- u)(3-v); end generate gen2_0_0; end generate gen2_0; end generate gen2; gen3: for m in 0 to Plen-1 generate C_temp(Plen+Tlen-1-m)<=P(Plen-1-m) xor S_bitvec(ceil(Plen)*128-1-m); end generate gen3; g4: for m in 0 to Tlen-1 generate C_temp(Tlen-1-m)<=T(Tlen-1-m) xor S0(127-m); end generate g4;

T(Tlen-1 downto 0)<=Yr(127 downto 127-Tlen+1); cbcmac: cipher_mod port map( input=>cbcmac_in, key_exp=>keyexp, clk=>clk, rst=>cbcmac_rst, en_exp=>enexp, oe=>cbcmac_oe, state_out=>y ); r3<='1' when r_count=rgen(Alen, Plen)+1 else '0'; cbcmac_rst<=rst_ciph or (cbcmac_oe and not(r3)); process(clk) begin if (clk='0' and clk'event)then if (rst_ciph='1')then C_out<=(others=>'1'); oe<='0'; r_count<=1; cbcmac_in<=sig_in(0); else if (cbcmac_oe='1') then if not(r_count=rgen(Alen, Plen)+1) then r_count<=r_count+1; for u in 0 to 3 loop for v in 0 to 3 loop cbcmac_in(u*4+v)<=sig_in(r_count)(u*4+v) xor Y(u)(v); end loop; end loop; else oe<='1'; C_out<=C_temp; end if; end if;--cbcmac_oe

55 end if;--reset end if;--clk end process; end struct;

AES Cipher VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity cipher_mod is port( input: in key; key_exp: in word_arr; clk: in std_logic; rst: in std_logic; en_exp: in std_logic; oe: out std_logic; state_out: out state ); end cipher_mod; architecture Behavioral of cipher_mod is component mix port(st_in: in state; st_out: out state ); end component; component add port( st_exp: in state; st_in: in state; clk: in std_logic; reset: in std_logic; en_exp: in std_logic; round: in natural range 0 to 10; st_out: out state; fin_out: out state ); end component; signal s, p, mix_out, q, init_st, fin_out: state; signal next_out, st_exp: state; signal round2: natural range 0 to 10; begin gen1: for i in 0 to 3 generate gen2: for j in 0 to 3 generate init_st(i)(j)<= input(i*4+j); p(i)(j)<=Sbox(conv_integer(next_out(i)(j)(7 downto 4)))(conv_integer(next_out(i)(j)(3 downto 0))); s(i)(j)<=p((i+j) mod 4)(j);--rotation end generate gen2; end generate gen1;

56 mi: mix port map( st_in=>s, st_out=>mix_out ); ad: add port map( st_exp=>st_exp, st_in=>q, clk=> clk, reset=>rst, en_exp=>en_exp, round=>round2, st_out=>next_out, fin_out=> fin_out ); q<=init_st when round2=0 else s when round2=10 else mix_out; decoder_1: for j in 0 to 3 generate st_exp(j)<=key_exp(round2*4+j); end generate; process( clk,rst) begin if ( clk='0' and clk'event) then if (rst='1') then oe<='0'; state_out<=(((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00")), ((X"00"), (X"00"), (X"00"), (X"00"))); round2<=0; else if not(round2=10) then if (en_exp='0') then round2<=round2+1; end if; else oe<='1'; state_out<=fin_out; end if; end if;--reset end if;--clk end process; end Behavioral;

57

AddRoundKey VHDL Code -- mul_luti is the multiplication for column library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity add is port( st_exp: in state; st_in: in state; clk: in std_logic; reset: in std_logic; en_exp: in std_logic; round: in natural range 0 to 10; st_out: out state; fin_out: out state ); end add; architecture structural of add is signal st_comb: state; begin g0: for k in 0 to 3 generate st_comb(k)<=st_in(k) xor st_exp(k); end generate g0; fin_out<=st_comb; process(clk) begin if (clk='0' and clk'event) then if reset='1' then else if (not(round=10) and en_exp='0') then st_out<=st_comb; end if; end if;--reset end if;--clk end process; end structural;

58

MixColumns VHDL Code -- mul_luti is the multiplication for column library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity mix is port( st_in: in state; st_out: out state); end mix; architecture structural of mix is signal r0, r1, r2, r3: state; begin gn1: for m in 0 to 3 generate r0(m)<=mul_row0(conv_integer(st_in(m)(0))); r1(m)<=mul_row1(conv_integer(st_in(m)(1))); r2(m)<=mul_row2(conv_integer(st_in(m)(2))); r3(m)<=mul_row3(conv_integer(st_in(m)(3))); st_out(m)<=r0(m) xor r1(m) xor r2(m) xor r3(m); end generate; end structural;

59

AES Key Expansion VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.ALL; use IEEE.STD_LOGIC_ARITH.ALL; use IEEE.STD_LOGIC_UNSIGNED.ALL; library ccm_v1_00_a; use ccm_v1_00_a.datatypes.ALL; entity expandkey is port (clk: in std_logic; key_in: in key; reset: in std_logic; en_exp: out std_logic; key_exp: out word_arr ); end expandkey; architecture structural of expandkey is signal en: std_logic; signal i: integer range 4 to 43; signal temp_w: word_arr; signal xor_input1, temp: word; signal con1: std_logic; begin temp(1)<=Sbox(conv_integer(temp_w(i-1)(2)(7 downto 4)))(conv_integer(temp_w(i-1)(2)(3 downto 0))); temp(2)<=Sbox(conv_integer(temp_w(i-1)(3)(7 downto 4)))(conv_integer(temp_w(i-1)(3)(3 downto 0))); temp(0)<=Sbox(conv_integer(temp_w(i-1)(1)(7 downto 4)))(conv_integer(temp_w(i-1)(1)(3 downto 0)))xor Rcon(i/4); temp(3)<=Sbox(conv_integer(temp_w(i-1)(0)(7 downto 4)))(conv_integer(temp_w(i-1)(0)(3 downto 0))); con1<=conv_std_logic_vector(i,6)(0) or conv_std_logic_vector(i,6)(1); xor_input1<=temp when (con1='0') else temp_w(i-1) ; en_exp<=en; key_exp<=temp_w; process(clk) variable xor_in1: word; variable z: std_logic_vector (1 downto 0); begin if (clk='0' and clk'event) then if (reset='1') then for r in 0 to 3 loop for s in 0 to 3 loop temp_w(r)(s)<=key_in(r*4+s); end loop; end loop;

60 i<=4; en<='1'; else if (not(i=43) and en='1') then i<=i+1; else en<='0'; end if; if (en='1') then temp_w(i)<=temp_w(i-4) xor xor_input1; end if; end if;--reset end if;--clk end process; end structural;

61

Package Datatypes VHDL Code library IEEE; use IEEE.STD_LOGIC_1164.all; use IEEE.STD_LOGIC_ARITH.ALL; package datatypes is type word is array (0 to 3) of std_logic_vector(7 downto 0); type key is array (0 to 15) of std_logic_vector(7 downto 0); type word_arr is array (0 to 43) of word ; type box is array (0 to 15) of key; type RCbox is array (0 to 10) of std_logic_vector(7 downto 0); type round_arr is array (1 to 10) of word; type mul_table is array (0 to 255) of word; type state is array (0 to 3) of word; type st_arr is array (natural range <>) of state; type key_arr is array (natural range <>) of key; constant mul_row0: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000010"), ("00000001"), ("00000001"), ("00000011")), (("00000100"), ("00000010"), ("00000010"), ("00000110")), (("00000110"), ("00000011"), ("00000011"), ("00000101")), (("00001000"), ("00000100"), ("00000100"), ("00001100")), (("00001010"), ("00000101"), ("00000101"), ("00001111")), (("00001100"), ("00000110"), ("00000110"), ("00001010")), (("00001110"), ("00000111"), ("00000111"), ("00001001")), (("00010000"), ("00001000"), ("00001000"), ("00011000")), (("00010010"), ("00001001"), ("00001001"), ("00011011")), (("00010100"), ("00001010"), ("00001010"), ("00011110")), (("00010110"), ("00001011"), ("00001011"), ("00011101")), (("00011000"), ("00001100"), ("00001100"), ("00010100")), (("00011010"), ("00001101"), ("00001101"), ("00010111")), (("00011100"), ("00001110"), ("00001110"), ("00010010")), (("00011110"), ("00001111"), ("00001111"), ("00010001")), (("00100000"), ("00010000"), ("00010000"), ("00110000")), (("00100010"), ("00010001"), ("00010001"), ("00110011")), (("00100100"), ("00010010"), ("00010010"), ("00110110")), (("00100110"), ("00010011"), ("00010011"), ("00110101")), (("00101000"), ("00010100"), ("00010100"), ("00111100")), (("00101010"), ("00010101"), ("00010101"), ("00111111")), (("00101100"), ("00010110"), ("00010110"), ("00111010")), (("00101110"), ("00010111"), ("00010111"), ("00111001")), (("00110000"), ("00011000"), ("00011000"), ("00101000")), (("00110010"), ("00011001"), ("00011001"), ("00101011")), (("00110100"), ("00011010"), ("00011010"), ("00101110")), (("00110110"), ("00011011"), ("00011011"), ("00101101")), (("00111000"), ("00011100"), ("00011100"), ("00100100")), (("00111010"), ("00011101"), ("00011101"), ("00100111")), (("00111100"), ("00011110"), ("00011110"), ("00100010")), (("00111110"), ("00011111"), ("00011111"), ("00100001")), (("01000000"), ("00100000"), ("00100000"), ("01100000")), (("01000010"), ("00100001"), ("00100001"), ("01100011")),

62 (("01000100"), ("00100010"), ("00100010"), ("01100110")), (("01000110"), ("00100011"), ("00100011"), ("01100101")), (("01001000"), ("00100100"), ("00100100"), ("01101100")), (("01001010"), ("00100101"), ("00100101"), ("01101111")), (("01001100"), ("00100110"), ("00100110"), ("01101010")), (("01001110"), ("00100111"), ("00100111"), ("01101001")), (("01010000"), ("00101000"), ("00101000"), ("01111000")), (("01010010"), ("00101001"), ("00101001"), ("01111011")), (("01010100"), ("00101010"), ("00101010"), ("01111110")), (("01010110"), ("00101011"), ("00101011"), ("01111101")), (("01011000"), ("00101100"), ("00101100"), ("01110100")), (("01011010"), ("00101101"), ("00101101"), ("01110111")), (("01011100"), ("00101110"), ("00101110"), ("01110010")), (("01011110"), ("00101111"), ("00101111"), ("01110001")), (("01100000"), ("00110000"), ("00110000"), ("01010000")), (("01100010"), ("00110001"), ("00110001"), ("01010011")), (("01100100"), ("00110010"), ("00110010"), ("01010110")), (("01100110"), ("00110011"), ("00110011"), ("01010101")), (("01101000"), ("00110100"), ("00110100"), ("01011100")), (("01101010"), ("00110101"), ("00110101"), ("01011111")), (("01101100"), ("00110110"), ("00110110"), ("01011010")), (("01101110"), ("00110111"), ("00110111"), ("01011001")), (("01110000"), ("00111000"), ("00111000"), ("01001000")), (("01110010"), ("00111001"), ("00111001"), ("01001011")), (("01110100"), ("00111010"), ("00111010"), ("01001110")), (("01110110"), ("00111011"), ("00111011"), ("01001101")), (("01111000"), ("00111100"), ("00111100"), ("01000100")), (("01111010"), ("00111101"), ("00111101"), ("01000111")), (("01111100"), ("00111110"), ("00111110"), ("01000010")), (("01111110"), ("00111111"), ("00111111"), ("01000001")), (("10000000"), ("01000000"), ("01000000"), ("11000000")), (("10000010"), ("01000001"), ("01000001"), ("11000011")), (("10000100"), ("01000010"), ("01000010"), ("11000110")), (("10000110"), ("01000011"), ("01000011"), ("11000101")), (("10001000"), ("01000100"), ("01000100"), ("11001100")), (("10001010"), ("01000101"), ("01000101"), ("11001111")), (("10001100"), ("01000110"), ("01000110"), ("11001010")), (("10001110"), ("01000111"), ("01000111"), ("11001001")), (("10010000"), ("01001000"), ("01001000"), ("11011000")), (("10010010"), ("01001001"), ("01001001"), ("11011011")), (("10010100"), ("01001010"), ("01001010"), ("11011110")), (("10010110"), ("01001011"), ("01001011"), ("11011101")), (("10011000"), ("01001100"), ("01001100"), ("11010100")), (("10011010"), ("01001101"), ("01001101"), ("11010111")), (("10011100"), ("01001110"), ("01001110"), ("11010010")), (("10011110"), ("01001111"), ("01001111"), ("11010001")), (("10100000"), ("01010000"), ("01010000"), ("11110000")), (("10100010"), ("01010001"), ("01010001"), ("11110011")), (("10100100"), ("01010010"), ("01010010"), ("11110110")), (("10100110"), ("01010011"), ("01010011"), ("11110101")), (("10101000"), ("01010100"), ("01010100"), ("11111100")), (("10101010"), ("01010101"), ("01010101"), ("11111111")), (("10101100"), ("01010110"), ("01010110"), ("11111010")), (("10101110"), ("01010111"), ("01010111"), ("11111001")), (("10110000"), ("01011000"), ("01011000"), ("11101000")), (("10110010"), ("01011001"), ("01011001"), ("11101011")), (("10110100"), ("01011010"), ("01011010"), ("11101110")),

63 (("10110110"), ("01011011"), ("01011011"), ("11101101")), (("10111000"), ("01011100"), ("01011100"), ("11100100")), (("10111010"), ("01011101"), ("01011101"), ("11100111")), (("10111100"), ("01011110"), ("01011110"), ("11100010")), (("10111110"), ("01011111"), ("01011111"), ("11100001")), (("11000000"), ("01100000"), ("01100000"), ("10100000")), (("11000010"), ("01100001"), ("01100001"), ("10100011")), (("11000100"), ("01100010"), ("01100010"), ("10100110")), (("11000110"), ("01100011"), ("01100011"), ("10100101")), (("11001000"), ("01100100"), ("01100100"), ("10101100")), (("11001010"), ("01100101"), ("01100101"), ("10101111")), (("11001100"), ("01100110"), ("01100110"), ("10101010")), (("11001110"), ("01100111"), ("01100111"), ("10101001")), (("11010000"), ("01101000"), ("01101000"), ("10111000")), (("11010010"), ("01101001"), ("01101001"), ("10111011")), (("11010100"), ("01101010"), ("01101010"), ("10111110")), (("11010110"), ("01101011"), ("01101011"), ("10111101")), (("11011000"), ("01101100"), ("01101100"), ("10110100")), (("11011010"), ("01101101"), ("01101101"), ("10110111")), (("11011100"), ("01101110"), ("01101110"), ("10110010")), (("11011110"), ("01101111"), ("01101111"), ("10110001")), (("11100000"), ("01110000"), ("01110000"), ("10010000")), (("11100010"), ("01110001"), ("01110001"), ("10010011")), (("11100100"), ("01110010"), ("01110010"), ("10010110")), (("11100110"), ("01110011"), ("01110011"), ("10010101")), (("11101000"), ("01110100"), ("01110100"), ("10011100")), (("11101010"), ("01110101"), ("01110101"), ("10011111")), (("11101100"), ("01110110"), ("01110110"), ("10011010")), (("11101110"), ("01110111"), ("01110111"), ("10011001")), (("11110000"), ("01111000"), ("01111000"), ("10001000")), (("11110010"), ("01111001"), ("01111001"), ("10001011")), (("11110100"), ("01111010"), ("01111010"), ("10001110")), (("11110110"), ("01111011"), ("01111011"), ("10001101")), (("11111000"), ("01111100"), ("01111100"), ("10000100")), (("11111010"), ("01111101"), ("01111101"), ("10000111")), (("11111100"), ("01111110"), ("01111110"), ("10000010")), (("11111110"), ("01111111"), ("01111111"), ("10000001")), (("00011011"), ("10000000"), ("10000000"), ("10011011")), (("00011001"), ("10000001"), ("10000001"), ("10011000")), (("00011111"), ("10000010"), ("10000010"), ("10011101")), (("00011101"), ("10000011"), ("10000011"), ("10011110")), (("00010011"), ("10000100"), ("10000100"), ("10010111")), (("00010001"), ("10000101"), ("10000101"), ("10010100")), (("00010111"), ("10000110"), ("10000110"), ("10010001")), (("00010101"), ("10000111"), ("10000111"), ("10010010")), (("00001011"), ("10001000"), ("10001000"), ("10000011")), (("00001001"), ("10001001"), ("10001001"), ("10000000")), (("00001111"), ("10001010"), ("10001010"), ("10000101")), (("00001101"), ("10001011"), ("10001011"), ("10000110")), (("00000011"), ("10001100"), ("10001100"), ("10001111")), (("00000001"), ("10001101"), ("10001101"), ("10001100")), (("00000111"), ("10001110"), ("10001110"), ("10001001")), (("00000101"), ("10001111"), ("10001111"), ("10001010")), (("00111011"), ("10010000"), ("10010000"), ("10101011")), (("00111001"), ("10010001"), ("10010001"), ("10101000")), (("00111111"), ("10010010"), ("10010010"), ("10101101")), (("00111101"), ("10010011"), ("10010011"), ("10101110")),

64 (("00110011"), ("10010100"), ("10010100"), ("10100111")), (("00110001"), ("10010101"), ("10010101"), ("10100100")), (("00110111"), ("10010110"), ("10010110"), ("10100001")), (("00110101"), ("10010111"), ("10010111"), ("10100010")), (("00101011"), ("10011000"), ("10011000"), ("10110011")), (("00101001"), ("10011001"), ("10011001"), ("10110000")), (("00101111"), ("10011010"), ("10011010"), ("10110101")), (("00101101"), ("10011011"), ("10011011"), ("10110110")), (("00100011"), ("10011100"), ("10011100"), ("10111111")), (("00100001"), ("10011101"), ("10011101"), ("10111100")), (("00100111"), ("10011110"), ("10011110"), ("10111001")), (("00100101"), ("10011111"), ("10011111"), ("10111010")), (("01011011"), ("10100000"), ("10100000"), ("11111011")), (("01011001"), ("10100001"), ("10100001"), ("11111000")), (("01011111"), ("10100010"), ("10100010"), ("11111101")), (("01011101"), ("10100011"), ("10100011"), ("11111110")), (("01010011"), ("10100100"), ("10100100"), ("11110111")), (("01010001"), ("10100101"), ("10100101"), ("11110100")), (("01010111"), ("10100110"), ("10100110"), ("11110001")), (("01010101"), ("10100111"), ("10100111"), ("11110010")), (("01001011"), ("10101000"), ("10101000"), ("11100011")), (("01001001"), ("10101001"), ("10101001"), ("11100000")), (("01001111"), ("10101010"), ("10101010"), ("11100101")), (("01001101"), ("10101011"), ("10101011"), ("11100110")), (("01000011"), ("10101100"), ("10101100"), ("11101111")), (("01000001"), ("10101101"), ("10101101"), ("11101100")), (("01000111"), ("10101110"), ("10101110"), ("11101001")), (("01000101"), ("10101111"), ("10101111"), ("11101010")), (("01111011"), ("10110000"), ("10110000"), ("11001011")), (("01111001"), ("10110001"), ("10110001"), ("11001000")), (("01111111"), ("10110010"), ("10110010"), ("11001101")), (("01111101"), ("10110011"), ("10110011"), ("11001110")), (("01110011"), ("10110100"), ("10110100"), ("11000111")), (("01110001"), ("10110101"), ("10110101"), ("11000100")), (("01110111"), ("10110110"), ("10110110"), ("11000001")), (("01110101"), ("10110111"), ("10110111"), ("11000010")), (("01101011"), ("10111000"), ("10111000"), ("11010011")), (("01101001"), ("10111001"), ("10111001"), ("11010000")), (("01101111"), ("10111010"), ("10111010"), ("11010101")), (("01101101"), ("10111011"), ("10111011"), ("11010110")), (("01100011"), ("10111100"), ("10111100"), ("11011111")), (("01100001"), ("10111101"), ("10111101"), ("11011100")), (("01100111"), ("10111110"), ("10111110"), ("11011001")), (("01100101"), ("10111111"), ("10111111"), ("11011010")), (("10011011"), ("11000000"), ("11000000"), ("01011011")), (("10011001"), ("11000001"), ("11000001"), ("01011000")), (("10011111"), ("11000010"), ("11000010"), ("01011101")), (("10011101"), ("11000011"), ("11000011"), ("01011110")), (("10010011"), ("11000100"), ("11000100"), ("01010111")), (("10010001"), ("11000101"), ("11000101"), ("01010100")), (("10010111"), ("11000110"), ("11000110"), ("01010001")), (("10010101"), ("11000111"), ("11000111"), ("01010010")), (("10001011"), ("11001000"), ("11001000"), ("01000011")), (("10001001"), ("11001001"), ("11001001"), ("01000000")), (("10001111"), ("11001010"), ("11001010"), ("01000101")), (("10001101"), ("11001011"), ("11001011"), ("01000110")), (("10000011"), ("11001100"), ("11001100"), ("01001111")),

65 (("10000001"), ("11001101"), ("11001101"), ("01001100")), (("10000111"), ("11001110"), ("11001110"), ("01001001")), (("10000101"), ("11001111"), ("11001111"), ("01001010")), (("10111011"), ("11010000"), ("11010000"), ("01101011")), (("10111001"), ("11010001"), ("11010001"), ("01101000")), (("10111111"), ("11010010"), ("11010010"), ("01101101")), (("10111101"), ("11010011"), ("11010011"), ("01101110")), (("10110011"), ("11010100"), ("11010100"), ("01100111")), (("10110001"), ("11010101"), ("11010101"), ("01100100")), (("10110111"), ("11010110"), ("11010110"), ("01100001")), (("10110101"), ("11010111"), ("11010111"), ("01100010")), (("10101011"), ("11011000"), ("11011000"), ("01110011")), (("10101001"), ("11011001"), ("11011001"), ("01110000")), (("10101111"), ("11011010"), ("11011010"), ("01110101")), (("10101101"), ("11011011"), ("11011011"), ("01110110")), (("10100011"), ("11011100"), ("11011100"), ("01111111")), (("10100001"), ("11011101"), ("11011101"), ("01111100")), (("10100111"), ("11011110"), ("11011110"), ("01111001")), (("10100101"), ("11011111"), ("11011111"), ("01111010")), (("11011011"), ("11100000"), ("11100000"), ("00111011")), (("11011001"), ("11100001"), ("11100001"), ("00111000")), (("11011111"), ("11100010"), ("11100010"), ("00111101")), (("11011101"), ("11100011"), ("11100011"), ("00111110")), (("11010011"), ("11100100"), ("11100100"), ("00110111")), (("11010001"), ("11100101"), ("11100101"), ("00110100")), (("11010111"), ("11100110"), ("11100110"), ("00110001")), (("11010101"), ("11100111"), ("11100111"), ("00110010")), (("11001011"), ("11101000"), ("11101000"), ("00100011")), (("11001001"), ("11101001"), ("11101001"), ("00100000")), (("11001111"), ("11101010"), ("11101010"), ("00100101")), (("11001101"), ("11101011"), ("11101011"), ("00100110")), (("11000011"), ("11101100"), ("11101100"), ("00101111")), (("11000001"), ("11101101"), ("11101101"), ("00101100")), (("11000111"), ("11101110"), ("11101110"), ("00101001")), (("11000101"), ("11101111"), ("11101111"), ("00101010")), (("11111011"), ("11110000"), ("11110000"), ("00001011")), (("11111001"), ("11110001"), ("11110001"), ("00001000")), (("11111111"), ("11110010"), ("11110010"), ("00001101")), (("11111101"), ("11110011"), ("11110011"), ("00001110")), (("11110011"), ("11110100"), ("11110100"), ("00000111")), (("11110001"), ("11110101"), ("11110101"), ("00000100")), (("11110111"), ("11110110"), ("11110110"), ("00000001")), (("11110101"), ("11110111"), ("11110111"), ("00000010")), (("11101011"), ("11111000"), ("11111000"), ("00010011")), (("11101001"), ("11111001"), ("11111001"), ("00010000")), (("11101111"), ("11111010"), ("11111010"), ("00010101")), (("11101101"), ("11111011"), ("11111011"), ("00010110")), (("11100011"), ("11111100"), ("11111100"), ("00011111")), (("11100001"), ("11111101"), ("11111101"), ("00011100")), (("11100111"), ("11111110"), ("11111110"), ("00011001")), (("11100101"), ("11111111"), ("11111111"), ("00011010")) ); constant mul_row1: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000011"), ("00000010"), ("00000001"), ("00000001")), (("00000110"), ("00000100"), ("00000010"), ("00000010")),

66 (("00000101"), ("00000110"), ("00000011"), ("00000011")), (("00001100"), ("00001000"), ("00000100"), ("00000100")), (("00001111"), ("00001010"), ("00000101"), ("00000101")), (("00001010"), ("00001100"), ("00000110"), ("00000110")), (("00001001"), ("00001110"), ("00000111"), ("00000111")), (("00011000"), ("00010000"), ("00001000"), ("00001000")), (("00011011"), ("00010010"), ("00001001"), ("00001001")), (("00011110"), ("00010100"), ("00001010"), ("00001010")), (("00011101"), ("00010110"), ("00001011"), ("00001011")), (("00010100"), ("00011000"), ("00001100"), ("00001100")), (("00010111"), ("00011010"), ("00001101"), ("00001101")), (("00010010"), ("00011100"), ("00001110"), ("00001110")), (("00010001"), ("00011110"), ("00001111"), ("00001111")), (("00110000"), ("00100000"), ("00010000"), ("00010000")), (("00110011"), ("00100010"), ("00010001"), ("00010001")), (("00110110"), ("00100100"), ("00010010"), ("00010010")), (("00110101"), ("00100110"), ("00010011"), ("00010011")), (("00111100"), ("00101000"), ("00010100"), ("00010100")), (("00111111"), ("00101010"), ("00010101"), ("00010101")), (("00111010"), ("00101100"), ("00010110"), ("00010110")), (("00111001"), ("00101110"), ("00010111"), ("00010111")), (("00101000"), ("00110000"), ("00011000"), ("00011000")), (("00101011"), ("00110010"), ("00011001"), ("00011001")), (("00101110"), ("00110100"), ("00011010"), ("00011010")), (("00101101"), ("00110110"), ("00011011"), ("00011011")), (("00100100"), ("00111000"), ("00011100"), ("00011100")), (("00100111"), ("00111010"), ("00011101"), ("00011101")), (("00100010"), ("00111100"), ("00011110"), ("00011110")), (("00100001"), ("00111110"), ("00011111"), ("00011111")), (("01100000"), ("01000000"), ("00100000"), ("00100000")), (("01100011"), ("01000010"), ("00100001"), ("00100001")), (("01100110"), ("01000100"), ("00100010"), ("00100010")), (("01100101"), ("01000110"), ("00100011"), ("00100011")), (("01101100"), ("01001000"), ("00100100"), ("00100100")), (("01101111"), ("01001010"), ("00100101"), ("00100101")), (("01101010"), ("01001100"), ("00100110"), ("00100110")), (("01101001"), ("01001110"), ("00100111"), ("00100111")), (("01111000"), ("01010000"), ("00101000"), ("00101000")), (("01111011"), ("01010010"), ("00101001"), ("00101001")), (("01111110"), ("01010100"), ("00101010"), ("00101010")), (("01111101"), ("01010110"), ("00101011"), ("00101011")), (("01110100"), ("01011000"), ("00101100"), ("00101100")), (("01110111"), ("01011010"), ("00101101"), ("00101101")), (("01110010"), ("01011100"), ("00101110"), ("00101110")), (("01110001"), ("01011110"), ("00101111"), ("00101111")), (("01010000"), ("01100000"), ("00110000"), ("00110000")), (("01010011"), ("01100010"), ("00110001"), ("00110001")), (("01010110"), ("01100100"), ("00110010"), ("00110010")), (("01010101"), ("01100110"), ("00110011"), ("00110011")), (("01011100"), ("01101000"), ("00110100"), ("00110100")), (("01011111"), ("01101010"), ("00110101"), ("00110101")), (("01011010"), ("01101100"), ("00110110"), ("00110110")), (("01011001"), ("01101110"), ("00110111"), ("00110111")), (("01001000"), ("01110000"), ("00111000"), ("00111000")), (("01001011"), ("01110010"), ("00111001"), ("00111001")), (("01001110"), ("01110100"), ("00111010"), ("00111010")), (("01001101"), ("01110110"), ("00111011"), ("00111011")),

67 (("01000100"), ("01111000"), ("00111100"), ("00111100")), (("01000111"), ("01111010"), ("00111101"), ("00111101")), (("01000010"), ("01111100"), ("00111110"), ("00111110")), (("01000001"), ("01111110"), ("00111111"), ("00111111")), (("11000000"), ("10000000"), ("01000000"), ("01000000")), (("11000011"), ("10000010"), ("01000001"), ("01000001")), (("11000110"), ("10000100"), ("01000010"), ("01000010")), (("11000101"), ("10000110"), ("01000011"), ("01000011")), (("11001100"), ("10001000"), ("01000100"), ("01000100")), (("11001111"), ("10001010"), ("01000101"), ("01000101")), (("11001010"), ("10001100"), ("01000110"), ("01000110")), (("11001001"), ("10001110"), ("01000111"), ("01000111")), (("11011000"), ("10010000"), ("01001000"), ("01001000")), (("11011011"), ("10010010"), ("01001001"), ("01001001")), (("11011110"), ("10010100"), ("01001010"), ("01001010")), (("11011101"), ("10010110"), ("01001011"), ("01001011")), (("11010100"), ("10011000"), ("01001100"), ("01001100")), (("11010111"), ("10011010"), ("01001101"), ("01001101")), (("11010010"), ("10011100"), ("01001110"), ("01001110")), (("11010001"), ("10011110"), ("01001111"), ("01001111")), (("11110000"), ("10100000"), ("01010000"), ("01010000")), (("11110011"), ("10100010"), ("01010001"), ("01010001")), (("11110110"), ("10100100"), ("01010010"), ("01010010")), (("11110101"), ("10100110"), ("01010011"), ("01010011")), (("11111100"), ("10101000"), ("01010100"), ("01010100")), (("11111111"), ("10101010"), ("01010101"), ("01010101")), (("11111010"), ("10101100"), ("01010110"), ("01010110")), (("11111001"), ("10101110"), ("01010111"), ("01010111")), (("11101000"), ("10110000"), ("01011000"), ("01011000")), (("11101011"), ("10110010"), ("01011001"), ("01011001")), (("11101110"), ("10110100"), ("01011010"), ("01011010")), (("11101101"), ("10110110"), ("01011011"), ("01011011")), (("11100100"), ("10111000"), ("01011100"), ("01011100")), (("11100111"), ("10111010"), ("01011101"), ("01011101")), (("11100010"), ("10111100"), ("01011110"), ("01011110")), (("11100001"), ("10111110"), ("01011111"), ("01011111")), (("10100000"), ("11000000"), ("01100000"), ("01100000")), (("10100011"), ("11000010"), ("01100001"), ("01100001")), (("10100110"), ("11000100"), ("01100010"), ("01100010")), (("10100101"), ("11000110"), ("01100011"), ("01100011")), (("10101100"), ("11001000"), ("01100100"), ("01100100")), (("10101111"), ("11001010"), ("01100101"), ("01100101")), (("10101010"), ("11001100"), ("01100110"), ("01100110")), (("10101001"), ("11001110"), ("01100111"), ("01100111")), (("10111000"), ("11010000"), ("01101000"), ("01101000")), (("10111011"), ("11010010"), ("01101001"), ("01101001")), (("10111110"), ("11010100"), ("01101010"), ("01101010")), (("10111101"), ("11010110"), ("01101011"), ("01101011")), (("10110100"), ("11011000"), ("01101100"), ("01101100")), (("10110111"), ("11011010"), ("01101101"), ("01101101")), (("10110010"), ("11011100"), ("01101110"), ("01101110")), (("10110001"), ("11011110"), ("01101111"), ("01101111")), (("10010000"), ("11100000"), ("01110000"), ("01110000")), (("10010011"), ("11100010"), ("01110001"), ("01110001")), (("10010110"), ("11100100"), ("01110010"), ("01110010")), (("10010101"), ("11100110"), ("01110011"), ("01110011")), (("10011100"), ("11101000"), ("01110100"), ("01110100")),

68 (("10011111"), ("11101010"), ("01110101"), ("01110101")), (("10011010"), ("11101100"), ("01110110"), ("01110110")), (("10011001"), ("11101110"), ("01110111"), ("01110111")), (("10001000"), ("11110000"), ("01111000"), ("01111000")), (("10001011"), ("11110010"), ("01111001"), ("01111001")), (("10001110"), ("11110100"), ("01111010"), ("01111010")), (("10001101"), ("11110110"), ("01111011"), ("01111011")), (("10000100"), ("11111000"), ("01111100"), ("01111100")), (("10000111"), ("11111010"), ("01111101"), ("01111101")), (("10000010"), ("11111100"), ("01111110"), ("01111110")), (("10000001"), ("11111110"), ("01111111"), ("01111111")), (("10011011"), ("00011011"), ("10000000"), ("10000000")), (("10011000"), ("00011001"), ("10000001"), ("10000001")), (("10011101"), ("00011111"), ("10000010"), ("10000010")), (("10011110"), ("00011101"), ("10000011"), ("10000011")), (("10010111"), ("00010011"), ("10000100"), ("10000100")), (("10010100"), ("00010001"), ("10000101"), ("10000101")), (("10010001"), ("00010111"), ("10000110"), ("10000110")), (("10010010"), ("00010101"), ("10000111"), ("10000111")), (("10000011"), ("00001011"), ("10001000"), ("10001000")), (("10000000"), ("00001001"), ("10001001"), ("10001001")), (("10000101"), ("00001111"), ("10001010"), ("10001010")), (("10000110"), ("00001101"), ("10001011"), ("10001011")), (("10001111"), ("00000011"), ("10001100"), ("10001100")), (("10001100"), ("00000001"), ("10001101"), ("10001101")), (("10001001"), ("00000111"), ("10001110"), ("10001110")), (("10001010"), ("00000101"), ("10001111"), ("10001111")), (("10101011"), ("00111011"), ("10010000"), ("10010000")), (("10101000"), ("00111001"), ("10010001"), ("10010001")), (("10101101"), ("00111111"), ("10010010"), ("10010010")), (("10101110"), ("00111101"), ("10010011"), ("10010011")), (("10100111"), ("00110011"), ("10010100"), ("10010100")), (("10100100"), ("00110001"), ("10010101"), ("10010101")), (("10100001"), ("00110111"), ("10010110"), ("10010110")), (("10100010"), ("00110101"), ("10010111"), ("10010111")), (("10110011"), ("00101011"), ("10011000"), ("10011000")), (("10110000"), ("00101001"), ("10011001"), ("10011001")), (("10110101"), ("00101111"), ("10011010"), ("10011010")), (("10110110"), ("00101101"), ("10011011"), ("10011011")), (("10111111"), ("00100011"), ("10011100"), ("10011100")), (("10111100"), ("00100001"), ("10011101"), ("10011101")), (("10111001"), ("00100111"), ("10011110"), ("10011110")), (("10111010"), ("00100101"), ("10011111"), ("10011111")), (("11111011"), ("01011011"), ("10100000"), ("10100000")), (("11111000"), ("01011001"), ("10100001"), ("10100001")), (("11111101"), ("01011111"), ("10100010"), ("10100010")), (("11111110"), ("01011101"), ("10100011"), ("10100011")), (("11110111"), ("01010011"), ("10100100"), ("10100100")), (("11110100"), ("01010001"), ("10100101"), ("10100101")), (("11110001"), ("01010111"), ("10100110"), ("10100110")), (("11110010"), ("01010101"), ("10100111"), ("10100111")), (("11100011"), ("01001011"), ("10101000"), ("10101000")), (("11100000"), ("01001001"), ("10101001"), ("10101001")), (("11100101"), ("01001111"), ("10101010"), ("10101010")), (("11100110"), ("01001101"), ("10101011"), ("10101011")), (("11101111"), ("01000011"), ("10101100"), ("10101100")), (("11101100"), ("01000001"), ("10101101"), ("10101101")),

69 (("11101001"), ("01000111"), ("10101110"), ("10101110")), (("11101010"), ("01000101"), ("10101111"), ("10101111")), (("11001011"), ("01111011"), ("10110000"), ("10110000")), (("11001000"), ("01111001"), ("10110001"), ("10110001")), (("11001101"), ("01111111"), ("10110010"), ("10110010")), (("11001110"), ("01111101"), ("10110011"), ("10110011")), (("11000111"), ("01110011"), ("10110100"), ("10110100")), (("11000100"), ("01110001"), ("10110101"), ("10110101")), (("11000001"), ("01110111"), ("10110110"), ("10110110")), (("11000010"), ("01110101"), ("10110111"), ("10110111")), (("11010011"), ("01101011"), ("10111000"), ("10111000")), (("11010000"), ("01101001"), ("10111001"), ("10111001")), (("11010101"), ("01101111"), ("10111010"), ("10111010")), (("11010110"), ("01101101"), ("10111011"), ("10111011")), (("11011111"), ("01100011"), ("10111100"), ("10111100")), (("11011100"), ("01100001"), ("10111101"), ("10111101")), (("11011001"), ("01100111"), ("10111110"), ("10111110")), (("11011010"), ("01100101"), ("10111111"), ("10111111")), (("01011011"), ("10011011"), ("11000000"), ("11000000")), (("01011000"), ("10011001"), ("11000001"), ("11000001")), (("01011101"), ("10011111"), ("11000010"), ("11000010")), (("01011110"), ("10011101"), ("11000011"), ("11000011")), (("01010111"), ("10010011"), ("11000100"), ("11000100")), (("01010100"), ("10010001"), ("11000101"), ("11000101")), (("01010001"), ("10010111"), ("11000110"), ("11000110")), (("01010010"), ("10010101"), ("11000111"), ("11000111")), (("01000011"), ("10001011"), ("11001000"), ("11001000")), (("01000000"), ("10001001"), ("11001001"), ("11001001")), (("01000101"), ("10001111"), ("11001010"), ("11001010")), (("01000110"), ("10001101"), ("11001011"), ("11001011")), (("01001111"), ("10000011"), ("11001100"), ("11001100")), (("01001100"), ("10000001"), ("11001101"), ("11001101")), (("01001001"), ("10000111"), ("11001110"), ("11001110")), (("01001010"), ("10000101"), ("11001111"), ("11001111")), (("01101011"), ("10111011"), ("11010000"), ("11010000")), (("01101000"), ("10111001"), ("11010001"), ("11010001")), (("01101101"), ("10111111"), ("11010010"), ("11010010")), (("01101110"), ("10111101"), ("11010011"), ("11010011")), (("01100111"), ("10110011"), ("11010100"), ("11010100")), (("01100100"), ("10110001"), ("11010101"), ("11010101")), (("01100001"), ("10110111"), ("11010110"), ("11010110")), (("01100010"), ("10110101"), ("11010111"), ("11010111")), (("01110011"), ("10101011"), ("11011000"), ("11011000")), (("01110000"), ("10101001"), ("11011001"), ("11011001")), (("01110101"), ("10101111"), ("11011010"), ("11011010")), (("01110110"), ("10101101"), ("11011011"), ("11011011")), (("01111111"), ("10100011"), ("11011100"), ("11011100")), (("01111100"), ("10100001"), ("11011101"), ("11011101")), (("01111001"), ("10100111"), ("11011110"), ("11011110")), (("01111010"), ("10100101"), ("11011111"), ("11011111")), (("00111011"), ("11011011"), ("11100000"), ("11100000")), (("00111000"), ("11011001"), ("11100001"), ("11100001")), (("00111101"), ("11011111"), ("11100010"), ("11100010")), (("00111110"), ("11011101"), ("11100011"), ("11100011")), (("00110111"), ("11010011"), ("11100100"), ("11100100")), (("00110100"), ("11010001"), ("11100101"), ("11100101")), (("00110001"), ("11010111"), ("11100110"), ("11100110")),

70 (("00110010"), ("11010101"), ("11100111"), ("11100111")), (("00100011"), ("11001011"), ("11101000"), ("11101000")), (("00100000"), ("11001001"), ("11101001"), ("11101001")), (("00100101"), ("11001111"), ("11101010"), ("11101010")), (("00100110"), ("11001101"), ("11101011"), ("11101011")), (("00101111"), ("11000011"), ("11101100"), ("11101100")), (("00101100"), ("11000001"), ("11101101"), ("11101101")), (("00101001"), ("11000111"), ("11101110"), ("11101110")), (("00101010"), ("11000101"), ("11101111"), ("11101111")), (("00001011"), ("11111011"), ("11110000"), ("11110000")), (("00001000"), ("11111001"), ("11110001"), ("11110001")), (("00001101"), ("11111111"), ("11110010"), ("11110010")), (("00001110"), ("11111101"), ("11110011"), ("11110011")), (("00000111"), ("11110011"), ("11110100"), ("11110100")), (("00000100"), ("11110001"), ("11110101"), ("11110101")), (("00000001"), ("11110111"), ("11110110"), ("11110110")), (("00000010"), ("11110101"), ("11110111"), ("11110111")), (("00010011"), ("11101011"), ("11111000"), ("11111000")), (("00010000"), ("11101001"), ("11111001"), ("11111001")), (("00010101"), ("11101111"), ("11111010"), ("11111010")), (("00010110"), ("11101101"), ("11111011"), ("11111011")), (("00011111"), ("11100011"), ("11111100"), ("11111100")), (("00011100"), ("11100001"), ("11111101"), ("11111101")), (("00011001"), ("11100111"), ("11111110"), ("11111110")), (("00011010"), ("11100101"), ("11111111"), ("11111111")) ); constant mul_row2: mul_table:= ( (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000001"), ("00000011"), ("00000010"), ("00000001")), (("00000010"), ("00000110"), ("00000100"), ("00000010")), (("00000011"), ("00000101"), ("00000110"), ("00000011")), (("00000100"), ("00001100"), ("00001000"), ("00000100")), (("00000101"), ("00001111"), ("00001010"), ("00000101")), (("00000110"), ("00001010"), ("00001100"), ("00000110")), (("00000111"), ("00001001"), ("00001110"), ("00000111")), (("00001000"), ("00011000"), ("00010000"), ("00001000")), (("00001001"), ("00011011"), ("00010010"), ("00001001")), (("00001010"), ("00011110"), ("00010100"), ("00001010")), (("00001011"), ("00011101"), ("00010110"), ("00001011")), (("00001100"), ("00010100"), ("00011000"), ("00001100")), (("00001101"), ("00010111"), ("00011010"), ("00001101")), (("00001110"), ("00010010"), ("00011100"), ("00001110")), (("00001111"), ("00010001"), ("00011110"), ("00001111")), (("00010000"), ("00110000"), ("00100000"), ("00010000")), (("00010001"), ("00110011"), ("00100010"), ("00010001")), (("00010010"), ("00110110"), ("00100100"), ("00010010")), (("00010011"), ("00110101"), ("00100110"), ("00010011")), (("00010100"), ("00111100"), ("00101000"), ("00010100")), (("00010101"), ("00111111"), ("00101010"), ("00010101")), (("00010110"), ("00111010"), ("00101100"), ("00010110")), (("00010111"), ("00111001"), ("00101110"), ("00010111")), (("00011000"), ("00101000"), ("00110000"), ("00011000")), (("00011001"), ("00101011"), ("00110010"), ("00011001")), (("00011010"), ("00101110"), ("00110100"), ("00011010")), (("00011011"), ("00101101"), ("00110110"), ("00011011")), (("00011100"), ("00100100"), ("00111000"), ("00011100")), (("00011101"), ("00100111"), ("00111010"), ("00011101")),

71 (("00011110"), ("00100010"), ("00111100"), ("00011110")), (("00011111"), ("00100001"), ("00111110"), ("00011111")), (("00100000"), ("01100000"), ("01000000"), ("00100000")), (("00100001"), ("01100011"), ("01000010"), ("00100001")), (("00100010"), ("01100110"), ("01000100"), ("00100010")), (("00100011"), ("01100101"), ("01000110"), ("00100011")), (("00100100"), ("01101100"), ("01001000"), ("00100100")), (("00100101"), ("01101111"), ("01001010"), ("00100101")), (("00100110"), ("01101010"), ("01001100"), ("00100110")), (("00100111"), ("01101001"), ("01001110"), ("00100111")), (("00101000"), ("01111000"), ("01010000"), ("00101000")), (("00101001"), ("01111011"), ("01010010"), ("00101001")), (("00101010"), ("01111110"), ("01010100"), ("00101010")), (("00101011"), ("01111101"), ("01010110"), ("00101011")), (("00101100"), ("01110100"), ("01011000"), ("00101100")), (("00101101"), ("01110111"), ("01011010"), ("00101101")), (("00101110"), ("01110010"), ("01011100"), ("00101110")), (("00101111"), ("01110001"), ("01011110"), ("00101111")), (("00110000"), ("01010000"), ("01100000"), ("00110000")), (("00110001"), ("01010011"), ("01100010"), ("00110001")), (("00110010"), ("01010110"), ("01100100"), ("00110010")), (("00110011"), ("01010101"), ("01100110"), ("00110011")), (("00110100"), ("01011100"), ("01101000"), ("00110100")), (("00110101"), ("01011111"), ("01101010"), ("00110101")), (("00110110"), ("01011010"), ("01101100"), ("00110110")), (("00110111"), ("01011001"), ("01101110"), ("00110111")), (("00111000"), ("01001000"), ("01110000"), ("00111000")), (("00111001"), ("01001011"), ("01110010"), ("00111001")), (("00111010"), ("01001110"), ("01110100"), ("00111010")), (("00111011"), ("01001101"), ("01110110"), ("00111011")), (("00111100"), ("01000100"), ("01111000"), ("00111100")), (("00111101"), ("01000111"), ("01111010"), ("00111101")), (("00111110"), ("01000010"), ("01111100"), ("00111110")), (("00111111"), ("01000001"), ("01111110"), ("00111111")), (("01000000"), ("11000000"), ("10000000"), ("01000000")), (("01000001"), ("11000011"), ("10000010"), ("01000001")), (("01000010"), ("11000110"), ("10000100"), ("01000010")), (("01000011"), ("11000101"), ("10000110"), ("01000011")), (("01000100"), ("11001100"), ("10001000"), ("01000100")), (("01000101"), ("11001111"), ("10001010"), ("01000101")), (("01000110"), ("11001010"), ("10001100"), ("01000110")), (("01000111"), ("11001001"), ("10001110"), ("01000111")), (("01001000"), ("11011000"), ("10010000"), ("01001000")), (("01001001"), ("11011011"), ("10010010"), ("01001001")), (("01001010"), ("11011110"), ("10010100"), ("01001010")), (("01001011"), ("11011101"), ("10010110"), ("01001011")), (("01001100"), ("11010100"), ("10011000"), ("01001100")), (("01001101"), ("11010111"), ("10011010"), ("01001101")), (("01001110"), ("11010010"), ("10011100"), ("01001110")), (("01001111"), ("11010001"), ("10011110"), ("01001111")), (("01010000"), ("11110000"), ("10100000"), ("01010000")), (("01010001"), ("11110011"), ("10100010"), ("01010001")), (("01010010"), ("11110110"), ("10100100"), ("01010010")), (("01010011"), ("11110101"), ("10100110"), ("01010011")), (("01010100"), ("11111100"), ("10101000"), ("01010100")), (("01010101"), ("11111111"), ("10101010"), ("01010101")), (("01010110"), ("11111010"), ("10101100"), ("01010110")),

72 (("01010111"), ("11111001"), ("10101110"), ("01010111")), (("01011000"), ("11101000"), ("10110000"), ("01011000")), (("01011001"), ("11101011"), ("10110010"), ("01011001")), (("01011010"), ("11101110"), ("10110100"), ("01011010")), (("01011011"), ("11101101"), ("10110110"), ("01011011")), (("01011100"), ("11100100"), ("10111000"), ("01011100")), (("01011101"), ("11100111"), ("10111010"), ("01011101")), (("01011110"), ("11100010"), ("10111100"), ("01011110")), (("01011111"), ("11100001"), ("10111110"), ("01011111")), (("01100000"), ("10100000"), ("11000000"), ("01100000")), (("01100001"), ("10100011"), ("11000010"), ("01100001")), (("01100010"), ("10100110"), ("11000100"), ("01100010")), (("01100011"), ("10100101"), ("11000110"), ("01100011")), (("01100100"), ("10101100"), ("11001000"), ("01100100")), (("01100101"), ("10101111"), ("11001010"), ("01100101")), (("01100110"), ("10101010"), ("11001100"), ("01100110")), (("01100111"), ("10101001"), ("11001110"), ("01100111")), (("01101000"), ("10111000"), ("11010000"), ("01101000")), (("01101001"), ("10111011"), ("11010010"), ("01101001")), (("01101010"), ("10111110"), ("11010100"), ("01101010")), (("01101011"), ("10111101"), ("11010110"), ("01101011")), (("01101100"), ("10110100"), ("11011000"), ("01101100")), (("01101101"), ("10110111"), ("11011010"), ("01101101")), (("01101110"), ("10110010"), ("11011100"), ("01101110")), (("01101111"), ("10110001"), ("11011110"), ("01101111")), (("01110000"), ("10010000"), ("11100000"), ("01110000")), (("01110001"), ("10010011"), ("11100010"), ("01110001")), (("01110010"), ("10010110"), ("11100100"), ("01110010")), (("01110011"), ("10010101"), ("11100110"), ("01110011")), (("01110100"), ("10011100"), ("11101000"), ("01110100")), (("01110101"), ("10011111"), ("11101010"), ("01110101")), (("01110110"), ("10011010"), ("11101100"), ("01110110")), (("01110111"), ("10011001"), ("11101110"), ("01110111")), (("01111000"), ("10001000"), ("11110000"), ("01111000")), (("01111001"), ("10001011"), ("11110010"), ("01111001")), (("01111010"), ("10001110"), ("11110100"), ("01111010")), (("01111011"), ("10001101"), ("11110110"), ("01111011")), (("01111100"), ("10000100"), ("11111000"), ("01111100")), (("01111101"), ("10000111"), ("11111010"), ("01111101")), (("01111110"), ("10000010"), ("11111100"), ("01111110")), (("01111111"), ("10000001"), ("11111110"), ("01111111")), (("10000000"), ("10011011"), ("00011011"), ("10000000")), (("10000001"), ("10011000"), ("00011001"), ("10000001")), (("10000010"), ("10011101"), ("00011111"), ("10000010")), (("10000011"), ("10011110"), ("00011101"), ("10000011")), (("10000100"), ("10010111"), ("00010011"), ("10000100")), (("10000101"), ("10010100"), ("00010001"), ("10000101")), (("10000110"), ("10010001"), ("00010111"), ("10000110")), (("10000111"), ("10010010"), ("00010101"), ("10000111")), (("10001000"), ("10000011"), ("00001011"), ("10001000")), (("10001001"), ("10000000"), ("00001001"), ("10001001")), (("10001010"), ("10000101"), ("00001111"), ("10001010")), (("10001011"), ("10000110"), ("00001101"), ("10001011")), (("10001100"), ("10001111"), ("00000011"), ("10001100")), (("10001101"), ("10001100"), ("00000001"), ("10001101")), (("10001110"), ("10001001"), ("00000111"), ("10001110")), (("10001111"), ("10001010"), ("00000101"), ("10001111")),

73 (("10010000"), ("10101011"), ("00111011"), ("10010000")), (("10010001"), ("10101000"), ("00111001"), ("10010001")), (("10010010"), ("10101101"), ("00111111"), ("10010010")), (("10010011"), ("10101110"), ("00111101"), ("10010011")), (("10010100"), ("10100111"), ("00110011"), ("10010100")), (("10010101"), ("10100100"), ("00110001"), ("10010101")), (("10010110"), ("10100001"), ("00110111"), ("10010110")), (("10010111"), ("10100010"), ("00110101"), ("10010111")), (("10011000"), ("10110011"), ("00101011"), ("10011000")), (("10011001"), ("10110000"), ("00101001"), ("10011001")), (("10011010"), ("10110101"), ("00101111"), ("10011010")), (("10011011"), ("10110110"), ("00101101"), ("10011011")), (("10011100"), ("10111111"), ("00100011"), ("10011100")), (("10011101"), ("10111100"), ("00100001"), ("10011101")), (("10011110"), ("10111001"), ("00100111"), ("10011110")), (("10011111"), ("10111010"), ("00100101"), ("10011111")), (("10100000"), ("11111011"), ("01011011"), ("10100000")), (("10100001"), ("11111000"), ("01011001"), ("10100001")), (("10100010"), ("11111101"), ("01011111"), ("10100010")), (("10100011"), ("11111110"), ("01011101"), ("10100011")), (("10100100"), ("11110111"), ("01010011"), ("10100100")), (("10100101"), ("11110100"), ("01010001"), ("10100101")), (("10100110"), ("11110001"), ("01010111"), ("10100110")), (("10100111"), ("11110010"), ("01010101"), ("10100111")), (("10101000"), ("11100011"), ("01001011"), ("10101000")), (("10101001"), ("11100000"), ("01001001"), ("10101001")), (("10101010"), ("11100101"), ("01001111"), ("10101010")), (("10101011"), ("11100110"), ("01001101"), ("10101011")), (("10101100"), ("11101111"), ("01000011"), ("10101100")), (("10101101"), ("11101100"), ("01000001"), ("10101101")), (("10101110"), ("11101001"), ("01000111"), ("10101110")), (("10101111"), ("11101010"), ("01000101"), ("10101111")), (("10110000"), ("11001011"), ("01111011"), ("10110000")), (("10110001"), ("11001000"), ("01111001"), ("10110001")), (("10110010"), ("11001101"), ("01111111"), ("10110010")), (("10110011"), ("11001110"), ("01111101"), ("10110011")), (("10110100"), ("11000111"), ("01110011"), ("10110100")), (("10110101"), ("11000100"), ("01110001"), ("10110101")), (("10110110"), ("11000001"), ("01110111"), ("10110110")), (("10110111"), ("11000010"), ("01110101"), ("10110111")), (("10111000"), ("11010011"), ("01101011"), ("10111000")), (("10111001"), ("11010000"), ("01101001"), ("10111001")), (("10111010"), ("11010101"), ("01101111"), ("10111010")), (("10111011"), ("11010110"), ("01101101"), ("10111011")), (("10111100"), ("11011111"), ("01100011"), ("10111100")), (("10111101"), ("11011100"), ("01100001"), ("10111101")), (("10111110"), ("11011001"), ("01100111"), ("10111110")), (("10111111"), ("11011010"), ("01100101"), ("10111111")), (("11000000"), ("01011011"), ("10011011"), ("11000000")), (("11000001"), ("01011000"), ("10011001"), ("11000001")), (("11000010"), ("01011101"), ("10011111"), ("11000010")), (("11000011"), ("01011110"), ("10011101"), ("11000011")), (("11000100"), ("01010111"), ("10010011"), ("11000100")), (("11000101"), ("01010100"), ("10010001"), ("11000101")), (("11000110"), ("01010001"), ("10010111"), ("11000110")), (("11000111"), ("01010010"), ("10010101"), ("11000111")), (("11001000"), ("01000011"), ("10001011"), ("11001000")),

74 (("11001001"), ("01000000"), ("10001001"), ("11001001")), (("11001010"), ("01000101"), ("10001111"), ("11001010")), (("11001011"), ("01000110"), ("10001101"), ("11001011")), (("11001100"), ("01001111"), ("10000011"), ("11001100")), (("11001101"), ("01001100"), ("10000001"), ("11001101")), (("11001110"), ("01001001"), ("10000111"), ("11001110")), (("11001111"), ("01001010"), ("10000101"), ("11001111")), (("11010000"), ("01101011"), ("10111011"), ("11010000")), (("11010001"), ("01101000"), ("10111001"), ("11010001")), (("11010010"), ("01101101"), ("10111111"), ("11010010")), (("11010011"), ("01101110"), ("10111101"), ("11010011")), (("11010100"), ("01100111"), ("10110011"), ("11010100")), (("11010101"), ("01100100"), ("10110001"), ("11010101")), (("11010110"), ("01100001"), ("10110111"), ("11010110")), (("11010111"), ("01100010"), ("10110101"), ("11010111")), (("11011000"), ("01110011"), ("10101011"), ("11011000")), (("11011001"), ("01110000"), ("10101001"), ("11011001")), (("11011010"), ("01110101"), ("10101111"), ("11011010")), (("11011011"), ("01110110"), ("10101101"), ("11011011")), (("11011100"), ("01111111"), ("10100011"), ("11011100")), (("11011101"), ("01111100"), ("10100001"), ("11011101")), (("11011110"), ("01111001"), ("10100111"), ("11011110")), (("11011111"), ("01111010"), ("10100101"), ("11011111")), (("11100000"), ("00111011"), ("11011011"), ("11100000")), (("11100001"), ("00111000"), ("11011001"), ("11100001")), (("11100010"), ("00111101"), ("11011111"), ("11100010")), (("11100011"), ("00111110"), ("11011101"), ("11100011")), (("11100100"), ("00110111"), ("11010011"), ("11100100")), (("11100101"), ("00110100"), ("11010001"), ("11100101")), (("11100110"), ("00110001"), ("11010111"), ("11100110")), (("11100111"), ("00110010"), ("11010101"), ("11100111")), (("11101000"), ("00100011"), ("11001011"), ("11101000")), (("11101001"), ("00100000"), ("11001001"), ("11101001")), (("11101010"), ("00100101"), ("11001111"), ("11101010")), (("11101011"), ("00100110"), ("11001101"), ("11101011")), (("11101100"), ("00101111"), ("11000011"), ("11101100")), (("11101101"), ("00101100"), ("11000001"), ("11101101")), (("11101110"), ("00101001"), ("11000111"), ("11101110")), (("11101111"), ("00101010"), ("11000101"), ("11101111")), (("11110000"), ("00001011"), ("11111011"), ("11110000")), (("11110001"), ("00001000"), ("11111001"), ("11110001")), (("11110010"), ("00001101"), ("11111111"), ("11110010")), (("11110011"), ("00001110"), ("11111101"), ("11110011")), (("11110100"), ("00000111"), ("11110011"), ("11110100")), (("11110101"), ("00000100"), ("11110001"), ("11110101")), (("11110110"), ("00000001"), ("11110111"), ("11110110")), (("11110111"), ("00000010"), ("11110101"), ("11110111")), (("11111000"), ("00010011"), ("11101011"), ("11111000")), (("11111001"), ("00010000"), ("11101001"), ("11111001")), (("11111010"), ("00010101"), ("11101111"), ("11111010")), (("11111011"), ("00010110"), ("11101101"), ("11111011")), (("11111100"), ("00011111"), ("11100011"), ("11111100")), (("11111101"), ("00011100"), ("11100001"), ("11111101")), (("11111110"), ("00011001"), ("11100111"), ("11111110")), (("11111111"), ("00011010"), ("11100101"), ("11111111")) ); constant mul_row3: mul_table:= (

75 (("00000000"), ("00000000"), ("00000000"), ("00000000")), (("00000001"), ("00000001"), ("00000011"), ("00000010")), (("00000010"), ("00000010"), ("00000110"), ("00000100")), (("00000011"), ("00000011"), ("00000101"), ("00000110")), (("00000100"), ("00000100"), ("00001100"), ("00001000")), (("00000101"), ("00000101"), ("00001111"), ("00001010")), (("00000110"), ("00000110"), ("00001010"), ("00001100")), (("00000111"), ("00000111"), ("00001001"), ("00001110")), (("00001000"), ("00001000"), ("00011000"), ("00010000")), (("00001001"), ("00001001"), ("00011011"), ("00010010")), (("00001010"), ("00001010"), ("00011110"), ("00010100")), (("00001011"), ("00001011"), ("00011101"), ("00010110")), (("00001100"), ("00001100"), ("00010100"), ("00011000")), (("00001101"), ("00001101"), ("00010111"), ("00011010")), (("00001110"), ("00001110"), ("00010010"), ("00011100")), (("00001111"), ("00001111"), ("00010001"), ("00011110")), (("00010000"), ("00010000"), ("00110000"), ("00100000")), (("00010001"), ("00010001"), ("00110011"), ("00100010")), (("00010010"), ("00010010"), ("00110110"), ("00100100")), (("00010011"), ("00010011"), ("00110101"), ("00100110")), (("00010100"), ("00010100"), ("00111100"), ("00101000")), (("00010101"), ("00010101"), ("00111111"), ("00101010")), (("00010110"), ("00010110"), ("00111010"), ("00101100")), (("00010111"), ("00010111"), ("00111001"), ("00101110")), (("00011000"), ("00011000"), ("00101000"), ("00110000")), (("00011001"), ("00011001"), ("00101011"), ("00110010")), (("00011010"), ("00011010"), ("00101110"), ("00110100")), (("00011011"), ("00011011"), ("00101101"), ("00110110")), (("00011100"), ("00011100"), ("00100100"), ("00111000")), (("00011101"), ("00011101"), ("00100111"), ("00111010")), (("00011110"), ("00011110"), ("00100010"), ("00111100")), (("00011111"), ("00011111"), ("00100001"), ("00111110")), (("00100000"), ("00100000"), ("01100000"), ("01000000")), (("00100001"), ("00100001"), ("01100011"), ("01000010")), (("00100010"), ("00100010"), ("01100110"), ("01000100")), (("00100011"), ("00100011"), ("01100101"), ("01000110")), (("00100100"), ("00100100"), ("01101100"), ("01001000")), (("00100101"), ("00100101"), ("01101111"), ("01001010")), (("00100110"), ("00100110"), ("01101010"), ("01001100")), (("00100111"), ("00100111"), ("01101001"), ("01001110")), (("00101000"), ("00101000"), ("01111000"), ("01010000")), (("00101001"), ("00101001"), ("01111011"), ("01010010")), (("00101010"), ("00101010"), ("01111110"), ("01010100")), (("00101011"), ("00101011"), ("01111101"), ("01010110")), (("00101100"), ("00101100"), ("01110100"), ("01011000")), (("00101101"), ("00101101"), ("01110111"), ("01011010")), (("00101110"), ("00101110"), ("01110010"), ("01011100")), (("00101111"), ("00101111"), ("01110001"), ("01011110")), (("00110000"), ("00110000"), ("01010000"), ("01100000")), (("00110001"), ("00110001"), ("01010011"), ("01100010")), (("00110010"), ("00110010"), ("01010110"), ("01100100")), (("00110011"), ("00110011"), ("01010101"), ("01100110")), (("00110100"), ("00110100"), ("01011100"), ("01101000")), (("00110101"), ("00110101"), ("01011111"), ("01101010")), (("00110110"), ("00110110"), ("01011010"), ("01101100")), (("00110111"), ("00110111"), ("01011001"), ("01101110")), (("00111000"), ("00111000"), ("01001000"), ("01110000")),

76 (("00111001"), ("00111001"), ("01001011"), ("01110010")), (("00111010"), ("00111010"), ("01001110"), ("01110100")), (("00111011"), ("00111011"), ("01001101"), ("01110110")), (("00111100"), ("00111100"), ("01000100"), ("01111000")), (("00111101"), ("00111101"), ("01000111"), ("01111010")), (("00111110"), ("00111110"), ("01000010"), ("01111100")), (("00111111"), ("00111111"), ("01000001"), ("01111110")), (("01000000"), ("01000000"), ("11000000"), ("10000000")), (("01000001"), ("01000001"), ("11000011"), ("10000010")), (("01000010"), ("01000010"), ("11000110"), ("10000100")), (("01000011"), ("01000011"), ("11000101"), ("10000110")), (("01000100"), ("01000100"), ("11001100"), ("10001000")), (("01000101"), ("01000101"), ("11001111"), ("10001010")), (("01000110"), ("01000110"), ("11001010"), ("10001100")), (("01000111"), ("01000111"), ("11001001"), ("10001110")), (("01001000"), ("01001000"), ("11011000"), ("10010000")), (("01001001"), ("01001001"), ("11011011"), ("10010010")), (("01001010"), ("01001010"), ("11011110"), ("10010100")), (("01001011"), ("01001011"), ("11011101"), ("10010110")), (("01001100"), ("01001100"), ("11010100"), ("10011000")), (("01001101"), ("01001101"), ("11010111"), ("10011010")), (("01001110"), ("01001110"), ("11010010"), ("10011100")), (("01001111"), ("01001111"), ("11010001"), ("10011110")), (("01010000"), ("01010000"), ("11110000"), ("10100000")), (("01010001"), ("01010001"), ("11110011"), ("10100010")), (("01010010"), ("01010010"), ("11110110"), ("10100100")), (("01010011"), ("01010011"), ("11110101"), ("10100110")), (("01010100"), ("01010100"), ("11111100"), ("10101000")), (("01010101"), ("01010101"), ("11111111"), ("10101010")), (("01010110"), ("01010110"), ("11111010"), ("10101100")), (("01010111"), ("01010111"), ("11111001"), ("10101110")), (("01011000"), ("01011000"), ("11101000"), ("10110000")), (("01011001"), ("01011001"), ("11101011"), ("10110010")), (("01011010"), ("01011010"), ("11101110"), ("10110100")), (("01011011"), ("01011011"), ("11101101"), ("10110110")), (("01011100"), ("01011100"), ("11100100"), ("10111000")), (("01011101"), ("01011101"), ("11100111"), ("10111010")), (("01011110"), ("01011110"), ("11100010"), ("10111100")), (("01011111"), ("01011111"), ("11100001"), ("10111110")), (("01100000"), ("01100000"), ("10100000"), ("11000000")), (("01100001"), ("01100001"), ("10100011"), ("11000010")), (("01100010"), ("01100010"), ("10100110"), ("11000100")), (("01100011"), ("01100011"), ("10100101"), ("11000110")), (("01100100"), ("01100100"), ("10101100"), ("11001000")), (("01100101"), ("01100101"), ("10101111"), ("11001010")), (("01100110"), ("01100110"), ("10101010"), ("11001100")), (("01100111"), ("01100111"), ("10101001"), ("11001110")), (("01101000"), ("01101000"), ("10111000"), ("11010000")), (("01101001"), ("01101001"), ("10111011"), ("11010010")), (("01101010"), ("01101010"), ("10111110"), ("11010100")), (("01101011"), ("01101011"), ("10111101"), ("11010110")), (("01101100"), ("01101100"), ("10110100"), ("11011000")), (("01101101"), ("01101101"), ("10110111"), ("11011010")), (("01101110"), ("01101110"), ("10110010"), ("11011100")), (("01101111"), ("01101111"), ("10110001"), ("11011110")), (("01110000"), ("01110000"), ("10010000"), ("11100000")), (("01110001"), ("01110001"), ("10010011"), ("11100010")),

77 (("01110010"), ("01110010"), ("10010110"), ("11100100")), (("01110011"), ("01110011"), ("10010101"), ("11100110")), (("01110100"), ("01110100"), ("10011100"), ("11101000")), (("01110101"), ("01110101"), ("10011111"), ("11101010")), (("01110110"), ("01110110"), ("10011010"), ("11101100")), (("01110111"), ("01110111"), ("10011001"), ("11101110")), (("01111000"), ("01111000"), ("10001000"), ("11110000")), (("01111001"), ("01111001"), ("10001011"), ("11110010")), (("01111010"), ("01111010"), ("10001110"), ("11110100")), (("01111011"), ("01111011"), ("10001101"), ("11110110")), (("01111100"), ("01111100"), ("10000100"), ("11111000")), (("01111101"), ("01111101"), ("10000111"), ("11111010")), (("01111110"), ("01111110"), ("10000010"), ("11111100")), (("01111111"), ("01111111"), ("10000001"), ("11111110")), (("10000000"), ("10000000"), ("10011011"), ("00011011")), (("10000001"), ("10000001"), ("10011000"), ("00011001")), (("10000010"), ("10000010"), ("10011101"), ("00011111")), (("10000011"), ("10000011"), ("10011110"), ("00011101")), (("10000100"), ("10000100"), ("10010111"), ("00010011")), (("10000101"), ("10000101"), ("10010100"), ("00010001")), (("10000110"), ("10000110"), ("10010001"), ("00010111")), (("10000111"), ("10000111"), ("10010010"), ("00010101")), (("10001000"), ("10001000"), ("10000011"), ("00001011")), (("10001001"), ("10001001"), ("10000000"), ("00001001")), (("10001010"), ("10001010"), ("10000101"), ("00001111")), (("10001011"), ("10001011"), ("10000110"), ("00001101")), (("10001100"), ("10001100"), ("10001111"), ("00000011")), (("10001101"), ("10001101"), ("10001100"), ("00000001")), (("10001110"), ("10001110"), ("10001001"), ("00000111")), (("10001111"), ("10001111"), ("10001010"), ("00000101")), (("10010000"), ("10010000"), ("10101011"), ("00111011")), (("10010001"), ("10010001"), ("10101000"), ("00111001")), (("10010010"), ("10010010"), ("10101101"), ("00111111")), (("10010011"), ("10010011"), ("10101110"), ("00111101")), (("10010100"), ("10010100"), ("10100111"), ("00110011")), (("10010101"), ("10010101"), ("10100100"), ("00110001")), (("10010110"), ("10010110"), ("10100001"), ("00110111")), (("10010111"), ("10010111"), ("10100010"), ("00110101")), (("10011000"), ("10011000"), ("10110011"), ("00101011")), (("10011001"), ("10011001"), ("10110000"), ("00101001")), (("10011010"), ("10011010"), ("10110101"), ("00101111")), (("10011011"), ("10011011"), ("10110110"), ("00101101")), (("10011100"), ("10011100"), ("10111111"), ("00100011")), (("10011101"), ("10011101"), ("10111100"), ("00100001")), (("10011110"), ("10011110"), ("10111001"), ("00100111")), (("10011111"), ("10011111"), ("10111010"), ("00100101")), (("10100000"), ("10100000"), ("11111011"), ("01011011")), (("10100001"), ("10100001"), ("11111000"), ("01011001")), (("10100010"), ("10100010"), ("11111101"), ("01011111")), (("10100011"), ("10100011"), ("11111110"), ("01011101")), (("10100100"), ("10100100"), ("11110111"), ("01010011")), (("10100101"), ("10100101"), ("11110100"), ("01010001")), (("10100110"), ("10100110"), ("11110001"), ("01010111")), (("10100111"), ("10100111"), ("11110010"), ("01010101")), (("10101000"), ("10101000"), ("11100011"), ("01001011")), (("10101001"), ("10101001"), ("11100000"), ("01001001")), (("10101010"), ("10101010"), ("11100101"), ("01001111")),

78 (("10101011"), ("10101011"), ("11100110"), ("01001101")), (("10101100"), ("10101100"), ("11101111"), ("01000011")), (("10101101"), ("10101101"), ("11101100"), ("01000001")), (("10101110"), ("10101110"), ("11101001"), ("01000111")), (("10101111"), ("10101111"), ("11101010"), ("01000101")), (("10110000"), ("10110000"), ("11001011"), ("01111011")), (("10110001"), ("10110001"), ("11001000"), ("01111001")), (("10110010"), ("10110010"), ("11001101"), ("01111111")), (("10110011"), ("10110011"), ("11001110"), ("01111101")), (("10110100"), ("10110100"), ("11000111"), ("01110011")), (("10110101"), ("10110101"), ("11000100"), ("01110001")), (("10110110"), ("10110110"), ("11000001"), ("01110111")), (("10110111"), ("10110111"), ("11000010"), ("01110101")), (("10111000"), ("10111000"), ("11010011"), ("01101011")), (("10111001"), ("10111001"), ("11010000"), ("01101001")), (("10111010"), ("10111010"), ("11010101"), ("01101111")), (("10111011"), ("10111011"), ("11010110"), ("01101101")), (("10111100"), ("10111100"), ("11011111"), ("01100011")), (("10111101"), ("10111101"), ("11011100"), ("01100001")), (("10111110"), ("10111110"), ("11011001"), ("01100111")), (("10111111"), ("10111111"), ("11011010"), ("01100101")), (("11000000"), ("11000000"), ("01011011"), ("10011011")), (("11000001"), ("11000001"), ("01011000"), ("10011001")), (("11000010"), ("11000010"), ("01011101"), ("10011111")), (("11000011"), ("11000011"), ("01011110"), ("10011101")), (("11000100"), ("11000100"), ("01010111"), ("10010011")), (("11000101"), ("11000101"), ("01010100"), ("10010001")), (("11000110"), ("11000110"), ("01010001"), ("10010111")), (("11000111"), ("11000111"), ("01010010"), ("10010101")), (("11001000"), ("11001000"), ("01000011"), ("10001011")), (("11001001"), ("11001001"), ("01000000"), ("10001001")), (("11001010"), ("11001010"), ("01000101"), ("10001111")), (("11001011"), ("11001011"), ("01000110"), ("10001101")), (("11001100"), ("11001100"), ("01001111"), ("10000011")), (("11001101"), ("11001101"), ("01001100"), ("10000001")), (("11001110"), ("11001110"), ("01001001"), ("10000111")), (("11001111"), ("11001111"), ("01001010"), ("10000101")), (("11010000"), ("11010000"), ("01101011"), ("10111011")), (("11010001"), ("11010001"), ("01101000"), ("10111001")), (("11010010"), ("11010010"), ("01101101"), ("10111111")), (("11010011"), ("11010011"), ("01101110"), ("10111101")), (("11010100"), ("11010100"), ("01100111"), ("10110011")), (("11010101"), ("11010101"), ("01100100"), ("10110001")), (("11010110"), ("11010110"), ("01100001"), ("10110111")), (("11010111"), ("11010111"), ("01100010"), ("10110101")), (("11011000"), ("11011000"), ("01110011"), ("10101011")), (("11011001"), ("11011001"), ("01110000"), ("10101001")), (("11011010"), ("11011010"), ("01110101"), ("10101111")), (("11011011"), ("11011011"), ("01110110"), ("10101101")), (("11011100"), ("11011100"), ("01111111"), ("10100011")), (("11011101"), ("11011101"), ("01111100"), ("10100001")), (("11011110"), ("11011110"), ("01111001"), ("10100111")), (("11011111"), ("11011111"), ("01111010"), ("10100101")), (("11100000"), ("11100000"), ("00111011"), ("11011011")), (("11100001"), ("11100001"), ("00111000"), ("11011001")), (("11100010"), ("11100010"), ("00111101"), ("11011111")), (("11100011"), ("11100011"), ("00111110"), ("11011101")),

79 (("11100100"), ("11100100"), ("00110111"), ("11010011")), (("11100101"), ("11100101"), ("00110100"), ("11010001")), (("11100110"), ("11100110"), ("00110001"), ("11010111")), (("11100111"), ("11100111"), ("00110010"), ("11010101")), (("11101000"), ("11101000"), ("00100011"), ("11001011")), (("11101001"), ("11101001"), ("00100000"), ("11001001")), (("11101010"), ("11101010"), ("00100101"), ("11001111")), (("11101011"), ("11101011"), ("00100110"), ("11001101")), (("11101100"), ("11101100"), ("00101111"), ("11000011")), (("11101101"), ("11101101"), ("00101100"), ("11000001")), (("11101110"), ("11101110"), ("00101001"), ("11000111")), (("11101111"), ("11101111"), ("00101010"), ("11000101")), (("11110000"), ("11110000"), ("00001011"), ("11111011")), (("11110001"), ("11110001"), ("00001000"), ("11111001")), (("11110010"), ("11110010"), ("00001101"), ("11111111")), (("11110011"), ("11110011"), ("00001110"), ("11111101")), (("11110100"), ("11110100"), ("00000111"), ("11110011")), (("11110101"), ("11110101"), ("00000100"), ("11110001")), (("11110110"), ("11110110"), ("00000001"), ("11110111")), (("11110111"), ("11110111"), ("00000010"), ("11110101")), (("11111000"), ("11111000"), ("00010011"), ("11101011")), (("11111001"), ("11111001"), ("00010000"), ("11101001")), (("11111010"), ("11111010"), ("00010101"), ("11101111")), (("11111011"), ("11111011"), ("00010110"), ("11101101")), (("11111100"), ("11111100"), ("00011111"), ("11100011")), (("11111101"), ("11111101"), ("00011100"), ("11100001")), (("11111110"), ("11111110"), ("00011001"), ("11100111")), (("11111111"), ("11111111"), ("00011010"), ("11100101")) ); constant sbox: box:= ( --0 (X"63", X"7c", X"77", X"7b", X"f2", X"6b", X"6f", X"c5", X"30", X"01", X"67", X"2b", X"fe", X"d7", X"ab", X"76"), --1 (X"ca", X"82", X"c9", X"7d", X"fa", X"59", X"47", X"f0", X"ad", X"d4", X"a2", X"af", X"9c", X"a4", X"72", X"c0"), --2 (X"b7", X"fd", X"93", X"26", X"36", X"3f", X"f7", X"cc", X"34", X"a5", X"e5", X"f1", X"71", X"d8", X"31", X"15"), --3 (X"04", X"c7", X"23", X"c3", X"18", X"96", X"05", X"9a", X"07", X"12", X"80", X"e2", X"eb", X"27", X"b2", X"75"), --4 (X"09", X"83", X"2c", X"1a", X"1b", X"6e", X"5a", X"a0", X"52", X"3b", X"d6", X"b3", X"29", X"e3", X"2f", X"84"), --5

80 (X"53", X"d1", X"00", X"ed", X"20", X"fc", X"b1", X"5b", X"6a", X"cb", X"be", X"39", X"4a", X"4c", X"58", X"cf"), --6 (X"d0", X"ef", X"aa", X"fb", X"43", X"4d", X"33", X"85", X"45", X"f9", X"02", X"7f", X"50", X"3c", X"9f", X"a8"), --7 (X"51", X"a3", X"40", X"8f", X"92", X"9d", X"38", X"f5", X"bc", X"b6", X"da", X"21", X"10", X"ff", X"f3", X"d2"), --8 (X"cd", X"0c", X"13", X"ec", X"5f", X"97", X"44", X"17", X"c4", X"a7", X"7e", X"3d", X"64", X"5d", X"19", X"73"), --9 (X"60", X"81", X"4f", X"dc", X"22", X"2a", X"90", X"88", X"46", X"ee", X"b8", X"14", X"de", X"5e", X"0b", X"db"), --10 (X"e0", X"32", X"3a", X"0a", X"49", X"06", X"24", X"5c", X"c2", X"d3", X"ac", X"62", X"91", X"95", X"e4", X"79"), --11 (X"e7", X"c8", X"37", X"6d", X"8d", X"d5", X"4e", X"a9", X"6c", X"56", X"f4", X"ea", X"65", X"7a", X"ae", X"08"), --12 (X"ba", X"78", X"25", X"2e", X"1c", X"a6", X"b4", X"c6", X"e8", X"dd", X"74", X"1f", X"4b", X"bd", X"8b", X"8a"), --13 (X"70", X"3e", X"b5", X"66", X"48", X"03", X"f6", X"0e", X"61", X"35", X"57", X"b9", X"86", X"c1", X"1d", X"9e"), --14 (X"e1", X"f8", X"98", X"11", X"69", X"d9", X"8e", X"94", X"9b", X"1e", X"87", X"e9", X"ce", X"55", X"28", X"df"), --15 (X"8c", X"a1", X"89", X"0d", X"bf", X"e6", X"42", X"68", X"41", X"99", X"2d", X"0f", X"b0", X"54", X"bb", X"16") ); constant Rcon: RCbox:= (

81 (X"00", X"01", X"02", X"04", X"08", X"10", X"20", X"40", X"80", X"1b", X"36") ); function "xor"(a, b: in word) return word; function ceil(a: in natural) return natural; function rgen(a, b: in natural) return natural; function ugen(a: in natural) return natural; function format( Tlen, Nlen, Alen, Plen : in natural; nonce, Ain, Pin: in std_logic_vector ) return key_Arr; function format_ctr( Nlen, Plen : in natural; nonce: in std_logic_vector ) return key_Arr; end datatypes; package body datatypes is function "xor"(a, b: in word) return word is variable temp: word; begin temp(0):= a(0) xor b(0); temp(1):= a(1) xor b(1); temp(2):= a(2) xor b(2); temp(3):= a(3) xor b(3); return temp; end function "xor"; function ceil(a: in natural) return natural is variable temp: natural; begin if (a mod 128=0) then temp:= a/128; else temp:= a/128+1; end if; return temp; end function ceil; function rgen(a,b: in natural) return natural is variable temp: natural; begin if (0<(a/8) and (a/8)<2**16-2**8) then temp:=ceil(16+a)+ceil(b); else --if ( 2**16-2**8<=(a/8) and (a/8)<2**32 ) then temp:=ceil(48+a)+ceil(b); --else --if (2**32<=(a/8) and (a/8)<2**64) then --temp:=ceil(80+a)+ceil(b); --end if;

82 --end if; end if; return temp; end function rgen; function ugen(a: in natural) return natural is variable temp: natural; begin if (0<(a/8) and (a/8)<2**16-2**8) then temp:=ceil(16+a); else --if ( 2**16-2**8<=(a/8) and (a/8)<2**32 ) then temp:=ceil(48+a); --else --if (2**32<=(a/8) and (a/8)<2**64) then --temp:=ceil(80+a); --end if; --end if; end if; return temp; end function ugen; function format( Tlen, Nlen, Alen, Plen : in natural; nonce, Ain, Pin: in std_logic_vector ) return key_Arr is variable temp: key_arr(rgen(Alen, Plen) downto 0); variable t, q, n, p: natural; variable count: natural; variable Q_str: std_logic_vector((15-Nlen/8)*8-1 downto 0); variable a_str: std_logic_vector(ugen(Alen)*128-1 downto 0); variable p_str: std_logic_vector(ugen(Plen)*128-1 downto 0); begin t:=Tlen/8; n:=Nlen/8; q:=15-n; p:=Plen/8; temp(0)(0)(7):='0'; if Alen=0 then temp(0)(0)(6):='0'; else temp(0)(0)(6):='1'; end if; temp(0)(0)(5 downto 3):=conv_std_logic_vector((t-2)/2, 3); temp(0)(0)(2 downto 0):=conv_std_logic_vector(q-1, 3); for j in 1 to n loop temp(0)(n+1-j):= nonce((j-1)*8+7 downto (j-1)*8); end loop;

Q_str:= conv_std_logic_vector(p, q*8); for j in n+1 to 15 loop temp(0)(15+n+1-j):= Q_str((j-(n+1))*8+7 downto (j-(n+1))*8);

83 end loop; if (0<(Alen/8) and (Alen/8)<2**16-2**8) then a_str(7 downto 0):= conv_std_logic_vector(Alen/8, 16)(15 downto 8); a_str(15 downto 8):= conv_std_logic_vector(Alen/8, 16)(7 downto 0); for i in 0 to Alen/8-1 loop a_str(16+i*8+7 downto 16+i*8):= Ain((Alen/8-1-i)*8+7 downto (Alen/8-1-i)*8); end loop; a_str(ugen(Alen)*128-1 downto 16+Alen):=(others=>'0'); for j in 1 to ugen(Alen) loop for k in 0 to 15 loop temp(j)(k):= a_str((j-1)*128+k*8+7 downto (j-1)*128+k*8); end loop; end loop; else --if ( 2**16-2**8<=(Alen/8) and (Alen/8)<2**32 ) then --else --if (2**32<=(Alen/8) and (Alen/8)<2**64) then --end if; --end if; end if; for i in 0 to Plen/8-1 loop p_str(i*8+7 downto i*8):= Pin((Plen/8-1-i)*8+7 downto (Plen/8-1-i)*8); end loop; p_str(ceil(Plen)*128-1 downto Plen):= (others=>'0'); for j in ugen(Alen)+1 to rgen(Alen, Plen) loop for k in 0 to 15 loop temp(j)(k):= p_str((j-(ugen(Alen)+1))*128+k*8+7 downto (j-(ugen(Alen)+1))*128+k*8); end loop; end loop; return temp; end function format; function format_ctr( Nlen, Plen : in natural; nonce: in std_logic_vector ) return key_Arr is variable q, n: natural; variable temp: key_arr(ceil(Plen) downto 0); begin n:=Nlen/8; q:=15-n; for i in 0 to ceil(Plen) loop temp(i)(0)(7 downto 3):=(others=>'0'); temp(i)(0)(2 downto 0):=conv_std_logic_vector(q-1, 3); for j in 1 to n loop temp(i)(n+1-j):= nonce((j-1)*8+7 downto (j-1)*8); end loop; for j in n+1 to 15 loop temp(i)(15+n+1-j):= conv_std_logic_vector(i, 8*q)((j-(n+1))*8+7 downto (j-(n+1))*8); end loop; end loop; return temp; end function format_ctr;

84 end package body datatypes;

85