Intel® 80310 I/O Processor Chipset AAU Coding Techniques

White Paper

January 14, 2002

Document Number: 273649-001 Information in this document is provided in connection with ® products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The Intel® 80310 I/O processor may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an ordering number and are referenced in this document, or other Intel literature may be obtained by calling 1-800-548-4725 or by visiting Intel's website at http://www.intel.com. Copyright© Intel Corporation, 2002 AlertVIEW, i960, AnyPoint, AppChoice, BoardWatch, BunnyPeople, CablePort, , Chips, Commerce Cart, CT Connect, CT Media, Dialogic, DM3, EtherExpress, ETOX, FlashFile, GatherRound, , , iCat, iCOMP, Insight960, InstantIP, Intel, Intel logo, Intel386, Intel486, , IntelDX2, IntelDX4, IntelSX2, Intel ChatPad, Intel Create&Share, Intel Dot.Station, Intel GigaBlade, Intel InBusiness, Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetStructure, Intel Play, Intel Play logo, Intel Pocket Concert, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel TeamStation, Intel WebOutfitter, Intel , Intel XScale, , JobAnalyst, LANDesk, LanRover, MCS, MMX, MMX logo, NetPort, NetportExpress, Optimizer logo, OverDrive, Paragon, PC Dads, PC Parents, , Pentium II Xeon, Pentium III Xeon, Performance at Your Command, ProShare, RemoteExpress, Screamline, Shiva, SmartDie, Solutions960, Sound Mark, StorageExpress, The Computer Inside, The Journey Inside, This Way In, TokenExpress, Trillium, Vivonic, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.

2 White Paper Contents

Contents

1.0 White Paper Purpose and Description ...... 7 1.1 Document Highlights...... 7 1.2 Related Documents ...... 7 2.0 Application Accelerator Unit ...... 8 2.0.1 Overview...... 8 3.0 Low-Level Design Document ...... 9 3.1 Objective...... 9 3.1.1 AAU Implementation...... 9 3.1.1.1 Overview ...... 9 3.1.2 Assumptions ...... 10 3.1.3 Initialization ...... 11 3.1.4 AAU Data Structures ...... 11 3.1.5 Data Path...... 15 3.1.6 API Functions ...... 17 3.1.6.1 API Listing...... 17 3.1.6.1.1 AAU Public...... 17 3.1.6.1.2 AAU Private (Static) ...... 17 3.1.6.2 Selected API Descriptions ...... 18 3.1.6.2.1 static int __init aau_init(void);...... 18 3.1.6.2.2 static int aau_start(iop310_aau_t *aau, sw_aau_t *aau_chain);...... 19 3.1.6.2.3 int aau_request(u32 *aau_context); ...... 19 3.1.6.2.4 int aau_suspend(u32 aau_context);...... 19 3.1.6.2.5 int aau_resume(u32 aau_context); ...... 20 3.1.6.2.6 int aau_queue_buffer(u32 aau_context, aau_sgl_t *sgl);...... 20 3.1.6.2.7 static int aau_flush_all(u32 aau_context);...... 21 3.1.6.2.8 int aau_free(u32 aau_context); ...... 21 3.1.6.2.9 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);...... 22 3.1.6.2.10 static void aau_process(iop310_aau_t *aau); ...... 24 3.1.6.2.11 static void aau_result_handler(void *aau);...... 24 3.1.6.2.12 aau_sgl_t * aau_get_buffer(u32 aau_context, u32 num_buf); ...... 25 3.1.6.2.13 void aau_return_buffer(u32 aau_context, sgl_list_t *list); .....25 4.0 Code Commentary...... 26 4.1 Section Objectives ...... 26 4.1.1 File Organization Overview...... 26 4.1.1.1 Key Data Structure and Use of Casting ...... 26 4.1.2 Cache Memory ...... 27 4.1.3 Other AAU Hardware...... 27 4.1.4 Virtual to Physical memory ...... 27 4.1.5 Interrupt Handling ...... 28 4.1.5.1 Top Half Interrupt Handler: aau_irq_handler() ...... 28 4.1.5.2 Bottom Half Interrupt Handler: aau_task() ...... 28 4.1.6 Linux Kernel APIs ...... 28

White Paper 3 Contents

4.2 Optimization Related ...... 29 4.2.1 Stack verses Queue ...... 29 4.2.2 Chaining and Resume ...... 29 4.2.3 Requiring the Application to Supply Physical Addresses in AAU Descriptor (verses virtual addresses)...... 29 4.2.4 Allocations of Memory for AAU Decriptors During Initialization...... 29 4.2.5 Using AAU for Local Memory to Local Memory Copy: mem_copy() ...... 29 5.0 Potential Enhancements...... 30 5.1 Error Handling...... 30 5.2 Lookaside Cache Scheme (This is Linux specific) ...... 30 5.3 Extensive Intel Optimization Related Documentation...... 30 6.0 Conclusion...... 31

A AAU Source Code...... 32 A.1 Public Definitions for Intel® 80310 I/O Processor Chipset AAU: \include\aau.h...... 32 A.2 Private Definitions for Intel® XScale™ Microarchitecture AAU: \src\aau.h ...... 36 A.3 Support Functions for the Intel® 80310 I/O Processor Chipset AAU: \src\aau.c ...... 39 B Example Calling Source Code...... 67 B.1 Standard Calls ...... 67 C MMU Functions for Intel® XScale™ Microarchitecture...... 73

4 White Paper Contents

Figures

1 Application Accelerator Unit...... 8 2 AAU State Trace Diagram ...... 16 3 Interrupt Handler Functional Flow Diagram ...... 23 Tables

1 Acronyms...... 9 2 AAU Control Registers...... 9 3 DC Field Description...... 10 4 AAU Registers ...... 11 5 AAU Hardware Descriptor Format...... 12 6 AAU Hardware Descriptor ...... 12 7 AAU Software Descriptor Structure ...... 13 8 AAU Device Descriptor ...... 14 9 User SGL Header ...... 14 10 AAU User SGL Structure...... 15

White Paper 5 Revision History

Revision History

Date Revision Description January 2002 001 Initial Release.

6 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques White Paper Purpose and Description

1.0 White Paper Purpose and Description

Increasing I/O demands are central to Network and Storage high performance applications. Intel® XScale™ microarchitecture (ARM* architecture compliant) addresses this trend with the Intel® 80310 I/O processor chipset (80310). Features of the Intel® 80310 solution include the Application Accelerator Unit (AAU).

The purpose of this paper is to provide Intel customers a fast development ramp in using the Application Accelerator Unit (AAU) on the 80310. This is achieved by providing a implementation case study. The contents of this document are meant to be a supplement to the Intel® 80312 I/O Companion Chip Developer’s Manual, Chapter 10, Intel-referenced Optimization Guides and the other extensive Intel documentation listed in the Section 1.2, “Related Documents”.

1.1 Document Highlights

• Section 2.0, “Application Accelerator Unit”: AAU Hardware Overview. • Section 1.2, “Related Documents”: A listing of related documents and web links. • Section 3.0, “Low-Level Design Document”: This is a case study presenting a Low-Level Design Document used in a Linux implemenation of AAU hardware. • Section A, “AAU Source Code”: The Linux implementation source code. • Section 4.0, “Code Commentary” and Section 5.0, “Potential Enhancements”: Code Commentary discussing implementation with source code line references. Commentary includes identifying optimization implemented, interrupt handling and potential enhancements to existing implementation. • Section B, “Example Calling Source Code”: Examples Calling Source Code APIs. • Section C, “MMU Functions for Intel® XScale™ Microarchitecture”: A listing for MMU implementation called in source code.

1.2 Related Documents

• Intel® 80312 I/O Companion Chip Developer’s Manual (273410). • Intel® 80200 Processor based on Intel® XScale™ Microarchitecture Developer’s Manual (273411). • Intel® IQ80310 Evaluation Platform Board Manual (273431). • Intel® XScale™ Microarchitecture Coding Techniques White Paper (273578).

Other Application Notes and tools: • http://www.intel.com/design/iio/docs/iop310.htm. • http://www.intel.com/design/iio/devtools/tptools.htm. • http://www.intel.com/design/intelxscale/.

White Paper 7 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Application Accelerator Unit

2.0 Application Accelerator Unit

2.0.1 Overview

The AAU provides low-latency, high-throughput data transfer capability between the AAU and the Intel® 80200 processor based on XScale™ microarchitecture (ARM* architecture compliant) local memory. It executes data transfers to and from Intel® 80200 processor (80200) local memory and also provides the necessary programming interface. The Application Accelerator performs the following functions: • Transfers data (read) from memory controller. • Performs an optional boolean operation (XOR) on read data. • Transfers data (write) to memory controller.

The AAU features: • 1 KB, arranged as 8-byte x 128-deep store queue. — Configurable to a 512-byte, arranged as 8-byte x 64-deep store queue. • Utilization of the Intel® 80312 I/O companion chip (80312) memory controller Interface. • 232 addressing range on the 80200 local memory interface. • Hardware support for unaligned data transfers for the internal bus. • Fully programmable from the 80200. • Support for automatic data chaining for gathering and scattering of data blocks.

Figure 1 shows a simplified connection of the Application Accelerator to the 80312 internal bus. Figure 1. Application Accelerator Unit

Application Accelerator Unit internal bus

8 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.0 Low-Level Design Document

3.1 Objective

This section presents the low level design details of the AAU API for Intel® XScale™ microarchitecture embedded Linux.

Table 1. Acronyms

Terms Definitions

AAU Application Accelerator Unit API Application Programming Interface OS Operating System PCI Peripheral Component Interconnect SGL Scattered Gather List

3.1.1 AAU Implementation

3.1.1.1 Overview

The 80312 contains an AAU to enable the hardware functionality of the XOR algorithm. It is capable of performing XOR operation on multiple blocks of source data and store the result back in 80200 local memory. The embedded Linux for Intel® XScale™ microarchitecture does not currently support the AAU functionality of the 80312. As a result, it is unable to take advantage of the AAUs XOR capabilities when it performs certain checksum calculations when using RAID 5 storage solution. This results in a drastic performance hit due to the XOR operations done in software. The implementation outlined describes the details of the changes that need to be made to embedded Linux for Intel® XScale™ microarchitecture in order to utilize the AAU and take advantage of the hardware acceleration.

The AAU API is intending to abstract the hardware away from driver developers and provide necessary functions for the developer to utilize the AAU. The AAU unit contains the following registers:

Table 2. AAU Control Registers (Sheet 1 of 2)

Register Register Name Description

Accelerator Control Word specifies parameters that dictate ACR Accelerator Control Register the overall operating environment such as enabling the accelerator and others. Accelerator Control Status shows the status of the ASR Accelerator Status Register accelerator that includes transfer task done and errors. Address of Current Chain Descriptor is the address of the ADAR Descriptor Address Register descriptor currently being processed. Address of Next Chain Descriptor points to the next ANDAR Next Descriptor Address Register descriptor that is linked to the current descriptor. A NULL value indicates it is the end of the descriptor chain. Intel® 80312 I/O companion chip Intel® 80200 processor Address of Source points to the SAR[4] Local Source Address Registers local address of the source data.

White Paper 9 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

Table 2. AAU Control Registers (Sheet 2 of 2)

Register Register Name Description

80312 Local Destination Address 80200 Address of Destination points to the local memory DAR Register region where the computed result is written to. The Byte Count contains the number of bytes to transfer ABCR Byte Counter Register for a XOR-transfer operation. Chain Descriptor Control Word contains control values for ADCR Descriptor Control Register data transfer on a per-chain descriptor basis. Extended Local Source Address SARE[4] Additional 4 registers for source data (same as SAR). Registers

Table 3 shows the various bits in the Descriptor Control (DC) field.

Table 3. DC Field Description

Bit Default Description

31 0 Destination Write Enable – This bit triggers the write back of the XOR operation result. 30:27 0 Reserved Supplemental Block Control Interpreter – These two bits enables the extended source blocks: 26:25 00 00–0additionalblocks 01–4additionalblocks 10 – reserved 11 – reserved Command Controls for all source blocks. Function can be performed on the blocks are either nothing or XOR. Block 1 (bits 03:01) can also have the Direct Fill (0x111) 24:01 0 command be set instead of performing the XOR (0x001) command. This command puts the data directly into the buffer instead of XOR the data from what’s already in the buffer. This command is also useful when using the AAU for copying data blocks. Interrupt Enable – When set the AAU triggers an interrupt to the Intel® 80200 processor 00 0 upon completion of the descriptor.

3.1.2 Assumptions

In the Linux environment, memory is cached unless otherwise stated. Cache coherency must be maintained by the AAU API by performing cleaning or invalidating at appropriate data locations.

The AAU API assumes that the application driver that utilizes the API follows strict usage guidelines outlined in this document.

10 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.3 Initialization

Initialization is done during kernel initialization. The AAU is initialized after the interrupt controller has been initialized during kernel setup. The AAU registers are all in default reset values before initialization. Following is the AAU initialization sequence: • Disable accelerator by clearing the ACR register. • Setup and initialize all resource queues and stack. • Setup and initialize all spinlocks. • Allocate a number of AAU hardware descriptors. • Align hardware descriptors to eight 32-bit word boundaries. • Allocate a corresponding number of AAU software descriptors. • Link each hardware descriptor to software descriptor. • Put software descriptors on the free resource stack. • Assign appropriate interrupt numbers. • Assign proper registers.

3.1.4 AAU Data Structures

Table 4 data structure directly maps to the AAU registers in order for easy access of the AAU registers. Table 4. AAU Registers typedef struct _aau_regs_t { volatile u32 ACR; /* Accelerator Control Register */ volatile u32 ASR; /* Accelerator Status Register */ volatile u32 ADAR; /* Descriptor Address Register */ volatile u32 ANDAR; /* Next Desc Address Register */ volatile u32 LSAR; /* Local Source Address */ volatile u32 LDAR; /* Local Destination Address */ volatile u32 ABCR; /* Byte Count */ volatile u32 ADCR; /* Descriptor Control */ } aau_regs_t;

White Paper 11 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

To start an AAU operation an AAU hardware descriptor chain is built in the local memory. The hardware descriptor is required to be aligned on an 8-word boundary and is comprised of six contiguous words. The hardware descriptor format is illustrated in Table 5. One or more hardware descriptors form an AAU descriptor chain.

Table 5. AAU Hardware Descriptor Format

Next Descriptor Address (NDA)

Source Address (SAR[0]) Source Address (SAR[1]) Source Address (SAR[2]) Source Address (SAR[3]) Destination Address (DAR) Byte Count (BC) Descriptor Control (DC) Source Address (SARE[0]) [optional] Source Address (SARE[1]) [optional] Source Address (SARE[2]) [optional] Source Address (SARE[3]) [optional]

The NDA points to the next descriptor thus forming a chain. The chain is terminated by having a null valued NDA. The descriptor provides pointers to four source addresses. These source addresses provides the source data for the XOR computation data source. The result of the XOR computation from the source addresses are written to the local memory location pointed to by the DAR. The BC register contains the number of bytes there are in a block of data per source address. All blocks of data that are pointed to by the source addresses have the same amount of data. Therefore, for example, when SAR[0] has 1024 bytes of data then the rest of the valid source addresses shall contain 1024 bytes of data block each. A bit in the DC field enables the extension of additional four source address fields for processing when more than four data sources are required for the XOR computation. The optional fields shall not be used until all existing four source fields are utilized. The DC field also contains various mode bits to allow operations done on a per descriptor basis.

The hardware descriptor for the AAU is presented in Table 6. This format is required by the AAU hardware. The source addresses 5 through 8 are optional. Any source address field not used must contain the NULL value. When any source address contains the NULL value then all the following source addresses must also contain the NULL value. All the source addresses and the destination address must be 80200 local address. Also they must contain physical addresses instead of virtual. Table 6. AAU Hardware Descriptor typedef struct _aau_desc_t { u32 NDA; /* Next Descriptor Address */ u32 SAR[4]; /* Source Addresses 0-3 */ u32 DAR; /* Destination Address */ u32 BC; /* Byte Count */ u32 DC; /* Descriptor Control */ u32 SARE[4]; /* Extended Source Addresses 0-3 */ } aau_desc_t;

12 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

A software descriptor is created to encapsulate each AAU hardware descriptor. The software descriptor contains additional information and status about the hardware descriptor that is not described by the hardware descriptor. The software descriptor also enables the use of stack and queue data structures to keep track of and manipulate the hardware descriptors without making any format changes to the hardware descriptor. A pool of software descriptors are allocated during initialization and put on a stack. An equal amount of hardware descriptors are created and encapsulated by the software descriptors. The resource pool removes the performance penalty suffered by dynamically allocating descriptors during operation.

The Table 7 data structure describes the AAU software descriptor. Table 7. AAU Software Descriptor Structure typedef struct _sw_aau_t { aau_desc_t aau_desc; /* AAU HW desc */ u32 status ; /* AAU Status */ struct _aau_sgl *next; /* pointer to next sgl */ void *dest ; /* Destination */ void *src[4] ; /* Source */ void *ext_src[4]; /* Extended Source */ u32 total_src; /* total src addresses */ struct list_head link; /* link to queue */ u32 aau_phys; /* AAU Physical Addr */ u32 desc_addr; /* HW unaligned addr */ u32 sgl_head; /* User SGL head Addr */ struct _sw_aau_t *head; /* Head of list */ struct _sw_aau_t *tail; /* Tail of list */ } sw_aau_t;

The AAU shall also have a global device descriptor that allows access to the accelerator registers, processing queues, queue locks, and accelerator status.

White Paper 13 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

The Table 8 data structure describes the AAU device. It keeps track of all the variables that are related to the AAU. Table 8. AAU Device Descriptor typedef struct _iop310_aau_t { const char *dev_id; /* Device ID */ list_t process_q; /* Processing Q */ list_t holding_q; /* Holding Q */ spinlock_t lock_pq; /* PQ spinlock */ spinlock_t lock_hq; /* HQ spinlock */ aau_regs_t *regs; /* AAU registers */ int irq; /* IRQ number */ sw_aau_t *last_aau; /* ptr to last AAU disc */ struct tq_struct aau_task; /* AAU task entry */ wait_queue_head_t wait_q; /* AAU wait queue */ atomic_t ref_count; /* AAU Reference count */ } iop310_aau_t;

The following structures represent the data format applications use to pass data to the AAU API. The application creates a SGL header with a SGL pointed to by the header. When no callback function is required, the call_back value must set to NULL. The status field should be zeroed out before being passed down. The end of the list is always marked by the next_sgl variable in the SGL list pointed to NULL. Table 9. User SGL Header struct _aau_sgl_head_t { u32 total; /* total SGLs */ aau_sgl_t *list; /* Pointer to list head */ u32 status; /* SG status */ aau_callback_t callback; /* Callback func ptr */ } aau_sgl_head_t;

14 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

The AAU descriptor data structure is filled out by the user. It maps over the first portion of the software descriptor sw_aau_t data structure. Casting eliminates a copy of values from one data structure to another. When the user passes a correct user SGL list, all the API has to do is re-cast the list into software descriptors and feed it to the processing queue. This requires a slight bit more knowledge of the AAU fields on the users part, but improves performance of the AAU operation considerably. Table 10. AAU User SGL Structure struct_aau_sgl_t { aau_desc_t aau_desc; u32 status; struct_aau_sgl_t *next_sgl; /* Pointer to next SG */ void *dest; void *src[4] ; /* Source group 1 */ void *src_ext[4]; /* Source group 2 */ u32 total_src; /* Total number of sources passed down */ } aau_sgl_t;

3.1.5 Data Path

The following is required for an application to utilize the AAU hardware through the AAU API. The application must first attempt to request the usage of the AAU by calling the aau_request() function. This function requests and registers an interrupt for the AAU. When successful, the application is allowed to use the AAU. The API also keeps track of the usage of the AAU by using a reference count method. When unsuccessful the error –EBUSY is returned to the caller.

The driver applications are required to create a scattered gather list (SGL) defined in the format of aau_sgl_t format with all information for AAU operation completed. The driver application is responsible for allocating and keeping track of the memory to store the AAU input data and result. The application calls the aau_queue_buffer() function to pass down the user SGL. The AAU API generates an AAU descriptor chain from the passed down SGL using the AAU software descriptors from the free AAU resource stack. When no free software descriptors are available the API goes to sleep for a short period of time, and then tries again ten times before giving up and returning –ENOMEM error. The Interrupt Enable bit is set by the function in the DC field of the last hardware descriptor in the chain to indicate end of chain. The AAU chain is queued into the processing queue by the function which then calls aau_start() for the application. The aau_start() function checks to determine if the AAU is active. If not active then this is a new operation and which requires setting the appropriate bits, links accordingly and starting the AAU. If active then it is an ongoing operation which requires appending to the existing chain and setting the chain resume bit. At this point the aau_queue_buffer()returns the control to the application while the AAU is doing its work.

The application has two choices in handling the result of AAU completion: 1. Sleep on the AAUs wait queue until being notified by the bottom half interrupt handler later on when operation is complete 2. Continue and be notified by a callback function when the operation on the chain is complete via the SGL passed down.

The AAU meanwhile processes the chain and triggers an interrupt when it encounters the Interrupt Enable Bit being set in a descriptor being processed or an error condition is encountered.

White Paper 15 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

When an AAU interrupt is asserted, the interrupt handler function aau_irq_handler() is called. Clearing an interrupt requires clearing it at the source. In this case the source is the Accelerator Status Register. When the Accelerator Status Resister has been cleared in the interrupt handler, the interrupt assert bit shall be cleared by the hardware as a consequence. As long as the AAU interrupt is asserted due to new AAU interrupts, the interrupt handler continues to remove the descriptors from the channel process queue and put the descriptors in the channel holding queue until the ADAR value equals to the address of the descriptor or the queue is empty. When the ADAR equals the descriptor address and the ASR indicates that the channel is active then that descriptor is not removed. Once the interrupt handler no longer sees an AAU interrupt being asserted it schedules a bottom half handler in the immediate task queue to process the holding queue and notify the application of the progress of the AAU operation.

The application calls the function aau_free() when it no longer needs the AAU and wants to release it. Depending on the reference count, the IRQ requested for the AAU may be freed. When there are any errors for the AAU unit, the AAU registers are cleared, all resources are returned, and the reference count shall be reset to 0.

Figure 2 shows the state trace diagram for a normal operation of the AAU. The diagram demonstrates all the necessary function calls that are performed during a normal, simple AAU execution path. The section explaining the APIs in detail follows. Figure 2. AAU State Trace Diagram

User AAU INTC System

aau_init() aau_get_buffer()

aau_queue_buffer()

aau_start()

AAU Complete aau_irq_handler()

Sleep on wait queue or Proceed on callback aau_process()

aau_task() Callback or Wake if Sleeping aau_buffer_return()

aau_free()

16 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6 API Functions

The following functions shall be implemented to support the AAU API for Intel® XScale™ microarchitecture embedded Linux.

3.1.6.1 API Listing

3.1.6.1.1 AAU Public

int aau_request(u32 *aau_context);

int aau_suspend(u32 aau_context);

int aau_resume(u32 aau_context);

int aau_queue_buffer(u32_context, aau_sgl_t *sgl);

int aau_free(u32 aau_context);

aau_sgl_t* aau_get_buffer(u32 aau_context, u32 num_buff);

void aau_return_buffer(u32 aau_context, sgl_list_t *list);

int aau_memcpy(void *, void *, u32);

3.1.6.1.2 AAU Private (Static)

static int __init aau_init(void);\

static int aau_start(iop310_aau_t *aau_chain);

static int aau_flush_all(u32 aau_context);

static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);

static void aau_process(iop310_aau_t *aau);

static void aau_result_handler(void *aau);

White Paper 17 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2 Selected API Descriptions

3.1.6.2.1 static int __init aau_init(void);

Input: N/A Output: Success -- OK Error -- -ENOMEM Purpose: This function initializes the AAU during kernel init. The function initializes all the variables to ready state and allocates memory for the resource pools. The AAU is at post reset state at this point. After initialization the AAU should be in the idle state.

Operation: • Initialize free resource stack • Initialize stack lock • Allocate memory for software descriptors — Returnerroriffail • Align memory on 8-byte boundary — Returnerroriffail • Push software descriptors onto free resource stack • Set register addresses for AAU • Initialize AAU queues and locks • Initialize wait queue • Assign interrupt number • Initialize all AAU reference count • Initialize interrupt bottom handler for immediate process queue • Zero out ACR

18 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.2 static int aau_start(iop310_aau_t *aau, sw_aau_t *aau_chain);

Input: aau – pointer to AAU device descriptor aau_chain – pointer to AAU descriptor chain to be sent to AAU Output: Success/Error condition Purpose: This function starts the AAU or appends an AAU chain and resumes the operation when a chain is being processed.

Operation: • If AAU not active — Write AAU descriptor address to ANDAR — Set enable accelerator bit in ACR • Else — Link chain to last AAU list tail ANDAR — Flush cache for range of tail descriptor ANDAR — If channel no longer active • Set chain resume bit in ACR • Set last descriptor pointer in AAU device descriptor

3.1.6.2.3 int aau_request(u32 *aau_context);

Input: aau_context – pass by reference AAU context. Written back by function. Output: success -- OK failed -- -EINVAL Purpose: This function requests an interrupt for the AAU from the kernel and returns the AAU descriptor to the driver application.

Operation: • Register IRQ with kernel • Increment reference count of AAU • Return AAU device descriptor to user

3.1.6.2.4 int aau_suspend(u32 aau_context);

Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function suspends the AAU operation. It calls aau_stop() to perform the operation.

Operation: • Unset bit in ACR that enables AAU operation

White Paper 19 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.5 int aau_resume(u32 aau_context);

Input: aau – AAU device context Output: Success/Error condition Purpose: This function resumes the AAU operation.

Operation: • If ASR contains errors — Clear errors — Flush AAU pipeline — Return with error • Set enable bit in ACR

3.1.6.2.6 int aau_queue_buffer(u32 aau_context, aau_sgl_t *sgl);

Input: aau_context – AAU device context sgl – User SGL for AAU to transform to AAU descriptor chain Output: Success/Error condition Purpose: This function converts the user SGL to an AAU descriptor chain. The function then puts the chain in the processing queue and starts the AAU.

Operation: • For all elements in SGL — Get AAU software descriptor from free resource stack — Convert to AAU descriptor — Init appropriate variables in AAU software descriptor — Flush cache in appropriate regions — Link up AAU chain • Call aau_start() and pass AAU chain

20 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.7 static int aau_flush_all(u32 aau_context);

Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function flushes the AAU pipeline, returns all resources to the free stack and clears any of the error conditions. This function is only be called by the interrupt handler for handling errors.

Operation: • For all descriptors in processing queue — Remove from processing queue — Set AAU_INCOMPLETE status mode in descriptor status — Put in holding queue • Clear ASR

3.1.6.2.8 int aau_free(u32 aau_context);

Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function attempts to release the IRQ held by AAU.

Operation: • Decrement AAU reference counter • If AAU ref count <= 1 —FreeIRQ

White Paper 21 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.9 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);

Input: irq –IRQnumber dev_id – Device Descriptor regs – CPU registers (not used but required) Output: N/A Purpose: This is the interrupt handler for AAU interrupts. It handles any error interrupts or chain complete interrupts depending on the status in the ASR. A bottom handler queued in the immediate task queue by this function begins to process everything in the holding queue when this function exits and the kernel leaves the interrupt space.

Operation: • If not AAU interrupt —Exit • If AAU error —Callaau_flush_all() • While AAU complete INTs —ClearASR —Callaau_process() • Register bottom handler

22 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

Figure 3. Interrupt Handler Functional Flow Diagram

Enter Interrupt Handler

Check if INT No for AAU Yes

Flush AAU / Clear Check any Yes Errors errors in ASR No

Move desc from process to holding Q

No

Done? Yes

Schedule INT Exit bottom handler

White Paper 23 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.10 static void aau_process(iop310_aau_t *aau);

Input: aau – AAU device descriptor Output: N/A Purpose: This function removes the done descriptor from the processing queue and put them in the holding queue to be processed by the bottom handler later. This function is only called by the interrupt handler.

Operation: • Do while descriptor address != ADAR and queue not empty — Remove from processing queue — Put on holding queue — If IE bit set in ADCR set AAU_DONE on chain head descriptor

3.1.6.2.11 static void aau_result_handler(void *aau);

Input: *aau – AAU device descriptor Output: N/A Purpose: This function is scheduled by the interrupt handler to finish processing AAU descriptors after the INT handler is done and exits the interrupt space. It notifies the driver performing the AAU either by waking the driver up when sleeping or use a callback function provided by the driver.

Operation: • Do while descriptor status == AAU_DONE — Remove descriptor from holding queue — Set status on user SGL — Return descriptor to free stack — If callback function exists • Call callback — Else if sleeping on wait queue • Wake up sleeping process

24 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document

3.1.6.2.12 aau_sgl_t * aau_get_buffer(u32 aau_context, u32 num_buf);

Input: aau_context – AAU device context num_buf – number of buffers to acquire. Output: aau_sgl_t * - chain of AAU acquired, NULL if failed. Purpose: This function is used to acquire a chain of user SGL buffers. After obtaining the list the user need to fill it out, link it to a SGL head and pass it to aau_queue_buffer() function.

Operation: • While free stack not empty — Acquire buffer — If failed • Retry • If Retry fails — Return all acquired buffer — Return NULL — Fill out necessary field — Link buffer to list • Return list

3.1.6.2.13 void aau_return_buffer(u32 aau_context, sgl_list_t *list);

Input: aau_context – AAU device context. *list – SGL list to be returned. Output: N/A Purpose: This function takes the SGL list passed in by the user and return it to the free stack.

Operation: • While not end of list — Put SGL element on free stack.

White Paper 25 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary

4.0 Code Commentary

4.1 Section Objectives

Primary Objective: To identify and describe aspects of the implementation that relate to 80310 hardware and standard operating system issues.

Secondary Objective: Provide additional background on the linux APIs to facilitate reading the code and understanding the implementation.

This code was written to be integrated in the Linux Kernel. Therefore Linux data structures and APIs defined and optimized by the Linux community are used.

Recommended Approach to understanding code is to begin with aau_init() and follow function call sequence in Figure 2, “AAU State Trace Diagram” on page 16. Also see sections provided for additional implementation support: • Appendix B, “Example Calling Source Code” • Appendix C, “MMU Functions for Intel® XScale™ Microarchitecture”

4.1.1 File Organization Overview

There are three files included in Appendix A: • \include\aau.h • \src\aau.h • \src\aau.c

File \include\aau.h includes the public definitions and function APIs. Note that the public data structure definition of struct aau_sgl_t is cast to private definition struct sw_aau_t.

Files \src\aau.h and \src\aau.c include private definitions, APIs and function calls. Note APIs that are static are private and local to the file, and those that are not static are public calls. The static modifier localize the functions to the c file and the symbol is not exported.

4.1.1.1 Key Data Structure and Use of Casting

The primary data structure used by application to initiate an AAU transaction is stuct aau_sgl_t (see code line 72). When the application is filling our the source and destination address in the descriptor, physical addresses not virtual addresses are required. The aau_sgl_t is cast to data structure sw_aau_t for processing (line 444). Note the descriptors are chained together within function aau_queue_buffer, line 463.

26 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary

4.1.2 Cache Memory

The following function is required for AAU descriptors since cache memory and RAM coherence are required to be managed by the programmer. Remember the AAU engine reads the AAU descriptors from RAM. Therefore values in cache are required to be flushed by the programmer to RAM (see Appendix C in this document for implementation). • cpu_xscale_dcache_clean_range(start, end) For the specified virtual address range, ensure that all caches contain clean data, such that peripheral accesses to the physical RAM fetch correct data. start: virtual start address end: virtual end address

4.1.3 Other AAU Hardware

The AAU hardware is described in the Intel® 80312 I/O Companion Chip Developer’s Manual pages 10-1 through 10-33. For register definitions see pages 10-23 through 10-31.

In the Appendix A code, see Descriptor Control Register (DC) bit definitions line 40 through line 54. For Accelerator Control Register (ACR) and Accelerator Status Register (ASR) see bit definitions at lines 124 through 136.

The addresses for referencing the memory mapped registers are references using #defines. See examples in code lines 301, 305 and 320. • IOP310_AAUANDAR - Address of Accelerator next Descriptor Address Register • IOP310_AAUACR - Address of Accelerator Control Register • IOP310_AAUASR - Address of Accelerator Status Register

4.1.4 Virtual to Physical memory

Cache flush/invalidate and memory mapped registers operate with virtual memory addresses

AAU descriptor operations operate from physical memory and require physical addresses. For example see Appendix A.3,line895.

White Paper 27 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary

4.1.5 Interrupt Handling

Linux interrupt handling is split between top and bottom halves. The top half interrupt handler is called when the hardware interrupt is invoked and performs only minimal critical tasks including scheduling the bottom half handlers. Bottom half handlers are schedules by marking the handler for future execution.

Three status registers are involved in interrupt handling. Clearing a interrupt requires clearing the interrupt at the source which in this case is the Accelerator Status Register. The action of clearing the interrupt requires writing a 1 to the bit to be cleared (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 1-7, section 1.4.2)). The three registers are: • FIQ1 Interrupt Status Register (IOP310_FIQ1ISR). Appendix A, Lines 637 & 666. This register is used to determine the cause of the interrupt. If Bit 5 is set there is a Application Accelerator Interrupt Pending (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 2-12) • Accelerator Status Register (IOP310_AAUASR) Appendix A, Lines 648 & 669. This register contains the AAU status flags. The interrupt is cleared by writing 1s to the set bits. • IRQ Interrupt Status Register (Not used in this code) Bit 10 indicates a Application Accelerator Unit Error (See Intel® 80312 I/O Companion Chip Developer’s Manual,page2-15)

4.1.5.1 Top Half Interrupt Handler: aau_irq_handler()

See Appendix A, line 629.

The following statuses are obtained: • FIQ1 Interrupt Status Register (IOP310_FIQ1ISR). Appendix A, lines 637 & 666. • Accelerator Status Register (IOP310_AAUASR) Appendix A, lines 648 & 669.

The AAU interrupt is cleared. Appendix A,Line657.

When the End of Transfer or End of Chain Interrupt is set, the function aau_process() is called. The purpose of aau_process() is to move all the AAU descriptors in the processing queue that are considered done to the holding queue.

The bottom half handler is marked scheduled. Appendix A, line 672.

4.1.5.2 Bottom Half Interrupt Handler: aau_task()

See Appendix A, Line 758. This function processes all the completed AAU chain descriptors in the holding Q, wakes up the user and frees the resource.

4.1.6 Linux Kernel APIs

This code contains numerous calls to Linux kernel macros or APIs. Primarily these are Linux calls used for declaring and handling queues and stack data structures and controlling variable access. When developing custom applications users of this document will call their Operating Systems equivalent APIs and macros.

28 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary

4.2 Optimization Related

4.2.1 Stack verses Queue

There are built and controlled with Linux kernel data structures and APIs. Using a stack for free descriptors increases the likelihood descriptors requested are still in the cache. A queue for the chain is required since chaining demands FIFO (first in first out) sequence. As previously stated, Linux kernel data structures have been optimized by the Linux community.

4.2.2 Chaining and Resume

Chaining allows the application to build a list of transfers which may not require the use of the Intel 80200 processor until all transfers are complete (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-9). In addition, while the AAU is executing a existing chain, a incremental descriptor or chain of descriptors can be appended concurrently by using the Chain Resume feature (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-16). The expanded chain then executes as a single uninterrupted set of transactions.

See Appendix A, lines 297 through 326 for implementation.

4.2.3 Requiring the Application to Supply Physical Addresses in AAU Descriptor (verses virtual addresses)

This requirement minimizes time between hand off from application and AAU processing software.

4.2.4 Allocations of Memory for AAU Decriptors During Initialization

Preallocating memory for AAU descriptors eliminates costly runtime memory allocations.

4.2.5 Using AAU for Local Memory to Local Memory Copy: mem_copy()

The advantage of using the AAU for local memory to local memory copying: • In absolute terms it is faster for non-trivial copies. • It happens in parallel to other core processing. When calling aau_memcopy() use the exact same syntax as memcopy(). See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-31 through 10-33 for full description. Appendix A, lines 1108 and 1109 • Covert virtual to physical address and write physical address to AAU descriptor Appendix A, line 1112 • AAU_DCR_WRITE — Sets bit 31. Description of operation specified: Write Enable • AAU_DCR_BLKCTRL_1_DF — Sets all bits 03:01 for Block 1 Command Control. Description of operation specified: Direct Fill Appendix A, line 1115 • Sets Interrupt Enable for this descriptor

White Paper 29 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Potential Enhancements

5.0 Potential Enhancements

5.1 Error Handling

When the Application Accelerator Unit generates a error during the execution of a AAU Descriptor, a interrupt is triggered and IRQ Interrupt Status Register, Bit 10 is set. Bit 10 being set indicates a Application Accelerator Unit Error (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 2-5). After identifying the source of the interrupt as the AAU, the application should should test the Accelerator Status Register (ASR) Bits. (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-25) • Bit 10 is clear: The Accelerator Active Flag being clear indicates the channel is idle. This bit my be cleared as a result of a bus error. • Bit 5 is set: The Master-abort bit is set when a master abort occurs during a transaction when the AAU is the Master on the internal bus.

Before clearing the interrupt, the application can use the Accelerator Descriptor Address Register (ADAR) to identify the currently executing decriptor. The descriptor can be marked as having failed prior to the interrupt being cleared and processing continuing. One approach is to write the contents of the ASR to a status variable attached to the descriptor (See Intel® 80312 I/O Companion Chip Developer’s Manual, sections 10.8 and 10.9 for Interrupt States and Error Conditions).

5.2 Lookaside Cache Scheme (This is Linux specific)

When implementing in Linux, device driver developers should consider using the Lookaside Cache scheme instead of allocating memory using kmalloc when creating hardware descriptors. The Lookaside Cache provides memory address alignment and other features that allows the efficient use of the Linux memory management for device driver development.

5.3 Extensive Intel Optimization Related Documentation

Intel provides extensive optimization related documentation. As part of the application development process it is recommended to review the Intel® XScale™ Microarchitecture Coding Techniques White Paper and the Intel® 80200 Processor based on Intel® XScale™ Microarchitecture Developer’s Manual, Appendix B for optimization opportunities.

30 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Conclusion

6.0 Conclusion

As discussed, increasing I/O demands are central to Network and Storage high performance applications. Intel® XScale™ microarchitecture addresses this trend with the 80310. Features of the Intel® 80310 solution that include AAU.

This paper and the accompanying source code have presented a AAU implementation including the Low Level Design, coded implementation and code commentary to provide software developers a template in order to speed the ramp for developing AAU applications. For their unique applications, Developers can design and build their own custom solutions using this template along with the Intel Optimization literature.

White Paper 31 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

Appendix A AAU Source Code

A.1 Public Definitions for Intel® 80310 I/O Processor Chipset AAU: \include\aau.h

1/*

2 * Definitions for IOP310 AAU

3*

4 * Author: Dave Jiang ([email protected])

5 * Copyright (C) 2001 Intel Corporation

6*

7 * This program is free software; you can redistribute it and/or modify

8 * it under the terms of the GNU General Public License version 2 as

9 * published by the Free Software Foundation.

10 *

11 */

12

13 #ifndef _IOP310_AAU_H_

14 #define _IOP310_AAU_H_

15

16

17 #define DEFAULT_AAU_IRQ_THRESH 10

18

19 #define MAX_AAU_DESC 1024/* 64 */

20 #define AAU_SAR_GROUP 4

21

22

23 #define AAU_DESC_DONE 0x0010

24 #define AAU_INCOMPLETE 0x0020

25 #define AAU_HOLD 0x0040

26 #define AAU_END_CHAIN 0x0080

27 #define AAU_COMPLETE 0x0100

28 #define AAU_NOTIFY 0x0200

29 #define AAU_NEW_HEAD 0x0400

32 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

30

31 #define AAU_USER_MASK(AAU_NOTIFY | AAU_INCOMPLETE | \

32 AAU_HOLD | AAU_COMPLETE)

33

34 #define DESC_HEAD 0x0010

35 #define DESC_TAIL 0x0020

36

37 /* result writeback */

38 #define AAU_DCR_WRITE 0x80000000

39 /* source block extension */

40 #define AAU_DCR_BLK_EXT 0x02000000

41 #define AAU_DCR_BLKCTRL_8_XOR 0x00400000

42 #define AAU_DCR_BLKCTRL_7_XOR 0x00080000

43 #define AAU_DCR_BLKCTRL_6_XOR 0x00010000

44 #define AAU_DCR_BLKCTRL_5_XOR 0x00002000

45 #define AAU_DCR_BLKCTRL_4_XOR 0x00000400

46 #define AAU_DCR_BLKCTRL_3_XOR 0x00000080

47 #define AAU_DCR_BLKCTRL_2_XOR 0x00000010

48 #define AAU_DCR_BLKCTRL_1_XOR 0x00000002

49 /* first block direct fill instead of XOR to buffer */

50 #define AAU_DCR_BLKCTRL_1_DF 0x0000000E

51 /* interrupt enable */

52 #define AAU_DCR_IE 0x00000001

53

54 #define DCR_BLKCTRL_OFFSET 3

55

56

57 /* AAU callback */

58 typedef void (*aau_callback_t) (void *buf_id);

59

60 /* hardware descriptor */

61 typedef struct _aau_desc

62 {

63 u32 NDA; /* next descriptor address */

White Paper 33 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

64 u32 SAR[AAU_SAR_GROUP];/* src addrs */

65 u32 DAR; /* destination addr */

66 u32 BC; /* byte count */

67 u32 DC; /* descriptor control */

68 u32 SARE[AAU_SAR_GROUP];/* extended src addrs */

69 } aau_desc_t;

70

71 /* user SGL format */

72 typedef struct _aau_sgl

73 {

74 aau_desc_t aau_desc;/* AAU HW Desc */

75 u32 status;

76 struct _aau_sgl *next;/* pointer to next SG */

77 void *dest; /* destination addr */

78 void *src[AAU_SAR_GROUP];/* source addr[4] */

79 void *ext_src[AAU_SAR_GROUP];/* ext src addr[4] */

80 u32 total_src; /* total number of source */

81 } aau_sgl_t;

82

83 /* header for user SGL */

84 typedef struct _aau_head

85 {

86 u32 total;

87 u32 status; /* SGL status */

88 aau_sgl_t *list; /* ptr to head of list */

89 aau_callback_t callback;/* callback func ptr */

90 } aau_head_t;

91

92 /* prototypes */

93 int aau_request(u32 *, const char *);

94 int aau_queue_buffer(u32, aau_head_t *);

95 void aau_suspend(u32);

96 void aau_resume(u32);

97 void aau_free(u32);

34 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

98 void aau_set_irq_threshold(u32, int);

99 void aau_return_buffer(u32, aau_sgl_t *);

100 aau_sgl_t *aau_get_buffer(u32, int);

101 int aau_memcpy(void *, void *, u32);

102

103 #endif

104 /* EOF */

White Paper 35 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

A.2 Private Definitions for Intel® XScale™ Microarchitecture AAU: \src\aau.h

105 /*

106 * Private Definitions for Intel® XScale™ microarchitecture AAU

107 *

108 * Author: Dave Jiang ([email protected])

109 * Copyright (C) 2001 Intel Corporation

110 *

111 * This program is free software; you can redistribute it and/or modify

112 * it under the terms of the GNU General Public License version 2 as

113 * published by the Free Software Foundation.

114 *

115 */

116

117 #ifndef _AAU_PRIVATE_H_

118 #define _AAU_PRIVATE_H_

119

120 #define SLEEP_TIME 50

121 #define AAU_DESC_SIZE 48

122 #define AAU_INT_MASK 0x0020

123

124 #define AAU_ACR_CLEAR 0x00000000

125 #define AAU_ACR_ENABLE 0x00000001

126 #define AAU_ACR_CHAIN_RESUME 0x00000002

127 #define AAU_ACR_512_BUFFER 0x00000004

128

129 #define AAU_ASR_CLEAR 0x00000320

130 #define AAU_ASR_MA_ABORT 0x00000020

131 #define AAU_ASR_ERROR_MASK AAU_ASR_MA_ABORT

132 #define AAU_ASR_DONE_EOT 0x00000200

133 #define AAU_ASR_DONE_EOC 0x00000100

134 #define AAU_ASR_DONE_MASK (AAU_ASR_DONE_EOT | AAU_ASR_DONE_EOC)

135 #define AAU_ASR_ACTIVE 0x00000400

36 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

136 #define AAU_ASR_MASK (AAU_ASR_ERROR_MASK | AAU_ASR_DONE_MASK)

137

138 /* software descriptor */

139 typedef struct _sw_aau

140 {

141 aau_desc_t aau_desc;/* AAU HW Desc */

142 u32 status;

143 struct _aau_sgl *next; /* pointer to next SG */

144 void *dest; /* destination addr */

145 void *src[AAU_SAR_GROUP];/* source addr[4] */

146 void *ext_src[AAU_SAR_GROUP];/* ext src addr[4] */

147 u32 total_src; /* total number of source */

148 struct list_head link; /* Link to queue */

149 u32 aau_phys; /* AAU Phys Addr (aligned) */

150 u32 desc_addr; /* unaligned HWDESC virtual addr */

151 u32 sgl_head;

152 struct _sw_aau *head; /* head of list */

153 struct _sw_aau *tail; /* tail of list */

154 } sw_aau_t;

155

156 /* AAU registers */

157 typedef struct _aau_regs_t

158 {

159 volatile u32 ACR; /* Accelerator Control Register */

160 volatile u32 ASR; /* Accelerator Status Register */

161 volatile u32 ADAR; /* Descriptor Address Register */

162 volatile u32 ANDAR; /* Next Desc Address Register */

163 volatile u32 LSAR[AAU_SAR_GROUP];/* source addrs */

164 volatile u32 LDAR; /* local destination address register */

165 volatile u32 ABCR; /* byte count */

166 volatile u32 ADCR; /* Descriptor Control */

167 volatile u32 LSARE[AAU_SAR_GROUP];/* extended src addrs */

168 } aau_regs_t;

169

White Paper 37 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

170

171 /* device descriptor */

172 typedef struct _iop310_aau_t

173 {

174 const char *dev_id; /* Device ID */

175 struct list_head process_q;/* Process Q */

176 struct list_head hold_q;/* Holding Q */

177 spinlock_t process_lock;/* PQ spinlock */

178 spinlock_t hold_lock;/* HQ spinlock */

179 aau_regs_t *regs; /* AAU registers */

180 int irq; /* IRQ number */

181 sw_aau_t *last_aau; /* ptr to last AAU desc */

182 struct tq_struct aau_task;/* AAU task entry */

183 wait_queue_head_t wait_q;/* AAU wait queue */

184 atomic_t ref_count; /* AAU ref count */

185 atomic_t irq_thresh;/* IRQ threshold */

186 } iop310_aau_t;

187

188 #define SW_ENTRY(list) list_entry((list), sw_aau_t, link)

189

190 #endif

38 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

A.3 Support Functions for the Intel® 80310 I/O Processor ChipsetAAU:\src\aau.c

191 /**************************************************************************

192 * arch/arm/mach-iop310/aau.c

193 *

194 * Support functions for the Intel 80310 AAU.

195 * (see also Documentation/arm/XScale/IOP310/aau.txt)

196 *

197 * Author: Dave Jiang ([email protected])

198 * Copyright (C) 2001 Intel Corporation

199 *

200 * This program is free software; you can redistribute it and/or modify

201 * it under the terms of the GNU General Public License version 2 as

202 * published by the Free Software Foundation.

203 *

204 * Todos: Thorough Error handling

205 * Do zero-size AAU transfer/channel at init

206 * so all we have to do is chaining

207 *

208 *

209 * History: (07/18/2001, DJ) Initial Creation

210 * (08/22/2001, DJ) Changed spinlock calls to no save flags

211 * (08/27/2001, DJ) Added irq threshold handling

212 * (09/11/2001, DJ) Changed AAU to list data structure,

213 * modified the user interface with embedded descriptors.

214 *

215 *************************************************************************/

216

217 #include

218 #include

219 #include

220 #include

221 #include

White Paper 39 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

222 #include

223 #include

224 #include

225 #include

226 #include

227 #include

228 #include

229 #include

230 #include

231 #include

232 #include

233 #include

234 #include

235

236 #include

237

238 #include "aau.h"

239

240 #ifndef EXPORT_SYMTAB

241 #define EXPORT_SYMTAB

242 #include

243 #endif

244

245 #undef DEBUG

246 #ifdef DEBUG

247 #define DPRINTK(s, args...) printk("80310AAU: " s, ## args)

248 #else

249 #define DPRINTK(s, args...)

250 #endif

251

252 /* globals */

253 static iop310_aau_t aau_dev;/* AAU device */

254 static struct list_head free_stack;/* free AAU desc stack */

255 static spinlock_t free_lock;/* free AAU stack lock */

40 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

256

257 /* static prototypes */

258 static int __init aau_init(void);

259 static int aau_start(iop310_aau_t *, sw_aau_t *);

260 static int aau_flush_all(u32);

261 static void aau_process(iop310_aau_t *);

262 static void aau_task(void *);

263 static void aau_irq_handler(int, void *, struct pt_regs *);

264

265 /*======*/

266 /* Procedure: aau_start() */

267 /* */

268 /* Description: This function starts the AAU. If the AAU */

269 /* has already started then chain resume is done */

270 /* */

271 /* Parameters: aau: AAU device */

272 /* aau_chain: AAU data chain to pass to the AAU */

273 /* */

274 /* Returns: int -- success: OK */

275 /* failure: -EBUSY */

276 /* */

277 /* Notes/Assumptions: */

278 /* */

279 /* History: Dave Jiang 07/18/01 Initial Creation */

280 /*======*/

281 static int aau_start(iop310_aau_t * aau, sw_aau_t * aau_chain)

282 {

283 u32 status;

284

285 /* get accelerator status */

286 status = *(IOP310_AAUASR);

287

288 /* check accelerator status error */

289 if(status & AAU_ASR_ERROR_MASK)

White Paper 41 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

290 {

291 DPRINTK("start: Accelerator Error %x\n", status);

292 /* should clean the accelerator up then, or let int handle it? */

293 return -EBUSY;

294 }

295

296 /* if first time */

297 if(!(status & AAU_ASR_ACTIVE))

298 {

299 /* set the next descriptor address register */

300

301 *(IOP310_AAUANDAR) = aau_chain->aau_phys;

302

303 DPRINTK("Enabling accelerator now\n");

304 /* enable the accelerator */

305 *(IOP310_AAUACR) |= AAU_ACR_ENABLE;

306 }

307 else

308 {

309 DPRINTK("Resuming chain\n");

310 /* if active, chain up to last AAU chain */

311

312 aau->last_aau->aau_desc.NDA = aau_chain->aau_phys;

313

314 /* flush cache since we changed the field */

315 /* 32bit word long */

316 cpu_dcache_clean_range((u32)&aau->last_aau->aau_desc.NDA,

317 (u32)(&aau->last_aau->aau_desc.NDA));

318

319 /* resume the chain */

320 *(IOP310_AAUACR) |= AAU_ACR_CHAIN_RESUME;

321 }

322

323 /* set the last accelerator descriptor to last descriptor in chain */

42 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

324 aau->last_aau = aau_chain->tail;

325

326 return 0;

327 }

328

329

330 /*======*/

331 /* Procedure: aau_request() */

332 /* */

333 /* Description: This function requests the AAU */

334 /* */

335 /* Parameters: aau_context: aau context */

336 /* device_id -- unique device name */

337 /* */

338 /* Returns: 0 - ok */

339 /* NULL -- failed */

340 /* */

341 /* Notes/Assumptions: */

342 /* */

343 /* History: Dave Jiang 07/18/01 Initial Creation */

344 /*======*/

345 int aau_request(u32 * aau_context, const char *device_id)

346 {

347 iop310_aau_t *aau = &aau_dev;

348

349 DPRINTK("Entering AAU request\n");

350 /* increment reference count */

351 atomic_inc(&aau->ref_count);

352

353 /* get interrupt if ref count is less than or equal to 1 */

354 if(atomic_read(&aau->ref_count) <= 1)

355 {

356 /* set device ID */

357 aau->dev_id = device_id;

White Paper 43 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

358 }

359

360 DPRINTK("Assigning AAU\n");

361 *aau_context = (u32) aau;

362

363 return 0;

364 }

365

366 /*======*/

367 /* Procedure: aau_suspend() */

368 /* */

369 /* Description: This function suspends the AAU at the earliest */

370 /* instant it is capable of. */

371 /* */

372 /* Parameters: aau: AAU device context */

373 /* */

374 /* Returns: N/A */

375 /* */

376 /* Notes/Assumptions: */

377 /* */

378 /* History: Dave Jiang 07/18/01 Initial Creation */

379 /*======*/

380 void aau_suspend(u32 aau_context)

381 {

382 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

383 *(IOP310_AAUACR) &= ~AAU_ACR_ENABLE;

384 }

385

386 /*======*/

387 /* Procedure: aau_resume() */

388 /* */

389 /* Description: This function resumes the AAU operations */

390 /* */

391 /* Parameters: aau: AAU device context */

44 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

392 /* */

393 /* Returns: N/A */

394 /* */

395 /* Notes/Assumptions: */

396 /* */

397 /* History: Dave Jiang 07/18/01 Initial Creation */

398 /*======*/

399 void aau_resume(u32 aau_context)

400 {

401 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

402 u32 status;

403

404 status = *(IOP310_AAUASR);

405

406 /* if it's already active */

407 if(status & AAU_ASR_ACTIVE)

408 {

409 DPRINTK("Accelerator already active\n");

410 return;

411 }

412 else if(status & AAU_ASR_ERROR_MASK)

413 {

414 printk("80310 AAU in error state! Cannot resume\n");

415 return;

416 }

417 else

418 {

419 *(IOP310_AAUACR) |= AAU_ACR_ENABLE;

420 }

421 }

422

423 /*======*/

424 /* Procedure: aau_queue_buffer() */

425 /* */

White Paper 45 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

426 /* Description: This function creates an AAU buffer chain from the */

427 /* user supplied SGL chain. It also puts the AAU chain */

428 /* onto the processing queue. This then starts the AAU */

429 /* */

430 /* Parameters: aau: AAU device context */

431 /* listhead: User SGL */

432 /* */

433 /* Returns: int: success -- OK */

434 /* failed: -ENOMEM */

435 /* */

436 /* Notes/Assumptions: User SGL must point to kernel memory, not user */

437 /* */

438 /* History: Dave Jiang 07/18/01 Initial Creation */

439 /* Dave Jiang 07/20/01 Removed some junk code not suppose */

440 /* to be there that causes infinite loop */

441 /*======*/

442 int aau_queue_buffer(u32 aau_context, aau_head_t * listhead)

443 {

444 sw_aau_t *sw_desc = (sw_aau_t *) listhead->list;

445 sw_aau_t *prev_desc = NULL;

446 sw_aau_t *head = NULL;

447 aau_head_t *sgl_head = listhead;

448 int err = 0;

449 int i;

450 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

451 DECLARE_WAIT_QUEUE_HEAD(wait_q);

452

453 DPRINTK("Entering aau_queue_buffer()\n");

454

455 /* scan through entire user SGL */

456 while(sw_desc)

457 {

458 sw_desc->sgl_head = (u32) listhead;

459

46 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

460 /* we clean the cache for previous descriptor in chain */

461 if(prev_desc)

462 {

463 prev_desc->aau_desc.NDA = sw_desc->aau_phys;

464 cpu_dcache_clean_range((u32)&prev_desc->aau_desc,

465 (u32)&prev_desc->aau_desc + AAU_DESC_SIZE);

466 }

467 else

468 {

469 /* no previous descriptor, so we set this to be head */

470 head = sw_desc;

471 }

472

473 sw_desc->head = head;

474 /* set previous to current */

475 prev_desc = sw_desc;

476

477 /* put descriptor on process */

478 spin_lock_irq(&aau->process_lock);

479 list_add_tail(&sw_desc->link, &aau->process_q);

480 spin_unlock_irq(&aau->process_lock);

481

482 sw_desc = (sw_aau_t *)sw_desc->next;

483 }

484 DPRINTK("Done converting SGL to AAU Chain List\n");

485

486 /* if our tail exists */

487 if(prev_desc)

488 {

489 /* set the head pointer on tail */

490 prev_desc->head = head;

491 /* set the header pointer's tail to tail */

492 head->tail = prev_desc;

493 prev_desc->tail = prev_desc;

White Paper 47 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

494

495 /* clean cache for tail */

496 cpu_dcache_clean_range((u32)&prev_desc->aau_desc,

497 (u32)&prev_desc->aau_desc + AAU_DESC_SIZE);

498

499 DPRINTK("Starting AAU accelerator\n");

500 /* start the AAU */

501 DPRINTK("Starting at chain: 0x%x\n", (u32)head);

502 if((err = aau_start(aau, head)) >= 0)

503 {

504 DPRINTK("ASR: %#x\n", *IOP310_AAUASR);

505 if(!sgl_head->callback)

506 {

507 wait_event_interruptible(aau->wait_q,

508 (sgl_head->status & AAU_COMPLETE));

509 }

510 return 0;

511 }

512 else

513 {

514 DPRINTK("AAU start failed!\n");

515 return err;

516 }

517 }

518

519 return -EINVAL;

520 }

521

522 /*======*/

523 /* Procedure: aau_flush_all() */

524 /* */

525 /* Description: This function flushes the entire process queue for */

526 /* the AAU. It also clears the AAU. */

527 /* */

48 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

528 /* Parameters: aau: AAU device context */

529 /* */

530 /* Returns: int: success -- OK */

531 /* */

532 /* Notes/Assumptions: */

533 /* */

534 /* History: Dave Jiang 07/19/01 Initial Creation */

535 /*======*/

536 static int aau_flush_all(u32 aau_context)

537 {

538 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

539 int flags;

540 sw_aau_t *sw_desc;

541

542 DPRINTK("Flushall is being called\n");

543

544 /* clear ACR */

545 /* read clear ASR */

546 *(IOP310_AAUACR) = AAU_ACR_CLEAR;

547 *(IOP310_AAUASR) |= AAU_ASR_CLEAR;

548

549 /* clean up processing Q */

550 while(!list_empty(&aau->hold_q))

551 {

552 spin_lock_irqsave(&aau->process_lock, flags);

553 sw_desc = SW_ENTRY(aau->process_q.next);

554 list_del(aau->process_q.next);

555 spin_unlock_irqrestore(&aau->process_lock, flags);

556

557 /* set status to be incomplete */

558 sw_desc->status |= AAU_INCOMPLETE;

559 /* put descriptor on holding queue */

560 spin_lock_irqsave(&aau->hold_lock, flags);

561 list_add_tail(&sw_desc->link, &aau->hold_q);

White Paper 49 562 spin_unlock_irqrestore(&aau->hold_lock, flags);

563 }

564

565 return 0;

566 }

567

568 /*======*/

569 /* Procedure: aau_free() */

570 /* */

571 /* Description: This function frees the AAU from usage. */

572 /* */

573 /* Parameters: aau -- AAU device context */

574 /* */

575 /* Returns: int: success -- OK */

576 /* */

577 /* Notes/Assumptions: */

578 /* */

579 /* History: Dave Jiang 07/19/01 Initial Creation */

580 /*======*/

581 void aau_free(u32 aau_context)

582 {

583 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

584

585 atomic_dec(&aau->ref_count);

586

587 /* if ref count is 1 or less, you are the last owner */

588 if(atomic_read(&aau->ref_count) <= 1)

589 {

590 /* flush AAU channel */

591 aau_flush_all(aau_context);

592 /* flush holding queue */

593 aau_task(aau);

594

595 if(aau->last_aau) Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

596 {

597 aau->last_aau = NULL;

598 }

599

600 DPRINTK("Freeing IRQ %d\n", aau->irq);

601 /* free the IRQ */

602 free_irq(aau->irq, (void *)aau);

603 }

604

605 DPRINTK("freed\n");

606 }

607

608 /*======*/

609 /* Procedure: aau_irq_handler() */

610 /* */

611 /* Description: This function is the int handler for the AAU */

612 /* driver. It removes the done AAU descriptors from the */

613 /* process queue and put them on the holding Q. it */

614 /* continues to process until process queue empty or */

615 /* the current AAU desc on the accelerator is the one */

616 /* we are inspecting */

617 /* */

618 /* Parameters: irq: IRQ activated */

619 /* dev_id: device */

620 /* regs: registers */

621 /* */

622 /* Returns: NONE */

623 /* */

624 /* Notes/Assumptions: Interrupt is masked */

625 /* */

626 /* History: Dave Jiang 07/19/01 Initial Creation */

627 /* Dave Jiang 07/20/01 Check FIQ1 instead of ASR for INTs */

628 /*======*/

629 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs)

White Paper 51 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

630 {

631 iop310_aau_t *aau = (iop310_aau_t *) dev_id;

632 u32 int_status = 0;

633 u32 status = 0;

634 u32 thresh;

635

636 /* get FIQ1 status */

637 int_status = *(IOP310_FIQ1ISR);

638

639 DPRINTK("IRQ: irq=%d status=%#x\n", irq, status);

640

641 /* this is not our interrupt */

642 if(!(int_status & AAU_INT_MASK))

643 {

644 return;

645 }

646

647 /* get accelerator status */

648 status = *(IOP310_AAUASR);

649

650 /* get threshold */

651 thresh = atomic_read(&aau->irq_thresh);

652

653 /* process while we have INT */

654 while((int_status & AAU_INT_MASK) && thresh--)

655 {

656 /* clear ASR */

657 *(IOP310_AAUASR) &= AAU_ASR_MASK;

658

659 /* */

660 if(status & AAU_ASR_DONE_MASK)

661 {

662 aau_process(aau);

663 }

52 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

664

665 /* read accelerator status */

666 status = *(IOP310_AAUASR);

667

668 /* get interrupt status */

669 int_status = *(IOP310_FIQ1ISR);

670 }

671

672 /* schedule bottom half */

673 aau->aau_task.data = (void *)aau;

674 /* task goes to the immediate task queue */

675 queue_task(&aau->aau_task, &tq_immediate);

676 /* mark IMMEDIATE BH for execute */

677 mark_bh(IMMEDIATE_BH);

678 }

679

680

681 /*======*/

682 /* Procedure: aau_process() */

683 /* */

684 /* Description: This function processes moves all the AAU desc in */

685 /* the processing queue that are considered done to the */

686 /* holding queue. It is called by the int when the */

687 /* done INTs are asserted. It continues until */

688 /* either the process Q is empty or current AAU desc */

689 /* equals to the one in the ADAR */

690 /* */

691 /* Parameters: aau: AAU device as parameter */

692 /* */

693 /* Returns: NONE */

694 /* */

695 /* Notes/Assumptions: Interrupt is masked */

696 /* */

697 /* History: Dave Jiang 07/19/01 Initial Creation */

White Paper 53 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

698 /*======*/

699 static void aau_process(iop310_aau_t * aau)

700 {

701 sw_aau_t *sw_desc;

702 u8 same_addr = 0;

703

704 DPRINTK("Entering aau_process()\n");

705

706 while(!same_addr && !list_empty(&aau->process_q))

707 {

708 spin_lock(&aau->process_lock);

709 sw_desc = SW_ENTRY(aau->process_q.next);

710 list_del(aau->process_q.next);

711 spin_unlock(&aau->process_lock);

712

713 if(sw_desc->head->tail->status & AAU_NEW_HEAD)

714 {

715 DPRINTK("Found new head\n");

716 sw_desc->tail->head = sw_desc;

717 sw_desc->head = sw_desc;

718 sw_desc->tail->status &= ~AAU_NEW_HEAD;

719 }

720

721 sw_desc->status |= AAU_DESC_DONE;

722

723 /* if we see end of chain, we set head status to DONE */

724 if(sw_desc->aau_desc.DC & AAU_DCR_IE)

725 {

726 if(sw_desc->status & AAU_END_CHAIN)

727 {

728 sw_desc->tail->status |= AAU_COMPLETE;

729 }

730 else

731 {

54 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

732 sw_desc->head->tail = sw_desc;

733 sw_desc->tail = sw_desc;

734 sw_desc->tail->status |= AAU_NEW_HEAD;

735 }

736 sw_desc->tail->status |= AAU_NOTIFY;

737 }

738

739 /* if descriptor equal same being processed, put it back */

740 if(((u32) sw_desc == *(IOP310_AAUADAR)

741 ) && ( *(IOP310_AAUASR) & AAU_ASR_ACTIVE))

742 {

743 spin_lock(&aau->process_lock);

744 list_add(&sw_desc->link, &aau->process_q);

745 spin_unlock(&aau->process_lock);

746 same_addr = 1;

747 }

748 else

749 {

750 spin_lock(&aau->hold_lock);

751 list_add_tail(&sw_desc->link, &aau->hold_q);

752 spin_unlock(&aau->hold_lock);

753 }

754 }

755 DPRINTK("Exit aau_process()\n");

756 }

757

758 /*======*/

759 /* Procedure: aau_task() */

760 /* */

761 /* Description: This func is the bottom half handler of the AAU INT */

762 /* handler. It is queued as an imm task on the imm */

763 /* task Q. It process all the complete AAU chain in the */

764 /* holding Q and wakes up the user and frees the */

765 /* resource. */

White Paper 55 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

766 /* */

767 /* Parameters: aau_dev: AAU device as parameter */

768 /* */

769 /* Returns: NONE */

770 /* */

771 /* Notes/Assumptions: */

772 /* */

773 /* History: Dave Jiang 07/19/01 Initial Creation */

774 /*======*/

775 static void aau_task(void *aau_dev)

776 {

777 iop310_aau_t *aau = (iop310_aau_t *) aau_dev;

778 u8 end_chain = 0;

779 sw_aau_t *sw_desc = NULL;

780 aau_head_t *listhead = NULL;/* user list */

781

782 DPRINTK("Entering bottom half\n");

783

784 if(!list_empty(&aau->hold_q))

785 {

786 sw_desc = SW_ENTRY(aau->hold_q.next);

787 listhead = (aau_head_t *) sw_desc->sgl_head;

788 }

789 else

790 return;

791

792 /* process while AAU chain is complete */

793 while(sw_desc && (sw_desc->tail->status & (AAU_NOTIFY | AAU_INCOMPLETE)))

794 {

795 /* clean up until end of AAU chain */

796 while(!end_chain)

797 {

798 /* IE flag indicate end of chain */

799 if(sw_desc->aau_desc.DC & AAU_DCR_IE)

56 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

800 {

801 end_chain = 1;

802 listhead->status |=

803 sw_desc->tail->status & AAU_USER_MASK;

804

805 sw_desc->status |= AAU_NOTIFY;

806

807 if(sw_desc->status & AAU_END_CHAIN)

808 listhead->status |= AAU_COMPLETE;

809 }

810

811 spin_lock_irq(&aau->hold_lock);

812 /* remove from holding queue */

813 list_del(&sw_desc->link);

814 spin_unlock_irq(&aau->hold_lock);

815

816 cpu_dcache_invalidate_range((u32)&sw_desc->aau_desc,

817 (u32)&sw_desc->aau_desc + AAU_DESC_SIZE);

818

819 if(!list_empty(&aau->hold_q))

820 {

821 sw_desc = SW_ENTRY(aau->hold_q.next);

822 listhead = (aau_head_t *) sw_desc->sgl_head;

823 }

824 else

825 sw_desc = NULL;

826 }

827

828 /* reset end of chain flag */

829 end_chain = 0;

830

831 /* wake up user function waiting for return */

832 /* or use callback if exist */

833 if(listhead->callback)

White Paper 57 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

834 {

835 DPRINTK("Calling callback\n");

836 listhead->callback((void *)listhead);

837 }

838 else if(listhead->status & AAU_COMPLETE)

839 /* if(waitqueue_active(&aau->wait_q)) */

840 {

841 DPRINTK("Waking up waiting process\n");

842 wake_up_interruptible(&aau->wait_q);

843 }

844 } /* end while */

845 DPRINTK("Exiting bottom task\n");

846 }

847

848 /*======*/

849 /* Procedure: aau_init() */

850 /* */

851 /* Description: This function initializes the AAU. */

852 /* */

853 /* Parameters: NONE */

854 /* */

855 /* Returns: int: success -- OK */

856 /* */

857 /* Notes/Assumptions: */

858 /* */

859 /* History: Dave Jiang 07/18/01 Initial Creation */

860 /*======*/

861 static int __init aau_init(void)

862 {

863 int i;

864 sw_aau_t *sw_desc;

865 int err;

866 void *desc = NULL;

867

58 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

868 printk("Intel 80310 AAU Copyright(c) 2001 Intel Corporation\n");

869 DPRINTK("Initializing...");

870

871 /* set the IRQ */

872 aau_dev.irq = IRQ_IOP310_AAU;

873

874 err = request_irq(aau_dev.irq, aau_irq_handler, SA_INTERRUPT,

875 NULL, (void *)&aau_dev);

876 if(err < 0)

877 {

878 printk(KERN_ERR "unable to request IRQ %d for AAU: %d\n",

879 aau_dev.irq, err);

880 return err;

881 }

882

883 /* init free stack */

884 INIT_LIST_HEAD(&free_stack);

885 /* init free stack spinlock */

886 spin_lock_init(&free_lock);

887

888

889 /* pre-alloc AAU descriptors */

890 for(i = 0; i < MAX_AAU_DESC; i++)

891 {

892 desc = kmalloc((sizeof(sw_aau_t) + 0x20), GFP_KERNEL);

893 memset(desc, 0, sizeof(sw_aau_t));

894 sw_desc = (sw_aau_t *) (((u32) desc & 0xffffffe0) + 0x20);

895 sw_desc->aau_phys = virt_to_phys((void *)sw_desc);

896 /* we keep track of original address before alignment adjust */

897 /* so we can free it later */

898 sw_desc->desc_addr = (u32) desc;

899

900 spin_lock_irq(&free_lock);

901 /* put the descriptors on the free stack */

White Paper 59 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

902 list_add_tail(&sw_desc->link, &free_stack);

903 spin_unlock_irq(&free_lock);

904 }

905

906 /* set the register data structure to the mapped memory regs AAU */

907 aau_dev.regs = (aau_regs_t *) IOP310_AAUACR;

908

909 atomic_set(&aau_dev.ref_count, 0);

910

911 /* init process Q */

912 INIT_LIST_HEAD(&aau_dev.process_q);

913 /* init holding Q */

914 INIT_LIST_HEAD(&aau_dev.hold_q);

915 /* init locks for Qs */

916 spin_lock_init(&aau_dev.hold_lock);

917 spin_lock_init(&aau_dev.process_lock);

918

919 aau_dev.last_aau = NULL;

920

921 /* initialize BH task */

922 aau_dev.aau_task.sync = 0;

923 aau_dev.aau_task.routine = (void *)aau_task;

924

925 /* initialize wait Q */

926 init_waitqueue_head(&aau_dev.wait_q);

927

928 /* clear AAU channel control register */

929 *(IOP310_AAUACR) = AAU_ACR_CLEAR;

930 *(IOP310_AAUASR) = AAU_ASR_CLEAR;

931 *(IOP310_AAUANDAR) = 0;

932

933 /* set default irq threshold */

934 atomic_set(&aau_dev.irq_thresh, DEFAULT_AAU_IRQ_THRESH);

935 DPRINTK("Done!\n");

60 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

936

937 return 0;

938 }

939

940 /*======*/

941 /* Procedure: aau_set_irq_threshold() */

942 /* */

943 /* Description: This function readjust the threshold for the irq. */

944 /* */

945 /* Parameters: aau: pointer to aau device descriptor */

946 /* value: value of new irq threshold */

947 /* */

948 /* Returns: N/A */

949 /* */

950 /* Notes/Assumptions: default is set at 10 */

951 /* */

952 /* History: Dave Jiang 08/27/01 Initial Creation */

953 /*======*/

954 void aau_set_irq_threshold(u32 aau_context, int value)

955 {

956 iop310_aau_t *aau = (iop310_aau_t *) aau_context;

957 atomic_set(&aau->irq_thresh, value);

958 } /* End of aau_set_irq_threshold() */

959

960

961 /*======*/

962 /* Procedure: aau_get_buffer() */

963 /* */

964 /* Description: This function acquires an SGL element for the user */

965 /* and returns that. It retries multiple times if no */

966 /* descriptor is available. */

967 /* */

968 /* Parameters: aau_context: AAU context */

969 /* num_buf: number of descriptors */

White Paper 61 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

970 /* */

971 /* Returns: N/A */

972 /* */

973 /* Notes/Assumptions: */

974 /* */

975 /* History: Dave Jiang 9/11/01 Initial Creation */

976 /* Dave Jiang 10/04/01Fixed list linking problem */

977 /*======*/

978 aau_sgl_t *aau_get_buffer(u32 aau_context, int num_buf)

979 {

980 sw_aau_t *sw_desc = NULL;

981 sw_aau_t *sw_head = NULL;

982 sw_aau_t *sw_prev = NULL;

983

984 int retry = 10;

985 int i;

986 DECLARE_WAIT_QUEUE_HEAD(wait_q);

987

988 if((num_buf > MAX_AAU_DESC) || (num_buf <= 0))

989 {

990 return NULL;

991 }

992

993 DPRINTK("Getting %d descriptors\n", num_buf);

994 for(i = num_buf;i>0;i--)

995 {

996 spin_lock_irq(&free_lock);

997 if(!list_empty(&free_stack))

998 {

999 sw_desc = SW_ENTRY(free_stack.next);

1000 list_del(free_stack.next);

1001 spin_unlock_irq(&free_lock);

1002 }

1003 else

62 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

1004 {

1005 while(retry-- && !sw_desc)

1006 {

1007 spin_unlock_irq(&free_lock);

1008 interruptible_sleep_on_timeout(&wait_q, SLEEP_TIME);

1009 spin_lock_irq(&free_lock);

1010 if(!list_empty(&free_stack))

1011 {

1012 sw_desc = SW_ENTRY(free_stack.next);

1013 list_del(free_stack.next);

1014 }

1015 spin_unlock_irq(&free_lock);

1016 }

1017

1018 sw_desc = sw_head;

1019 spin_lock_irq(&free_lock);

1020 while(sw_desc)

1021 {

1022 sw_desc->status = 0;

1023 sw_desc->head = NULL;

1024 sw_desc->tail = NULL;

1025 list_add(&sw_desc->link, &free_stack);

1026 sw_desc = (sw_aau_t *) sw_desc->next;

1027 } /* end while */

1028 spin_unlock_irq(&dma_free_lock);

1029 return NULL;

1030 } /* end else */

1031

1032 if(sw_prev)

1033 {

1034 sw_prev->next = (aau_sgl_t *) sw_desc;

1035 sw_prev->aau_desc.NDA = sw_desc->aau_phys;

1036 }

1037 else

White Paper 63 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

1038 {

1039 sw_head = sw_desc;

1040 }

1041

1042 sw_prev = sw_desc;

1043 } /* end for */

1044

1045 sw_desc->aau_desc.NDA = 0;

1046 sw_desc->next = NULL;

1047 sw_desc->status = 0;

1048 return (aau_sgl_t *) sw_head;

1049 }

1050

1051

1052 /*======*/

1053 /* Procedure: aau_return_buffer() */

1054 /* */

1055 /* Description: This function takes a list of SGL and return it to */

1056 /* the free stack. */

1057 /* */

1058 /* Parameters: aau_context: AAU context */

1059 /* list: SGL list to return to free stack */

1060 /* */

1061 /* Returns: N/A */

1062 /* */

1063 /* Notes/Assumptions: */

1064 /* */

1065 /* History: Dave Jiang 9/11/01 Initial Creation */

1066 /*======*/

1067 void aau_return_buffer(u32 aau_context, aau_sgl_t * list)

1068 {

1069 sw_aau_t *sw_desc = (sw_aau_t *) list;

1070

1071 spin_lock_irq(&free_lock);

64 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

1072 while(sw_desc)

1073 if(sw_desc)

1074 {

1075 list_add(&sw_desc->link, &free_stack);

1076 sw_desc = (sw_aau_t *) sw_desc->next;

1077 }

1078 spin_unlock_irq(&free_lock);

1079 }

1080

1081 int aau_memcpy(void *dest, void *src, u32 size)

1082 {

1083

1084 iop310_aau_t *aau = &aau_dev; /* Global variable */

1085 aau_head_t head;

1086 aau_sgl_t *list;

1087 int err;

1088

1089 head.total = size;

1090 head.status = 0;

1091 head.callback = NULL;

1092

1093 list = aau_get_buffer((u32) aau, 1);

1094 if(list)

1095 {

1096 head.list = list;

1097 }

1098 else

1099 {

1100 return -ENOMEM;

1101 }

1102

1103

1104

1105 while(list)

White Paper 65 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code

1106 {

1107 list->status = 0;

1108 list->src[0] = src;

1109 list->aau_desc.SAR[0] = (u32) virt_to_phys(src);

1110 list->dest = dest;

1111 list->aau_desc.DAR = (u32) virt_to_phys(dest);

1112 list->aau_desc.BC = size;

1113 list->aau_desc.DC = AAU_DCR_WRITE | AAU_DCR_BLKCTRL_1_DF;

1114 if(!list->next)

1115 {

1116 list->aau_desc.DC |= AAU_DCR_IE;

1117 list->status |= AAU_END_CHAIN;

1118 break;

1119 }

1120 list = list->next;

1121 }

1122 err = aau_queue_buffer((u32) aau, &head);

1123 aau_return_buffer((u32) aau, head.list);

1124 return err;

1125 }

1126

1127 EXPORT_SYMBOL_NOVERS(aau_request);

1128 EXPORT_SYMBOL_NOVERS(aau_queue_buffer);

1129 EXPORT_SYMBOL_NOVERS(aau_suspend);

1130 EXPORT_SYMBOL_NOVERS(aau_resume);

1131 EXPORT_SYMBOL_NOVERS(aau_free);

1132 EXPORT_SYMBOL_NOVERS(aau_set_irq_threshold);

1133 EXPORT_SYMBOL_NOVERS(aau_get_buffer);

1134 EXPORT_SYMBOL_NOVERS(aau_return_buffer);

1135 EXPORT_SYMBOL_NOVERS(aau_memcpy);

1136

1137 module_init(aau_init);

66 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

Appendix B Example Calling Source Code

B.1 Standard Calls

Support functions for the 80310 AAU

======

Dave Jiang

Last updated: 09/18/2001

The Intel® 80312 I/O companion chip in the 80310 chipset contains an AAU. The

AAU is capable of processing up to 8 data block sources and perform XOR

operations on them. This unit is typically used to accelerated XOR

operations utilized by RAID storage device drivers such as RAID 5. This

API is designed to provide a set of functions to take advantage of the

AAU. The AAU can also be used to transfer data blocks and used as a memory

copier. The AAU transfer the memory faster than the operation performed by

using CPU copy therefore it is recommended to use the AAU for memory copy.

------

int aau_request(u32 *aau_context, const char *device_id);

This function allows the user the acquire the control of the AAU. The

function will return a context of AAU to the user and allocate

an interrupt for the AAU. The user must pass the context as a parameter to

various AAU API calls.

int aau_queue_buffer(u32 aau_context, aau_head_t *listhead);

This function starts the AAU operation. The user must create a SGL

header with a SGL attached. The format is presented below. The SGL is

built from kernel memory.

/* hardware descriptor */

typedef struct _aau_desc

{

White Paper 67 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

u32 NDA; /* next descriptor address [READONLY] */

u32 SAR[AAU_SAR_GROUP]; /* src addrs */

u32 DAR; /* destination addr */

u32 BC; /* byte count */

u32 DC; /* descriptor control */

u32 SARE[AAU_SAR_GROUP]; /* extended src addrs */

} aau_desc_t;

/* user SGL format */

typedef struct _aau_sgl

{

aau_desc_t aau_desc; /* AAU HW Desc */

u32 status; /* status of SGL [READONLY] */

struct _aau_sgl*next; /* pointer to next SG [READONLY] */

void *dest; /* destination addr */

void *src[AAU_SAR_GROUP]; /* source addr[4] */

void *ext_src[AAU_SAR_GROUP]; /* ext src addr[4] */

u32 total_src; /* total number of source */

} aau_sgl_t;

/* header for user SGL */

typedef struct _aau_head

{

u32 total; /* total descriptors allocated */

u32 status; /* SGL status */

aau_sgl_t *list; /* ptr to head of list */

aau_callback_t callback; /* callback func ptr */

} aau_head_t;

The function will call aau_start() and start the AAU after it queues

the SGL to the processing queue. When the function will either

a. Sleep on the wait queue aau->wait_q if no callback has been provided, or

b. Continue and then call the provided callback function when DMA interrupt

68 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

has been triggered.

int aau_suspend(u32 aau_context);

Stops/Suspends the AAU operation

int aau_free(u32 aau_context);

Frees the ownership of AAU. Called when no longer need AAU service.

aau_sgl_t * aau_get_buffer(u32 aau_context, int num_buf);

This function obtains an AAU SGL for the user. User must specify the number

of descriptors to be allocated in the chain that is returned.

void aau_return_buffer(u32 aau_context, aau_sgl_t *list);

This function returns all SGL back to the API after user is done.

int aau_memcpy(void *dest, void *src, u32 size);

This function is a short cut for user to do memory copy utilizing the AAU for

better large block memory copy vs. using the CPU. This is similar to using

typical memcopy() call.

* User is responsible for the source address(es) and the destination address.

The source and destination should all be cached memory.

void aau_test()

{

u32 aau;

char dev_id[] = "AAU";

int size = 2;

int err = 0;

aau_head_t *head;

aau_sgl_t *list;

u32 i;

White Paper 69 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

u32 result = 0;

void *src, *dest;

printk("Starting AAU test\n");

if((err = aau_request(&aau, dev_id))<0)

{

printk("test - AAU request failed: %d\n", err);

return;

}

else

{

printk("test - AAU request successful\n");

}

head = kmalloc(sizeof(aau_head_t), GFP_KERNEL);

head->total = size;

head->status = 0;

head->callback = NULL;

list = aau_get_buffer(aau, size);

if(!list)

{

printk("Can't get buffers\n");

return;

}

head->list = list;

src = kmalloc(1024, GFP_KERNEL);

dest = kmalloc(1024, GFP_KERNEL);

while(list)

{

list->status = 0;

list->aau_desc->SAR[0] = (u32)src;

70 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

list->aau_desc->DAR = (u32)dest;

list->aau_desc->BC = 1024;

/* see iop310-aau.h for more DCR commands */

list->aau_desc->DC = AAU_DCR_WRITE | AAU_DCR_BLKCTRL_1_DF;

if(!list->next)

{

list->aau_desc->DC = AAU_DCR_IE;

break;

}

list = list->next;

}

printk("test- Queueing buffer for AAU operation\n");

err = aau_queue_buffer(aau, head);

if(err >= 0)

{

printk("AAU Queue Buffer is done...\n");

}

else

{

printk("AAU Queue Buffer failed...: %d\n", err);

}

#if 1

printk("freeing the AAU\n");

aau_return_buffer(aau, head->list);

aau_free(aau);

kfree(src);

kfree(dest);

kfree((void *)head);

#endif

White Paper 71 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code

}

All Disclaimers apply. Use this at your own discretion. Neither Intel nor I

will be responsible if anything goes wrong. =)

TODO

____

* Testing

* Do zero-size AAU transfer/channel at init

so all we have to do is chaining

72 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

Appendix C MMU Functions for Intel® XScale™ Microarchitecture

/*

* linux/arch/arm/mm/proc-.S

*

* Author:Nicolas Pitre

* Created:November 2000

* Copyright:(C) 2000, 2001 MontaVista Software Inc.

*

* This program is free software; you can redistribute it and/or modify

* it under the terms of the GNU General Public License version 2 as

* published by the Free Software Foundation.

*

* MMU functions for the Intel® XScale™ microarchitecture

*

* 2001 Aug 21:

* some contributions by Brett Gaines

* Copyright 2001 by Intel Corp.

*

* 2001 Sep 08:

* Completely revisited, many important fixes

* Nicolas Pitre

*/

#include

#include

#include

#include

#include

/*

* This is the maximum size of an area which will be flushed. If the area

* is larger than this, then we flush the whole cache

White Paper 73 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

*/

#define MAX_AREA_SIZE32768

/*

* the cache line size of the I and D cache

*/

#define CACHELINESIZE32

/*

* the size of the data cache

*/

#define CACHESIZE32768

/*

* and the page size

*/

#define PAGESIZE4096

/*

* Virtual address used to allocate the cache when flushed

*

* This must be an address range which is _never_ used. It should

* apparently have a mapping in the corresponding page table for

* compatibility with future CPUs that _could_ require it. For instance we

* don't care.

*

* This must be aligned on a 2*CACHESIZE boundary. The code selects one of

* the 2 areas alternating each time the clean_d_cache macro is used.

* Without this the Intel® XScale™ core™ exhibits cache eviction problems and no one

* knows why.

*

* Reminder: the vector table is located at 0xffff0000-0xffff0fff.

*/

#define CLEAN_ADDR0xfffe0000

74 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

/*

* This macro is used to wait for a CP15 write and is needed

* when we have to ensure that the last operation to the co-pro

* was completed before continuing with operation.

*/

.macrocpwait, rd

mrc p15, 0, \rd, c2, c0, 0@ arbitrary read of cp15

mov \rd, \rd @ wait for completion

sub pc, pc, #4 @ flush instruction pipeline

.endm

.macrocpwait_ret, lr, rd

mrc p15, 0, \rd, c2, c0, 0@ arbitrary read of cp15

sub pc, \lr, \rd, LSR #32@ wait for completion and

@ flush instruction pipeline

.endm

/*

* This macro cleans the entire dcache using line allocate.

* The main loop has been unrolled to reduce loop overhead.

* rd and rs are two scratch registers.

*/

.macro clean_d_cache, rd, rs

ldr \rs, =clean_addr

ldr \rd, [\rs]

eor \rd, \rd, #CACHESIZE

str \rd, [\rs]

add \rs, \rd, #CACHESIZE

1: mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line

add \rd, \rd, #CACHELINESIZE

mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line

add \rd, \rd, #CACHELINESIZE

mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line

White Paper 75 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

add \rd, \rd, #CACHELINESIZE

mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line

add \rd, \rd, #CACHELINESIZE

teq \rd, \rs

bne 1b

.endm

.data

clean_addr:.wordCLEAN_ADDR

.text

/*

* cpu_xscale_data_abort()

*

* obtain information about current aborted instruction

*

* r0 = address of aborted instruction

*

* Returns:

* r0 = address of abort

* r1 != 0 if writing

* r3=FSR

*/

.align5

ENTRY(cpu_xscale_data_abort)

mov r2, r0

mrc p15, 0, r0, c6, c0, 0@ get FAR

mrc p15, 0, r3, c5, c0, 0@ get FSR

ldr r1, [r2] @ read aborted instruction

tst r1, r1, lsr #21@ C = bit 20

sbcr1,r1,r1@r1=C-1

and r3, r3, #255

mov pc, lr

76 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

/*

* cpu_xscale_check_bugs()

*/

ENTRY(cpu_xscale_check_bugs)

mrs ip, cpsr

bic ip, ip, #F_BIT

msr cpsr, ip

mov pc, lr

/*

* cpu_xscale_proc_init()

*

* Nothing too exciting at the moment

*/

ENTRY(cpu_xscale_proc_init)

mov pc, lr

/*

* cpu_xscale_proc_fin()

*/

ENTRY(cpu_xscale_proc_fin)

str lr, [sp, #-4]!

mov r0, #F_BIT|I_BIT|SVC_MODE

msr cpsr_c, r0

mrc p15, 0, r0, c1, c0, 0@ ctrl register

bic r0, r0, #0x1800@ ...IZ......

bic r0, r0, #0x0006@ ...... CA.

mcr p15, 0, r0, c1, c0, 0@ disable caches

bl cpu_xscale_cache_clean_invalidate_all@ clean caches

ldr pc, [sp], #4

/*

* cpu_xscale_reset(loc)

White Paper 77 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

*

* Perform a soft reset of the system. Put the CPU into the

* same state as it would be if it had been reset, and branch

* to what would be the reset vector.

*

* loc: location to jump to for soft reset

*/

.align5

ENTRY(cpu_xscale_reset)

mov r1, #F_BIT|I_BIT|SVC_MODE

msr cpsr_c, r1 @ reset CPSR

mrc p15, 0, r1, c1, c0, 0@ ctrl register

bic r1, r1, #0x0086@ ...... B....CA.

bic r1, r1, #0x1900@ ...IZ..S......

mcr p15, 0, r1, c1, c0, 0@ ctrl register

mcr p15, 0, ip, c7, c7, 0@ invalidate I,D caches & BTB

bic r1, r1, #0x0001@ ...... M

mcr p15, 0, r1, c1, c0, 0@ ctrl register

@ CAUTION: MMU turned off from this point. We count on the pipeline

@ already containing those two last instructions to survive.

mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs

mov pc, r0

/*

* cpu_xscale_do_idle(type)

*

* Cause the processor to idle

*

* type:

* 0 = slow idle

* 1 = fast idle

* 2 = switch to slow processor clock

* 3 = switch to fast processor clock

*

78 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

* For now we do nothing but go to idle mode for every case

*

* Intel® XScale™ microarchitecture supports clock switching, but using idle mode

* support allows external hardware to react to system state changes.

*/

.align5

ENTRY(cpu_xscale_do_idle)

mov r0, #1

mcr p14, 0, r0, c7, c0, 0@ Go to IDLE

mov pc, lr

/* ======CACHE ======*/

/*

* cpu_xscale_cache_clean_invalidate_all (void)

*

* clean and invalidate all cache lines

*

* Note:

* 1. We should preserve r0 at all times.

* 2. Even if this function implies cache "invalidation" by its name,

* we don't need to actually use explicit invalidation operations

* since the goal is to discard all valid references from the cache

* and the cleaning of it already has that effect.

* 3. Because of 2 above and the fact that kernel space memory is always

* coherent across task switches there is no need to worry about

* inconsistencies due to interrupts, hence no irq disabling.

*/

.align5

ENTRY(cpu_xscale_cache_clean_invalidate_all)

mov r2, #1

cpu_xscale_cache_clean_invalidate_all_r2:

clean_d_cache r0, r1

White Paper 79 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

teq r2, #0

mcrnep15, 0, ip, c7, c5, 0@ Invalidate I cache & BTB

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/*

* cpu_xscale_cache_clean_invalidate_range(start, end, flags)

*

* clean and invalidate all cache lines associated with this area of memory

*

* start: Area start address

* end: Area end address

* flags: nonzero for I cache as well

*/

.align5

ENTRY(cpu_xscale_cache_clean_invalidate_range)

bic r0, r0, #CACHELINESIZE - 1@ round down to cache line

sub r3, r1, r0

cmp r3, #MAX_AREA_SIZE

bhi cpu_xscale_cache_clean_invalidate_all_r2

1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

mcr p15, 0, r0, c7, c6, 1@ Invalidate D cache line

add r0, r0, #CACHELINESIZE

cmp r0, r1

blo 1b

teq r2, #0

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

moveqpc, lr

sub r0, r0, r3

1: mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

cmp r0, r1

blo 1b

mcr p15, 0, ip, c7, c5, 6@ Invalidate BTB

80 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

mov pc, lr

/*

* cpu_xscale_flush_ram_page(page)

*

* clean all cache lines associated with this memory page

*

* page: page to clean

*/

.align5

ENTRY(cpu_xscale_flush_ram_page)

mov r1, #PAGESIZE

1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

subsr1, r1, #2 * CACHELINESIZE

bne 1b

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/* ======D-CACHE ======*/

/*

* cpu_xscale_dcache_invalidate_range(start, end)

*

* throw away all D-cached data in specified region without an obligation

* to write them back. Note however that on Intel® XScale™ microarchitecture we

* must clean all entries also due to hardware errata (80200 A0 & A1 only).

*

* start: virtual start address

* end: virtual end address

*/

.align5

White Paper 81 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

ENTRY(cpu_xscale_dcache_invalidate_range)

mrc p15, 0, r2, c0, c0, 0@ Read part no.

eor r2, r2, #0x69000000

eor r2, r2, #0x00052000@ 80200 XX part no.

bicsr2, r2, #0x1@ Clear LSB in revision field

moveqr2, #0

beq cpu_xscale_cache_clean_invalidate_range@ An 80200 A0 or A1

tst r0, #CACHELINESIZE - 1

mcrnep15, 0, r0, c7, c10, 1@ Clean D cache line

tst r1, #CACHELINESIZE - 1

mcrnep15, 0, r1, c7, c10, 1@ Clean D cache line

bic r0, r0, #CACHELINESIZE - 1@ round down to cache line

1: mcr p15, 0, r0, c7, c6, 1@ Invalidate D cache line

add r0, r0, #CACHELINESIZE

cmp r0, r1

blo 1b

mov pc, lr

/*

* cpu_xscale_dcache_clean_range(start, end)

*

* For the specified virtual address range, ensure that all caches contain

* clean data, such that peripheral accesses to the physical RAM fetch

* correct data.

*

* start: virtual start address

* end: virtual end address

*/

.align5

ENTRY(cpu_xscale_dcache_clean_range)

bic r0, r0, #CACHELINESIZE - 1

sub r2, r1, r0

cmp r2, #MAX_AREA_SIZE

82 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

movhir2, #0

bhi cpu_xscale_cache_clean_invalidate_all_r2

1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

cmp r0, r1

blo 1b

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/*

* cpu_xscale_clean_dcache_page(page)

*

* Cleans a single page of dcache so that if we have any future aliased

* mappings, they will be consistent at the time that they are created.

*

* Note:

* 1. we don't need to flush the write buffer in this case.

* 2. we don't invalidate the entries since when we write the page

* out to disk, the entries may get reloaded into the cache.

*/

.align5

ENTRY(cpu_xscale_dcache_clean_page)

mov r1, #PAGESIZE

1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

add r0, r0, #CACHELINESIZE

White Paper 83 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

subsr1, r1, #4 * CACHELINESIZE

bne 1b

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/*

* cpu_xscale_dcache_clean_entry(addr)

*

* Clean the specified entry of any caches such that the MMU

* translation fetches will obtain correct data.

*

* addr: cache-unaligned virtual address

*/

.align5

ENTRY(cpu_xscale_dcache_clean_entry)

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/* ======I-CACHE ======*/

/*

* cpu_xscale_icache_invalidate_range(start, end)

*

* invalidate a range of virtual addresses from the Icache

*

* start: virtual start address

* end: virtual end address

*

* Note: This is vaguely defined as supposed to bring the dcache and the

* icache in sync by the way this function is used.

*/

.align5

ENTRY(cpu_xscale_icache_invalidate_range)

84 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

bic r0, r0, #CACHELINESIZE - 1

1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

cmp r0, r1

blo 1b

mcr p15, 0, ip, c7, c5, 6@ Invalidate BTB

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/*

* cpu_xscale_icache_invalidate_page(page)

*

* invalidate all Icache lines associated with this area of memory

*

* page: page to invalidate

*/

.align5

ENTRY(cpu_xscale_icache_invalidate_page)

mov r1, #PAGESIZE

1: mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line

add r0, r0, #CACHELINESIZE

subsr1, r1, #4 * CACHELINESIZE

bne 1b

mcr p15, 0, r0, c7, c5, 6@ Invalidate BTB

mov pc, lr

/* ======CACHE LOCKING======

White Paper 85 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

*

* The Intel® XScale™ microarchitecture implements support for locking entries into

* the data and instruction cache. The following functions implement the core

* low level instructions needed to accomplish the locking. The developer's

* manual states that the code that performs the locking must be in non-cached

* memory. To accomplish this, the code in xscale-cache-lock.c copies the

* following functions from the cache into a non-cached memory region that

* is allocated through consistent_alloc().

*

*/

.align5

/*

* xscale_icache_lock

*

* r0: starting address to lock

* r1: end address to lock

*/

ENTRY(xscale_icache_lock)

iLockLoop:

bic r0, r0, #CACHELINESIZE - 1

mcr p15, 0, r0, c9, c1, 0@ lock into cache

cmp r0, r1 @ are we done?

add r0, r0, #CACHELINESIZE@ advance to next cache line

bls iLockLoop

mov pc, lr

/*

* xscale_icache_unlock

*/

ENTRY(xscale_icache_unlock)

mcr p15, 0, r0, c9, c1, 1@ Unlock icache

mov pc, lr

86 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

/*

* xscale_dcache_lock

*

* r0: starting address to lock

* r1: end address to lock

*/

ENTRY(xscale_dcache_lock)

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov r2, #1

mcr p15, 0, r2, c9, c2, 0@ Put dcache in lock mode

cpwaitip @ Wait for completion

mrs r2, cpsr

orr r3, r2, #F_BIT | I_BIT

dLockLoop:

msr cpsr_c, r3

mcr p15, 0, r0, c7, c10, 1@ Write back line if it is dirty

mcr p15, 0, r0, c7, c6, 1@ Flush/invalidate line

msr cpsr_c, r2

ldr ip, [r0], #CACHELINESIZE @ Preload 32 bytes into cache from

@ location [r0]. Post-increment

@ r3 to next cache line

cmp r0, r1 @ Are we done?

bls dLockLoop

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov r2, #0

mcr p15, 0, r2, c9, c2, 0@ Get out of lock mode

cpwait_ret lr, ip

/*

* xscale_dcache_unlock

*/

ENTRY(xscale_dcache_unlock)

White Paper 87 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mcr p15, 0, ip, c9, c2, 1@ Unlock cache

mov pc, lr

/*

* Needed to determine the length of the code that needs to be copied.

*/

.align5

ENTRY(xscale_cache_dummy)

mov pc, lr

/* ======TLB ======*/

/*

* cpu_xscale_tlb_invalidate_all()

*

* Invalidate all TLB entries

*/

.align5

ENTRY(cpu_xscale_tlb_invalidate_all)

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs

cpwait_ret lr, ip

/*

* cpu_xscale_tlb_invalidate_range(start, end)

*

* invalidate TLB entries covering the specified range

*

* start: range start address

* end: range end address

*/

.align5

ENTRY(cpu_xscale_tlb_invalidate_range)

88 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

bic r0, r0, #(PAGESIZE - 1) & 0x00ff

bic r0, r0, #(PAGESIZE - 1) & 0xff00

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

1: mcr p15, 0, r0, c8, c6, 1@ invalidate D TLB entry

mcr p15, 0, r0, c8, c5, 1@ invalidate I TLB entry

add r0, r0, #PAGESIZE

cmp r0, r1

blo 1b

cpwait_ret lr, ip

/*

* cpu_xscale_tlb_invalidate_page(page, flags)

*

* invalidate the TLB entries for the specified page.

*

* page: page to invalidate

* flags: non-zero if we include the I TLB

*/

.align5

ENTRY(cpu_xscale_tlb_invalidate_page)

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

teq r1, #0

mcr p15, 0, r0, c8, c6, 1@ invalidate D TLB entry

mcrnep15, 0, r3, c8, c5, 1@ invalidate I TLB entry

cpwait_ret lr, ip

/* ======TLB LOCKING======

*

* The Intel® XScale™ microarchitecture implements support for locking entries into

* the Instruction and Data TLBs. The following functions provide the

* low level support for supporting these under Linux. xscale-lock.c

* implements some higher level management code. Most of the following

* is taken straight out of the Developer's Manual.

*/

White Paper 89 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

/*

* Lock I-TLB entry

*

* r0: Virtual address to translate and lock

*/

.align5

ENTRY(xscale_itlb_lock)

mrs r2, cpsr

orr r3, r2, #F_BIT | I_BIT

msr cpsr_c, r3 @ Disable interrupts

mcr p15, 0, r0, c8, c5, 1@ Invalidate I-TLB entry

mcr p15, 0, r0, c10, c4, 0@ Translate and lock

msr cpsr_c, r2 @ Restore interrupts

cpwait_ret lr, ip

/*

* Lock D-TLB entry

*

* r0: Virtual address to translate and lock

*/

.align5

ENTRY(xscale_dtlb_lock)

mrs r2, cpsr

orr r3, r2, #F_BIT | I_BIT

msr cpsr_c, r3 @ Disable interrupts

mcr p15, 0, r0, c8, c6, 1@ Invalidate D-TLB entry

mcr p15, 0, r0, c10, c8, 0@ Translate and lock

msr cpsr_c, r2 @ Restore interrupts

cpwait_ret lr, ip

/*

* Unlock all I-TLB entries

*/

90 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

.align5

ENTRY(xscale_itlb_unlock)

mcr p15, 0, ip, c10, c4, 1@ Unlock I-TLB

mcr p15, 0, ip, c8, c5, 0@ Invalidate I-TLB

cpwait_ret lr, ip

/*

* Unlock all D-TLB entries

*/

ENTRY(xscale_dtlb_unlock)

mcr p15, 0, ip, c10, c8, 1@ Unlock D-TBL

mcr p15, 0, ip, c8, c6, 0@ Invalidate D-TLB

cpwait_ret lr, ip

/* ======Page Table ======*/

#define USER_CACHE_WRITE_ALLOCATE 1

#define KERN_CACHE_WRITE_ALLOCATE 1

#define PMD_TYPE_MASK0x0003

#define PMD_TYPE_SECT0x0002

#define PMD_SECT_BUFFERABLE0x0004

#define PMD_SECT_CACHEABLE0x0008

#define PMD_SECT_TEX_X0x1000

#define HPTE_TYPE_SMALLEXT0x0003

#define HPTE_SMALLEXT_TEX_X0x0040

/*

* cpu_xscale_set_pgd(pgd)

*

* Set the translation base pointer to be as described by pgd.

*

* pgd: new page tables

White Paper 91 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

*/

.align5

ENTRY(cpu_xscale_set_pgd)

clean_d_cache r1, r2

mcr p15, 0, ip, c7, c5, 0@ Invalidate I cache & BTB

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mcr p15, 0, r0, c2, c0, 0@ load page table pointer

mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs

cpwait_ret lr, ip

/*

* cpu_xscale_set_pmd(pmdp, pmd)

*

* Set a level 1 translation table entry, and clean it out of

* any caches such that the MMUs can load it correctly.

*

* pmdp: pointer to PMD entry

* pmd: PMD value to store

*/

.align5

ENTRY(cpu_xscale_set_pmd)

#if KERN_CACHE_WRITE_ALLOCATE

and r2, r1, #PMD_TYPE_MASK|PMD_SECT_CACHEABLE|PMD_SECT_BUFFERABLE

cmp r2, #PMD_TYPE_SECT|PMD_SECT_CACHEABLE|PMD_SECT_BUFFERABLE

orreqr1, r1, #PMD_SECT_TEX_X

#endif

str r1, [r0]

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

/*

* cpu_xscale_set_pte(ptep, pte)

92 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

*

* Set a PTE and flush it out

*/

.align5

ENTRY(cpu_xscale_set_pte)

str r1, [r0], #-1024@ linux version

bic r2, r1, #0xff0

bic r2, r2, #3

eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY | LPTE_BUFFERABLE | LPTE_CACHEABLE

tst r1, #LPTE_USER | LPTE_EXEC@ User or Exec?

orrner2, r2, #HPTE_AP_READ

tst r1, #LPTE_WRITE | LPTE_DIRTY@ Write and Dirty?

orreqr2, r2, #HPTE_AP_WRITE

#if USER_CACHE_WRITE_ALLOCATE

tst r1, #LPTE_CACHEABLE | LPTE_BUFFERABLE@ B and C

orrner2, r2, #HPTE_TYPE_SMALL

biceqr2, r2, #0x0fc0@ clear non-exist AP[1-3]

orreqr2, r2, #HPTE_TYPE_SMALLEXT | HPTE_SMALLEXT_TEX_X

#else

orr r2, r2, #HPTE_TYPE_SMALL

#endif

tst r1, #LPTE_PRESENT | LPTE_YOUNG@ Present and Young?

movner2, #0

str r2, [r0] @ hardware version

mov r0, r0

mcr p15, 0, r0, c7, c10, 1@ Clean D cache line

White Paper 93 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mov pc, lr

.ltorg

cpu_manu_name:

.asciz"Intel"

cpu_80200_name:

.asciz"XScale-80200"

cpu_cotulla_name:

.asciz"XScale-Cotulla"

.align

.section ".text.init", #alloc, #execinstr

__xscale_setup:

mov r0, #F_BIT|I_BIT|SVC_MODE

msr cpsr_c, r0

mcr p15, 0, ip, c7, c7, 0@ invalidate I, D caches & BTB

mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer

mcr p15, 0, ip, c8, c7, 0@ invalidate I, D TLBs

mcr p15, 0, r4, c2, c0, 0@ load page table pointer

mov r0, #0x1f @ Domains 0, 1 = client

mcr p15, 0, r0, c3, c0, 0@ load domain access register

mrc p15, 0, r0, c1, c0, 0@ get control register

bic r0, r0, #0x0200@ ...... R......

bic r0, r0, #0x0082@ ...... B.....A.

orr r0, r0, #0x0005@ ...... C.M

orr r0, r0, #0x3900@ ..VIZ..S......

mov pc, lr

94 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

.text

/*

* Purpose : Function pointers used to access above functions - all calls

* come through these

*/

.type xscale_processor_functions, #object

ENTRY(xscale_processor_functions)

.word cpu_xscale_data_abort

.word cpu_xscale_check_bugs

.word cpu_xscale_proc_init

.word cpu_xscale_proc_fin

.word cpu_xscale_reset

.word cpu_xscale_do_idle

/* cache */

.word cpu_xscale_cache_clean_invalidate_all

.word cpu_xscale_cache_clean_invalidate_range

.word cpu_xscale_flush_ram_page

/* dcache */

.word cpu_xscale_dcache_invalidate_range

.word cpu_xscale_dcache_clean_range

.word cpu_xscale_dcache_clean_page

.word cpu_xscale_dcache_clean_entry

/* icache */

.word cpu_xscale_icache_invalidate_range

.word cpu_xscale_icache_invalidate_page

/* tlb */

.word cpu_xscale_tlb_invalidate_all

White Paper 95 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

.word cpu_xscale_tlb_invalidate_range

.word cpu_xscale_tlb_invalidate_page

/* pgtable */

.word cpu_xscale_set_pgd

.word cpu_xscale_set_pmd

.word cpu_xscale_set_pte

.size xscale_processor_functions, . - xscale_processor_functions

.type cpu_80200_info, #object

cpu_80200_info:

.long cpu_manu_name

.long cpu_80200_name

.size cpu_80200_info, . - cpu_80200_info

.type cpu_cotulla_info, #object

cpu_cotulla_info:

.long cpu_manu_name

.long cpu_cotulla_name

.size cpu_cotulla_info, . - cpu_cotulla_info

.type cpu_arch_name, #object

cpu_arch_name:

.asciz "armv5"

.size cpu_arch_name, . - cpu_arch_name

.type cpu_elf_name, #object

cpu_elf_name:

.asciz "v5"

.size cpu_elf_name, . - cpu_elf_name

.align

.section ".proc.info", #alloc, #execinstr

96 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

.type __80200_proc_info,#object

__80200_proc_info:

.long 0x69052000

.long 0xfffffff0

.long 0x00000c0e

b __xscale_setup

.long cpu_arch_name

.long cpu_elf_name

.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP

.long cpu_80200_info

.long xscale_processor_functions

.size __80200_proc_info, . - __80200_proc_info

.type __cotulla_proc_info,#object

__cotulla_proc_info:

.long 0x69052100

.long 0xfffffff0

.long 0x00000c0e

b __xscale_setup

.long cpu_arch_name

.long cpu_elf_name

.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP

.long cpu_cotulla_info

.long xscale_processor_functions

.size __cotulla_proc_info, . - __cotulla_proc_info

White Paper 97 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture

This page intentionally left blank.

98 White Paper