Intel® 80310 I/O Processor Chipset AAU Coding Techniques
White Paper
January 14, 2002
Document Number: 273649-001 Information in this document is provided in connection with Intel® products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The Intel® 80310 I/O processor may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an ordering number and are referenced in this document, or other Intel literature may be obtained by calling 1-800-548-4725 or by visiting Intel's website at http://www.intel.com. Copyright© Intel Corporation, 2002 AlertVIEW, i960, AnyPoint, AppChoice, BoardWatch, BunnyPeople, CablePort, Celeron, Chips, Commerce Cart, CT Connect, CT Media, Dialogic, DM3, EtherExpress, ETOX, FlashFile, GatherRound, i386, i486, iCat, iCOMP, Insight960, InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel ChatPad, Intel Create&Share, Intel Dot.Station, Intel GigaBlade, Intel InBusiness, Intel Inside, Intel Inside logo, Intel NetBurst, Intel NetStructure, Intel Play, Intel Play logo, Intel Pocket Concert, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel TeamStation, Intel WebOutfitter, Intel Xeon, Intel XScale, Itanium, JobAnalyst, LANDesk, LanRover, MCS, MMX, MMX logo, NetPort, NetportExpress, Optimizer logo, OverDrive, Paragon, PC Dads, PC Parents, Pentium, Pentium II Xeon, Pentium III Xeon, Performance at Your Command, ProShare, RemoteExpress, Screamline, Shiva, SmartDie, Solutions960, Sound Mark, StorageExpress, The Computer Inside, The Journey Inside, This Way In, TokenExpress, Trillium, Vivonic, and VTune are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.
2 White Paper Contents
Contents
1.0 White Paper Purpose and Description ...... 7 1.1 Document Highlights...... 7 1.2 Related Documents ...... 7 2.0 Application Accelerator Unit ...... 8 2.0.1 Overview...... 8 3.0 Low-Level Design Document ...... 9 3.1 Objective...... 9 3.1.1 AAU Implementation...... 9 3.1.1.1 Overview ...... 9 3.1.2 Assumptions ...... 10 3.1.3 Initialization ...... 11 3.1.4 AAU Data Structures ...... 11 3.1.5 Data Path...... 15 3.1.6 API Functions ...... 17 3.1.6.1 API Listing...... 17 3.1.6.1.1 AAU Public...... 17 3.1.6.1.2 AAU Private (Static) ...... 17 3.1.6.2 Selected API Descriptions ...... 18 3.1.6.2.1 static int __init aau_init(void);...... 18 3.1.6.2.2 static int aau_start(iop310_aau_t *aau, sw_aau_t *aau_chain);...... 19 3.1.6.2.3 int aau_request(u32 *aau_context); ...... 19 3.1.6.2.4 int aau_suspend(u32 aau_context);...... 19 3.1.6.2.5 int aau_resume(u32 aau_context); ...... 20 3.1.6.2.6 int aau_queue_buffer(u32 aau_context, aau_sgl_t *sgl);...... 20 3.1.6.2.7 static int aau_flush_all(u32 aau_context);...... 21 3.1.6.2.8 int aau_free(u32 aau_context); ...... 21 3.1.6.2.9 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);...... 22 3.1.6.2.10 static void aau_process(iop310_aau_t *aau); ...... 24 3.1.6.2.11 static void aau_result_handler(void *aau);...... 24 3.1.6.2.12 aau_sgl_t * aau_get_buffer(u32 aau_context, u32 num_buf); ...... 25 3.1.6.2.13 void aau_return_buffer(u32 aau_context, sgl_list_t *list); .....25 4.0 Code Commentary...... 26 4.1 Section Objectives ...... 26 4.1.1 File Organization Overview...... 26 4.1.1.1 Key Data Structure and Use of Casting ...... 26 4.1.2 Cache Memory ...... 27 4.1.3 Other AAU Hardware...... 27 4.1.4 Virtual to Physical memory ...... 27 4.1.5 Interrupt Handling ...... 28 4.1.5.1 Top Half Interrupt Handler: aau_irq_handler() ...... 28 4.1.5.2 Bottom Half Interrupt Handler: aau_task() ...... 28 4.1.6 Linux Kernel APIs ...... 28
White Paper 3 Contents
4.2 Optimization Related ...... 29 4.2.1 Stack verses Queue ...... 29 4.2.2 Chaining and Resume ...... 29 4.2.3 Requiring the Application to Supply Physical Addresses in AAU Descriptor (verses virtual addresses)...... 29 4.2.4 Allocations of Memory for AAU Decriptors During Initialization...... 29 4.2.5 Using AAU for Local Memory to Local Memory Copy: mem_copy() ...... 29 5.0 Potential Enhancements...... 30 5.1 Error Handling...... 30 5.2 Lookaside Cache Scheme (This is Linux specific) ...... 30 5.3 Extensive Intel Optimization Related Documentation...... 30 6.0 Conclusion...... 31
A AAU Source Code...... 32 A.1 Public Definitions for Intel® 80310 I/O Processor Chipset AAU: \include\aau.h...... 32 A.2 Private Definitions for Intel® XScale™ Microarchitecture AAU: \src\aau.h ...... 36 A.3 Support Functions for the Intel® 80310 I/O Processor Chipset AAU: \src\aau.c ...... 39 B Example Calling Source Code...... 67 B.1 Standard Calls ...... 67 C MMU Functions for Intel® XScale™ Microarchitecture...... 73
4 White Paper Contents
Figures
1 Application Accelerator Unit...... 8 2 AAU State Trace Diagram ...... 16 3 Interrupt Handler Functional Flow Diagram ...... 23 Tables
1 Acronyms...... 9 2 AAU Control Registers...... 9 3 DC Field Description...... 10 4 AAU Registers ...... 11 5 AAU Hardware Descriptor Format...... 12 6 AAU Hardware Descriptor ...... 12 7 AAU Software Descriptor Structure ...... 13 8 AAU Device Descriptor ...... 14 9 User SGL Header ...... 14 10 AAU User SGL Structure...... 15
White Paper 5 Revision History
Revision History
Date Revision Description January 2002 001 Initial Release.
6 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques White Paper Purpose and Description
1.0 White Paper Purpose and Description
Increasing I/O demands are central to Network and Storage high performance applications. Intel® XScale™ microarchitecture (ARM* architecture compliant) addresses this trend with the Intel® 80310 I/O processor chipset (80310). Features of the Intel® 80310 solution include the Application Accelerator Unit (AAU).
The purpose of this paper is to provide Intel customers a fast development ramp in using the Application Accelerator Unit (AAU) on the 80310. This is achieved by providing a implementation case study. The contents of this document are meant to be a supplement to the Intel® 80312 I/O Companion Chip Developer’s Manual, Chapter 10, Intel-referenced Optimization Guides and the other extensive Intel documentation listed in the Section 1.2, “Related Documents”.
1.1 Document Highlights
• Section 2.0, “Application Accelerator Unit”: AAU Hardware Overview. • Section 1.2, “Related Documents”: A listing of related documents and web links. • Section 3.0, “Low-Level Design Document”: This is a case study presenting a Low-Level Design Document used in a Linux implemenation of AAU hardware. • Section A, “AAU Source Code”: The Linux implementation source code. • Section 4.0, “Code Commentary” and Section 5.0, “Potential Enhancements”: Code Commentary discussing implementation with source code line references. Commentary includes identifying optimization implemented, interrupt handling and potential enhancements to existing implementation. • Section B, “Example Calling Source Code”: Examples Calling Source Code APIs. • Section C, “MMU Functions for Intel® XScale™ Microarchitecture”: A listing for MMU implementation called in source code.
1.2 Related Documents
• Intel® 80312 I/O Companion Chip Developer’s Manual (273410). • Intel® 80200 Processor based on Intel® XScale™ Microarchitecture Developer’s Manual (273411). • Intel® IQ80310 Evaluation Platform Board Manual (273431). • Intel® XScale™ Microarchitecture Coding Techniques White Paper (273578).
Other Application Notes and tools: • http://www.intel.com/design/iio/docs/iop310.htm. • http://www.intel.com/design/iio/devtools/tptools.htm. • http://www.intel.com/design/intelxscale/.
White Paper 7 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Application Accelerator Unit
2.0 Application Accelerator Unit
2.0.1 Overview
The AAU provides low-latency, high-throughput data transfer capability between the AAU and the Intel® 80200 processor based on XScale™ microarchitecture (ARM* architecture compliant) local memory. It executes data transfers to and from Intel® 80200 processor (80200) local memory and also provides the necessary programming interface. The Application Accelerator performs the following functions: • Transfers data (read) from memory controller. • Performs an optional boolean operation (XOR) on read data. • Transfers data (write) to memory controller.
The AAU features: • 1 KB, arranged as 8-byte x 128-deep store queue. — Configurable to a 512-byte, arranged as 8-byte x 64-deep store queue. • Utilization of the Intel® 80312 I/O companion chip (80312) memory controller Interface. • 232 addressing range on the 80200 local memory interface. • Hardware support for unaligned data transfers for the internal bus. • Fully programmable from the 80200. • Support for automatic data chaining for gathering and scattering of data blocks.
Figure 1 shows a simplified connection of the Application Accelerator to the 80312 internal bus. Figure 1. Application Accelerator Unit
Application Accelerator Unit internal bus
8 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.0 Low-Level Design Document
3.1 Objective
This section presents the low level design details of the AAU API for Intel® XScale™ microarchitecture embedded Linux.
Table 1. Acronyms
Terms Definitions
AAU Application Accelerator Unit API Application Programming Interface OS Operating System PCI Peripheral Component Interconnect SGL Scattered Gather List
3.1.1 AAU Implementation
3.1.1.1 Overview
The 80312 contains an AAU to enable the hardware functionality of the XOR algorithm. It is capable of performing XOR operation on multiple blocks of source data and store the result back in 80200 local memory. The embedded Linux for Intel® XScale™ microarchitecture does not currently support the AAU functionality of the 80312. As a result, it is unable to take advantage of the AAUs XOR capabilities when it performs certain checksum calculations when using RAID 5 storage solution. This results in a drastic performance hit due to the XOR operations done in software. The implementation outlined describes the details of the changes that need to be made to embedded Linux for Intel® XScale™ microarchitecture in order to utilize the AAU and take advantage of the hardware acceleration.
The AAU API is intending to abstract the hardware away from driver developers and provide necessary functions for the developer to utilize the AAU. The AAU unit contains the following registers:
Table 2. AAU Control Registers (Sheet 1 of 2)
Register Register Name Description
Accelerator Control Word specifies parameters that dictate ACR Accelerator Control Register the overall operating environment such as enabling the accelerator and others. Accelerator Control Status shows the status of the ASR Accelerator Status Register accelerator that includes transfer task done and errors. Address of Current Chain Descriptor is the address of the ADAR Descriptor Address Register descriptor currently being processed. Address of Next Chain Descriptor points to the next ANDAR Next Descriptor Address Register descriptor that is linked to the current descriptor. A NULL value indicates it is the end of the descriptor chain. Intel® 80312 I/O companion chip Intel® 80200 processor Address of Source points to the SAR[4] Local Source Address Registers local address of the source data.
White Paper 9 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
Table 2. AAU Control Registers (Sheet 2 of 2)
Register Register Name Description
80312 Local Destination Address 80200 Address of Destination points to the local memory DAR Register region where the computed result is written to. The Byte Count contains the number of bytes to transfer ABCR Byte Counter Register for a XOR-transfer operation. Chain Descriptor Control Word contains control values for ADCR Descriptor Control Register data transfer on a per-chain descriptor basis. Extended Local Source Address SARE[4] Additional 4 registers for source data (same as SAR). Registers
Table 3 shows the various bits in the Descriptor Control (DC) field.
Table 3. DC Field Description
Bit Default Description
31 0 Destination Write Enable – This bit triggers the write back of the XOR operation result. 30:27 0 Reserved Supplemental Block Control Interpreter – These two bits enables the extended source blocks: 26:25 00 00–0additionalblocks 01–4additionalblocks 10 – reserved 11 – reserved Command Controls for all source blocks. Function can be performed on the blocks are either nothing or XOR. Block 1 (bits 03:01) can also have the Direct Fill (0x111) 24:01 0 command be set instead of performing the XOR (0x001) command. This command puts the data directly into the buffer instead of XOR the data from what’s already in the buffer. This command is also useful when using the AAU for copying data blocks. Interrupt Enable – When set the AAU triggers an interrupt to the Intel® 80200 processor 00 0 upon completion of the descriptor.
3.1.2 Assumptions
In the Linux environment, memory is cached unless otherwise stated. Cache coherency must be maintained by the AAU API by performing cleaning or invalidating at appropriate data locations.
The AAU API assumes that the application driver that utilizes the API follows strict usage guidelines outlined in this document.
10 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.3 Initialization
Initialization is done during kernel initialization. The AAU is initialized after the interrupt controller has been initialized during kernel setup. The AAU registers are all in default reset values before initialization. Following is the AAU initialization sequence: • Disable accelerator by clearing the ACR register. • Setup and initialize all resource queues and stack. • Setup and initialize all spinlocks. • Allocate a number of AAU hardware descriptors. • Align hardware descriptors to eight 32-bit word boundaries. • Allocate a corresponding number of AAU software descriptors. • Link each hardware descriptor to software descriptor. • Put software descriptors on the free resource stack. • Assign appropriate interrupt numbers. • Assign proper registers.
3.1.4 AAU Data Structures
Table 4 data structure directly maps to the AAU registers in order for easy access of the AAU registers. Table 4. AAU Registers typedef struct _aau_regs_t { volatile u32 ACR; /* Accelerator Control Register */ volatile u32 ASR; /* Accelerator Status Register */ volatile u32 ADAR; /* Descriptor Address Register */ volatile u32 ANDAR; /* Next Desc Address Register */ volatile u32 LSAR; /* Local Source Address */ volatile u32 LDAR; /* Local Destination Address */ volatile u32 ABCR; /* Byte Count */ volatile u32 ADCR; /* Descriptor Control */ } aau_regs_t;
White Paper 11 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
To start an AAU operation an AAU hardware descriptor chain is built in the local memory. The hardware descriptor is required to be aligned on an 8-word boundary and is comprised of six contiguous words. The hardware descriptor format is illustrated in Table 5. One or more hardware descriptors form an AAU descriptor chain.
Table 5. AAU Hardware Descriptor Format
Next Descriptor Address (NDA)
Source Address (SAR[0]) Source Address (SAR[1]) Source Address (SAR[2]) Source Address (SAR[3]) Destination Address (DAR) Byte Count (BC) Descriptor Control (DC) Source Address (SARE[0]) [optional] Source Address (SARE[1]) [optional] Source Address (SARE[2]) [optional] Source Address (SARE[3]) [optional]
The NDA points to the next descriptor thus forming a chain. The chain is terminated by having a null valued NDA. The descriptor provides pointers to four source addresses. These source addresses provides the source data for the XOR computation data source. The result of the XOR computation from the source addresses are written to the local memory location pointed to by the DAR. The BC register contains the number of bytes there are in a block of data per source address. All blocks of data that are pointed to by the source addresses have the same amount of data. Therefore, for example, when SAR[0] has 1024 bytes of data then the rest of the valid source addresses shall contain 1024 bytes of data block each. A bit in the DC field enables the extension of additional four source address fields for processing when more than four data sources are required for the XOR computation. The optional fields shall not be used until all existing four source fields are utilized. The DC field also contains various mode bits to allow operations done on a per descriptor basis.
The hardware descriptor for the AAU is presented in Table 6. This format is required by the AAU hardware. The source addresses 5 through 8 are optional. Any source address field not used must contain the NULL value. When any source address contains the NULL value then all the following source addresses must also contain the NULL value. All the source addresses and the destination address must be 80200 local address. Also they must contain physical addresses instead of virtual. Table 6. AAU Hardware Descriptor typedef struct _aau_desc_t { u32 NDA; /* Next Descriptor Address */ u32 SAR[4]; /* Source Addresses 0-3 */ u32 DAR; /* Destination Address */ u32 BC; /* Byte Count */ u32 DC; /* Descriptor Control */ u32 SARE[4]; /* Extended Source Addresses 0-3 */ } aau_desc_t;
12 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
A software descriptor is created to encapsulate each AAU hardware descriptor. The software descriptor contains additional information and status about the hardware descriptor that is not described by the hardware descriptor. The software descriptor also enables the use of stack and queue data structures to keep track of and manipulate the hardware descriptors without making any format changes to the hardware descriptor. A pool of software descriptors are allocated during initialization and put on a stack. An equal amount of hardware descriptors are created and encapsulated by the software descriptors. The resource pool removes the performance penalty suffered by dynamically allocating descriptors during operation.
The Table 7 data structure describes the AAU software descriptor. Table 7. AAU Software Descriptor Structure typedef struct _sw_aau_t { aau_desc_t aau_desc; /* AAU HW desc */ u32 status ; /* AAU Status */ struct _aau_sgl *next; /* pointer to next sgl */ void *dest ; /* Destination */ void *src[4] ; /* Source */ void *ext_src[4]; /* Extended Source */ u32 total_src; /* total src addresses */ struct list_head link; /* link to queue */ u32 aau_phys; /* AAU Physical Addr */ u32 desc_addr; /* HW unaligned addr */ u32 sgl_head; /* User SGL head Addr */ struct _sw_aau_t *head; /* Head of list */ struct _sw_aau_t *tail; /* Tail of list */ } sw_aau_t;
The AAU shall also have a global device descriptor that allows access to the accelerator registers, processing queues, queue locks, and accelerator status.
White Paper 13 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
The Table 8 data structure describes the AAU device. It keeps track of all the variables that are related to the AAU. Table 8. AAU Device Descriptor typedef struct _iop310_aau_t { const char *dev_id; /* Device ID */ list_t process_q; /* Processing Q */ list_t holding_q; /* Holding Q */ spinlock_t lock_pq; /* PQ spinlock */ spinlock_t lock_hq; /* HQ spinlock */ aau_regs_t *regs; /* AAU registers */ int irq; /* IRQ number */ sw_aau_t *last_aau; /* ptr to last AAU disc */ struct tq_struct aau_task; /* AAU task entry */ wait_queue_head_t wait_q; /* AAU wait queue */ atomic_t ref_count; /* AAU Reference count */ } iop310_aau_t;
The following structures represent the data format applications use to pass data to the AAU API. The application creates a SGL header with a SGL pointed to by the header. When no callback function is required, the call_back value must set to NULL. The status field should be zeroed out before being passed down. The end of the list is always marked by the next_sgl variable in the SGL list pointed to NULL. Table 9. User SGL Header struct _aau_sgl_head_t { u32 total; /* total SGLs */ aau_sgl_t *list; /* Pointer to list head */ u32 status; /* SG status */ aau_callback_t callback; /* Callback func ptr */ } aau_sgl_head_t;
14 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
The AAU descriptor data structure is filled out by the user. It maps over the first portion of the software descriptor sw_aau_t data structure. Casting eliminates a copy of values from one data structure to another. When the user passes a correct user SGL list, all the API has to do is re-cast the list into software descriptors and feed it to the processing queue. This requires a slight bit more knowledge of the AAU fields on the users part, but improves performance of the AAU operation considerably. Table 10. AAU User SGL Structure struct_aau_sgl_t { aau_desc_t aau_desc; u32 status; struct_aau_sgl_t *next_sgl; /* Pointer to next SG */ void *dest; void *src[4] ; /* Source group 1 */ void *src_ext[4]; /* Source group 2 */ u32 total_src; /* Total number of sources passed down */ } aau_sgl_t;
3.1.5 Data Path
The following is required for an application to utilize the AAU hardware through the AAU API. The application must first attempt to request the usage of the AAU by calling the aau_request() function. This function requests and registers an interrupt for the AAU. When successful, the application is allowed to use the AAU. The API also keeps track of the usage of the AAU by using a reference count method. When unsuccessful the error –EBUSY is returned to the caller.
The driver applications are required to create a scattered gather list (SGL) defined in the format of aau_sgl_t format with all information for AAU operation completed. The driver application is responsible for allocating and keeping track of the memory to store the AAU input data and result. The application calls the aau_queue_buffer() function to pass down the user SGL. The AAU API generates an AAU descriptor chain from the passed down SGL using the AAU software descriptors from the free AAU resource stack. When no free software descriptors are available the API goes to sleep for a short period of time, and then tries again ten times before giving up and returning –ENOMEM error. The Interrupt Enable bit is set by the function in the DC field of the last hardware descriptor in the chain to indicate end of chain. The AAU chain is queued into the processing queue by the function which then calls aau_start() for the application. The aau_start() function checks to determine if the AAU is active. If not active then this is a new operation and which requires setting the appropriate bits, links accordingly and starting the AAU. If active then it is an ongoing operation which requires appending to the existing chain and setting the chain resume bit. At this point the aau_queue_buffer()returns the control to the application while the AAU is doing its work.
The application has two choices in handling the result of AAU completion: 1. Sleep on the AAUs wait queue until being notified by the bottom half interrupt handler later on when operation is complete 2. Continue and be notified by a callback function when the operation on the chain is complete via the SGL passed down.
The AAU meanwhile processes the chain and triggers an interrupt when it encounters the Interrupt Enable Bit being set in a descriptor being processed or an error condition is encountered.
White Paper 15 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
When an AAU interrupt is asserted, the interrupt handler function aau_irq_handler() is called. Clearing an interrupt requires clearing it at the source. In this case the source is the Accelerator Status Register. When the Accelerator Status Resister has been cleared in the interrupt handler, the interrupt assert bit shall be cleared by the hardware as a consequence. As long as the AAU interrupt is asserted due to new AAU interrupts, the interrupt handler continues to remove the descriptors from the channel process queue and put the descriptors in the channel holding queue until the ADAR value equals to the address of the descriptor or the queue is empty. When the ADAR equals the descriptor address and the ASR indicates that the channel is active then that descriptor is not removed. Once the interrupt handler no longer sees an AAU interrupt being asserted it schedules a bottom half handler in the immediate task queue to process the holding queue and notify the application of the progress of the AAU operation.
The application calls the function aau_free() when it no longer needs the AAU and wants to release it. Depending on the reference count, the IRQ requested for the AAU may be freed. When there are any errors for the AAU unit, the AAU registers are cleared, all resources are returned, and the reference count shall be reset to 0.
Figure 2 shows the state trace diagram for a normal operation of the AAU. The diagram demonstrates all the necessary function calls that are performed during a normal, simple AAU execution path. The section explaining the APIs in detail follows. Figure 2. AAU State Trace Diagram
User AAU INTC System
aau_init() aau_get_buffer()
aau_queue_buffer()
aau_start()
AAU Complete aau_irq_handler()
Sleep on wait queue or Proceed on callback aau_process()
aau_task() Callback or Wake if Sleeping aau_buffer_return()
aau_free()
16 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6 API Functions
The following functions shall be implemented to support the AAU API for Intel® XScale™ microarchitecture embedded Linux.
3.1.6.1 API Listing
3.1.6.1.1 AAU Public
int aau_request(u32 *aau_context);
int aau_suspend(u32 aau_context);
int aau_resume(u32 aau_context);
int aau_queue_buffer(u32_context, aau_sgl_t *sgl);
int aau_free(u32 aau_context);
aau_sgl_t* aau_get_buffer(u32 aau_context, u32 num_buff);
void aau_return_buffer(u32 aau_context, sgl_list_t *list);
int aau_memcpy(void *, void *, u32);
3.1.6.1.2 AAU Private (Static)
static int __init aau_init(void);\
static int aau_start(iop310_aau_t *aau_chain);
static int aau_flush_all(u32 aau_context);
static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);
static void aau_process(iop310_aau_t *aau);
static void aau_result_handler(void *aau);
White Paper 17 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2 Selected API Descriptions
3.1.6.2.1 static int __init aau_init(void);
Input: N/A Output: Success -- OK Error -- -ENOMEM Purpose: This function initializes the AAU during kernel init. The function initializes all the variables to ready state and allocates memory for the resource pools. The AAU is at post reset state at this point. After initialization the AAU should be in the idle state.
Operation: • Initialize free resource stack • Initialize stack lock • Allocate memory for software descriptors — Returnerroriffail • Align memory on 8-byte boundary — Returnerroriffail • Push software descriptors onto free resource stack • Set register addresses for AAU • Initialize AAU queues and locks • Initialize wait queue • Assign interrupt number • Initialize all AAU reference count • Initialize interrupt bottom handler for immediate process queue • Zero out ACR
18 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.2 static int aau_start(iop310_aau_t *aau, sw_aau_t *aau_chain);
Input: aau – pointer to AAU device descriptor aau_chain – pointer to AAU descriptor chain to be sent to AAU Output: Success/Error condition Purpose: This function starts the AAU or appends an AAU chain and resumes the operation when a chain is being processed.
Operation: • If AAU not active — Write AAU descriptor address to ANDAR — Set enable accelerator bit in ACR • Else — Link chain to last AAU list tail ANDAR — Flush cache for range of tail descriptor ANDAR — If channel no longer active • Set chain resume bit in ACR • Set last descriptor pointer in AAU device descriptor
3.1.6.2.3 int aau_request(u32 *aau_context);
Input: aau_context – pass by reference AAU context. Written back by function. Output: success -- OK failed -- -EINVAL Purpose: This function requests an interrupt for the AAU from the kernel and returns the AAU descriptor to the driver application.
Operation: • Register IRQ with kernel • Increment reference count of AAU • Return AAU device descriptor to user
3.1.6.2.4 int aau_suspend(u32 aau_context);
Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function suspends the AAU operation. It calls aau_stop() to perform the operation.
Operation: • Unset bit in ACR that enables AAU operation
White Paper 19 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.5 int aau_resume(u32 aau_context);
Input: aau – AAU device context Output: Success/Error condition Purpose: This function resumes the AAU operation.
Operation: • If ASR contains errors — Clear errors — Flush AAU pipeline — Return with error • Set enable bit in ACR
3.1.6.2.6 int aau_queue_buffer(u32 aau_context, aau_sgl_t *sgl);
Input: aau_context – AAU device context sgl – User SGL for AAU to transform to AAU descriptor chain Output: Success/Error condition Purpose: This function converts the user SGL to an AAU descriptor chain. The function then puts the chain in the processing queue and starts the AAU.
Operation: • For all elements in SGL — Get AAU software descriptor from free resource stack — Convert to AAU descriptor — Init appropriate variables in AAU software descriptor — Flush cache in appropriate regions — Link up AAU chain • Call aau_start() and pass AAU chain
20 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.7 static int aau_flush_all(u32 aau_context);
Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function flushes the AAU pipeline, returns all resources to the free stack and clears any of the error conditions. This function is only be called by the interrupt handler for handling errors.
Operation: • For all descriptors in processing queue — Remove from processing queue — Set AAU_INCOMPLETE status mode in descriptor status — Put in holding queue • Clear ASR
3.1.6.2.8 int aau_free(u32 aau_context);
Input: aau_context – AAU device context Output: Success/Error condition Purpose: This function attempts to release the IRQ held by AAU.
Operation: • Decrement AAU reference counter • If AAU ref count <= 1 —FreeIRQ
White Paper 21 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.9 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs);
Input: irq –IRQnumber dev_id – Device Descriptor regs – CPU registers (not used but required) Output: N/A Purpose: This is the interrupt handler for AAU interrupts. It handles any error interrupts or chain complete interrupts depending on the status in the ASR. A bottom handler queued in the immediate task queue by this function begins to process everything in the holding queue when this function exits and the kernel leaves the interrupt space.
Operation: • If not AAU interrupt —Exit • If AAU error —Callaau_flush_all() • While AAU complete INTs —ClearASR —Callaau_process() • Register bottom handler
22 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
Figure 3. Interrupt Handler Functional Flow Diagram
Enter Interrupt Handler
Check if INT No for AAU Yes
Flush AAU / Clear Check any Yes Errors errors in ASR No
Move desc from process to holding Q
No
Done? Yes
Schedule INT Exit bottom handler
White Paper 23 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.10 static void aau_process(iop310_aau_t *aau);
Input: aau – AAU device descriptor Output: N/A Purpose: This function removes the done descriptor from the processing queue and put them in the holding queue to be processed by the bottom handler later. This function is only called by the interrupt handler.
Operation: • Do while descriptor address != ADAR and queue not empty — Remove from processing queue — Put on holding queue — If IE bit set in ADCR set AAU_DONE on chain head descriptor
3.1.6.2.11 static void aau_result_handler(void *aau);
Input: *aau – AAU device descriptor Output: N/A Purpose: This function is scheduled by the interrupt handler to finish processing AAU descriptors after the INT handler is done and exits the interrupt space. It notifies the driver performing the AAU either by waking the driver up when sleeping or use a callback function provided by the driver.
Operation: • Do while descriptor status == AAU_DONE — Remove descriptor from holding queue — Set status on user SGL — Return descriptor to free stack — If callback function exists • Call callback — Else if sleeping on wait queue • Wake up sleeping process
24 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Low-Level Design Document
3.1.6.2.12 aau_sgl_t * aau_get_buffer(u32 aau_context, u32 num_buf);
Input: aau_context – AAU device context num_buf – number of buffers to acquire. Output: aau_sgl_t * - chain of AAU acquired, NULL if failed. Purpose: This function is used to acquire a chain of user SGL buffers. After obtaining the list the user need to fill it out, link it to a SGL head and pass it to aau_queue_buffer() function.
Operation: • While free stack not empty — Acquire buffer — If failed • Retry • If Retry fails — Return all acquired buffer — Return NULL — Fill out necessary field — Link buffer to list • Return list
3.1.6.2.13 void aau_return_buffer(u32 aau_context, sgl_list_t *list);
Input: aau_context – AAU device context. *list – SGL list to be returned. Output: N/A Purpose: This function takes the SGL list passed in by the user and return it to the free stack.
Operation: • While not end of list — Put SGL element on free stack.
White Paper 25 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary
4.0 Code Commentary
4.1 Section Objectives
Primary Objective: To identify and describe aspects of the implementation that relate to 80310 hardware and standard operating system issues.
Secondary Objective: Provide additional background on the linux APIs to facilitate reading the code and understanding the implementation.
This code was written to be integrated in the Linux Kernel. Therefore Linux data structures and APIs defined and optimized by the Linux community are used.
Recommended Approach to understanding code is to begin with aau_init() and follow function call sequence in Figure 2, “AAU State Trace Diagram” on page 16. Also see sections provided for additional implementation support: • Appendix B, “Example Calling Source Code” • Appendix C, “MMU Functions for Intel® XScale™ Microarchitecture”
4.1.1 File Organization Overview
There are three files included in Appendix A: • \include\aau.h • \src\aau.h • \src\aau.c
File \include\aau.h includes the public definitions and function APIs. Note that the public data structure definition of struct aau_sgl_t is cast to private definition struct sw_aau_t.
Files \src\aau.h and \src\aau.c include private definitions, APIs and function calls. Note APIs that are static are private and local to the file, and those that are not static are public calls. The static modifier localize the functions to the c file and the symbol is not exported.
4.1.1.1 Key Data Structure and Use of Casting
The primary data structure used by application to initiate an AAU transaction is stuct aau_sgl_t (see code line 72). When the application is filling our the source and destination address in the descriptor, physical addresses not virtual addresses are required. The aau_sgl_t is cast to data structure sw_aau_t for processing (line 444). Note the descriptors are chained together within function aau_queue_buffer, line 463.
26 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary
4.1.2 Cache Memory
The following function is required for AAU descriptors since cache memory and RAM coherence are required to be managed by the programmer. Remember the AAU engine reads the AAU descriptors from RAM. Therefore values in cache are required to be flushed by the programmer to RAM (see Appendix C in this document for implementation). • cpu_xscale_dcache_clean_range(start, end) For the specified virtual address range, ensure that all caches contain clean data, such that peripheral accesses to the physical RAM fetch correct data. start: virtual start address end: virtual end address
4.1.3 Other AAU Hardware
The AAU hardware is described in the Intel® 80312 I/O Companion Chip Developer’s Manual pages 10-1 through 10-33. For register definitions see pages 10-23 through 10-31.
In the Appendix A code, see Descriptor Control Register (DC) bit definitions line 40 through line 54. For Accelerator Control Register (ACR) and Accelerator Status Register (ASR) see bit definitions at lines 124 through 136.
The addresses for referencing the memory mapped registers are references using #defines. See examples in code lines 301, 305 and 320. • IOP310_AAUANDAR - Address of Accelerator next Descriptor Address Register • IOP310_AAUACR - Address of Accelerator Control Register • IOP310_AAUASR - Address of Accelerator Status Register
4.1.4 Virtual to Physical memory
Cache flush/invalidate and memory mapped registers operate with virtual memory addresses
AAU descriptor operations operate from physical memory and require physical addresses. For example see Appendix A.3,line895.
White Paper 27 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary
4.1.5 Interrupt Handling
Linux interrupt handling is split between top and bottom halves. The top half interrupt handler is called when the hardware interrupt is invoked and performs only minimal critical tasks including scheduling the bottom half handlers. Bottom half handlers are schedules by marking the handler for future execution.
Three status registers are involved in interrupt handling. Clearing a interrupt requires clearing the interrupt at the source which in this case is the Accelerator Status Register. The action of clearing the interrupt requires writing a 1 to the bit to be cleared (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 1-7, section 1.4.2)). The three registers are: • FIQ1 Interrupt Status Register (IOP310_FIQ1ISR). Appendix A, Lines 637 & 666. This register is used to determine the cause of the interrupt. If Bit 5 is set there is a Application Accelerator Interrupt Pending (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 2-12) • Accelerator Status Register (IOP310_AAUASR) Appendix A, Lines 648 & 669. This register contains the AAU status flags. The interrupt is cleared by writing 1s to the set bits. • IRQ Interrupt Status Register (Not used in this code) Bit 10 indicates a Application Accelerator Unit Error (See Intel® 80312 I/O Companion Chip Developer’s Manual,page2-15)
4.1.5.1 Top Half Interrupt Handler: aau_irq_handler()
See Appendix A, line 629.
The following statuses are obtained: • FIQ1 Interrupt Status Register (IOP310_FIQ1ISR). Appendix A, lines 637 & 666. • Accelerator Status Register (IOP310_AAUASR) Appendix A, lines 648 & 669.
The AAU interrupt is cleared. Appendix A,Line657.
When the End of Transfer or End of Chain Interrupt is set, the function aau_process() is called. The purpose of aau_process() is to move all the AAU descriptors in the processing queue that are considered done to the holding queue.
The bottom half handler is marked scheduled. Appendix A, line 672.
4.1.5.2 Bottom Half Interrupt Handler: aau_task()
See Appendix A, Line 758. This function processes all the completed AAU chain descriptors in the holding Q, wakes up the user and frees the resource.
4.1.6 Linux Kernel APIs
This code contains numerous calls to Linux kernel macros or APIs. Primarily these are Linux calls used for declaring and handling queues and stack data structures and controlling variable access. When developing custom applications users of this document will call their Operating Systems equivalent APIs and macros.
28 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Code Commentary
4.2 Optimization Related
4.2.1 Stack verses Queue
There are built and controlled with Linux kernel data structures and APIs. Using a stack for free descriptors increases the likelihood descriptors requested are still in the cache. A queue for the chain is required since chaining demands FIFO (first in first out) sequence. As previously stated, Linux kernel data structures have been optimized by the Linux community.
4.2.2 Chaining and Resume
Chaining allows the application to build a list of transfers which may not require the use of the Intel 80200 processor until all transfers are complete (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-9). In addition, while the AAU is executing a existing chain, a incremental descriptor or chain of descriptors can be appended concurrently by using the Chain Resume feature (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-16). The expanded chain then executes as a single uninterrupted set of transactions.
See Appendix A, lines 297 through 326 for implementation.
4.2.3 Requiring the Application to Supply Physical Addresses in AAU Descriptor (verses virtual addresses)
This requirement minimizes time between hand off from application and AAU processing software.
4.2.4 Allocations of Memory for AAU Decriptors During Initialization
Preallocating memory for AAU descriptors eliminates costly runtime memory allocations.
4.2.5 Using AAU for Local Memory to Local Memory Copy: mem_copy()
The advantage of using the AAU for local memory to local memory copying: • In absolute terms it is faster for non-trivial copies. • It happens in parallel to other core processing. When calling aau_memcopy() use the exact same syntax as memcopy(). See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-31 through 10-33 for full description. Appendix A, lines 1108 and 1109 • Covert virtual to physical address and write physical address to AAU descriptor Appendix A, line 1112 • AAU_DCR_WRITE — Sets bit 31. Description of operation specified: Write Enable • AAU_DCR_BLKCTRL_1_DF — Sets all bits 03:01 for Block 1 Command Control. Description of operation specified: Direct Fill Appendix A, line 1115 • Sets Interrupt Enable for this descriptor
White Paper 29 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Potential Enhancements
5.0 Potential Enhancements
5.1 Error Handling
When the Application Accelerator Unit generates a error during the execution of a AAU Descriptor, a interrupt is triggered and IRQ Interrupt Status Register, Bit 10 is set. Bit 10 being set indicates a Application Accelerator Unit Error (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 2-5). After identifying the source of the interrupt as the AAU, the application should should test the Accelerator Status Register (ASR) Bits. (See Intel® 80312 I/O Companion Chip Developer’s Manual, page 10-25) • Bit 10 is clear: The Accelerator Active Flag being clear indicates the channel is idle. This bit my be cleared as a result of a bus error. • Bit 5 is set: The Master-abort bit is set when a master abort occurs during a transaction when the AAU is the Master on the internal bus.
Before clearing the interrupt, the application can use the Accelerator Descriptor Address Register (ADAR) to identify the currently executing decriptor. The descriptor can be marked as having failed prior to the interrupt being cleared and processing continuing. One approach is to write the contents of the ASR to a status variable attached to the descriptor (See Intel® 80312 I/O Companion Chip Developer’s Manual, sections 10.8 and 10.9 for Interrupt States and Error Conditions).
5.2 Lookaside Cache Scheme (This is Linux specific)
When implementing in Linux, device driver developers should consider using the Lookaside Cache scheme instead of allocating memory using kmalloc when creating hardware descriptors. The Lookaside Cache provides memory address alignment and other features that allows the efficient use of the Linux memory management for device driver development.
5.3 Extensive Intel Optimization Related Documentation
Intel provides extensive optimization related documentation. As part of the application development process it is recommended to review the Intel® XScale™ Microarchitecture Coding Techniques White Paper and the Intel® 80200 Processor based on Intel® XScale™ Microarchitecture Developer’s Manual, Appendix B for optimization opportunities.
30 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Conclusion
6.0 Conclusion
As discussed, increasing I/O demands are central to Network and Storage high performance applications. Intel® XScale™ microarchitecture addresses this trend with the 80310. Features of the Intel® 80310 solution that include AAU.
This paper and the accompanying source code have presented a AAU implementation including the Low Level Design, coded implementation and code commentary to provide software developers a template in order to speed the ramp for developing AAU applications. For their unique applications, Developers can design and build their own custom solutions using this template along with the Intel Optimization literature.
White Paper 31 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
Appendix A AAU Source Code
A.1 Public Definitions for Intel® 80310 I/O Processor Chipset AAU: \include\aau.h
1/*
2 * Definitions for IOP310 AAU
3*
4 * Author: Dave Jiang ([email protected])
5 * Copyright (C) 2001 Intel Corporation
6*
7 * This program is free software; you can redistribute it and/or modify
8 * it under the terms of the GNU General Public License version 2 as
9 * published by the Free Software Foundation.
10 *
11 */
12
13 #ifndef _IOP310_AAU_H_
14 #define _IOP310_AAU_H_
15
16
17 #define DEFAULT_AAU_IRQ_THRESH 10
18
19 #define MAX_AAU_DESC 1024/* 64 */
20 #define AAU_SAR_GROUP 4
21
22
23 #define AAU_DESC_DONE 0x0010
24 #define AAU_INCOMPLETE 0x0020
25 #define AAU_HOLD 0x0040
26 #define AAU_END_CHAIN 0x0080
27 #define AAU_COMPLETE 0x0100
28 #define AAU_NOTIFY 0x0200
29 #define AAU_NEW_HEAD 0x0400
32 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
30
31 #define AAU_USER_MASK(AAU_NOTIFY | AAU_INCOMPLETE | \
32 AAU_HOLD | AAU_COMPLETE)
33
34 #define DESC_HEAD 0x0010
35 #define DESC_TAIL 0x0020
36
37 /* result writeback */
38 #define AAU_DCR_WRITE 0x80000000
39 /* source block extension */
40 #define AAU_DCR_BLK_EXT 0x02000000
41 #define AAU_DCR_BLKCTRL_8_XOR 0x00400000
42 #define AAU_DCR_BLKCTRL_7_XOR 0x00080000
43 #define AAU_DCR_BLKCTRL_6_XOR 0x00010000
44 #define AAU_DCR_BLKCTRL_5_XOR 0x00002000
45 #define AAU_DCR_BLKCTRL_4_XOR 0x00000400
46 #define AAU_DCR_BLKCTRL_3_XOR 0x00000080
47 #define AAU_DCR_BLKCTRL_2_XOR 0x00000010
48 #define AAU_DCR_BLKCTRL_1_XOR 0x00000002
49 /* first block direct fill instead of XOR to buffer */
50 #define AAU_DCR_BLKCTRL_1_DF 0x0000000E
51 /* interrupt enable */
52 #define AAU_DCR_IE 0x00000001
53
54 #define DCR_BLKCTRL_OFFSET 3
55
56
57 /* AAU callback */
58 typedef void (*aau_callback_t) (void *buf_id);
59
60 /* hardware descriptor */
61 typedef struct _aau_desc
62 {
63 u32 NDA; /* next descriptor address */
White Paper 33 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
64 u32 SAR[AAU_SAR_GROUP];/* src addrs */
65 u32 DAR; /* destination addr */
66 u32 BC; /* byte count */
67 u32 DC; /* descriptor control */
68 u32 SARE[AAU_SAR_GROUP];/* extended src addrs */
69 } aau_desc_t;
70
71 /* user SGL format */
72 typedef struct _aau_sgl
73 {
74 aau_desc_t aau_desc;/* AAU HW Desc */
75 u32 status;
76 struct _aau_sgl *next;/* pointer to next SG */
77 void *dest; /* destination addr */
78 void *src[AAU_SAR_GROUP];/* source addr[4] */
79 void *ext_src[AAU_SAR_GROUP];/* ext src addr[4] */
80 u32 total_src; /* total number of source */
81 } aau_sgl_t;
82
83 /* header for user SGL */
84 typedef struct _aau_head
85 {
86 u32 total;
87 u32 status; /* SGL status */
88 aau_sgl_t *list; /* ptr to head of list */
89 aau_callback_t callback;/* callback func ptr */
90 } aau_head_t;
91
92 /* prototypes */
93 int aau_request(u32 *, const char *);
94 int aau_queue_buffer(u32, aau_head_t *);
95 void aau_suspend(u32);
96 void aau_resume(u32);
97 void aau_free(u32);
34 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
98 void aau_set_irq_threshold(u32, int);
99 void aau_return_buffer(u32, aau_sgl_t *);
100 aau_sgl_t *aau_get_buffer(u32, int);
101 int aau_memcpy(void *, void *, u32);
102
103 #endif
104 /* EOF */
White Paper 35 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
A.2 Private Definitions for Intel® XScale™ Microarchitecture AAU: \src\aau.h
105 /*
106 * Private Definitions for Intel® XScale™ microarchitecture AAU
107 *
108 * Author: Dave Jiang ([email protected])
109 * Copyright (C) 2001 Intel Corporation
110 *
111 * This program is free software; you can redistribute it and/or modify
112 * it under the terms of the GNU General Public License version 2 as
113 * published by the Free Software Foundation.
114 *
115 */
116
117 #ifndef _AAU_PRIVATE_H_
118 #define _AAU_PRIVATE_H_
119
120 #define SLEEP_TIME 50
121 #define AAU_DESC_SIZE 48
122 #define AAU_INT_MASK 0x0020
123
124 #define AAU_ACR_CLEAR 0x00000000
125 #define AAU_ACR_ENABLE 0x00000001
126 #define AAU_ACR_CHAIN_RESUME 0x00000002
127 #define AAU_ACR_512_BUFFER 0x00000004
128
129 #define AAU_ASR_CLEAR 0x00000320
130 #define AAU_ASR_MA_ABORT 0x00000020
131 #define AAU_ASR_ERROR_MASK AAU_ASR_MA_ABORT
132 #define AAU_ASR_DONE_EOT 0x00000200
133 #define AAU_ASR_DONE_EOC 0x00000100
134 #define AAU_ASR_DONE_MASK (AAU_ASR_DONE_EOT | AAU_ASR_DONE_EOC)
135 #define AAU_ASR_ACTIVE 0x00000400
36 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
136 #define AAU_ASR_MASK (AAU_ASR_ERROR_MASK | AAU_ASR_DONE_MASK)
137
138 /* software descriptor */
139 typedef struct _sw_aau
140 {
141 aau_desc_t aau_desc;/* AAU HW Desc */
142 u32 status;
143 struct _aau_sgl *next; /* pointer to next SG */
144 void *dest; /* destination addr */
145 void *src[AAU_SAR_GROUP];/* source addr[4] */
146 void *ext_src[AAU_SAR_GROUP];/* ext src addr[4] */
147 u32 total_src; /* total number of source */
148 struct list_head link; /* Link to queue */
149 u32 aau_phys; /* AAU Phys Addr (aligned) */
150 u32 desc_addr; /* unaligned HWDESC virtual addr */
151 u32 sgl_head;
152 struct _sw_aau *head; /* head of list */
153 struct _sw_aau *tail; /* tail of list */
154 } sw_aau_t;
155
156 /* AAU registers */
157 typedef struct _aau_regs_t
158 {
159 volatile u32 ACR; /* Accelerator Control Register */
160 volatile u32 ASR; /* Accelerator Status Register */
161 volatile u32 ADAR; /* Descriptor Address Register */
162 volatile u32 ANDAR; /* Next Desc Address Register */
163 volatile u32 LSAR[AAU_SAR_GROUP];/* source addrs */
164 volatile u32 LDAR; /* local destination address register */
165 volatile u32 ABCR; /* byte count */
166 volatile u32 ADCR; /* Descriptor Control */
167 volatile u32 LSARE[AAU_SAR_GROUP];/* extended src addrs */
168 } aau_regs_t;
169
White Paper 37 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
170
171 /* device descriptor */
172 typedef struct _iop310_aau_t
173 {
174 const char *dev_id; /* Device ID */
175 struct list_head process_q;/* Process Q */
176 struct list_head hold_q;/* Holding Q */
177 spinlock_t process_lock;/* PQ spinlock */
178 spinlock_t hold_lock;/* HQ spinlock */
179 aau_regs_t *regs; /* AAU registers */
180 int irq; /* IRQ number */
181 sw_aau_t *last_aau; /* ptr to last AAU desc */
182 struct tq_struct aau_task;/* AAU task entry */
183 wait_queue_head_t wait_q;/* AAU wait queue */
184 atomic_t ref_count; /* AAU ref count */
185 atomic_t irq_thresh;/* IRQ threshold */
186 } iop310_aau_t;
187
188 #define SW_ENTRY(list) list_entry((list), sw_aau_t, link)
189
190 #endif
38 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
A.3 Support Functions for the Intel® 80310 I/O Processor ChipsetAAU:\src\aau.c
191 /**************************************************************************
192 * arch/arm/mach-iop310/aau.c
193 *
194 * Support functions for the Intel 80310 AAU.
195 * (see also Documentation/arm/XScale/IOP310/aau.txt)
196 *
197 * Author: Dave Jiang ([email protected])
198 * Copyright (C) 2001 Intel Corporation
199 *
200 * This program is free software; you can redistribute it and/or modify
201 * it under the terms of the GNU General Public License version 2 as
202 * published by the Free Software Foundation.
203 *
204 * Todos: Thorough Error handling
205 * Do zero-size AAU transfer/channel at init
206 * so all we have to do is chaining
207 *
208 *
209 * History: (07/18/2001, DJ) Initial Creation
210 * (08/22/2001, DJ) Changed spinlock calls to no save flags
211 * (08/27/2001, DJ) Added irq threshold handling
212 * (09/11/2001, DJ) Changed AAU to list data structure,
213 * modified the user interface with embedded descriptors.
214 *
215 *************************************************************************/
216
217 #include
218 #include
219 #include
220 #include
221 #include
White Paper 39 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
222 #include
223 #include
224 #include
225 #include
226 #include
227 #include
228 #include
229 #include
230 #include
231 #include
232 #include
233 #include
234 #include
235
236 #include
237
238 #include "aau.h"
239
240 #ifndef EXPORT_SYMTAB
241 #define EXPORT_SYMTAB
242 #include
243 #endif
244
245 #undef DEBUG
246 #ifdef DEBUG
247 #define DPRINTK(s, args...) printk("80310AAU: " s, ## args)
248 #else
249 #define DPRINTK(s, args...)
250 #endif
251
252 /* globals */
253 static iop310_aau_t aau_dev;/* AAU device */
254 static struct list_head free_stack;/* free AAU desc stack */
255 static spinlock_t free_lock;/* free AAU stack lock */
40 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
256
257 /* static prototypes */
258 static int __init aau_init(void);
259 static int aau_start(iop310_aau_t *, sw_aau_t *);
260 static int aau_flush_all(u32);
261 static void aau_process(iop310_aau_t *);
262 static void aau_task(void *);
263 static void aau_irq_handler(int, void *, struct pt_regs *);
264
265 /*======*/
266 /* Procedure: aau_start() */
267 /* */
268 /* Description: This function starts the AAU. If the AAU */
269 /* has already started then chain resume is done */
270 /* */
271 /* Parameters: aau: AAU device */
272 /* aau_chain: AAU data chain to pass to the AAU */
273 /* */
274 /* Returns: int -- success: OK */
275 /* failure: -EBUSY */
276 /* */
277 /* Notes/Assumptions: */
278 /* */
279 /* History: Dave Jiang 07/18/01 Initial Creation */
280 /*======*/
281 static int aau_start(iop310_aau_t * aau, sw_aau_t * aau_chain)
282 {
283 u32 status;
284
285 /* get accelerator status */
286 status = *(IOP310_AAUASR);
287
288 /* check accelerator status error */
289 if(status & AAU_ASR_ERROR_MASK)
White Paper 41 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
290 {
291 DPRINTK("start: Accelerator Error %x\n", status);
292 /* should clean the accelerator up then, or let int handle it? */
293 return -EBUSY;
294 }
295
296 /* if first time */
297 if(!(status & AAU_ASR_ACTIVE))
298 {
299 /* set the next descriptor address register */
300
301 *(IOP310_AAUANDAR) = aau_chain->aau_phys;
302
303 DPRINTK("Enabling accelerator now\n");
304 /* enable the accelerator */
305 *(IOP310_AAUACR) |= AAU_ACR_ENABLE;
306 }
307 else
308 {
309 DPRINTK("Resuming chain\n");
310 /* if active, chain up to last AAU chain */
311
312 aau->last_aau->aau_desc.NDA = aau_chain->aau_phys;
313
314 /* flush cache since we changed the field */
315 /* 32bit word long */
316 cpu_dcache_clean_range((u32)&aau->last_aau->aau_desc.NDA,
317 (u32)(&aau->last_aau->aau_desc.NDA));
318
319 /* resume the chain */
320 *(IOP310_AAUACR) |= AAU_ACR_CHAIN_RESUME;
321 }
322
323 /* set the last accelerator descriptor to last descriptor in chain */
42 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
324 aau->last_aau = aau_chain->tail;
325
326 return 0;
327 }
328
329
330 /*======*/
331 /* Procedure: aau_request() */
332 /* */
333 /* Description: This function requests the AAU */
334 /* */
335 /* Parameters: aau_context: aau context */
336 /* device_id -- unique device name */
337 /* */
338 /* Returns: 0 - ok */
339 /* NULL -- failed */
340 /* */
341 /* Notes/Assumptions: */
342 /* */
343 /* History: Dave Jiang 07/18/01 Initial Creation */
344 /*======*/
345 int aau_request(u32 * aau_context, const char *device_id)
346 {
347 iop310_aau_t *aau = &aau_dev;
348
349 DPRINTK("Entering AAU request\n");
350 /* increment reference count */
351 atomic_inc(&aau->ref_count);
352
353 /* get interrupt if ref count is less than or equal to 1 */
354 if(atomic_read(&aau->ref_count) <= 1)
355 {
356 /* set device ID */
357 aau->dev_id = device_id;
White Paper 43 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
358 }
359
360 DPRINTK("Assigning AAU\n");
361 *aau_context = (u32) aau;
362
363 return 0;
364 }
365
366 /*======*/
367 /* Procedure: aau_suspend() */
368 /* */
369 /* Description: This function suspends the AAU at the earliest */
370 /* instant it is capable of. */
371 /* */
372 /* Parameters: aau: AAU device context */
373 /* */
374 /* Returns: N/A */
375 /* */
376 /* Notes/Assumptions: */
377 /* */
378 /* History: Dave Jiang 07/18/01 Initial Creation */
379 /*======*/
380 void aau_suspend(u32 aau_context)
381 {
382 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
383 *(IOP310_AAUACR) &= ~AAU_ACR_ENABLE;
384 }
385
386 /*======*/
387 /* Procedure: aau_resume() */
388 /* */
389 /* Description: This function resumes the AAU operations */
390 /* */
391 /* Parameters: aau: AAU device context */
44 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
392 /* */
393 /* Returns: N/A */
394 /* */
395 /* Notes/Assumptions: */
396 /* */
397 /* History: Dave Jiang 07/18/01 Initial Creation */
398 /*======*/
399 void aau_resume(u32 aau_context)
400 {
401 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
402 u32 status;
403
404 status = *(IOP310_AAUASR);
405
406 /* if it's already active */
407 if(status & AAU_ASR_ACTIVE)
408 {
409 DPRINTK("Accelerator already active\n");
410 return;
411 }
412 else if(status & AAU_ASR_ERROR_MASK)
413 {
414 printk("80310 AAU in error state! Cannot resume\n");
415 return;
416 }
417 else
418 {
419 *(IOP310_AAUACR) |= AAU_ACR_ENABLE;
420 }
421 }
422
423 /*======*/
424 /* Procedure: aau_queue_buffer() */
425 /* */
White Paper 45 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
426 /* Description: This function creates an AAU buffer chain from the */
427 /* user supplied SGL chain. It also puts the AAU chain */
428 /* onto the processing queue. This then starts the AAU */
429 /* */
430 /* Parameters: aau: AAU device context */
431 /* listhead: User SGL */
432 /* */
433 /* Returns: int: success -- OK */
434 /* failed: -ENOMEM */
435 /* */
436 /* Notes/Assumptions: User SGL must point to kernel memory, not user */
437 /* */
438 /* History: Dave Jiang 07/18/01 Initial Creation */
439 /* Dave Jiang 07/20/01 Removed some junk code not suppose */
440 /* to be there that causes infinite loop */
441 /*======*/
442 int aau_queue_buffer(u32 aau_context, aau_head_t * listhead)
443 {
444 sw_aau_t *sw_desc = (sw_aau_t *) listhead->list;
445 sw_aau_t *prev_desc = NULL;
446 sw_aau_t *head = NULL;
447 aau_head_t *sgl_head = listhead;
448 int err = 0;
449 int i;
450 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
451 DECLARE_WAIT_QUEUE_HEAD(wait_q);
452
453 DPRINTK("Entering aau_queue_buffer()\n");
454
455 /* scan through entire user SGL */
456 while(sw_desc)
457 {
458 sw_desc->sgl_head = (u32) listhead;
459
46 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
460 /* we clean the cache for previous descriptor in chain */
461 if(prev_desc)
462 {
463 prev_desc->aau_desc.NDA = sw_desc->aau_phys;
464 cpu_dcache_clean_range((u32)&prev_desc->aau_desc,
465 (u32)&prev_desc->aau_desc + AAU_DESC_SIZE);
466 }
467 else
468 {
469 /* no previous descriptor, so we set this to be head */
470 head = sw_desc;
471 }
472
473 sw_desc->head = head;
474 /* set previous to current */
475 prev_desc = sw_desc;
476
477 /* put descriptor on process */
478 spin_lock_irq(&aau->process_lock);
479 list_add_tail(&sw_desc->link, &aau->process_q);
480 spin_unlock_irq(&aau->process_lock);
481
482 sw_desc = (sw_aau_t *)sw_desc->next;
483 }
484 DPRINTK("Done converting SGL to AAU Chain List\n");
485
486 /* if our tail exists */
487 if(prev_desc)
488 {
489 /* set the head pointer on tail */
490 prev_desc->head = head;
491 /* set the header pointer's tail to tail */
492 head->tail = prev_desc;
493 prev_desc->tail = prev_desc;
White Paper 47 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
494
495 /* clean cache for tail */
496 cpu_dcache_clean_range((u32)&prev_desc->aau_desc,
497 (u32)&prev_desc->aau_desc + AAU_DESC_SIZE);
498
499 DPRINTK("Starting AAU accelerator\n");
500 /* start the AAU */
501 DPRINTK("Starting at chain: 0x%x\n", (u32)head);
502 if((err = aau_start(aau, head)) >= 0)
503 {
504 DPRINTK("ASR: %#x\n", *IOP310_AAUASR);
505 if(!sgl_head->callback)
506 {
507 wait_event_interruptible(aau->wait_q,
508 (sgl_head->status & AAU_COMPLETE));
509 }
510 return 0;
511 }
512 else
513 {
514 DPRINTK("AAU start failed!\n");
515 return err;
516 }
517 }
518
519 return -EINVAL;
520 }
521
522 /*======*/
523 /* Procedure: aau_flush_all() */
524 /* */
525 /* Description: This function flushes the entire process queue for */
526 /* the AAU. It also clears the AAU. */
527 /* */
48 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
528 /* Parameters: aau: AAU device context */
529 /* */
530 /* Returns: int: success -- OK */
531 /* */
532 /* Notes/Assumptions: */
533 /* */
534 /* History: Dave Jiang 07/19/01 Initial Creation */
535 /*======*/
536 static int aau_flush_all(u32 aau_context)
537 {
538 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
539 int flags;
540 sw_aau_t *sw_desc;
541
542 DPRINTK("Flushall is being called\n");
543
544 /* clear ACR */
545 /* read clear ASR */
546 *(IOP310_AAUACR) = AAU_ACR_CLEAR;
547 *(IOP310_AAUASR) |= AAU_ASR_CLEAR;
548
549 /* clean up processing Q */
550 while(!list_empty(&aau->hold_q))
551 {
552 spin_lock_irqsave(&aau->process_lock, flags);
553 sw_desc = SW_ENTRY(aau->process_q.next);
554 list_del(aau->process_q.next);
555 spin_unlock_irqrestore(&aau->process_lock, flags);
556
557 /* set status to be incomplete */
558 sw_desc->status |= AAU_INCOMPLETE;
559 /* put descriptor on holding queue */
560 spin_lock_irqsave(&aau->hold_lock, flags);
561 list_add_tail(&sw_desc->link, &aau->hold_q);
White Paper 49 562 spin_unlock_irqrestore(&aau->hold_lock, flags);
563 }
564
565 return 0;
566 }
567
568 /*======*/
569 /* Procedure: aau_free() */
570 /* */
571 /* Description: This function frees the AAU from usage. */
572 /* */
573 /* Parameters: aau -- AAU device context */
574 /* */
575 /* Returns: int: success -- OK */
576 /* */
577 /* Notes/Assumptions: */
578 /* */
579 /* History: Dave Jiang 07/19/01 Initial Creation */
580 /*======*/
581 void aau_free(u32 aau_context)
582 {
583 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
584
585 atomic_dec(&aau->ref_count);
586
587 /* if ref count is 1 or less, you are the last owner */
588 if(atomic_read(&aau->ref_count) <= 1)
589 {
590 /* flush AAU channel */
591 aau_flush_all(aau_context);
592 /* flush holding queue */
593 aau_task(aau);
594
595 if(aau->last_aau) Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
596 {
597 aau->last_aau = NULL;
598 }
599
600 DPRINTK("Freeing IRQ %d\n", aau->irq);
601 /* free the IRQ */
602 free_irq(aau->irq, (void *)aau);
603 }
604
605 DPRINTK("freed\n");
606 }
607
608 /*======*/
609 /* Procedure: aau_irq_handler() */
610 /* */
611 /* Description: This function is the int handler for the AAU */
612 /* driver. It removes the done AAU descriptors from the */
613 /* process queue and put them on the holding Q. it */
614 /* continues to process until process queue empty or */
615 /* the current AAU desc on the accelerator is the one */
616 /* we are inspecting */
617 /* */
618 /* Parameters: irq: IRQ activated */
619 /* dev_id: device */
620 /* regs: registers */
621 /* */
622 /* Returns: NONE */
623 /* */
624 /* Notes/Assumptions: Interrupt is masked */
625 /* */
626 /* History: Dave Jiang 07/19/01 Initial Creation */
627 /* Dave Jiang 07/20/01 Check FIQ1 instead of ASR for INTs */
628 /*======*/
629 static void aau_irq_handler(int irq, void *dev_id, struct pt_regs *regs)
White Paper 51 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
630 {
631 iop310_aau_t *aau = (iop310_aau_t *) dev_id;
632 u32 int_status = 0;
633 u32 status = 0;
634 u32 thresh;
635
636 /* get FIQ1 status */
637 int_status = *(IOP310_FIQ1ISR);
638
639 DPRINTK("IRQ: irq=%d status=%#x\n", irq, status);
640
641 /* this is not our interrupt */
642 if(!(int_status & AAU_INT_MASK))
643 {
644 return;
645 }
646
647 /* get accelerator status */
648 status = *(IOP310_AAUASR);
649
650 /* get threshold */
651 thresh = atomic_read(&aau->irq_thresh);
652
653 /* process while we have INT */
654 while((int_status & AAU_INT_MASK) && thresh--)
655 {
656 /* clear ASR */
657 *(IOP310_AAUASR) &= AAU_ASR_MASK;
658
659 /* */
660 if(status & AAU_ASR_DONE_MASK)
661 {
662 aau_process(aau);
663 }
52 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
664
665 /* read accelerator status */
666 status = *(IOP310_AAUASR);
667
668 /* get interrupt status */
669 int_status = *(IOP310_FIQ1ISR);
670 }
671
672 /* schedule bottom half */
673 aau->aau_task.data = (void *)aau;
674 /* task goes to the immediate task queue */
675 queue_task(&aau->aau_task, &tq_immediate);
676 /* mark IMMEDIATE BH for execute */
677 mark_bh(IMMEDIATE_BH);
678 }
679
680
681 /*======*/
682 /* Procedure: aau_process() */
683 /* */
684 /* Description: This function processes moves all the AAU desc in */
685 /* the processing queue that are considered done to the */
686 /* holding queue. It is called by the int when the */
687 /* done INTs are asserted. It continues until */
688 /* either the process Q is empty or current AAU desc */
689 /* equals to the one in the ADAR */
690 /* */
691 /* Parameters: aau: AAU device as parameter */
692 /* */
693 /* Returns: NONE */
694 /* */
695 /* Notes/Assumptions: Interrupt is masked */
696 /* */
697 /* History: Dave Jiang 07/19/01 Initial Creation */
White Paper 53 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
698 /*======*/
699 static void aau_process(iop310_aau_t * aau)
700 {
701 sw_aau_t *sw_desc;
702 u8 same_addr = 0;
703
704 DPRINTK("Entering aau_process()\n");
705
706 while(!same_addr && !list_empty(&aau->process_q))
707 {
708 spin_lock(&aau->process_lock);
709 sw_desc = SW_ENTRY(aau->process_q.next);
710 list_del(aau->process_q.next);
711 spin_unlock(&aau->process_lock);
712
713 if(sw_desc->head->tail->status & AAU_NEW_HEAD)
714 {
715 DPRINTK("Found new head\n");
716 sw_desc->tail->head = sw_desc;
717 sw_desc->head = sw_desc;
718 sw_desc->tail->status &= ~AAU_NEW_HEAD;
719 }
720
721 sw_desc->status |= AAU_DESC_DONE;
722
723 /* if we see end of chain, we set head status to DONE */
724 if(sw_desc->aau_desc.DC & AAU_DCR_IE)
725 {
726 if(sw_desc->status & AAU_END_CHAIN)
727 {
728 sw_desc->tail->status |= AAU_COMPLETE;
729 }
730 else
731 {
54 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
732 sw_desc->head->tail = sw_desc;
733 sw_desc->tail = sw_desc;
734 sw_desc->tail->status |= AAU_NEW_HEAD;
735 }
736 sw_desc->tail->status |= AAU_NOTIFY;
737 }
738
739 /* if descriptor equal same being processed, put it back */
740 if(((u32) sw_desc == *(IOP310_AAUADAR)
741 ) && ( *(IOP310_AAUASR) & AAU_ASR_ACTIVE))
742 {
743 spin_lock(&aau->process_lock);
744 list_add(&sw_desc->link, &aau->process_q);
745 spin_unlock(&aau->process_lock);
746 same_addr = 1;
747 }
748 else
749 {
750 spin_lock(&aau->hold_lock);
751 list_add_tail(&sw_desc->link, &aau->hold_q);
752 spin_unlock(&aau->hold_lock);
753 }
754 }
755 DPRINTK("Exit aau_process()\n");
756 }
757
758 /*======*/
759 /* Procedure: aau_task() */
760 /* */
761 /* Description: This func is the bottom half handler of the AAU INT */
762 /* handler. It is queued as an imm task on the imm */
763 /* task Q. It process all the complete AAU chain in the */
764 /* holding Q and wakes up the user and frees the */
765 /* resource. */
White Paper 55 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
766 /* */
767 /* Parameters: aau_dev: AAU device as parameter */
768 /* */
769 /* Returns: NONE */
770 /* */
771 /* Notes/Assumptions: */
772 /* */
773 /* History: Dave Jiang 07/19/01 Initial Creation */
774 /*======*/
775 static void aau_task(void *aau_dev)
776 {
777 iop310_aau_t *aau = (iop310_aau_t *) aau_dev;
778 u8 end_chain = 0;
779 sw_aau_t *sw_desc = NULL;
780 aau_head_t *listhead = NULL;/* user list */
781
782 DPRINTK("Entering bottom half\n");
783
784 if(!list_empty(&aau->hold_q))
785 {
786 sw_desc = SW_ENTRY(aau->hold_q.next);
787 listhead = (aau_head_t *) sw_desc->sgl_head;
788 }
789 else
790 return;
791
792 /* process while AAU chain is complete */
793 while(sw_desc && (sw_desc->tail->status & (AAU_NOTIFY | AAU_INCOMPLETE)))
794 {
795 /* clean up until end of AAU chain */
796 while(!end_chain)
797 {
798 /* IE flag indicate end of chain */
799 if(sw_desc->aau_desc.DC & AAU_DCR_IE)
56 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
800 {
801 end_chain = 1;
802 listhead->status |=
803 sw_desc->tail->status & AAU_USER_MASK;
804
805 sw_desc->status |= AAU_NOTIFY;
806
807 if(sw_desc->status & AAU_END_CHAIN)
808 listhead->status |= AAU_COMPLETE;
809 }
810
811 spin_lock_irq(&aau->hold_lock);
812 /* remove from holding queue */
813 list_del(&sw_desc->link);
814 spin_unlock_irq(&aau->hold_lock);
815
816 cpu_dcache_invalidate_range((u32)&sw_desc->aau_desc,
817 (u32)&sw_desc->aau_desc + AAU_DESC_SIZE);
818
819 if(!list_empty(&aau->hold_q))
820 {
821 sw_desc = SW_ENTRY(aau->hold_q.next);
822 listhead = (aau_head_t *) sw_desc->sgl_head;
823 }
824 else
825 sw_desc = NULL;
826 }
827
828 /* reset end of chain flag */
829 end_chain = 0;
830
831 /* wake up user function waiting for return */
832 /* or use callback if exist */
833 if(listhead->callback)
White Paper 57 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
834 {
835 DPRINTK("Calling callback\n");
836 listhead->callback((void *)listhead);
837 }
838 else if(listhead->status & AAU_COMPLETE)
839 /* if(waitqueue_active(&aau->wait_q)) */
840 {
841 DPRINTK("Waking up waiting process\n");
842 wake_up_interruptible(&aau->wait_q);
843 }
844 } /* end while */
845 DPRINTK("Exiting bottom task\n");
846 }
847
848 /*======*/
849 /* Procedure: aau_init() */
850 /* */
851 /* Description: This function initializes the AAU. */
852 /* */
853 /* Parameters: NONE */
854 /* */
855 /* Returns: int: success -- OK */
856 /* */
857 /* Notes/Assumptions: */
858 /* */
859 /* History: Dave Jiang 07/18/01 Initial Creation */
860 /*======*/
861 static int __init aau_init(void)
862 {
863 int i;
864 sw_aau_t *sw_desc;
865 int err;
866 void *desc = NULL;
867
58 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
868 printk("Intel 80310 AAU Copyright(c) 2001 Intel Corporation\n");
869 DPRINTK("Initializing...");
870
871 /* set the IRQ */
872 aau_dev.irq = IRQ_IOP310_AAU;
873
874 err = request_irq(aau_dev.irq, aau_irq_handler, SA_INTERRUPT,
875 NULL, (void *)&aau_dev);
876 if(err < 0)
877 {
878 printk(KERN_ERR "unable to request IRQ %d for AAU: %d\n",
879 aau_dev.irq, err);
880 return err;
881 }
882
883 /* init free stack */
884 INIT_LIST_HEAD(&free_stack);
885 /* init free stack spinlock */
886 spin_lock_init(&free_lock);
887
888
889 /* pre-alloc AAU descriptors */
890 for(i = 0; i < MAX_AAU_DESC; i++)
891 {
892 desc = kmalloc((sizeof(sw_aau_t) + 0x20), GFP_KERNEL);
893 memset(desc, 0, sizeof(sw_aau_t));
894 sw_desc = (sw_aau_t *) (((u32) desc & 0xffffffe0) + 0x20);
895 sw_desc->aau_phys = virt_to_phys((void *)sw_desc);
896 /* we keep track of original address before alignment adjust */
897 /* so we can free it later */
898 sw_desc->desc_addr = (u32) desc;
899
900 spin_lock_irq(&free_lock);
901 /* put the descriptors on the free stack */
White Paper 59 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
902 list_add_tail(&sw_desc->link, &free_stack);
903 spin_unlock_irq(&free_lock);
904 }
905
906 /* set the register data structure to the mapped memory regs AAU */
907 aau_dev.regs = (aau_regs_t *) IOP310_AAUACR;
908
909 atomic_set(&aau_dev.ref_count, 0);
910
911 /* init process Q */
912 INIT_LIST_HEAD(&aau_dev.process_q);
913 /* init holding Q */
914 INIT_LIST_HEAD(&aau_dev.hold_q);
915 /* init locks for Qs */
916 spin_lock_init(&aau_dev.hold_lock);
917 spin_lock_init(&aau_dev.process_lock);
918
919 aau_dev.last_aau = NULL;
920
921 /* initialize BH task */
922 aau_dev.aau_task.sync = 0;
923 aau_dev.aau_task.routine = (void *)aau_task;
924
925 /* initialize wait Q */
926 init_waitqueue_head(&aau_dev.wait_q);
927
928 /* clear AAU channel control register */
929 *(IOP310_AAUACR) = AAU_ACR_CLEAR;
930 *(IOP310_AAUASR) = AAU_ASR_CLEAR;
931 *(IOP310_AAUANDAR) = 0;
932
933 /* set default irq threshold */
934 atomic_set(&aau_dev.irq_thresh, DEFAULT_AAU_IRQ_THRESH);
935 DPRINTK("Done!\n");
60 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
936
937 return 0;
938 }
939
940 /*======*/
941 /* Procedure: aau_set_irq_threshold() */
942 /* */
943 /* Description: This function readjust the threshold for the irq. */
944 /* */
945 /* Parameters: aau: pointer to aau device descriptor */
946 /* value: value of new irq threshold */
947 /* */
948 /* Returns: N/A */
949 /* */
950 /* Notes/Assumptions: default is set at 10 */
951 /* */
952 /* History: Dave Jiang 08/27/01 Initial Creation */
953 /*======*/
954 void aau_set_irq_threshold(u32 aau_context, int value)
955 {
956 iop310_aau_t *aau = (iop310_aau_t *) aau_context;
957 atomic_set(&aau->irq_thresh, value);
958 } /* End of aau_set_irq_threshold() */
959
960
961 /*======*/
962 /* Procedure: aau_get_buffer() */
963 /* */
964 /* Description: This function acquires an SGL element for the user */
965 /* and returns that. It retries multiple times if no */
966 /* descriptor is available. */
967 /* */
968 /* Parameters: aau_context: AAU context */
969 /* num_buf: number of descriptors */
White Paper 61 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
970 /* */
971 /* Returns: N/A */
972 /* */
973 /* Notes/Assumptions: */
974 /* */
975 /* History: Dave Jiang 9/11/01 Initial Creation */
976 /* Dave Jiang 10/04/01Fixed list linking problem */
977 /*======*/
978 aau_sgl_t *aau_get_buffer(u32 aau_context, int num_buf)
979 {
980 sw_aau_t *sw_desc = NULL;
981 sw_aau_t *sw_head = NULL;
982 sw_aau_t *sw_prev = NULL;
983
984 int retry = 10;
985 int i;
986 DECLARE_WAIT_QUEUE_HEAD(wait_q);
987
988 if((num_buf > MAX_AAU_DESC) || (num_buf <= 0))
989 {
990 return NULL;
991 }
992
993 DPRINTK("Getting %d descriptors\n", num_buf);
994 for(i = num_buf;i>0;i--)
995 {
996 spin_lock_irq(&free_lock);
997 if(!list_empty(&free_stack))
998 {
999 sw_desc = SW_ENTRY(free_stack.next);
1000 list_del(free_stack.next);
1001 spin_unlock_irq(&free_lock);
1002 }
1003 else
62 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
1004 {
1005 while(retry-- && !sw_desc)
1006 {
1007 spin_unlock_irq(&free_lock);
1008 interruptible_sleep_on_timeout(&wait_q, SLEEP_TIME);
1009 spin_lock_irq(&free_lock);
1010 if(!list_empty(&free_stack))
1011 {
1012 sw_desc = SW_ENTRY(free_stack.next);
1013 list_del(free_stack.next);
1014 }
1015 spin_unlock_irq(&free_lock);
1016 }
1017
1018 sw_desc = sw_head;
1019 spin_lock_irq(&free_lock);
1020 while(sw_desc)
1021 {
1022 sw_desc->status = 0;
1023 sw_desc->head = NULL;
1024 sw_desc->tail = NULL;
1025 list_add(&sw_desc->link, &free_stack);
1026 sw_desc = (sw_aau_t *) sw_desc->next;
1027 } /* end while */
1028 spin_unlock_irq(&dma_free_lock);
1029 return NULL;
1030 } /* end else */
1031
1032 if(sw_prev)
1033 {
1034 sw_prev->next = (aau_sgl_t *) sw_desc;
1035 sw_prev->aau_desc.NDA = sw_desc->aau_phys;
1036 }
1037 else
White Paper 63 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
1038 {
1039 sw_head = sw_desc;
1040 }
1041
1042 sw_prev = sw_desc;
1043 } /* end for */
1044
1045 sw_desc->aau_desc.NDA = 0;
1046 sw_desc->next = NULL;
1047 sw_desc->status = 0;
1048 return (aau_sgl_t *) sw_head;
1049 }
1050
1051
1052 /*======*/
1053 /* Procedure: aau_return_buffer() */
1054 /* */
1055 /* Description: This function takes a list of SGL and return it to */
1056 /* the free stack. */
1057 /* */
1058 /* Parameters: aau_context: AAU context */
1059 /* list: SGL list to return to free stack */
1060 /* */
1061 /* Returns: N/A */
1062 /* */
1063 /* Notes/Assumptions: */
1064 /* */
1065 /* History: Dave Jiang 9/11/01 Initial Creation */
1066 /*======*/
1067 void aau_return_buffer(u32 aau_context, aau_sgl_t * list)
1068 {
1069 sw_aau_t *sw_desc = (sw_aau_t *) list;
1070
1071 spin_lock_irq(&free_lock);
64 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
1072 while(sw_desc)
1073 if(sw_desc)
1074 {
1075 list_add(&sw_desc->link, &free_stack);
1076 sw_desc = (sw_aau_t *) sw_desc->next;
1077 }
1078 spin_unlock_irq(&free_lock);
1079 }
1080
1081 int aau_memcpy(void *dest, void *src, u32 size)
1082 {
1083
1084 iop310_aau_t *aau = &aau_dev; /* Global variable */
1085 aau_head_t head;
1086 aau_sgl_t *list;
1087 int err;
1088
1089 head.total = size;
1090 head.status = 0;
1091 head.callback = NULL;
1092
1093 list = aau_get_buffer((u32) aau, 1);
1094 if(list)
1095 {
1096 head.list = list;
1097 }
1098 else
1099 {
1100 return -ENOMEM;
1101 }
1102
1103
1104
1105 while(list)
White Paper 65 Intel® 80310 I/O Processor Chipset AAU Coding Techniques AAU Source Code
1106 {
1107 list->status = 0;
1108 list->src[0] = src;
1109 list->aau_desc.SAR[0] = (u32) virt_to_phys(src);
1110 list->dest = dest;
1111 list->aau_desc.DAR = (u32) virt_to_phys(dest);
1112 list->aau_desc.BC = size;
1113 list->aau_desc.DC = AAU_DCR_WRITE | AAU_DCR_BLKCTRL_1_DF;
1114 if(!list->next)
1115 {
1116 list->aau_desc.DC |= AAU_DCR_IE;
1117 list->status |= AAU_END_CHAIN;
1118 break;
1119 }
1120 list = list->next;
1121 }
1122 err = aau_queue_buffer((u32) aau, &head);
1123 aau_return_buffer((u32) aau, head.list);
1124 return err;
1125 }
1126
1127 EXPORT_SYMBOL_NOVERS(aau_request);
1128 EXPORT_SYMBOL_NOVERS(aau_queue_buffer);
1129 EXPORT_SYMBOL_NOVERS(aau_suspend);
1130 EXPORT_SYMBOL_NOVERS(aau_resume);
1131 EXPORT_SYMBOL_NOVERS(aau_free);
1132 EXPORT_SYMBOL_NOVERS(aau_set_irq_threshold);
1133 EXPORT_SYMBOL_NOVERS(aau_get_buffer);
1134 EXPORT_SYMBOL_NOVERS(aau_return_buffer);
1135 EXPORT_SYMBOL_NOVERS(aau_memcpy);
1136
1137 module_init(aau_init);
66 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
Appendix B Example Calling Source Code
B.1 Standard Calls
Support functions for the 80310 AAU
======
Dave Jiang
Last updated: 09/18/2001
The Intel® 80312 I/O companion chip in the 80310 chipset contains an AAU. The
AAU is capable of processing up to 8 data block sources and perform XOR
operations on them. This unit is typically used to accelerated XOR
operations utilized by RAID storage device drivers such as RAID 5. This
API is designed to provide a set of functions to take advantage of the
AAU. The AAU can also be used to transfer data blocks and used as a memory
copier. The AAU transfer the memory faster than the operation performed by
using CPU copy therefore it is recommended to use the AAU for memory copy.
------
int aau_request(u32 *aau_context, const char *device_id);
This function allows the user the acquire the control of the AAU. The
function will return a context of AAU to the user and allocate
an interrupt for the AAU. The user must pass the context as a parameter to
various AAU API calls.
int aau_queue_buffer(u32 aau_context, aau_head_t *listhead);
This function starts the AAU operation. The user must create a SGL
header with a SGL attached. The format is presented below. The SGL is
built from kernel memory.
/* hardware descriptor */
typedef struct _aau_desc
{
White Paper 67 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
u32 NDA; /* next descriptor address [READONLY] */
u32 SAR[AAU_SAR_GROUP]; /* src addrs */
u32 DAR; /* destination addr */
u32 BC; /* byte count */
u32 DC; /* descriptor control */
u32 SARE[AAU_SAR_GROUP]; /* extended src addrs */
} aau_desc_t;
/* user SGL format */
typedef struct _aau_sgl
{
aau_desc_t aau_desc; /* AAU HW Desc */
u32 status; /* status of SGL [READONLY] */
struct _aau_sgl*next; /* pointer to next SG [READONLY] */
void *dest; /* destination addr */
void *src[AAU_SAR_GROUP]; /* source addr[4] */
void *ext_src[AAU_SAR_GROUP]; /* ext src addr[4] */
u32 total_src; /* total number of source */
} aau_sgl_t;
/* header for user SGL */
typedef struct _aau_head
{
u32 total; /* total descriptors allocated */
u32 status; /* SGL status */
aau_sgl_t *list; /* ptr to head of list */
aau_callback_t callback; /* callback func ptr */
} aau_head_t;
The function will call aau_start() and start the AAU after it queues
the SGL to the processing queue. When the function will either
a. Sleep on the wait queue aau->wait_q if no callback has been provided, or
b. Continue and then call the provided callback function when DMA interrupt
68 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
has been triggered.
int aau_suspend(u32 aau_context);
Stops/Suspends the AAU operation
int aau_free(u32 aau_context);
Frees the ownership of AAU. Called when no longer need AAU service.
aau_sgl_t * aau_get_buffer(u32 aau_context, int num_buf);
This function obtains an AAU SGL for the user. User must specify the number
of descriptors to be allocated in the chain that is returned.
void aau_return_buffer(u32 aau_context, aau_sgl_t *list);
This function returns all SGL back to the API after user is done.
int aau_memcpy(void *dest, void *src, u32 size);
This function is a short cut for user to do memory copy utilizing the AAU for
better large block memory copy vs. using the CPU. This is similar to using
typical memcopy() call.
* User is responsible for the source address(es) and the destination address.
The source and destination should all be cached memory.
void aau_test()
{
u32 aau;
char dev_id[] = "AAU";
int size = 2;
int err = 0;
aau_head_t *head;
aau_sgl_t *list;
u32 i;
White Paper 69 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
u32 result = 0;
void *src, *dest;
printk("Starting AAU test\n");
if((err = aau_request(&aau, dev_id))<0)
{
printk("test - AAU request failed: %d\n", err);
return;
}
else
{
printk("test - AAU request successful\n");
}
head = kmalloc(sizeof(aau_head_t), GFP_KERNEL);
head->total = size;
head->status = 0;
head->callback = NULL;
list = aau_get_buffer(aau, size);
if(!list)
{
printk("Can't get buffers\n");
return;
}
head->list = list;
src = kmalloc(1024, GFP_KERNEL);
dest = kmalloc(1024, GFP_KERNEL);
while(list)
{
list->status = 0;
list->aau_desc->SAR[0] = (u32)src;
70 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
list->aau_desc->DAR = (u32)dest;
list->aau_desc->BC = 1024;
/* see iop310-aau.h for more DCR commands */
list->aau_desc->DC = AAU_DCR_WRITE | AAU_DCR_BLKCTRL_1_DF;
if(!list->next)
{
list->aau_desc->DC = AAU_DCR_IE;
break;
}
list = list->next;
}
printk("test- Queueing buffer for AAU operation\n");
err = aau_queue_buffer(aau, head);
if(err >= 0)
{
printk("AAU Queue Buffer is done...\n");
}
else
{
printk("AAU Queue Buffer failed...: %d\n", err);
}
#if 1
printk("freeing the AAU\n");
aau_return_buffer(aau, head->list);
aau_free(aau);
kfree(src);
kfree(dest);
kfree((void *)head);
#endif
White Paper 71 Intel® 80310 I/O Processor Chipset AAU Coding Techniques Example Calling Source Code
}
All Disclaimers apply. Use this at your own discretion. Neither Intel nor I
will be responsible if anything goes wrong. =)
TODO
____
* Testing
* Do zero-size AAU transfer/channel at init
so all we have to do is chaining
72 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
Appendix C MMU Functions for Intel® XScale™ Microarchitecture
/*
* linux/arch/arm/mm/proc-xscale.S
*
* Author:Nicolas Pitre
* Created:November 2000
* Copyright:(C) 2000, 2001 MontaVista Software Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* MMU functions for the Intel® XScale™ microarchitecture
*
* 2001 Aug 21:
* some contributions by Brett Gaines
* Copyright 2001 by Intel Corp.
*
* 2001 Sep 08:
* Completely revisited, many important fixes
* Nicolas Pitre
*/
#include
#include
#include
#include
#include
/*
* This is the maximum size of an area which will be flushed. If the area
* is larger than this, then we flush the whole cache
White Paper 73 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
*/
#define MAX_AREA_SIZE32768
/*
* the cache line size of the I and D cache
*/
#define CACHELINESIZE32
/*
* the size of the data cache
*/
#define CACHESIZE32768
/*
* and the page size
*/
#define PAGESIZE4096
/*
* Virtual address used to allocate the cache when flushed
*
* This must be an address range which is _never_ used. It should
* apparently have a mapping in the corresponding page table for
* compatibility with future CPUs that _could_ require it. For instance we
* don't care.
*
* This must be aligned on a 2*CACHESIZE boundary. The code selects one of
* the 2 areas alternating each time the clean_d_cache macro is used.
* Without this the Intel® XScale™ core™ exhibits cache eviction problems and no one
* knows why.
*
* Reminder: the vector table is located at 0xffff0000-0xffff0fff.
*/
#define CLEAN_ADDR0xfffe0000
74 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
/*
* This macro is used to wait for a CP15 write and is needed
* when we have to ensure that the last operation to the co-pro
* was completed before continuing with operation.
*/
.macrocpwait, rd
mrc p15, 0, \rd, c2, c0, 0@ arbitrary read of cp15
mov \rd, \rd @ wait for completion
sub pc, pc, #4 @ flush instruction pipeline
.endm
.macrocpwait_ret, lr, rd
mrc p15, 0, \rd, c2, c0, 0@ arbitrary read of cp15
sub pc, \lr, \rd, LSR #32@ wait for completion and
@ flush instruction pipeline
.endm
/*
* This macro cleans the entire dcache using line allocate.
* The main loop has been unrolled to reduce loop overhead.
* rd and rs are two scratch registers.
*/
.macro clean_d_cache, rd, rs
ldr \rs, =clean_addr
ldr \rd, [\rs]
eor \rd, \rd, #CACHESIZE
str \rd, [\rs]
add \rs, \rd, #CACHESIZE
1: mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line
add \rd, \rd, #CACHELINESIZE
mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line
add \rd, \rd, #CACHELINESIZE
mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line
White Paper 75 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
add \rd, \rd, #CACHELINESIZE
mcr p15, 0, \rd, c7, c2, 5@ allocate D cache line
add \rd, \rd, #CACHELINESIZE
teq \rd, \rs
bne 1b
.endm
.data
clean_addr:.wordCLEAN_ADDR
.text
/*
* cpu_xscale_data_abort()
*
* obtain information about current aborted instruction
*
* r0 = address of aborted instruction
*
* Returns:
* r0 = address of abort
* r1 != 0 if writing
* r3=FSR
*/
.align5
ENTRY(cpu_xscale_data_abort)
mov r2, r0
mrc p15, 0, r0, c6, c0, 0@ get FAR
mrc p15, 0, r3, c5, c0, 0@ get FSR
ldr r1, [r2] @ read aborted instruction
tst r1, r1, lsr #21@ C = bit 20
sbcr1,r1,r1@r1=C-1
and r3, r3, #255
mov pc, lr
76 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
/*
* cpu_xscale_check_bugs()
*/
ENTRY(cpu_xscale_check_bugs)
mrs ip, cpsr
bic ip, ip, #F_BIT
msr cpsr, ip
mov pc, lr
/*
* cpu_xscale_proc_init()
*
* Nothing too exciting at the moment
*/
ENTRY(cpu_xscale_proc_init)
mov pc, lr
/*
* cpu_xscale_proc_fin()
*/
ENTRY(cpu_xscale_proc_fin)
str lr, [sp, #-4]!
mov r0, #F_BIT|I_BIT|SVC_MODE
msr cpsr_c, r0
mrc p15, 0, r0, c1, c0, 0@ ctrl register
bic r0, r0, #0x1800@ ...IZ......
bic r0, r0, #0x0006@ ...... CA.
mcr p15, 0, r0, c1, c0, 0@ disable caches
bl cpu_xscale_cache_clean_invalidate_all@ clean caches
ldr pc, [sp], #4
/*
* cpu_xscale_reset(loc)
White Paper 77 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
*
* Perform a soft reset of the system. Put the CPU into the
* same state as it would be if it had been reset, and branch
* to what would be the reset vector.
*
* loc: location to jump to for soft reset
*/
.align5
ENTRY(cpu_xscale_reset)
mov r1, #F_BIT|I_BIT|SVC_MODE
msr cpsr_c, r1 @ reset CPSR
mrc p15, 0, r1, c1, c0, 0@ ctrl register
bic r1, r1, #0x0086@ ...... B....CA.
bic r1, r1, #0x1900@ ...IZ..S......
mcr p15, 0, r1, c1, c0, 0@ ctrl register
mcr p15, 0, ip, c7, c7, 0@ invalidate I,D caches & BTB
bic r1, r1, #0x0001@ ...... M
mcr p15, 0, r1, c1, c0, 0@ ctrl register
@ CAUTION: MMU turned off from this point. We count on the pipeline
@ already containing those two last instructions to survive.
mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs
mov pc, r0
/*
* cpu_xscale_do_idle(type)
*
* Cause the processor to idle
*
* type:
* 0 = slow idle
* 1 = fast idle
* 2 = switch to slow processor clock
* 3 = switch to fast processor clock
*
78 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
* For now we do nothing but go to idle mode for every case
*
* Intel® XScale™ microarchitecture supports clock switching, but using idle mode
* support allows external hardware to react to system state changes.
*/
.align5
ENTRY(cpu_xscale_do_idle)
mov r0, #1
mcr p14, 0, r0, c7, c0, 0@ Go to IDLE
mov pc, lr
/* ======CACHE ======*/
/*
* cpu_xscale_cache_clean_invalidate_all (void)
*
* clean and invalidate all cache lines
*
* Note:
* 1. We should preserve r0 at all times.
* 2. Even if this function implies cache "invalidation" by its name,
* we don't need to actually use explicit invalidation operations
* since the goal is to discard all valid references from the cache
* and the cleaning of it already has that effect.
* 3. Because of 2 above and the fact that kernel space memory is always
* coherent across task switches there is no need to worry about
* inconsistencies due to interrupts, hence no irq disabling.
*/
.align5
ENTRY(cpu_xscale_cache_clean_invalidate_all)
mov r2, #1
cpu_xscale_cache_clean_invalidate_all_r2:
clean_d_cache r0, r1
White Paper 79 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
teq r2, #0
mcrnep15, 0, ip, c7, c5, 0@ Invalidate I cache & BTB
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/*
* cpu_xscale_cache_clean_invalidate_range(start, end, flags)
*
* clean and invalidate all cache lines associated with this area of memory
*
* start: Area start address
* end: Area end address
* flags: nonzero for I cache as well
*/
.align5
ENTRY(cpu_xscale_cache_clean_invalidate_range)
bic r0, r0, #CACHELINESIZE - 1@ round down to cache line
sub r3, r1, r0
cmp r3, #MAX_AREA_SIZE
bhi cpu_xscale_cache_clean_invalidate_all_r2
1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
mcr p15, 0, r0, c7, c6, 1@ Invalidate D cache line
add r0, r0, #CACHELINESIZE
cmp r0, r1
blo 1b
teq r2, #0
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
moveqpc, lr
sub r0, r0, r3
1: mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
cmp r0, r1
blo 1b
mcr p15, 0, ip, c7, c5, 6@ Invalidate BTB
80 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
mov pc, lr
/*
* cpu_xscale_flush_ram_page(page)
*
* clean all cache lines associated with this memory page
*
* page: page to clean
*/
.align5
ENTRY(cpu_xscale_flush_ram_page)
mov r1, #PAGESIZE
1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
subsr1, r1, #2 * CACHELINESIZE
bne 1b
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/* ======D-CACHE ======*/
/*
* cpu_xscale_dcache_invalidate_range(start, end)
*
* throw away all D-cached data in specified region without an obligation
* to write them back. Note however that on Intel® XScale™ microarchitecture we
* must clean all entries also due to hardware errata (80200 A0 & A1 only).
*
* start: virtual start address
* end: virtual end address
*/
.align5
White Paper 81 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
ENTRY(cpu_xscale_dcache_invalidate_range)
mrc p15, 0, r2, c0, c0, 0@ Read part no.
eor r2, r2, #0x69000000
eor r2, r2, #0x00052000@ 80200 XX part no.
bicsr2, r2, #0x1@ Clear LSB in revision field
moveqr2, #0
beq cpu_xscale_cache_clean_invalidate_range@ An 80200 A0 or A1
tst r0, #CACHELINESIZE - 1
mcrnep15, 0, r0, c7, c10, 1@ Clean D cache line
tst r1, #CACHELINESIZE - 1
mcrnep15, 0, r1, c7, c10, 1@ Clean D cache line
bic r0, r0, #CACHELINESIZE - 1@ round down to cache line
1: mcr p15, 0, r0, c7, c6, 1@ Invalidate D cache line
add r0, r0, #CACHELINESIZE
cmp r0, r1
blo 1b
mov pc, lr
/*
* cpu_xscale_dcache_clean_range(start, end)
*
* For the specified virtual address range, ensure that all caches contain
* clean data, such that peripheral accesses to the physical RAM fetch
* correct data.
*
* start: virtual start address
* end: virtual end address
*/
.align5
ENTRY(cpu_xscale_dcache_clean_range)
bic r0, r0, #CACHELINESIZE - 1
sub r2, r1, r0
cmp r2, #MAX_AREA_SIZE
82 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
movhir2, #0
bhi cpu_xscale_cache_clean_invalidate_all_r2
1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
cmp r0, r1
blo 1b
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/*
* cpu_xscale_clean_dcache_page(page)
*
* Cleans a single page of dcache so that if we have any future aliased
* mappings, they will be consistent at the time that they are created.
*
* Note:
* 1. we don't need to flush the write buffer in this case.
* 2. we don't invalidate the entries since when we write the page
* out to disk, the entries may get reloaded into the cache.
*/
.align5
ENTRY(cpu_xscale_dcache_clean_page)
mov r1, #PAGESIZE
1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
add r0, r0, #CACHELINESIZE
White Paper 83 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
subsr1, r1, #4 * CACHELINESIZE
bne 1b
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/*
* cpu_xscale_dcache_clean_entry(addr)
*
* Clean the specified entry of any caches such that the MMU
* translation fetches will obtain correct data.
*
* addr: cache-unaligned virtual address
*/
.align5
ENTRY(cpu_xscale_dcache_clean_entry)
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/* ======I-CACHE ======*/
/*
* cpu_xscale_icache_invalidate_range(start, end)
*
* invalidate a range of virtual addresses from the Icache
*
* start: virtual start address
* end: virtual end address
*
* Note: This is vaguely defined as supposed to bring the dcache and the
* icache in sync by the way this function is used.
*/
.align5
ENTRY(cpu_xscale_icache_invalidate_range)
84 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
bic r0, r0, #CACHELINESIZE - 1
1: mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
cmp r0, r1
blo 1b
mcr p15, 0, ip, c7, c5, 6@ Invalidate BTB
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/*
* cpu_xscale_icache_invalidate_page(page)
*
* invalidate all Icache lines associated with this area of memory
*
* page: page to invalidate
*/
.align5
ENTRY(cpu_xscale_icache_invalidate_page)
mov r1, #PAGESIZE
1: mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
mcr p15, 0, r0, c7, c5, 1@ Invalidate I cache line
add r0, r0, #CACHELINESIZE
subsr1, r1, #4 * CACHELINESIZE
bne 1b
mcr p15, 0, r0, c7, c5, 6@ Invalidate BTB
mov pc, lr
/* ======CACHE LOCKING======
White Paper 85 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
*
* The Intel® XScale™ microarchitecture implements support for locking entries into
* the data and instruction cache. The following functions implement the core
* low level instructions needed to accomplish the locking. The developer's
* manual states that the code that performs the locking must be in non-cached
* memory. To accomplish this, the code in xscale-cache-lock.c copies the
* following functions from the cache into a non-cached memory region that
* is allocated through consistent_alloc().
*
*/
.align5
/*
* xscale_icache_lock
*
* r0: starting address to lock
* r1: end address to lock
*/
ENTRY(xscale_icache_lock)
iLockLoop:
bic r0, r0, #CACHELINESIZE - 1
mcr p15, 0, r0, c9, c1, 0@ lock into cache
cmp r0, r1 @ are we done?
add r0, r0, #CACHELINESIZE@ advance to next cache line
bls iLockLoop
mov pc, lr
/*
* xscale_icache_unlock
*/
ENTRY(xscale_icache_unlock)
mcr p15, 0, r0, c9, c1, 1@ Unlock icache
mov pc, lr
86 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
/*
* xscale_dcache_lock
*
* r0: starting address to lock
* r1: end address to lock
*/
ENTRY(xscale_dcache_lock)
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov r2, #1
mcr p15, 0, r2, c9, c2, 0@ Put dcache in lock mode
cpwaitip @ Wait for completion
mrs r2, cpsr
orr r3, r2, #F_BIT | I_BIT
dLockLoop:
msr cpsr_c, r3
mcr p15, 0, r0, c7, c10, 1@ Write back line if it is dirty
mcr p15, 0, r0, c7, c6, 1@ Flush/invalidate line
msr cpsr_c, r2
ldr ip, [r0], #CACHELINESIZE @ Preload 32 bytes into cache from
@ location [r0]. Post-increment
@ r3 to next cache line
cmp r0, r1 @ Are we done?
bls dLockLoop
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov r2, #0
mcr p15, 0, r2, c9, c2, 0@ Get out of lock mode
cpwait_ret lr, ip
/*
* xscale_dcache_unlock
*/
ENTRY(xscale_dcache_unlock)
White Paper 87 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mcr p15, 0, ip, c9, c2, 1@ Unlock cache
mov pc, lr
/*
* Needed to determine the length of the code that needs to be copied.
*/
.align5
ENTRY(xscale_cache_dummy)
mov pc, lr
/* ======TLB ======*/
/*
* cpu_xscale_tlb_invalidate_all()
*
* Invalidate all TLB entries
*/
.align5
ENTRY(cpu_xscale_tlb_invalidate_all)
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs
cpwait_ret lr, ip
/*
* cpu_xscale_tlb_invalidate_range(start, end)
*
* invalidate TLB entries covering the specified range
*
* start: range start address
* end: range end address
*/
.align5
ENTRY(cpu_xscale_tlb_invalidate_range)
88 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
bic r0, r0, #(PAGESIZE - 1) & 0x00ff
bic r0, r0, #(PAGESIZE - 1) & 0xff00
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
1: mcr p15, 0, r0, c8, c6, 1@ invalidate D TLB entry
mcr p15, 0, r0, c8, c5, 1@ invalidate I TLB entry
add r0, r0, #PAGESIZE
cmp r0, r1
blo 1b
cpwait_ret lr, ip
/*
* cpu_xscale_tlb_invalidate_page(page, flags)
*
* invalidate the TLB entries for the specified page.
*
* page: page to invalidate
* flags: non-zero if we include the I TLB
*/
.align5
ENTRY(cpu_xscale_tlb_invalidate_page)
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
teq r1, #0
mcr p15, 0, r0, c8, c6, 1@ invalidate D TLB entry
mcrnep15, 0, r3, c8, c5, 1@ invalidate I TLB entry
cpwait_ret lr, ip
/* ======TLB LOCKING======
*
* The Intel® XScale™ microarchitecture implements support for locking entries into
* the Instruction and Data TLBs. The following functions provide the
* low level support for supporting these under Linux. xscale-lock.c
* implements some higher level management code. Most of the following
* is taken straight out of the Developer's Manual.
*/
White Paper 89 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
/*
* Lock I-TLB entry
*
* r0: Virtual address to translate and lock
*/
.align5
ENTRY(xscale_itlb_lock)
mrs r2, cpsr
orr r3, r2, #F_BIT | I_BIT
msr cpsr_c, r3 @ Disable interrupts
mcr p15, 0, r0, c8, c5, 1@ Invalidate I-TLB entry
mcr p15, 0, r0, c10, c4, 0@ Translate and lock
msr cpsr_c, r2 @ Restore interrupts
cpwait_ret lr, ip
/*
* Lock D-TLB entry
*
* r0: Virtual address to translate and lock
*/
.align5
ENTRY(xscale_dtlb_lock)
mrs r2, cpsr
orr r3, r2, #F_BIT | I_BIT
msr cpsr_c, r3 @ Disable interrupts
mcr p15, 0, r0, c8, c6, 1@ Invalidate D-TLB entry
mcr p15, 0, r0, c10, c8, 0@ Translate and lock
msr cpsr_c, r2 @ Restore interrupts
cpwait_ret lr, ip
/*
* Unlock all I-TLB entries
*/
90 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
.align5
ENTRY(xscale_itlb_unlock)
mcr p15, 0, ip, c10, c4, 1@ Unlock I-TLB
mcr p15, 0, ip, c8, c5, 0@ Invalidate I-TLB
cpwait_ret lr, ip
/*
* Unlock all D-TLB entries
*/
ENTRY(xscale_dtlb_unlock)
mcr p15, 0, ip, c10, c8, 1@ Unlock D-TBL
mcr p15, 0, ip, c8, c6, 0@ Invalidate D-TLB
cpwait_ret lr, ip
/* ======Page Table ======*/
#define USER_CACHE_WRITE_ALLOCATE 1
#define KERN_CACHE_WRITE_ALLOCATE 1
#define PMD_TYPE_MASK0x0003
#define PMD_TYPE_SECT0x0002
#define PMD_SECT_BUFFERABLE0x0004
#define PMD_SECT_CACHEABLE0x0008
#define PMD_SECT_TEX_X0x1000
#define HPTE_TYPE_SMALLEXT0x0003
#define HPTE_SMALLEXT_TEX_X0x0040
/*
* cpu_xscale_set_pgd(pgd)
*
* Set the translation base pointer to be as described by pgd.
*
* pgd: new page tables
White Paper 91 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
*/
.align5
ENTRY(cpu_xscale_set_pgd)
clean_d_cache r1, r2
mcr p15, 0, ip, c7, c5, 0@ Invalidate I cache & BTB
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mcr p15, 0, r0, c2, c0, 0@ load page table pointer
mcr p15, 0, ip, c8, c7, 0@ invalidateI&DTLBs
cpwait_ret lr, ip
/*
* cpu_xscale_set_pmd(pmdp, pmd)
*
* Set a level 1 translation table entry, and clean it out of
* any caches such that the MMUs can load it correctly.
*
* pmdp: pointer to PMD entry
* pmd: PMD value to store
*/
.align5
ENTRY(cpu_xscale_set_pmd)
#if KERN_CACHE_WRITE_ALLOCATE
and r2, r1, #PMD_TYPE_MASK|PMD_SECT_CACHEABLE|PMD_SECT_BUFFERABLE
cmp r2, #PMD_TYPE_SECT|PMD_SECT_CACHEABLE|PMD_SECT_BUFFERABLE
orreqr1, r1, #PMD_SECT_TEX_X
#endif
str r1, [r0]
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
/*
* cpu_xscale_set_pte(ptep, pte)
92 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
*
* Set a PTE and flush it out
*/
.align5
ENTRY(cpu_xscale_set_pte)
str r1, [r0], #-1024@ linux version
bic r2, r1, #0xff0
bic r2, r2, #3
eor r1, r1, #LPTE_PRESENT | LPTE_YOUNG | LPTE_WRITE | LPTE_DIRTY | LPTE_BUFFERABLE | LPTE_CACHEABLE
tst r1, #LPTE_USER | LPTE_EXEC@ User or Exec?
orrner2, r2, #HPTE_AP_READ
tst r1, #LPTE_WRITE | LPTE_DIRTY@ Write and Dirty?
orreqr2, r2, #HPTE_AP_WRITE
#if USER_CACHE_WRITE_ALLOCATE
tst r1, #LPTE_CACHEABLE | LPTE_BUFFERABLE@ B and C
orrner2, r2, #HPTE_TYPE_SMALL
biceqr2, r2, #0x0fc0@ clear non-exist AP[1-3]
orreqr2, r2, #HPTE_TYPE_SMALLEXT | HPTE_SMALLEXT_TEX_X
#else
orr r2, r2, #HPTE_TYPE_SMALL
#endif
tst r1, #LPTE_PRESENT | LPTE_YOUNG@ Present and Young?
movner2, #0
str r2, [r0] @ hardware version
mov r0, r0
mcr p15, 0, r0, c7, c10, 1@ Clean D cache line
White Paper 93 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mov pc, lr
.ltorg
cpu_manu_name:
.asciz"Intel"
cpu_80200_name:
.asciz"XScale-80200"
cpu_cotulla_name:
.asciz"XScale-Cotulla"
.align
.section ".text.init", #alloc, #execinstr
__xscale_setup:
mov r0, #F_BIT|I_BIT|SVC_MODE
msr cpsr_c, r0
mcr p15, 0, ip, c7, c7, 0@ invalidate I, D caches & BTB
mcr p15, 0, ip, c7, c10, 4@ Drain Write (& Fill) Buffer
mcr p15, 0, ip, c8, c7, 0@ invalidate I, D TLBs
mcr p15, 0, r4, c2, c0, 0@ load page table pointer
mov r0, #0x1f @ Domains 0, 1 = client
mcr p15, 0, r0, c3, c0, 0@ load domain access register
mrc p15, 0, r0, c1, c0, 0@ get control register
bic r0, r0, #0x0200@ ...... R......
bic r0, r0, #0x0082@ ...... B.....A.
orr r0, r0, #0x0005@ ...... C.M
orr r0, r0, #0x3900@ ..VIZ..S......
mov pc, lr
94 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
.text
/*
* Purpose : Function pointers used to access above functions - all calls
* come through these
*/
.type xscale_processor_functions, #object
ENTRY(xscale_processor_functions)
.word cpu_xscale_data_abort
.word cpu_xscale_check_bugs
.word cpu_xscale_proc_init
.word cpu_xscale_proc_fin
.word cpu_xscale_reset
.word cpu_xscale_do_idle
/* cache */
.word cpu_xscale_cache_clean_invalidate_all
.word cpu_xscale_cache_clean_invalidate_range
.word cpu_xscale_flush_ram_page
/* dcache */
.word cpu_xscale_dcache_invalidate_range
.word cpu_xscale_dcache_clean_range
.word cpu_xscale_dcache_clean_page
.word cpu_xscale_dcache_clean_entry
/* icache */
.word cpu_xscale_icache_invalidate_range
.word cpu_xscale_icache_invalidate_page
/* tlb */
.word cpu_xscale_tlb_invalidate_all
White Paper 95 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
.word cpu_xscale_tlb_invalidate_range
.word cpu_xscale_tlb_invalidate_page
/* pgtable */
.word cpu_xscale_set_pgd
.word cpu_xscale_set_pmd
.word cpu_xscale_set_pte
.size xscale_processor_functions, . - xscale_processor_functions
.type cpu_80200_info, #object
cpu_80200_info:
.long cpu_manu_name
.long cpu_80200_name
.size cpu_80200_info, . - cpu_80200_info
.type cpu_cotulla_info, #object
cpu_cotulla_info:
.long cpu_manu_name
.long cpu_cotulla_name
.size cpu_cotulla_info, . - cpu_cotulla_info
.type cpu_arch_name, #object
cpu_arch_name:
.asciz "armv5"
.size cpu_arch_name, . - cpu_arch_name
.type cpu_elf_name, #object
cpu_elf_name:
.asciz "v5"
.size cpu_elf_name, . - cpu_elf_name
.align
.section ".proc.info", #alloc, #execinstr
96 White Paper Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
.type __80200_proc_info,#object
__80200_proc_info:
.long 0x69052000
.long 0xfffffff0
.long 0x00000c0e
b __xscale_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP
.long cpu_80200_info
.long xscale_processor_functions
.size __80200_proc_info, . - __80200_proc_info
.type __cotulla_proc_info,#object
__cotulla_proc_info:
.long 0x69052100
.long 0xfffffff0
.long 0x00000c0e
b __xscale_setup
.long cpu_arch_name
.long cpu_elf_name
.long HWCAP_SWP|HWCAP_HALF|HWCAP_THUMB|HWCAP_FAST_MULT|HWCAP_EDSP
.long cpu_cotulla_info
.long xscale_processor_functions
.size __cotulla_proc_info, . - __cotulla_proc_info
White Paper 97 Intel® 80310 I/O Processor Chipset AAU Coding Techniques MMU Functions for Intel® XScale™ Microarchitecture
This page intentionally left blank.
98 White Paper