ADSP-21065L SHARC User's Manual; Chapter 2, Computation
Total Page:16
File Type:pdf, Size:1020Kb
&20387$7,2181,76 Figure 2-0. Table 2-0. Listing 2-0. The processor’s computation units provide the numeric processing power for performing DSP algorithms, performing operations on both fixed-point and floating-point numbers. Each computation unit executes instructions in a single cycle. The processor contains three computation units: • An arithmetic/logic unit (ALU) Performs a standard set of arithmetic and logic operations in both fixed-point and floating-point formats. • A multiplier Performs floating-point and fixed-point multiplication as well as fixed-point dual multiply/add or multiply/subtract operations. •A shifter Performs logical and arithmetic shifts, bit manipulation, field deposit and extraction operations on 32-bit operands and can derive exponents as well. ADSP-21065L SHARC User’s Manual 2-1 PM Data Bus DM Data Bus Register File Multiplier Shifter ALU 16 × 40-bit MR2 MR1 MR0 Figure 2-1. Computation units block diagram The computation units are architecturally arranged in parallel, as shown in Figure 2-1. The output from any computation unit can be input to any computation unit on the next cycle. The computation units store input operands and results locally in a ten-port register file. The Register File is accessible to the processor’s pro- gram memory data (PMD) bus and its data memory data (DMD) bus. Both of these buses transfer data between the computation units and internal memory, external memory, or other parts of the processor. This chapter covers these topics: • Data formats • Register File data storage and transfers • ALU architecture and operations 2-2 ADSP-21065L SHARC User’s Manual &RPSXWDWLRQ8QLWV • Multiplier architecture and operations • Shifter architecture and operations • Multifunction operations ADSP-21065L SHARC User’s Manual 2-3 'DWD)RUPDWV 'DWD)RUPDWV The processor’s computation units operate on a variety of data formats and support two rounding modes: • IEEE 754/854 standard for single-precision floating-point format • Extended-precision floating-point format • Short word (16-bit) floating-point format • 32-bit fixed-point format • Round-toward-nearest and round-toward-zero rounding modes The processor also provides exception handling for floating-point operations. 6LQJOH3UHFLVLRQ)ORDWLQJ3RLQW)RUPDW The processor’s Multiplier and ALU units support the single-precision, floating-point format specified in the IEEE 754/854 standard, as described in Appendix C, Numeric Formats. The processor is IEEE 754/854 compatible for single-precision, floating-point operations in all respects, except that: • The processor does not provide inexact flags. • NAN (Not-A-Number) inputs generate an invalid exception and return a quiet NAN (all 1s). • The processor flushes denormal operands to 0 when they are input to a computation unit and do not generate an underflow exception. It flushes to 0 any denormal or underflow result from an arithmetic operation and generates an underflow exception. 2-4 ADSP-21065L SHARC User’s Manual &RPSXWDWLRQ8QLWV • The processor supports round-to-nearest and round-toward-zero modes, but does not support rounding to +Infinity or to –Infinity. The processor also supports a 40-bit extended precision, floating-point mode, which includes eight additional LSBs of the mantissa and is compli- ant with the 754/854 standards. However, results in this format are more precise than the IEEE single-precision standard specifies. ([WHQGHG3UHFLVLRQ)/RDWLQJ3RLQW Floating-point data can be either 32- or 40-bits wide. The RND32 bit in the MODE1 register determines the width: RND32=0 Selects extended precision, floating-point format (eight bits of exponent and thirty-two bits of mantissa). RND32=1 Selects normal IEEE precision (eight bits of exponent and twenty-four bits of mantissa). The computation unit sets the eight LSBs of floating-point inputs to 0s before performing the operation. It rounds the mantissa of a result to twenty-three bits (not including the hidden bit) and sets the eight LSBs of the 40-bit result to 0s to form a 32-bit number that is equivalent to the IEEE standard result. 6KRUW:RUG)ORDWLQJ3RLQW)RUPDW The processor supports a 16-bit, floating-point data type and provides conversion instructions for it. The short float data format has an 11-bit mantissa with a 4-bit exponent and a sign bit. The 16-bit floating-point numbers reside in the lower sixteen bits of the 32-bit floating-point field. Two shifter instructions, FPACK and FUNPACK, perform the packing and unpacking conversions between 32-bit and 16-bit floating-point ADSP-21065L SHARC User’s Manual 2-5 'DWD)RUPDWV words. FPACK converts a 32-bit IEEE floating-point number to a 16-bit floating-point number. FUNPACK converts the 16-bit floating-point numbers back to 32-bit IEEE floating-point. Both instructions execute in a single cycle. The short float type supports gradual underflow. This type sacrifices pre- cision for dynamic range. When packing a number that would have underflowed, the Shifter sets the exponent to 0 and right-shifts the man- tissa (including the hidden 1) the appropriate amount. The packed result is a denormal, which applications can unpack into a normal IEEE float- ing-point number. ([FHSWLRQ+DQGOLQJIRU)/RDWLQJ3RLQW2SHUDWLRQV Both the Multiplier and ALU provide exception information when execut- ing floating-point operations. Each unit updates overflow, underflow, and invalid operation flags in the arithmetic status (ASTAT) register and in the sticky status (STKY) register. An underflow, overflow, or invalid oper- ation from any computation unit also generates a maskable interrupt. So, applications have three ways to handle floating-point exceptions: •Interrupts When your application must correct all exceptions as they occur, use an interrupt service routine to handle the exception condition immediately. • ASTAT register When your application needs to monitor a particular floating-point operation, test the exception flags in the ASTAT register that per- tain to a particular arithmetic operation after the processor has per- formed the operation. 2-6 ADSP-21065L SHARC User’s Manual &RPSXWDWLRQ8QLWV • STKY register When exception handling is noncritical, examine the exception flags in the STKY register at the end of a series of operations. If any flags are set, some of the results are incorrect. )L[HG3RLQW)RUPDW The processor always represents fixed-point numbers in 32-bit, left-justi- fied (occupy the thirty-two MSBs) format in its 40-bit data fields. You can treat these numbers as fractions or integers and as unsigned or twos-complement. Each computation unit has its own restrictions on how you can mix these formats in a given operation. The computation units read 32-bit operands from 40-bit registers, ignor- ing the eight LSBs, and write 32-bit results, zero-filling the eight LSBs. 5RXQGLQJ0RGHV The processor supports two modes of rounding. Both modes follow the IEEE 754 standard definitions. • Round-Toward-Zero If the processor cannot represent exactly the result before rounding in the destination format, it rounds the result to the number that is nearer to 0. This method is equivalent to truncation. ADSP-21065L SHARC User’s Manual 2-7 'DWD)RUPDWV • Round-Toward-Nearest If the processor cannot represent exactly the result before rounding in the destination format, it rounds the result to the number that is nearer to the result before rounding. If the result before rounding is exactly halfway between two num- bers in the destination format (differing by an LSB), the processor rounds the result to the number that has an LSB equal to 0. Statistically, rounding up occurs as often as rounding down, so this method has no large sample bias. Because the maximum floating-point value is one LSB less than the value that represents Infinity, in this mode, a result that is halfway between the maximum floating-point value and Infinity rounds to Infinity. 2-8 ADSP-21065L SHARC User’s Manual &RPSXWDWLRQ8QLWV 5HJLVWHU)LOH The Register File provides the interface between the processor’s internal data buses and its computation units. It also provides local storage for operands and results. The Register File has these structural and functional characteristics: • Consists of sixteen primary registers and sixteen alternate (second- ary) registers. • All of the individual data registers are forty bits wide. • 32-bit data from the computation units is always left-justified. • On register reads, the processor ignores the eight LSBs, and on reg- ister writes, it writes the eight LSBs with zeros (0). Accesses of the Register File have these characteristics: • Program memory data accesses and data memory accesses occur on the PM Data bus and DM Data bus, respectively. • One PM Data bus and/or one DM Data bus access can occur in one cycle. • Transfers between the Register File and the 40-bit DM Data bus are always forty bits wide. • The Register File transfers data to and from the 48-bit PM Data bus in the most significant forty bits, writing zeros (0) in the lower eight bits on transfers to the PM Data bus. ADSP-21065L SHARC User’s Manual 2-9 5HJLVWHU)LOH • If the same location in the Register File is specified as both the source of an operand and the destination of a result or memory fetch, the read occurs in the first half of the cycle, and the write occurs in the second half. This enables the processor to use the old data as the operand before it updates the location with the resulting new data. • If writes to the same location take place in the same cycle, only the write with higher precedence actually occurs. The source of the write data determines the precedence. In order of precedence, the sources for write data are: • Data memory or universal register • Program memory • ALU • Multiplier •Shifter ,QGLYLGXDO'DWD5HJLVWHUV In assembly language source code, the individual registers of the Register File carry a prefix.