Revision 1.12 February 3, 2016 Document Number: MD00868 Architecture Module Architecture Volume IV-j: The MIPS64® SIMD IV-j: Volume MIPS® Architecture for Programmers Programmers for Architecture MIPS®

, Revision 1.12 any warranty warranty ofkind. any any chitecture Module chitecture nd is supplied ‘as is’, without ‘as is supplied nd Volume IV-j: The MIPS64® SIMD Ar ect to change notice without a MIPS® Architecture for Programmers Programmers MIPS® Architecture for 02, Built with ARCH FPU_PStags: FPU_PSandARCH te: nB1. te: . This publication contains proprietary information which is subj which information proprietary contains publication This . Public Templa le, Revision 1.12 Architecture Modu Architecture ogrammers Volume ogrammers IV-j: The MIPS64® SIMD 3Pr MIPS® Architecture for

Chapter 1

About This Book

The MIPS® Architecture for Programmers Volume IV-j: The MIPS64® SIMD Architecture Module comes as part of a multi-volume set.

• Volume I-A describes conventions used throughout the document set, and provides an introduction to the MIPS64® Architecture

• Volume I-B describes conventions used throughout the document set, and provides an introduction to the microMIPS64™ Architecture

• Volume II-A provides detailed descriptions of each instruction in the MIPS64® instruction set

• Volume II-B provides detailed descriptions of each instruction in the microMIPS64™ instruction set

• Volume III describes the MIPS64® and microMIPS64™ Privileged Resource Architecture which defines and governs the behavior of the privileged resources included in a MIPS® implementation

• Volume IV-a describes the MIPS16e™ Application-Specific Extension to the MIPS64® Architecture. Beginning with Release 3 of the Architecture, microMIPS is the preferred solution for smaller code size.

• Volume IV-b describes the MDMX™ Application-Specific Extension to the MIPS64® Architecture and microMIPS64™. With Release 5 of the Architecture, MDMX is deprecated. MDMX and MSA can not be imple- mented at the same time.

• Volume IV-c describes the MIPS-3D® Application-Specific Extension to the MIPS® Architecture

• Volume IV-d describes the SmartMIPS®Application-Specific Extension to the MIPS32® Architecture and the microMIPS32™ Architecture and is not applicable to the MIPS64® document set nor the microMIPS64™ docu- ment set.

• Volume IV-e describes the MIPS® DSP Module to the MIPS® Architecture

• Volume IV-f describes the MIPS® MT Module to the MIPS® Architecture

• Volume IV-h describes the MIPS® MCU Application-Specific Extension to the MIPS® Architecture

• Volume IV-i describes the MIPS® Virtualization Module to the MIPS® Architecture

• Volume IV-j describes the MIPS® SIMD Architecture Module to the MIPS® Architecture

1.1 Typographical Conventions

This section describes the use of italic, bold and courier fonts in this book.

MIPS® Architecture for Programmers Volume IV-j: The MIPS64® SIMD Architecture Module, Revision 1.12 12

13 , , D S bits, which are are which bits, register of code and instruction indicates numbers 5 through 5 numbers indicates operations executed in user operations executed rated ornot. Ifa result gener- is 5..1 5..1 ctive (for instance, address bits used by bits address instance, (for ctive arbitrary exceptions. 1.2 UNPREDICTABLE and UNDEFINED results mustnot data source depend on any itecture Module, Revision 1.12 Revision Module, itecture floating point instruction formats, as such UNPREDICTABLE uncached with the CP0 usable bit set in the ). Unpriv- register). Statusthe set in usable bit CP0the with on the screen, on and for examples the struction. Software can never dependon resultsthat are struction. can never Software and the current processor mode current processor the operations may cause operations behavior, as defined below. defined as UNDEFINED behavior, are used throughout this book to describe the behavior of the pro- the of behavior the to describe book this throughout are used om a hardware perspective (for instance, (for perspective om a hardware cached and UNPREDICTABLE al state that is only accessible in Kernel Mode or Debug Mode or in Mode Debug Mode or in Kernel only accessible is al state that operationsmaya resultcause to be gene orresults operations. UNDEFINED , that are important from a software perspe software a from are important that , defined processor mode. For example, example, For processor mode. UNPREDICTABLE behavior or operations. Conversely, both privileged and unprivileged andunprivileged both privileged operations.or UNDEFINED behavior Conversely, and . which is inaccessible in inaccessible is which important are fr that Volume IV-j: The MIPS64® SIMD Arch and UNDEFINED registers operations mustnot read, write,or modify the contents of memoryinternal or state which , fields behavior or operations can occur only as the result of executing instructions in instructions executing of result as the only can occur operations or behavior UNDEFINED results or operations have several implementation restrictions: several orresults operations have . UNPREDICTABLE results may vary from processor implementation maytoresults implementation,vary instruction toinstruction, fields UNPREDICTABLE and , bits emphasis bits UNPREDICTABLE UNPREDICTABLE fixed-width font is used for text that is displayed fonttext used for is fixed-width PS another mode must not access memory or intern memory or not access mode must (memory or internal state) (memory UNPREDICTABLE current in the is inaccessible software, and programmable fields and registers), and various andsoftware, programmableand registers), various and fields and not programmable but accessible only to hardware) programmable but not 1 • UNPREDICTABLE • Implementationsof operationsgenerating • as such types, the memory access used for is ated, it is a privileged mode (i.e., in Kernel Mode or Debug Mode, or or Mode, Debug or Mode Kernel in (i.e., mode privileged a cessor in certain cases. in certain cases. cessor cause never can software ileged can cause software pseudocode. UNPREDICTABLE •instance, For ellipsis. by an indicated is range the of numbers; ranges for used is • used to emphasize UNPREDICTABLE is UNPREDICTABLE functionas timea of or on the same implementationor in The terms The Courier •is used for for used •is •that is being term a represents •is used for for used •is •for used is 1.2.1 UNPREDICTABLE 1.1.3 Courier Text 1.1.2 Bold Text 1.1.1 Italic Text MIPS® Architecture for Programmers Programmers MIPS® Architecture for 1.2 UNPREDICTABLE

14 is less y value results in a value operations or behavior behavior or operations . Table 1.1 Table UNDEFINED cimal value 100, 2#100 represents the the represents 2#100 100, value cimal ation (rightmost bit is 0) is used. If bit 0) is used. ation is (rightmost e binary value 100 (decimal 4). e value binary 1.3Pseudocode Special Symbols Notation in x itecture Module, Revision 1.12 Revision Module, itecture The assertion of any of the reset signals must restore the must signals of the reset any of assertion The Meaning results mustnot (memory data source depend or on any . Little-endian bit not . x t processor mode 16#100 represents the hexadecimal value 100 (decimal 256). If the "b#" (decimal 100 256). value hexadecimal the represents 16#100 ent in which execution continue. UNDEFINEDcan no longer opera- execution in which ent arithmetic: addition, subtraction UNSTABLE . For instance 10#100 represents the de instance represents 10#100 For . of bit string z . For instance 0x100 represents the hexadecimal value 100 (decimal 256). value the hexadecimal instance represents 0x100 For 16. b th instance represents 0b100 For 2. copies of the single-bit value single-bit copies the value of y ble in the curren ble in through through n in base n in base n in base y Volume IV-j: The MIPS64® SIMD Arch operations notmust haltor hang theprocessor , this expression is an empty (zero length) bit string. is empty (zero length) an this expression , z values, software may software dependon that a sampling the fact values, UNSTABLE of an -bit string formed by -bit y Table 1.1Table Statements Operation Instruction in Used Symbols 2’s complement or floating point complement or floating 2’s Tests for equality and inequality for equality Tests concatenation string Bit than binary value 100 (decimal 4), and value binary is 10. base default the is omitted, prefix A bits of Selection Assignment   y processor to an operational state inaccessi state) which is internal operations or behavior mustnot UNDEFINED operationscause the processoror behavior to hang (that statefromis, enter a which the processor). down powering than other is no exit there UNPREDICTABLE  , , y z x  0xn constant value A 0bn constant value A b#n constant value A   x Symbol

legal transient value that was correct transientat somevalue pointlegal in time prior tothe sampling. one implementation restriction: have values UNSTABLE • Implementationsof operationsgenerating Inthis book, algorithmic descriptions of an operation languageare described as pseudocodenotationin a high-level Specialsymbols usedthein resembling pseudocode notationPascal. are listed in tions or behavior tionsmayor behavior causedata loss. has one implementation UNDEFINED operationsor behavior restriction: • as a functiontime mayof onresultsthe vary values sameor implementation instruction.or Unlike UNSTABLE UNPREDICTABLE may vary fromnothing mayvary tocreating an environm • fromprocessorimplementation UNDEFINED operationsmayorvary implementation,to behavior instructionto instruction.implementationor same the on time of function as a or instruction, 1.2.3 UNSTABLE 1.2.2 UNDEFINED 1.3 Notation in Pseudocode Symbols Special MIPS® Architecture for Programmers Programmers MIPS® Architecture for

15 neral-purpose regis- neral-purpose riptions), and the riptions), endi- an may be computed may as be computed an available in User mode only, in User mode only, available . Big-Endian). User mode, this  In opies of the CPU ge CPU the opies of x. COC[1] register. Thus, BigEndianCPU be may com- register. Big-Endian). Specifies the of Big-Endian).  Specifies . 1.3Pseudocode Special Symbols Notation in , x] , register s, register Status is always zero. In Release 2 of the Architecture, Architecture, the 2 of Release In zero. is always itecture Module, Revision 1.12 Revision Module, itecture ory pseudocode ory function desc register. Thus, ReverseEndi register. CSS x instructions.This feature is Little-Endian, Little-Endian, 1  Meaning GPR[0] Status bit in the s Little-Endian, 1 Little-Endian, into the corresponding 32-bit GPR number GPR 32-bit corresponding the into RE has the same value as value has the same x refers to GPR set to GPR refers bit of the bit of ltiplication (both used for either) (both ltiplication x SGPR[ SRSCtl x x, select x RE FCC[0]

and subsequent c releases, multiple subsequent and x . The content of . of the CPU general-purpose registers CPU general-purpose the of unit 1), general register 1), register unit general SGPR[s,x] LoadMemory and StoreMem LoadMemory in at zeros left-hand-side) in zeros at right-hand-side) in zeros itched setting the by endianness of load and store of and endianness load store condition signal condition , , control register , general register , register general ss-than comparison ss-than MIPS16e GPR number MIPS16e onfigured chip at reset (0 onfigured z z 2, general register general 2, register z Volume IV-j: The MIPS64® SIMD Arch and User mode). RE (SR and is implemented by setting the the setting implemented and by is puted as (BigEndianMem ReverseEndian). puted XOR endianness be may sw Translation of the of the Translation mode execution. and Supervisor anness Kernel of the memory interface (see memory interface the unit Coprocessor Bitwise logical OR logical Bitwise Logical Shift left (shift Floating division point le complement 2’s comparison greater-than complement 2’s or equal comparison n less-tha complement 2’s or comparison equal complement greater-than 2’s is GPR[x] for a short-hand notation CPU general-purpose register x register CPU general-purpose Coprocessor unit Coprocessor ters be may implemented. register operand Floating Point Floating Point condition CC. Floating code condition Point Floating Point (Coprocessor Floating (Coprocessor Point unit Coprocessor 2’s complement or floating mu point complement or floating 2’s 2’s complement integer division integer complement 2’s Logical Shift right (shift Table 1.1Table (Continued) Statements Operation in Instruction Used Symbols  v      or << >> di notinversion Bitwise norxorNOR logical Bitwise XOR logical Bitwise and ANDlogical Bitwise && Logical AND (non-Bitwise) *, modmodulo complement 2’s Xlat[x] FPR[x] FPR[x] GPR[x] COC[z] Symbol CCR[z,x] GPRLEN length in bits (32 or 64) The FCC[CC] SGPR[s,x]the Architecture Release 2 of In CPR[z,x,s] CP2CPR[x]unit Coprocessor CP2CCR[x] unit control register 2, Coprocessor ReverseEndianthe to reverse Signal BigEndianCPU(0 for load store and instructions endianness The BigEndianMem as c Endian mode MIPS® Architecture for Programmers Programmers MIPS® Architecture for

16 is LLbit bytes. 36 during the during = 2 PC tten in sections of that pseudocode pseudocode that of PABITS es of statements for dif- of statements es exception return instruc- exception en the processor stores a the processor stores en ined by assign- determined is me lable until after the next next until after the lable ograms must not depend on a on must not depend ograms appears to occur “at the same same the “at occur to appears ted, all effects of the current the of ted, all effects description that writes the result result the writes that description 1 during an instruction time by any by an instruction time during any gment, the size of the segment is segment the of size the gment, + e would be 2 would e I ion theor microMIPSbase architec- symbol PABITS. As such, if 36 phys- As such, PABITS. symbol ion operation is wri operation ion PC , in which the effect effect the which in , I cur either earlier or later — that is, during the during the is, that — later or earlier either cur gns thegns target addressto the which mode the processor is executing, as fol- mode processor which the is executing, tions as a label. It indicates the instruction time time instruction the indicates It label. a as tions ng 32-bit MIPS instructions 32-bit MIPS ng or microMIPS MIIPS16e ng 1.3Pseudocode Special Symbols Notation in ons that provide atomic read-modify-write. read-modify-write. atomic provide that ons ss otherwise indica sible indirectly, such as wh as such indirectly, sible cal address spac address cal Meaning itecture Module, Revision 1.12 Revision Module, itecture a result that is not avai ring the next instruction ti instruction the next ring by either 2 (in the case of a 16-bit MIPS16e instruc- 2 MIPS16e (in case of a 16-bit by the either during other CPU operation, operation, CPU other during cleared, is It store. tional ndirectly, such as when the processor stores the restart the stores processor such as when the ndirectly, for the following instruction. Within one pseudocode one pseudocode Within instruction. for the following der. However, between sequenc However, der. s all of which are significant during a memory refer- memory a during significant are which of all s I ere is no defined order. Pr order. ere is no defined Meaning If no value is assigned to is assigned value no If instruction time of the current instruction. No label is equivalent to a a to is equivalent No label instruction. current of the time instruction longer be atomic. In particular, it is cleared by cleared is it In particular, atomic. be longer address bits are implemented in a se bits are implemented address the portion of the instruction operation instruction operation the portion of the ion. When this happens, the instruct ts implemented is represented by the represented is implemented ts instructions ts implemented in a segment of the address space is represented by the sym- by the space is represented address of the in a segment implemented ts 1. ented, size the physi the of the MIPS16e Application Specific Extens Specific the Application MIPS16e ruction time. A taken branch assi branch A taken time. ruction + I 1 The is processor executi 0 The is processor executi description lines and func description lines and Operation tion between tion such sections. eudocode statements labeled eudocode value. During the instruction time of an instruction, this is the address of the instruc- the of address is the this instruction, an of time instruction the During value. s a single-bit register that determines in in determines that register a single-bit s i

Encoding during an instruction time. an instruction during ode bytes. PC . Sometimes effects of an instruction appear to oc to appear instruction an of effects Sometimes . state used to specify operation for instructi for operation specify to state used 40 I Volume IV-j: The MIPS64® SIMD Arch l = 2 virtua Program Program SEGBITS ferent instructions that occur “at the same time,” th time,” same the “at occur that instructions ferent sequence, the effects of the statements take place in or in place takestatements the of effects thesequence, du occurs that instruction of the address The word. tion particular order evalua of order particular to value a ing pseudocodestatement, itis automatically incremented tion) or 4 before the next inst next before the or 4 tion) time” as the effect of ps of the as effect time” The The instruction time of the instruction in the branch delay slot. delay branch the in instruction the of time instruction i visible only is value PC the Architecture, the MIPS In ence. struction, or into a Coprocessor 0 register on an on register 0 a Coprocessor into or instruction, branch-and-link or a jump-and-link on GPR a into address contains a 64-bit full addres PC value The exception. The effect of pseudocode statements for the current instruction labelled labelled instruction current the for statements pseudocode of effect The ISA M the tures, instruction. Such an instruction has has instruction an Such instruction. in a section labeled register appears to occur. For example, instruction have an may example, For appears to occur. labeled with the instruction time, relative to the current instruction instruction current the to relative time, instruction the with labeled instruction time of another instruct another of time instruction set when a linked load occurs and is tested by the condi is tested by the and load occurs a linked when set no location would to the store a when tions. to occurs as a prefix This during which the pseudocode appears to “execute.” Unle to “execute.” the pseudocode appears which during instruction appear to occur during the during to occur appear instruction of label time 2 Bit of Bit In the MIPS Architecture, the ISA Mode value is only vi is value Mode ISA the Architecture, MIPS the In branch-and-link or a GPR on a jump-and-link into Mode and the ISA PC of the upper bits of value combined on exception. an Coprocessor register a 0 into or instruction, lows: ical bits were implem address bol SEGBITS. As such, bol As if 40 SEGBITS. virtual Table 1.1Table (Continued) Statements Operation in Instruction Used Symbols :, :, I PC : I-n I+n LLbit PABITSaddress bi number of physical The Symbol SEGBITS of virtual address bi number The ISA ModeISA processors implement In that MIPS® Architecture for Programmers Programmers MIPS® Architecture for

17 . data types are stored in data [email protected] state. That is, the isvalue 1.4Information More For pseudocode function—the pseudocode function—the register. ed in the in delay slot ed a branch of static register. If this bit is is a 0, the bit pro- this If register. e type of exception and the argument argument the and exception type of e Status Status gisters (FPRs). In MIPS32 Release 1, the FPU the FPU Release 1, In MIPS32 (FPRs). gisters is always a 0. MIPS64 implementations have a implementations have a 0. MIPS64 is always ment, send Email to to Email send ment, itecture Module, Revision 1.12 Revision Module, itecture 64-bit FPRs in which 64-bit 64-bit l this does not return from t is a 1, the processor operates with 32 64-bit operates 32 FPRs. with 64-bit t processor a 1, the is Meaning state theof instruction,not the FP32RegistersMode ta types are stored in even-odd pairs of FPRs. In MIPS64, (and option- (and In MIPS64, FPRs. pairs of even-odd in types are stored ta dynamic the address was execut the Counter address was Program using the exception parameter as th parameter exception the using iscomputed from theFR bit in the chitecture or this docu or this chitecture an instruction whose PC immediately follows a branch or jump, but which which but jump, or branch a follows immediately PC whose instruction an is computed from the FR bit in the the in bit FR from the computed is r the FPU has 32-bit or 64-bit floating point re floating point has 32-bit or 64-bit the FPU r Volume IV-j: The MIPS64® SIMD Arch compatibility mode in which the processor references the FPRs as if it were a MIPS32 implementation. In implementation. MIPS32 a were it if FPRs as the references processor the which in mode compatibility such a case FP32RegisterMode has 32 32-bit FPRs in which 64-bit da 64-bit which FPRs in 32-bit 32 has MIPSr3) the and FPU has 32 Release2 MIPS32 in ally FPR. any implementations, 1 MIPS32 Release In FPRs. If this bi 32-bit 32 as if it had cessor operates of FP32RegistersMode value The at whether the instruction Indicates or jump. This condition reflects the reflects jump. This condition or to occurs jump or if a branch false the in delay slot a of or jump. branch not executed is signaled, be to an exception Causes of the is call. signaled the at point exception parameter as an exception-specific argument). Contro argument). as an exception-specific parameter Table 1.1Table (Continued) Statements Operation in Instruction Used Symbols http://www mips.com laySlot Symbol Various MIPS processorRISC manuals and additional information aboutproducts MIPS be foundcan at the MIPS Various URL: ArMIPS64® questions on the comments or For tion, argument) tion, FP32RegistersModewhethe Indicates InstructionInBranchDe- SignalException(excep- MIPS® Architecture for Programmers Programmers MIPS® Architecture for 1.4 Information More For

are listed in alphabetical are itecture Module, Revision 1.12 Revision Module, itecture 18 re are descriptionslistedbelow: of thefields instruction descriptions,which Volume IV-j: The MIPS64® SIMD Arch e Instruction Fields e Instruction shows an example instruction. Following the figu instruction. Following an example shows “InstructionFields” onpage 19 “InstructionDescriptive NameandMnemonic” on 20 “Format Field” on page 20 page on Field” “Purpose 21 “Description Field” on page 21 “RestrictionsField”on page 21 “Operation Field” on page 22 “Exceptions Field” on page 22 “Programming Notes and Implementation Notes Fields” on page 23 Figure2.1 This chapter provides a detailed This chapter guide provides to understanding the order inthetables chapter. at the beginningof the next • • • • • • • • • MIPS® Architecture for Programmers Programmers MIPS® Architecture for 2.1 th Understanding Guide to the Instruction Set to the Instruction Guide Chapter 2 Chapter

). 20 ). Figure in Figure Figure 2.2 Figure rd Figure 2.2 0 , and, . rt e Instruction Fields e Instruction , ADD MIPS32 allinstructions inpre- rs field. For example, the example, For field. ADD 100000 fmt in uppercase characters. The The characters. uppercase in 6 5 to be zero (bits 10:6 in 10:6 (bits zero be to instruction was originally defined are defined originally was instruction ate fields. The architectural level at architectural level The ate fields. UNPREDICTABLE al assembler formats are valid for the for assembler are valid formats al for each instruction, as shown in shown as instruction, each for 0 00000 2.1 Understanding th architecture levels include levels architecture itecture Module, Revision 1.12 Revision Module, itecture 11 10 extended, the architecture levels at which it was at which it was levels architecture the extended, rd ecture level. Floating point operations on formatted data formatted onoperations point Floating level. ecture fields that are required are that fields Instruction Fields Instruction assemblerinstruction printed 16 15 names are listed in uppercase (SPECIAL and ADD(SPECIAL and in in uppercase names are listed rt sembler mnemonic for each valid value of the value valid each mnemonic for sembler opcode es, the operation of the processor is the operation es, and the architecture level at which the at which level architecture the and shown in register formin at the topregister of shown the instructiondescription. The follow- as the lowercase names of the appropri names of the as the lowercase mnemonic are printed as page headings printed are mnemonic 21 20 ons are backwards compatible. The origin compatible. The ons are backwards rs Figure 2.2Figure Example of Figure 2.4Figure Example of Instruction Format ptive Name and Mnemonic Volume IV-j: The MIPS64® SIMD Arch ADD fd,rs,rt ADD 26 25 field. If the instruction definition was later instructionIf the was definition field. ). The MIPS architecture levels are inclusive; higher inclusive; are MIPS architecture levels ). The Figure 2.3Figure Name and Mnemonic Descriptive of Instruction Example 6 5555 6 Format: 000000 SPECIAL instruction lists both ADD.S and ADD.D. ). Add Word 31 Constant values in a field are shown in binary below thesymbolicvalue. hexadecimal in binaryor below are shown ina field Constant values 2.2 If such fields are set to non-zero valu fields such If C.cond fmt . variable parts, the operands, are shown operands, are shown parts, the variable which the instruction was first defined, for example “MIPS32” is shown at the rightside of“MIPS32” shown the page. example is for which defined, the instruction first was archit format for each can be more than one assembler There ADD fmt show an assembly format withactual as format the an assembly show The assembler formats for the instruction for the instruction formats assembler The inthe Format given instructi to Extensions levels. vious architecture. extended with literalpartsassembler format The the of is shown The instruction descriptive name and instruction descriptive The an example, (for extension of order in their shown are definition extended the for formats the assembler and extended see 2.3 Fields encoding the instruction word are instruction Fields word the encoding ingrules are followed: •and the of constant fields The values •names the instructionused in are listeddescription with the lowercase ( fields Allvariable •unused are named not are zeros but that contain Fields 2.1.3 Field Format 2.1.2 Instruction Descri MIPS® Architecture for Programmers Programmers MIPS® Architecture for

21 Control / Control Description . “CP1 register “CP1 register . fs e Instruction Fields e Instruction ” is thefloating” point to produce a 32-bit produceto rs ) ) FCSR 2.1 Understanding th fd. “ ADD fmt ) itecture Module, Revision 1.12 Revision Module, itecture section. ” is CPU general-purpose register specified by the specified register ” is CPUgeneral-purpose ations in the formats (once in the formats ations vari explain help to comments rt ADD.fmt the instruction in text, tables, and figures. This description This description and figures. tables, instructionin text, the Operation instruction are used in thearithmetic or logical operation. ) rt of theassembler format. Integer Overflow exception occurs. exception Overflow Integer DADD is added to the 32-bit value in GPR in 32-bit value the added to is rt GPR[rs] + GPR[rt] GPR[rs] specified by the instruction field field by the instruction specified  Volume IV-j: The MIPS64® SIMD Arch Figure 2.5 Example of Instruction Purpose GPR[rd] ” is the floating pointby registeroperand the instruction specified field Figure 2.6Figure Description of Instruction Example . Add Word Add rd ). These comments are not a pa comments are not ). These . “FPR fs . GPR register is not modified notis and an modified register rt field documentspossible field restrictionsany that the instruction. mayMost affect restrictions into fall Description: GPR in value word 32-bit The • the 32-bitresult is signed-extended and placedinto the addition If doesnot overflow, Purpose: then trap. occurs, If an overflow 32-bit integers. add To result. • destinationthe complement arithmeticoverflow, the addition If results in32-bit 2’s field gives a short a descriptionthe ofof use the instruction. gives field C.cond.fmt lid values for instruction fields (for example, see floating see point(for example, for instruction fields lid values register. a Restrictions Status fd” is the coprocessor 1general register instruction field instruction field This section uses acronyms for register descriptions. “GPR register This for section uses acronyms The body of the section is a description of the operation of of the operation a description is section the of body The complements the high-level languagecomplements description inthe high-level the The heading. The main purpose is to show how fields in the in fields how is to show main purpose heading. The six categories: theone of following •V Ifa one-line symbolic description ofthe instruction it is feasible, appears immediately therightto ofthe again, see again, The Purpose The assembler format lines sometimes include parenthetical include sometimes lines format assembler The • ALIGNMENTrequirements for memory see LW addresses (for example, •see of operands (for example, values Valid •see floating operand example, pointformats (for Valid 2.1.6 Restrictions Field 2.1.5 Description Field 2.1.4 Purpose Field MIPS® Architecture for Programmers Programmers MIPS® Architecture for

22 section. equal), e Instruction Fields e Instruction ) te in itself because many many itself because in te 63..31 Exceptions nts avoid pipeline hazards pipeline avoid nts 31..0 ). MUL ||GPR[rt] s exceptions that can be caused by asyn- that can be caused exceptions s e instructions because the relationship relationship the because e instructions 31 2.1 Understanding th itecture Module, Revision 1.12 Revision Module, itecture section; it is not comple is section; it These ordering constrai ordering These for more information ontheformal notation used ) ) + (GPR[rt] ) + 31..0 ) ror and for load stor a Error exception may be caused by the operationexception Error a Bus a of Description SC 31..0 LL/ ent exceptions that are not present in the not present that are ent exceptions . UNPREDICTABLE rdware interlocks (for example, see example, (for interlocks rdware e in the pseudocode orare omittedfor legibility. ||GPR[rs] then that can be caused by Operationthat of theinstruction. It omitsthat exceptions does not contain sign-extended 32-bit(bits values 31 31 for instance, TLB Refill, and also omit and instance, TLB Refill, for guarantee correct execution. guarantee rs temp 

or GPR Volume IV-j: The MIPS64® SIMD Arch Figure 2.8 Example of Instruction Operation Figure 2.9Figure Exception of Instruction Example rt 32 is section does not list Bus Er section does is Figure 2.7 Example of Instruction Restrictions SignalException(IntegerOverflow)  sign_extend(temp GPR[rd] UNPREDICTABLE else if temp endif if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then or NotWordValue(GPR[rt]) if NotWordValue(GPR[rs]) endif  (GPR[rs] temp field lists the exceptions field field describes the operation of the instruction as pseudocode in a high-level language notationresem- high-level describespseudocodea the operationof the instructionin as field Exceptions: Exceptions: Integer Overflow Operation: Restrictions: If either GPR then the result of the operation the result is then Exceptions for whichha some processors notdo have Operation 2.2 “OperationSection Notationand Functions”on page 23 • see (for example, types access memory Valid The An instructionmay implementation-dependcause can be caused by the instruction fetch, be caused by the instruction fetch, can suchas an Interrupt. Although events chronousexternal instruction, or th store load dependentare uponthe implemen-between load andstore instructions Buserror indications, and external Error, like tation. The •to necessary of instructions Order bling Pascal. This formal bling description Pascal. complements the of the restrictions are either difficult to includ the restrictions are either difficult of See here. 2.1.8 Exceptions Field 2.1.7 Operation Field MIPS® Architecture for Programmers Programmers MIPS® Architecture for

23 havior is abstracted into abstracted is havior make the pseudocodemoremake bed in the previous chapter. Specific Specific chapter. previous in the bed entors, respectively, but that is not nec- is not that but respectively, entors, unctions inthis are defined section, and 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture the coprocessor itself. This be itself. coprocessor the ns. These are used either to are ns. used These section are executed sequentially (except as constrained as sequentially (except are executed section ocessor does with a word or doubleword suppliedocessor does with to itor doubleword word anda Operations section uses a high-level language notation to describesectionthehigh-level uses operationper- a ols used in the pseudocode are descri are ols the pseudocode used in Implementation Notes Fields Notes Implementation that is useful for programmers and implem and programmers useful for is that Operation in the pseudocode descriptio Volume IV-j: The MIPS64® SIMD Arch Figure Figure 2.10 Notes Programming of Instruction Example Programming Notes: Programming ADDU arithmeticperforms the same does nottrap onoverflow. operation but sections contain material material contain sections “Coprocessor General“Coprocessor RegisterFunctions” Access onpage 23 “Memory Operation Functions” on page 25 “Floating PointFunctions” onpage 28 “Miscellaneous Functions” on page 31 “InstructionExecution Ordering” on page 23 “PseudocodeFunctions” onpage 23 Notes the functions described in thissection. include the following: • • • • between coprocessor and doublewords instructionswords to exchange have CP0, for except , Defined What a copr the system. rest of and the general registers how a coprocessor supplies a word or doubleword is defined by defined is doubleword or word a supplies a coprocessor how pseudocode functions are described below. This sectionpresents information abouttopics: the following • • entation-specific behavior, or both.f These behavior, implementation-specific to abstract readable, by conditionaland loop constructs). There are several functions used several There are Each of the high-level language statements in the in statements language high-level of the Each formed by each instruction. Special symb Inan instruction description, the essary to describe instructionthe anddoesnot belong inthedescription sections. The 2.2.2.1 Coprocessor General Register Access Functions 2.2.2 Pseudocode Functions 2.2.1 Ordering Instruction Execution 2.1.9 Notes and Programming MIPS® Architecture for Programmers Programmers MIPS® Architecture for 2.2Notation and Functions Section Operation

24 to supply the contents of the low- of contents to supply the d be tostore contentsthe of mem- lythe contentswordof in the low-order 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture to supplya of word data during a storeopera- word to supply a doubleword of datato duringsupply a store doubleword a dou- z z r-specific. Thetypical action bewould to store r-specific. thecon- . rt rt. ssor-specific. The typical action would be would The typical action ssor-specific. ocessor-specific. action woul The typical ocessor-specific. . rt specific. The typical action would be to supp to be The typical action would specific. Volume IV-j: The MIPS64® SIMD Arch rt. Figure 2.12Figure Function COP_LD Pseudocode Figure 2.11Figure Function Pseudocode COP_LW Figure Figure 2.13 Function COP_SW Pseudocode z: number unit coprocessor The specifier register general Coprocessor rt: value word : 32-bit dataword */ action Coprocessor-dependent /* z: number unit coprocessor The specifier register general Coprocessor rt: coprocessor. the to supplied value : 64-bit doubleword memdouble */ action Coprocessor-dependent /* z: number unit coprocessor The specifier register general Coprocessor rt: coprocessor to the supplied value 32-bit word : A memword */ action Coprocessor-dependent /* COP_SW (z, rt) COP_SW (z,  dataword COP_SW endfunction COP_LD (z, rt, memdouble) (z, COP_LD COP_LD endfunction COP_LW (z, rt, memword) (z, COP_LW COP_LW endfunction bleword operation. The actionis coprocebleword tion.The action is coprocessor- coprocessor general register in coprocessor general register order doubleword COP_SD COP_SDThe the actionbyfunction defines taken coprocessor COP_SW COP_SWThe the actionbyfunction defines taken coprocessor COP_LD the action COP_LDby coprocessor z when suppliedfrom The taken memoryfunction defines with a doubleword operation. Theactionduringa load is coprocesso doubleword word in coprocessor general register register general coprocessor in word tents of memdouble in coprocessorgeneral register COP_LW bythecoprocessor actiontaken whenfunction z defines suppliedfrom memory withword a during a COP_LW The loadword operation. The action copris MIPS® Architecture for Programmers Programmers MIPS® Architecture for

25 Access- be determined directly ), find thecorresponding find ), IorD by are determined directly the CCA rd, or doubleword is the smallest byte is the smallest or doubleword rd, not present in the TLB or the desired the in desired the TLB or not present ) used to resolve the vir- reference. If the ) resolve used to . The. byteswithin the addressed unitof address and its cacheability and coherency andits cacheabilityaddress and coherency 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture Table 2.1 processors) that are used can processors) field of instruction */ of instruction field function e mapped address spaces then the TLB or fixed mapping MMU or fixed then the TLB spaces e mapped address the specified Coprocessor operation. Coprocessor specified the type; if the required translation is translation is the required type; if address spaces, the physical address and address spaces, the physical address ed to resolve the memory reference. memory the resolve to ed tle-endian), the address of a halfword, wo of a halfword, address the tle-endian), fails and an exception is taken. and an exception fails translates a virtualtranslates a address toa physical , and whetheris Instructionstothereference or Data ( value to coprocessor z */ to coprocessor value cop_fun thebig-endianobject. For ordering this little-endian isthefor a byte; most-significant Volume IV-j: The MIPS64® SIMD Arch Figure 2.14Figure Function Pseudocode COP_SD vAddr AddressTranslation (vAddr, IorD, LorS) (vAddr, AddressTranslation ) and the cacheability and coherency attribute (CCA attribute coherency and the cacheability ) and and the two or three low-order bits of the address. bits low-order or three two and the : function from Coprocessor Figure 2.16Figure Function Pseudocode AddressTranslation COP_SD (z, rt) (z, COP_SD Figure 2.15Figure Function Pseudocode CoprocessorOperation : */ address physical pAddr : caches*/ to access method used Attribute,the Cacheability&Coherency : */ unit number Coprocessor AccessLength /* pAddr /* CCA /* Transmit /* Transmit the /* z /* cop_fun z: number unit coprocessor The specifier register general Coprocessor rt: value doubleword 64-bit : datadouble */ action Coprocessor-dependent /* field. The valid constant names and values are shown in are shown values and names constant valid The field. pseudocode for load and store operations, the following functions summarize the handling of virtual of handling summarizethe functions following the operations, and store for load pseudocodeOperation (pAddr, CCA)  CCA) (pAddr, endfunction CoprocessorOperation endfunction CoprocessorOperation (z, cop_fun) CoprocessorOperation datadouble  datadouble endfunction COP_SD endfunction access permitted, is not thefunction determines the physical address and access address the physical determines virtual address. If the virtual address If is in one of th the virtual address address. virtual Length ordering thisis thebyte. least-significant the In inthe is passed storedloadedor to thedata be item of The size memory. physical and theaccess of addresses physical address ( tual address is in the unmapped address one of is tual memory (word for 32-bit processors or doubleword for 64-bit 32-bitfor memory processors or doubleword (word address of the bytes that form that the bytes of address fromthe us describing the mechanism attribute, the virtual address Given AddressTranslation function AddressTranslation The Regardless of byte ordering (big- or lit (big- ordering byte of Regardless CoprocessorOperation CoprocessorOperationThe function performs 2.2.2.2 Functions Operation Memory MIPS® Architecture for Programmers Programmers MIPS® Architecture for

26 size ) and AccessLen- CCA . The data is is data The . pAddr and the and the pAddr hy (data caches and main mem- (data hy reference. At a minimum, this reference. At a , an implementation-specific an implementation-specific cache, om memory and marked as valid within as valid marked andmemory om g at physical location at physical g contains the data for an aligned, data for an aligned, the contains MemElem 2.2 Functions and Notation Section Operation ). The ). itecture Module, Revision 1.12 Revision Module, itecture need to be passed to the processor. If the memory memory the If processor. the to be passed to need CCA ). The low-order 2 (orMemElem 3) bitsoflow-order ). the addressThe using the memory hierarc the using rs, a doubleword for 64-bitfor rs, processors),a doubleword though only the data should be stored; only these bytes in memory will actu- will memory in bytes these only stored; be should data MemElem memory bytes, startinmemorybytes, in both the Cacheability and Coherency Attribute ( Attribute both the Cacheability and Coherency in pAddr but the data is not present in present the data is not but ed into the cache to satisfy a load a satisfy cache to ed into the MemElem cached AccessLength , only the referenced bytes are read fr read only the referenced bytes are , ility and Coherency Attribute ( Attribute and Coherency ility memory need be valid. The low-order two (or three) bits of bits (or three) two low-order The valid. be need memory uncached ads a value from memory. a value ads rally aligned memory element ( rally ores a value to memory. value ores a : Length, in bytes, of access */ access bytes, of in : Length, Volume IV-j: The MIPS64® SIMD Arch into the physical location the physical into and memory and resolve the reference */ the reference resolve memory and and width is the same size as the CPU general-purpose register, */ register, CPU general-purpose as the same size is the width */ boundary, 32- or 64-bit on a aligned 64 bits, 32 or */ respectively. Figure 2.17Figure Function Pseudocode LoadMemory Figure 2.18Figure Function Pseudocode StoreMemory : */ The alignment. a natural width with a fixed in is returned Data indicate which of the bytes within the bytes of whichindicate : */ address virtual : : */ address physical */ address virtual : */ or Data for Instructions is access whether Indicates : DATA */ or for INSTRUCTION is access whether Indicates : */ STORE or for LOAD is access whether Indicates :*/ caches access used to Cacheability&CoherencyAttribute=method ) to find the contents of contentsthe to find ) block of memory is read and load memory is read and block of IorD IorD /*/* vAddr */ the reference resolve and memory and /* /* AccessLength /* vAddr /* IorD /* MemElem /* /* pAddr /* /* /* /* CCA /* LorS MMU */ the appropriate for description address translation See the /* */ mechanism translation exact the book for of this Volume III type in /* AccessLength StoreMemory (CCA, AccessLength, MemElem, pAddr, vAddr) pAddr, MemElem, (CCA, AccessLength, StoreMemory endfunction LoadMemory endfunction LoadMemory (CCA, AccessLength, pAddr, vAddr, IorD) vAddr, pAddr, AccessLength, (CCA,  LoadMemory MemElem endfunction AddressTranslation endfunction alignment field indicate which of the bytes within the within bytes the of which indicate field block is the entire memory element. the block entire memory is and the memory element. If the access type is type access the memory element. If the ory) as specified by the Cacheab as specified ory) memory elementfor 32-bit(a word processofixed-width to that are actually stored bytes gth ally be changed. ally and the and type of the reference is access StoreMemory StoreMemory function st The stored is data specified The LoadMemory LoadMemoryThe function lo cache and main memory as specified uses This action the access ( access the natureturned in a fixed-width MIPS® Architecture for Programmers Programmers MIPS® Architecture for

27 itecturally visible state. itecturally 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture tion-specific action is taken. The action taken may taken The action action is taken. tion-specific the program or alter arch 10 bytes (16 2 bits) (8 bits) byte 1 cy Attribute, the method used to access */ to used method the cy Attribute, nd stores to synchronize shared memory. synchronize shared to nd stores on for which an implementa on an for which : Length, in bytes, of access */ access bytes, of in : Length, Volume IV-j: The MIPS64® SIMD Arch Figure 2.19Figure Function Pseudocode Prefetch ches data from memory. from data ches AccessLength Name Value Meaning caches and memory and resolve the reference. */ reference. the resolve and memory and caches The width is the same size as the CPU general */ general the CPU size as the same width is The */ 8 bytes, 4 or either register, purpose */ For a boundary. 4- or 8-byte on a aligned be*/ that will bytes only the store, partial-memory-element be valid.*/ must stored BYTE DOUBLEWORDSEPTIBYTESEXTIBYTEQUINTIBYTE 7WORD bits) bytes (64 8 TRIPLEBYTE 6HALFWORD 5 bytes (56 7 bits) 4 bytes (48 6 bits) bytes (40 5 bits) 2 3 bytes (24 3 bits) bytes (32 bits) 4 : : */ memory element. of a alignment width and in the Data Table 2.1Table Loads/Stores for Specifications AccessLength : */ address physical : */ address virtual : : */ address physical */ address virtual : DATA */ is for access that Indicates : */ of the data use the possible that indicates hint : Cacheability&Coheren :access */ used to method the Attribute, Cacheability&Coherency lists the data access lengths and their labels for loads and stores. their and lengths access lists the data /* CCA /* vAddr /* DATA /* hint /*/* pAddr */ the reference. resolve and and memory caches /* CCA /* /* AccessLength /* MemElem /* /* /* /* /* /* pAddr /* vAddr endfunction Prefetch endfunction Prefetch (CCA, pAddr, vAddr, DATA, hint) vAddr, pAddr, (CCA, Prefetch endfunction StoreMemory endfunction SyncOperation a orders loads SyncOperation function The Table 2.1 Prefetch is an advisory instructi an advisory Prefetch is the not meaning of change must performance but increase Prefetch prefet Prefetch function The MIPS® Architecture for Programmers Programmers MIPS® Architecture for

28 occur in the same order for all for all order same the occur in stype ther than unformatted contents from a from unformatted contents than ther 31..0 31..0 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture FPR[fpr] contents loaded or moved to CP1 registers are inter- registers CP1 to contents loadedmoved or 

31..0 FPR[fpr]  1]

 32 0) then 0) 0)   hronizable loads and stores indicated by loads and stores hronizable FPR contains a value in some format, ra some in value a contains FPR 0) then 

0 Volume IV-j: The MIPS64® SIMD Arch Figure 2.21Figure Function Pseudocode ValueFPR valueFPR  UNPREDICTABLE valueFPR  FPR[fpr valueFPR Figure 2.20Figure Function Pseudocode SyncOperation valueFPR  UNPREDICTABLE valueFPR if (fpr endif  FPR[fpr] valueFPR else : */ perform. to ordering of load/store Type if (FP32RegistersMode if (FP32RegistersMode else endif if (FP32RegistersMode valueFPR  UNPREDICTABLE valueFPR D, UNINTERPRETED_DOUBLEWORD: PS, OB, QH: L, S, W, UNINTERPRETED_WORD: /* value:/* FPR */ from the value formattted The /* fpr:fmt:/* The FPR number */ /*of: */ one the data, format of The /*/*/*PS, */ W, L, S, D, */ the datatype that to indicate are used values The UNINTERPRETED /* OB, QH, */ SDC1 */ SWC1 and in for example, as, is not known /* */ UNINTERPRETED_WORD, */ UNINTERPRETED_DOUBLEWORD fmt of case /* stype */ the to complete operation implementation-dependent Perform /* */ operation /* required synchronization ValueFPR(fpr, fmt)  ValueFPR(fpr, value SyncOperation(stype) SyncOperation endfunction The pseudocode shown in below specifies how the unformatted how specifies inbelow pseudocode The shown If an value. formatted a to form preted load(uninterpreted), to interpretinthat it is valid notformat the value tointerpret format).(but itin different a ValueFPR function returns from a formatted the floating value point registers. ValueFPR The processors. This action makes the effects of the sync of the the effects makes action This 2.2.2.3 Functions Point Floating MIPS® Architecture for Programmers Programmers MIPS® Architecture for

29 63..32 31..0 2.2 Functions and Notation Section Operation value value itecture Module, Revision 1.12 Revision Module, itecture  

31..0 32 32 oding representing a formatted value intois stored CP1 oding representingformatted a value ry representation is visible to store or move-from instruc-ry representation is visible orto store move-from value 

32 0) then 0)   UNPREDICTABLE  UNPREDICTABLE PR[fpr] 1]  0) then 

lue from the StoreFPR(), it is not valid to interpret the value with ValueFPR() in a lue fromtheStoreFPR(), to interpret with ValueFPR() it is not valid the value 0 Volume IV-j: The MIPS64® SIMD Arch Figure 2.22Figure Function Pseudocode StoreFPR UNPREDICTABLE  FPR[fpr] FPR[fpr F  valueFPR if (fpr UNPREDICTABLE  value FPR[fpr] endif  value FPR[fpr] else else else endif else endif if (FP32RegistersMode if (FP32RegistersMode if (FP32RegistersMode FPR[fpr]  UNPREDICTABLE FPR[fpr] endif  UNPREDICTABLE valueFPR L, PS, OB, QH: L, D, UNINTERPRETED_DOUBLEWORD: S, W, UNINTERPRETED_WORD: DEFAULT: endcase /* fpr:fmt:/* The FPR number */ /*of: */ one the data, format of The /*/*/*PS, */ W, L, S, D, value:/* */ FPR the into to be stored value formattted The OB, QH, */ */ UNINTERPRETED_WORD, */ the datatype that to indicate are used values The UNINTERPRETED /* */ UNINTERPRETED_DOUBLEWORD LDC1 */ LWC1 and in for example, as, is not known /* fmt of case endcase StoreFPR (fpr, fmt, value) fmt, (fpr, StoreFPR endfunction ValueFPR endfunction registers by a computational or move operation.by computationala Thisregisters bina move or va a an FPR receives tions. Once format. different StoreFPR The pseudocode shown below specifies the way a binary specifies enc pseudocodebelow The shown MIPS® Architecture for Programmers Programmers MIPS® Architecture for

30 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture 23+cc..0 22..0 0)) ) then  e to a specific floating point condition code. floating point a specific e to ) 11..7 23 24+cc || tf || FCSR || tf || tf || FCSR tf || || and FCSR 1) or 31..24 31..25+cc Volume IV-j: The MIPS64® SIMD Arch  16..12 17 Figure 2.24 Figure Function Pseudocode FPConditionCode Figure 2.23Figure Function Pseudocode CheckFPException Figure 2.25Figure Function Pseudocode SetFPConditionCode ((FCSR FPConditionCode  FCSR FPConditionCode FCSR  FCSR FCSR  FCSR FCSR FPConditionCode  FCSR FPConditionCode SignalException(FloatingPointException) FPConditionCode(cc) /* tf:/* code */ condition the specified value of The cc:/* */ 0..7 the range in code number Condition The cc = 0 then if if cc = 0 then if else else endif if ( (FCSR endif  endfunction StoreFPR endfunction SetFPConditionCode(cc, tf) SetFPConditionCode(cc, endif SetFPConditionCode endfunction endfunction FPConditionCode endfunction tf CheckFPException() a 1 */ is field Cause of the the E bit if is signaled exception point A floating /* */ field Cause the bit in or if any enable) have no Operations (Unimplemented /* */ both 1 are field Enable in the bit corresponding and the /* endfunction CheckFPException endfunction SetFPConditionCode SetFPConditionCode valu The functionwrites a new FPConditionCode floatingFPConditionCodepointThe specific of a condition functionreturns value the code. tingan enabled floa andconditionally for pointchecks exception. signalstheexception below shown pseudocode The CheckFPException MIPS® Architecture for Programmers Programmers MIPS® Architecture for

31 ion pseudocode never sees a sees a pseudocode never ion ion pseudocode never sees a pseudocode never ion sees a pseudocode never ion 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture fect of the instruction, but also inhibiting all exceptions all exceptions inhibiting also but of the instruction, fect n in question. For branch-likelyinstructions, nullification For question.in n ction. The instruction operat The instruction ction. ction. The instruction operat The instruction ction. operat The instruction ction. ntException Pseudocode Function Pseudocode ntException ot of the branch likely instruction. of the branch likely ot signals an exception condition. signals exception an Volume IV-j: The MIPS64® SIMD Arch :*/ exists. that condition exception The Figure Figure 2.26 Function Pseudocode SignalException Figure 2.27Figure SignalDebugBreakpoi Figure Figure 2.28 Function Pseudocode SignalDebugModeBreakpointException /* Exception argument:/* */ if any argument, A exception-dependent SignalDebugModeBreakpointException() SignalDebugModeBreakpointException endfunction SignalDebugBreakpointException() SignalDebugBreakpointException endfunction SignalException(Exception, argument) SignalException(Exception, SignalException endfunction returnfrom this function call. returnfrom this function call. returnfrom this function call. This section lists miscellaneous functions not covered in previous sections. This sectioninprevious miscellaneouslists functions not covered SignalException SignalException function The This actionresults in anthat exception aborts theinstru detected during fetch, decode, or execution of the instructio of the or execution fetch, decode, duringdetected sl delay the in instruction the kills NullifyCurrentInstruction NullifyCurrentInstructionThe functionthe currentnullifies instruction. ef functionalthe only not aborted, inhibiting is instruction The SignalDebugModeBreakpointException SignalDebugModeBreakpointExceptionThe function signalsa condition thatMode fromcauses entry into Debug Mode generated (i.e., an exception Mode). alreadywhile running Debug in Debug This actionresults in anthat exception aborts theinstru SignalDebugBreakpointException SignalDebugBreakpointExceptionfunctionsignalsThe condition a Mode thatentryfromnon- causes into Debug Mode. Debug This actionresults in anthat exception aborts theinstru 2.2.2.4Functions Miscellaneous MIPS® Architecture for Programmers Programmers MIPS® Architecture for

32 lue contains a valid word lue containsvalid a 2.2 Functions and Notation Section Operation itecture Module, Revision 1.12 Revision Module, itecture ) i termines whethertermines the 64-bit va 32 r the PC-relative instructions ASE. The in the MIPS16e the r PC-relative ) 31 || 0 is executed in a jump delayslot.isA executed jump immedi-delay slotalways (value 

(31-i)..0 vAddr 63..32 Volume IV-j: The MIPS64® SIMD Arch Figure 2.32Figure Function Pseudocode PolyMult Figure 2.31Figure Function Pseudocode NotWordValue Figure 2.30Figure Function Pseudocode JumpDelaySlot = 1 then :Virtual address */ address :Virtual i Figure 2.29Figure Function PseudoCode NullifyCurrentInstruction temp xor (y  temp temp NotWordValue(value)  if x endif temp  0 temp .. 31 i in 0 for endfor  temp PolyMult /* vAddr /* result:/* /**/ value; word sign-extended not a correct value is if the True value:/*  value NotWordValue */ checked to be value register A 64-bit */ otherwise False PolyMult(x, y) PolyMult(x, endfunction NotWordValue endfunction result result endfunction PolyMult endfunction JumpDelaySlot(vAddr) JumpDelaySlot(vAddr) endfunction JumpDelaySlot endfunction NullifyCurrentInstruction() NullifyCurrentInstruction endfunction (32-bit) value. Such a value has bits 63..32Such equal a value to bit(32-bit)31. value. PolyMult PolyMultThe function multiplies two binary polynomialcoefficients. function returns TRUE if the instructionthe if at function returns TRUE NotWordValue that de boolean value returns a function NotWordValue The JumpDelaySlot JumpDelaySlotThe functionis used inthe pseudocode fo ately follows a JR, JAL, JALR, or JALX instruction. MIPS® Architecture for Programmers Programmers MIPS® Architecture for

33 fs, ft, imme- fs, ft, s of specific instructions. For For instructions. s of specific function subfields. op and 2.3 Notation Subfield Function and Op in an instruction format (such as an in itecture Module, Revision 1.12 Revision Module, itecture can have constant 5- or 6-bit values. When reference is is reference When values. 5- or 6-bit constant have can riable subfield in the formatriablesubfield the chapters describing the CPU, FPU, MDMX, and MIPS16e the CPU, FPU, MDMX, describing chapters the for a description of the the of description a for function structions. Such an alias is always lowercase it refers to a since lowercase structions. is Such an alias always and op The instructionThe name(such as ADD, andinupper- SUB, soon) is shown instruction, all variable subfields subfields all variable instruction, Volume IV-j: The MIPS64® SIMD Arch ADD. In other cases, a single field has both fixed and variable subfields, so the name con- so the name subfields, variable and both fixed has single ADD.field a In other cases, in the format for load and store in load and store the format for in function= rs=base rs=base , and so on) are shown in lowercase. in lowercase. are shown and so on) , “Op and Function Subfield Notation” on page on Notation” Subfield Function and “Op 33 COP1 and tains both upper- and lowercase characters. lowercase and both upper- tains made to these instructions, uppercase mnemonics instance,instructions, uppercase are For to these used. made inthefloating point ADD instruction, op= In the detailed description of each FPU the detailed description In diate In some instructions, the instruction subfields subfields instruction the instructions, some In case. va used for a is sometimes an alias of clarity, the sake For example, example, subfield. variable I, in Volume in are given for mnemonics encodings Bit instructions. See MIPS® Architecture for Programmers Programmers MIPS® Architecture for 2.4 Instructions FPU 2.3 Notation Subfield and Function Op

ta. In the case of a float- of case the In ta. chitecture, but rather in but chitecture, full 32 vector registers set registers vector full 32 e physical vector registers to the registers vector physical e lected, simpleSIMD instruction set re than 150 new instructions operat- re than 150 new ftware-programmable solution in the the in solution ftware-programmable ons. This functionality is of growing Thisfunctionalityons. growing is of MD Architecture (MSA). ith leveraging generic compiler support. generic ith leveraging itecture Module, Revision 1.12 Revision Module, itecture 34 is ideal for these applications. these ideal for is ts 128-bit wide vector registers sharedwith the64-bitregisters 128-bit ts wide vector rise applications. and64-bitfloating-point data. 16-bit floating-point stor- in thatlends itselfwell toSIMD. These assembly for any specific ar specific assembly for any than 32 physical vector registers per hardware threadhardware per con- registers than 32 physical vector gisters as needed, up to the needed, up to gisters as pport within high-level languages,enabling andsimple pport withinfast high-level to/from 32-bitfloating-point da ger, 16-and 32-bit fixed- point, 32-16-andor and 64-bit32-bitfixed- float- ger, eature extraction in video, eature imageandextraction video processing, ures of ures the MIPS64® SI rict adherence to RISC (Reduced Instruction Set Computer) design princi- design Computer) Set Instruction (Reduced RISC to adherence rict d MIPS architecture with a set of mo ectronics and enterp and ectronics pending requests. The actual mapping of th requests. The actual mapping of pending s with a designed the MSA carefully se nt in terms of speed, area, and power speed, area, and power of terms nt in also hardware-efficie but er-friendly, parallel processing of vector operati parallelprocessingof vector system flexibility, and the MSA systemflexibility, (MSA) module adds new instructions tomoduletheindustry-standard(MSA) adds new Release MIPS 5 intensive w applications in conjunction intensive e current release, e MSAcurrent release, implemen Volume IV-j: The MIPS64® SIMD Arch -2008.for 32-bitAll standard operations are provided TM A wideapplicationsrange of – including data mining,f human-computer interaction, and others – have some built- human-computer interaction,and–othershave compute-intensive software packagescompute-intensive will not writtenbe in high-level languages using high-level operationsdataon types. vector st with implemented was module MSA The thatis not onlyprogrammer- and compil consumption. The simpleinstructions are also easytosu ples. From the beginning, MIPS From thearchitectbeginning, ples. code. existing of code, as wellas leverage of new development feat purpose and key the This chapter describes The MSA complements the well-establishe MSA complements The The MIPS® Architecture The SIMD efficient allow (“R5”) architecture that consumer el range of a importance across In consumer electronics, while dedicated,while non-programmableconsumer electronics, In hardware aids theCPU and GPU by handling so a adding toward trend a recognized is there codecs, multimedia heavy-duty CPU to handle emerging applications by thetoa smallIn CPU handlethisor emerging dedicated numberof hardware. functionsnot covered increased SIMDcan provide way, instruc- defined narrowly on focusing Rather than nsion. exte SIMDmultimedia another just not is MSA the However, optimized writtencode tionsthat manually must inhave assemblylanguage inorder to be utilized, the MSA is compute- designed to accelerate age format is supported through conversion instructions age formatis supported through conversion defined by the architecture. When the hardware runs out of physical vector registers, the OS re-schedules the runningthe OS re-schedules the registers, vector physical of out runs hardware the When architecture. the by defined the accommodate to or processes threads text. The contexts have access to as many vector re vector to as many access have thread contexts The text. ing on 32 vector registers of 8-, 16-,8-, 32-, and 64-bitof registers inte ingon 32 vector ing-pointdata elements. Inth widefloating-point unit (FPU) registers. Inmulti-threaded implementations,for fewer MSAallows hardware thread contexts is managedthread contexts by the hardware. hardware MSAThe floating-point implementationis compliant with theIEEE Standardfor Floating-Point Arithmetic 754 MIPS® Architecture for Programmers Programmers MIPS® Architecture for 3.1 Overview The MIPS64® SIMD Architecture The MIPS64® Chapter 3 Chapter

35 Section 3.2 MSA Software Detection for MSA, which means that bit), doubleword (64-bit). Cor- (64-bit). bit), doubleword (MIPS DigitalMedia eXtension):a t the need for software emulation for emulation t the need for software s. MSA and FPU can notbeboth pres- , is used to enable the MSA instructions. the MSA to enable used is , bit (CP0 Register 16, bit bit Select 3, (CP0 Register 28) as

not zero element(s) or vector value. value. not vector or zero element(s) number of elements indexed from0tonumber n, of elements indexed itecture Module, Revision 1.12 Revision Module, itecture MSAP Figure 3-2 Figure ecture, Release 5 or later. 5 or Release ecture, Config3 both MSA and the scalar floating-point unit (FPU) are pres- unitfloating-point scalar the and MSA both halfword (16-bit), halfword (32- word not set causes not a MSAset causes DisabledException, see . The reset state of the MSAEnstate bit is zero. The reset . element is precisely identified withou identified precisely is element tend 64-bit and share the FPU register Branch instructions test for zero instructions or test for Branch pioneered by MIPS and its earlier MDMX its and by MIPS pioneered entation of theMIPS32 Archit Volume IV-j: The MIPS64® SIMD Arch . MSAP bit is fixed bytheimplementation hardware MSAP. bit is and is read-only fixed thefor software. The soft- Figure 3-2Figure Bit MSA 5) Enable Select 16, (CP0 Register Config5 Figure 3-1 MSAEn MSAEn bit (CP0 Register 16, Select 5, bit 27), shown in 27), shown bit 5, Select 16, Register (CP0 bit MSAEn Figure 3-1Figure Bit Present MSA 3) Implementation Select 16, (CP0 Register Config3 MSAP Executing a MSA instruction whenExecuting MSAEn bit is 3.5.1 “Handling the MSA Disabled Exception” Config5 ware may determineware MSAifthe is implementedby MSAchecking if theMSAP bitattempt is set. Any to execute instructions mustInstruction causea Reserved Exception if the MSAPbit is not set. responding to the associated data format, a vector register consists of a of consists register dataa vector format, associated responding to the ent, unless the FPU has 64-bitfloating-point registers. four dataformats: byte (8-bit), have register vector MSA The MSA operates on 32 128-bit wide vector registers. If registers. vector wide 128-bit MSA operates on 32 The ex registers ent, the 128-bit MSAvector simple, yet very efficient instructionset. Theopcodes allocated to MDMXreused are efficient yetsimple, very MDMXis timeat the deprecated the release of MSA. of implem compliant requires a MSA in shown The presence of MSA implementation is indicated by the indicated is MSA implementation presence of The all vector elements. all vector compare andFor branch,MSA no uses global condition flags: compare instructions write resultsthe ele- per vector bit values. one zero or all all ment as same principles the on is built MSA ing-point exception, each faulting vector vector each faulting exception, ing-point 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 MIPS® Architecture for Programmers Programmers MIPS® Architecture for 3.3 Registers Vector MSA 3.2 MSA Software Detection

36 0 ele- th LSB LSB [0] 5 0 [0] 61 [1] 3.3 MSA Registers Vector ta formats where [n] refers to [n] refers where formats ta 11 [0] 23 32 31 LSB MSB [2] 0 and the most significant bitthe 0 of n and the most significant 73 [1] 84 itecture Module, Revision 1.12 Revision Module, itecture [3] 34 out for elements of all four da all four of elements for out 46 64 63 0 64 63 LSB MSB LSB MSB [4] 96 ficant and Least Significant Byte. Significant and Least Most Significant the element’s for stand 07 [2] element is the vector register bit register the vector element is th [5] show the vector register lay register vector the show Volume IV-j: The MIPS64® SIMD Arch 5 8 Figure3-3 Register Byte ElementsVector MSA Figure 3-5Figure Elements Word Register MSA Vector 69 [1] Figure 3-4Figure Elements Halfword Register MSA Vector 96 95 Figure Figure 3-6 Elements Doubleword Register Vector MSA Figure 3-6 Figure LSB MSB [6] elements. th 1 9 1 through 21 [3] 1 vector element and MSB and LSB element and vector th [7] the n Figure 3-3 Figure where the least significant bit of the 0 bit of the least significant where ment is the vector ment register bit the127.is vector theWhen floating-pointboth are present, FPU andMSA reg- onthecorrespondingare mapped MSAregisters vector isters as the 0 71 [15] [14] [13] [12] [11] [10] [9] [8] [7] [6] [5] [4] [3] [2] [1] [0] MSB LSB MSB LSB MSB LSB MSB LSB MSB LSB MSB LSB MSB LSB MSB LSB 2 3.3.1 Layout Registers MSB MSB 1 2101121140998887776655444333221187 0 127120119112111104103969588878079727164635655484740393231242316158 127 127 MIPS® Architecture for Programmers Programmers MIPS® Architecture for

) 37 . Figure3-9 register (CP0 Regis- (CP0 register ), and 8 rows ( ), and 8 rows 3.3 MSA Registers Vector Config dress. The byte order The of each byte dress. for a MSA vector consisting for of a MSA vector Figure 3-8 nal byte array, with as many rows as rows as many with nal byte array, e BE bit in the CP0 the in bit BE e itecture Module, Revision 1.12 Revision Module, itecture element at the lowest byte ad byte element at the lowest th , the 1-row array is reduced to the vector shown in Figure 3-3 shown to the vector array is reduced the 1-row , ), there are 4 rows for word ( word for are 4 rows ), there [9][7] [8] [5] [6] [3] [4] [1] [2] [0] [15] [14] [13] [12] [11] [10] shows representation the memory shows 15870 ention as indicated by th as ention [7][6][5][4] [3][2][1][0] Figure 3-7 [15] [14][11] [13] [10] [12] [9] [8] 31 24 23 16 15 8 7 0 Table 3.1 structions SLD and SLDI is a 2-dimensio is SLDand SLDI structions Volume IV-j: The MIPS64® SIMD Arch memory starting fromthe 0 [7][6][5][4][3][2][1][0] [15] [14] [13] [12] [11] [10] [9] [8] 63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0 Figure 3-7 Register as Byte 2-Row MSAArray Vector Figure 3-9 Register as Byte 8-Row MSAArray Vector Figure 3-8 Register as Byte 4-Row MSAArray Vector MSA vectors are stored in are stored vectors MSA For halfword, the byte array has 2 rows ( the 2 rows byte array has halfword, For for doubleword data format. for doubleword ter 16,ter Select 0, bit example, 15).For elementsword inboth big- andlittle-endian mode. element follows the big- or little-endianconv or the big- follows element bytes indatathe format. byte integer formatdata For The vector register layout for slide in register vector The MIPS® Architecture for Programmers Programmers MIPS® Architecture for

38 in 32-bit mode.i.e. bit cilitate register data shar- data register cilitate 3.3 MSA Registers Vector (CP1 Control 31, Register 2 1 0 3 6 5 4 7 8 9 10 11 14 13 12 15 clearwill generate theReserved ABS2008 FR Address Offset Address Big-Endian Byte Big-Endian FPU is required to use 64-bit floating-point64-bit use to is required FPU (CP Register 12, Select0, bit26)12, is notset. (CP Register FCSR Status FR itecture Module, Revision 1.12 Revision Module, itecture and r 1) is usable and operates usable is r 1) Status e MSA vector registers. To fa To registers. vector MSA e set and 1 2 3 0 5 6 7 4 8 9 CU1 NAN2008 10 11 13 14 15 12 Status FCSR Address OffsetAddress Little-Endian Byte Little-Endian (CP1 Control Register 0, bit 22) are set. are 22) bit 0, Register Control (CP1 nt instructions and vector instructions, theinstructions, and vector instructions nt Volume IV-j: The MIPS64® SIMD Arch Byte [0] Byte / LSB [0] Byte [1] Byte [2] Byte / MSB [3] Byte [0] Byte / LSB [0] Byte [1] Byte [2] Byte / MSB [3] Byte [0] Byte / LSB [0] Byte [1] Byte [2] Byte [3] Byte / MSB [3] Byte [0] Byte / LSB [0] Byte [1] Byte [2] Byte / MSB [3] F64 Table 3.1Table MemoryRepresentation Vector Word FIR -2008,theread-only bitsi.e. t Registers Mapping t Registers TM and [0] [1] [2] [3] Word Vector Element Vector Word Word Word Word Word Register 12, Select(CP 0, bit is 29) and bitset MSAP CU1 Any attempt to execute MSA instructions with with instructions MSA execute to attempt Any Instruction exception. Instruction Config3 bits18and are set. 19) Status Arithmetic754 The scalar floating-point unit (FPU) registers are mapped on th are mapped registers (FPU)unit scalar floating-point The ing between scalar floating-poi scalar between ing operating in 64-bit mode.More specifically: registers •MSA If andFPU areboth present, then theFPU mustimplement 64-bit floating pointbitsi.e. registers, •instructions are not enabled while the FPU (Coprocesso MSA •be compliantFPU must then the with the IEEE StandardFPUare both present, MSAFloating-Point and for If 3.3.2 Floating-Poin MIPS® Architecture for Programmers Programmers MIPS® Architecture for

39 to V 0 0 and the V Figure 3-11 Figure MSAP and and set) set) FR to the floating-point FR Unchanged set) Config3 Word value V V = 0, …, 31, writes writes …, 31, 0, = to the element with r FR l remaining elements areremaining elements l V Figure 3-10 Figure 3.4 MSA Control Registers value of the element with of element the value = 0, …, 31, returns the value of value the = 0, …, 31, returns , where , r r ting-point registers are defined as as are defined registers ting-point 32 31 32 31 where Doubleword value value Doubleword V UNPREDICTABLE. r, = 0,…, 31, writes r word element 0, and al word itecture Module, Revision 1.12 Revision Module, itecture after writing 32-bita value . UNPREDICTABLE Word value r = 0, …, 31, returns the the returns 31, …, 0, = , where r r preserves . . r r r, where 64 63 64 6364 63 0 r, ations for the FPU/MSA mapped floa mapped the FPU/MSA for ations to the high part of the floating-point register register floating-point of the part high the to shows the vector register register vector the shows and manage the MSA state and resources. Two dedicated instructions are pro- are dedicated instructions Two and manage the MSA state and resources. V . The element’s format is word for for 32-bitformatis word (single precisionfloating-point)read or . Theelement’s and all remaining elements are and all remaining r 1 in the vector register register in the vector 1 r to the floating-point register V after writinga 32-bit (single precision floating-point)and64-bit a (doubleprecision e floating-point register register floating-point e r Volume IV-j: The MIPS64® SIMD Arch from the high part of the floating-point register floating-pointregister of the part high the from the to floating-point register . Figure 3-12 V value renders all floating-point and vector registers renders all floating-pointregisters value and vector 96 95 96 95 FR is set, the read and write oper and write read the set, is UNPREDICTABLE FR Status . r Figure 3-10Figure (Status Register MSA Vector on the Effect Write Word FPU Status Figure 3-12Figure (Status Register MSA on the Vector Write Effect FPU High Word register register the word element with index 1 in the vector register register in the vector 1 index with element word the UNPREDICTABLE floating-point) value show the vector register register vector the show index 0 in the vector register register in the vector 0 index the word element with index index element with the word index 0 in register theindex vector double for 64-bit(double precisionfloating-point) read. Figure 3-11Figure (Status Register Vector the MSA on Effect Write FPU Doubleword Changing the Changing • of value 32-bit write operation A • A write operation of value The control registers are used to record used are registers control The Control MSAregis- vided thisFrom for andpurpose:CFCMSA CTCMSA(Copy Control MSA To register) (Copy bit implementation the is registers control MSA the outside residing information only The ter). When •operation 32-bit read A follows: •th A read operation from UNPREDICTABLE UNPREDICTABLE UNPREDICTABLE UNPREDICTABLE UNPREDICTABLE 127 127 127 MIPS® Architecture for Programmers Programmers MIPS® Architecture for 3.4 Registers Control MSA

40 0 fields. tion specifyingthe registers mask registers MSAIR registers mask registers 3.4 MSA Control Registers 8 7 (written) vector describes the the describes . contains informa quested vector quested vector following sections forthefollowing complete descrip- register instance. register Figure3-14 ; itecture Module, Revision 1.12 Revision Module, itecture MSAIR MSAIR Read/Write Description WRP ProcessorID Revision for a summary and the summary for a = 0 ileged Read Only Implementation (MSAIR, MSA Control Register 0) (MSAIR, MSA Control ivileged Read/Writeand status Control ) is a 32-bit read-only register thatis a ) 32-bitread-only register WRP 18 17 16 15 Table 3.2 Table Section 3.2 “MSASoftware Detection” MSAIR MSAIR shows the format of the shows using CFCMSA (Copy From using Controlinstruction.CFCMSA (Copy If the MSA register) Table 3.2Table Registers MSA Control Figure 3-13Figure MSAIR Register Format = 1 ifMSA is implemented , user mode accessible Volume IV-j: The MIPS64® SIMD Arch Access Mode WRP MSAIR discussed in discussed Figure 3-13 Privileged Reserved Read/Write index register vector Unmapping Privileged Reserved Read/Writeindex register vector Mapping PrivilegedPrivileged ReservedPrivileged ReservedPrivileged Only Read Reservedmask Read/Write registers vector Available Reserved mask registers vector Saved Read/Write Modified Only Read Re 0 Required

User mode accessible, not priv User mode accessible, not MSAIR MSAEn 25 24 23 Not privileged

00000000000000 0 7 6 5 2 3 4 Config5 The software can read the can read software The multi-threading moduleshare oneallis present, thread contexts Compliance Level: Access Mode: enable bit enable There are 8 MSA control registers. See There MSA controlare 8 registers. tion. identification of MSA.identification The MSA Implementation Register ( MSA The Implementation Register Name Index MSAIR 31 3.4.1 MSA Register Implementation MSAMap MSACSR 1 User pr mode accessible, not MSASave MSAAccess MSAUnmap MSAModify MSARequest MIPS® Architecture for Programmers Programmers MIPS® Architecture for

41 register register 2 1 0 fields. MSACSR Flags RM 3.4 MSA Control Registers A Control and Status Register MSACSR 7 6 that controls the operation of the operation the that controls R Preset Required R Preset Required R0 0 Reserved Enables Write ResetState Compliance Read/ A Control Register 1) A Control describes the itecture Module, Revision 1.12 Revision Module, itecture 12 11 Figure3-16 ; er Field Descriptions , CP1 Control Register 31) and MS Control Register , CP1 e is present, each thread context has its own own its has each thread context e is present, ) is a 32-bit read/write register register a 32-bit read/write ) is EVZOU I VZOU I VZOU I MSACSR Meaning using CFCMSA and CTCMSA (Copy From and To Control MSA usingCFCMSAFrom and CTCMSA (Copy and To ads as zero and must be writ- as zero and ads itioning MSA allows for mul- for MSA allows itioning MSACSR Description MSACSR implemented. mented. us Register (MSACSR, MS us Register (MSACSR, Figure 3-15Figure Format Register MSACSR lti-threading modul ifMSA is implemented , user mode accessible Volume IV-j: The MIPS64® SIMD Arch 0 not partitioning registers 1 Vector imple- partitioning registers Vector Figure 3-14Figure Regist MSAIR shows the format of the the format shows FS 0 Impl 0 NX Cause Encoding Required

ten as ten zero. Using vector registers part registers vector Using tithreaded implementations with fewer than 32 physical 32 physical than fewer with implementations tithreaded context. thread hardware per registers vector 25 24 23 22 21 20 19 18 17 Not privileged

) are closely related in their purpose. However, each serves a different functional unit and can exist inde- functional unit and can exist a different each serves ) related in their purpose. are closely However, Figure 3-15 0 00000000 Fields MSACSR Compliance Level: Access Mode: register) instructions. If the mu If instructions. register) The software can read and write the and write can read software The The MSAThe ( Control and Status Register MSA unit. unit. MSA instance. Floating PointControl (FCSR andStatus Register ( pendentlyof the other. 0 31:17 for use; re future Reserved Rev 7:0number Revision WRP 16Partitioning. Registers Vector Name Bits ProcID 15:8 Processor numberID R Preset Required 31 3.4.2 and Stat MSA Control MIPS® Architecture for Programmers Programmers MIPS® Architecture for

42 Optional Reserved Reserved 0 0 3.4 MSA Control Registers Undefined Optional R0 0 Reserved R0 R0 R/W 0 Write Reset State Compliance Read/ itecture Module, Revision 1.12 Revision Module, itecture . ndent features. R/W ster Field Descriptions Field ster Section Section ation Exception Meaning s as zero and must be writ- as zero and s ads as zero and must be writ- as zero and ads ads as zero and must be writ- as zero and ads emented,reads aszero and Description and Exception Signaling” and l implementation depe non-zero results not altered. non-zero are Oper Unimplemented as needed. be signaled may result zero with of non-zero tiny and the same sign. No Unimplemented signaled. is Exception Operation Volume IV-j: The MIPS64® SIMD Arch 0 and tiny subnormal values Input 1 subnormal input value every Replace Figure 3-16Figure Regi MSACSR Encoding ten as ten zero. 3.5.4 “Flushto Zero writes are writes ignored. non-zero result is and tiny subnormal value input Every See sign. same the of zero with replaced ten as ten zero. ten as ten zero. Fields 0 31:25 for use; re future Reserved 0 23for Reserved future use; read 0 20:19 for use; re future Reserved FS 24to zero. If not impl Flush Impl 22:21 contro to Available Name Bits MIPS® Architecture for Programmers Programmers MIPS® Architecture for

43 floating-point floating-point floating-point 3.4 MSA Control Registers R/W Undefinedfor Required R/W 0for Required R/W Undefinedfor Required Write Reset State Compliance Read/ itecture Module, Revision 1.12 Revision Module, itecture instruction instruction floating-point mode.

) with the least sig- least the with ) Cause bit are set either set bit are Cause Meaning destination register is not register destination trap. All thedestination ition arises for any of the the of any for ition arises it is set to 1 if the corre- the if to 1 is set it Section 3.5.2Section “Handling not a exception is taken is taken not a exception all operations in a vector operations all in a vector or floating-point or exception that conditions exception ng point exceptions not do point ng exceptions operation in vector float- in vector operation ception condition that does condition ception Unimplemented Operation Operation Unimplemented The exception conditions conditions The exception updated for updated all for ading the field. ading Cause point exception point Description for the of each bit. for meaning for the of each bit. meaning Volume IV-j: The MIPS64® SIMD Arch Table 3.3 Table 3.3 01mode Normal exception mode exception Non-trapping Encoding can be determined by re be determined by can to Refer In normal exception mode, the exception normal In written and the floating point exceptions set the Cause written the point exceptions and floating bits and trap. operations which the mode, non-trapping exception In floati normally signal would not do the bits and write Cause execution the condition arises during exception sponding instruction floating-point vector in the operation any of and set is to 0 otherwise. vect preceding by the caused register’s elements are set either to the calculated results results calculated to the either set are elements register’s signal an exception, normally if would operation the or, (see signaling NaN values to of during execution arise the A b instruction. floating-point bits These control whether or the MSA Floating Point Exception” Point Floating MSA the type exception specific bits recording 6 the nificant Cause the as format same the in element that for detected The Flags bits are field. IEEE the bits indicate These when an IEEE cond exception operation with an IEEE ex with operation the (i.e., exception point in a MSA floating result not Enablebit isoff). when both an is taken The exception conditions. five corresponding the bit and Enable of any the during execution to MSACSR a value by moving or instruction ing-point Cause that Note representations. alternative its of one or corresponding Operation) has no E (Unimplemented bit Enablebit; thenon-IEEE Exception is defined by MIPS as always enabled. MIPS as always by Exception defined is to Refer Fields NX 18 floating Non-trapping Cause 17:12bits. Cause Name Bits Enable 11:7 Enable bits. MIPS® Architecture for Programmers Programmers MIPS® Architecture for

44 floating-point floating-point ). 3.4 MSA Control Registers Table 3.6 Table usefully definable In result. definable usefully Exception not enabled, the is R/W Undefined for Required R/W 0for Required ion is not enabled, the overflowed overflowed the is not enabled, ion Write Reset State Compliance Exception is not enabled, the Exception Read/ ). itecture Module, Revision 1.12 Revision Module, itecture y if an exact infinite result is defined for an for is result defined y infinite if an exact the destination format’s largest finite number is number finite largest format’s destination the Table 3.6 Table Bit Meaning signed according to the operation (see operation according to the signed for the operation to be performed. operation to be performed. the for signaled if and only if there is no signaled if if there is only and i.e. when the Divide Zero by i.e. when the Divide i.e. when the Invalid Operation Operation Invalid the when i.e. ion (i.e., the Enable bit is bit the Enable (i.e., ion does not result in a MSA MSA a does not result in ags field are set, while the ags field ithmetic operations ithmetic that xplicitly reset by software. reset xplicitly point operation an point raises ) is delivered to the destination. In addition, the Inexact bit in the the in bit Inexact the In addition, destination. to the is delivered ) ounding mode used for most ndling, i.e. when the Overflow Except Overflow the when i.e. ndling, lt is a quiet NaN (see (see NaN quiet is a lt Description on (i.e.,on the Enable thebit isoff), for the of each bit. for meaning of the for the encodings meaning of in the Cause field. the Cause in Table 3.6 s any exception conditions that have that have conditions exception s any w Volume IV-j: The MIPS64® SIMD Arch Table 3.3 Table 3.4 Table 3.3Table Bit Definitions and Flag Cause, Enable, This field sho This field occurred for all operations in the vector floating-point floating-point in the vector for all operations occurred by reset last was flag the since completed instructions When a floating- software. point floating excepti Fl bit(s) in the corresponding except point floating in a result reset is never field Flags bits.This the do not update on) be e and must hardware by to Refer indicates the r This field specific operations use a e (som operations point floating mode). rounding to Refer IEEE exception condition that condition IEEE exception Ar unchanged. remain others this field. this This bit exists only only exists bit This Exception Operation is Invalid The The Divide by Zero Exception is signaled if and onl Exception is signaled if and Zero by Divide The operands. on finite operation handling, exception Under default correctly is an infinity result default default floating-point resu floating-point default these cases the operands are invalid invalid are operands the cases these handling, exception Under default exceeded in magnitude by what would have been the rounded floating-point result were the expo- result were the been rounded the floating-point have would in magnitude by what exceeded unbounded. range nent ha exception default Under result (see rounded is set. field Cause The Overflow Exception is signaled if if and only Overflow The E Unimplemented Operation. Z by Zero. Divide V Operation. Invalid O Overflow. Bit Name Fields RM 1:0Mode. Rounding Flags 6:2 Flag bits. Name Bits MIPS® Architecture for Programmers Programmers MIPS® Architecture for

45 MSAAccess. Reserved bit is set. The reset state bit

Wn 3.4 MSA Control Registers otherwise , register instance. register set) shows the format of the format the shows fying which of the 32 architecturally the fying which of tion is not enabled, rounded the not enabled, is tion MSAAccess WRP itecture Module, Revision 1.12 Revision Module, itecture MSAAccess MSAIR Figure 3-17 Figure ). lue. When two representable areequally values ags field is set. Such an underflow condition has no has condition underflow is set. Such an field ags Coprocessor enabled 0 is least significant bit is zero (that is, even) is, (that is zero bit significant least xponent range and precision unbounded then the -- and precision unbounded range xponent Bit Meaning Table 3.6 exact bit in the Cause field is set. field Cause bit in the exact unded result is exact inexact. or result is exact unded ss, MSA Control Register 2) ss, MSA Control i.e. when the Inexact Exception is not enabled, the rounded rounded the enabled, is not Exception Inexact the when i.e. i.e. when the Underflow Excep i.e. when the Underflow tion is signaled when a tiny non-zero result is detected after is detected result non-zero when a tiny signaled is tion ) is a 32-bit read-only register speci a ) 32-bit read-only register is ) is delivered to the destination and: to the destination delivered is ) for vector registers partitioning registers for vector (i.e. Table 3.4 Table Definitions Modes Rounding Volume IV-j: The MIPS64® SIMD Arch MSAAccess = 0, …, 31, is available and can be used only if used only be can and available …, 31, is 0, = n using CFCMSA (Copy From Control MSA register) instruction.the MSAAccessIf From usingControlregister) MSA CFCMSA (Copy Table 3.6 Table present, each thread context has its own its own has eachthread context present, , accessible only when access to only when access , accessible Table 3.3Table Definitions Bit and Flag Cause, Enable, Required

, where , observable effect under default handling. under default effect observable n Rounds the result to the nearest representable va representable nearest the to result the Rounds whose to the value result is rounded the near, result. the than magnitude in greater not but to closest value the to result the Rounds result. the than less not but to closest value the to result the Rounds result. the than greater not but to closest value the to result the Rounds •In the is inexact, result the rounded If (see destination to the is delivered result result (see result •Fl bit in the no exact, result is rounded the If from differs is, it that -- inexact is operation of an result rounded if the otherwise, stated Unless e were both been computed have what would signaled. be is Exception Inexact handling, exception Under default Under default exception handling, handling, exception Under default If enabled, the Underflow Excep Underflow the enabled, If whether the ro of regardless rounding Privileged

I Inexact. 0even. to to nearest / ties Round 1zero. toward Round 2/ plus infinity. positive towards Round 3infinity. / minus negative towards Round U Underflow. RM Field RM Field Bit Name Encoding Meaning Vector register W register Vector defined vector registers W0, …, W31 are available to the software. software. to the …, W31 are available W0, registers vector defined Access Mode: ( MSAThe Access register Compliance Level: of the MSA Access register is zero. register Access the MSA of the can read software The multi-threadingis module 3.4.3 MSA Access Register (MSAAcce MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. 46 n 0 0 W W 1 1 W W MSAModify 2 2 W W 3 3 W W 32 architecturally 32 Reserved Reserved 4 MSASave register 4 W shows the format of the format shows W has to be saved on on be saved to has n , the softwarethe n, writes 5 5 W W 3.4 MSA Control Registers is mapped to an available n is mapped to an available 6 6 W W otherwise otherwise . W 7 7 W , , W Figure3-18 8 8 W W set) set) shows the format of the format the shows 9 9 fying whichof the W W g which of the 32 architecturally defined defined architecturally 32 the of which g WRP WRP MSAMap W 10 W 10 n to W 11 W 11 itecture Module, Revision 1.12 Revision Module, itecture Figure 3-13 Figure MSAIR MSAIR W 12 W 12 n = 0,…, 31, thenregister W W 13 W 13 Coprocessor enabled 0 is Coprocessor enabled 0 is W 14 W 14 cleared. an already mapped vector register W register vector an already mapped W 15 W 15 me can not exceed the number of physical registers imple- of the number physical registers me can not exceed Wn with the value corresponding with tothecurrent the value context. W 16 W 16 e is present, each thread context has its own own its has each thread context e is present, are set, where set, where are W 17 W 17 dify, MSA Control Register 4) MSA Control dify, Wn W 18 W 18 W 19 W 19 MSAAccess is set. To freeup To is set. W 20 W using CFCMSA and CTCMSA (Copy From and To Control MSA MSASave usingCFCMSAFrom and CTCMSA (Copy and To 20 ) is a 32-bit read/write register specifyin register read/write a 32-bit is ) MSASave Wn W 21 W 21 Figure3-18 Register MSASave Format lti-threading modul (MSASave, MSA Control Register 3) MSA Control (MSASave, for vector registers partitioning registers for vector (i.e. for vector registers partitioning registers for vector (i.e. Figure Figure 3-17 Format Register MSAAccess W 22 W 22 Volume IV-j: The MIPS64® SIMD Arch ) is a 32-bit read/write register speci read/writeregister 32-bit is a MSAModify) Modify register is zero. register Modify and bit bit and W 23 W , accessible only when access to only when access , accessible , accessible only when access to only when access , accessible 23 MSASave Wn W 24 W 24 MSAAccess Required Required

isn unmapped and W 25 W 25 . W . The reset state of the MSA Save register is zero. register of the MSA Save state reset The Privileged Privileged W 26 W 26

W 27 W 27 MSAAccess W 28 W 28 W 29 W 29 MSASave. MSAUnmap W 30 W 30 the The software can read and write the and write can read software The register) instructions. If the mu If instructions. register) instance. Ifboth bit to behalf of the previous software and restoredcontext behalf theof previous The total number of vector registers mapped at any ti mappedat any registers vector total number of The mented. Access Mode: ( MSA The Modify register (written). been modified have W0, …, W31 registers vector defined MSA reset of the state The Access Mode: ( register MSASave The . after a software context not been saved W31 …, W0, have registers vector Compliance Level: Compliance Level: = 0, …, 31, the software writes0, , n Wn …, 31, the software = register to vector get access To physical register and physical register 313029282726252423222120191817161514131211109876543210 W 31 313029282726252423222120191817161514131211109876543210 W 31 3.4.5 MSA Modify Register (MSAMo 3.4.4 Register MSA Save MIPS® Architecture for Programmers Programmers MIPS® Architecture for

47 0 0 W W Figure Figure register register 1 1 W W 2 2 W W 3 3 W W shows the format of the format shows Reserved Reserved MSAModify 4 4 W W 5 5 sters the instruction will W W 3.4 MSA Control Registers 6 6 W W otherwise otherwise struction completes. The update update completes. The struction 7 7 Figure3-13 , , W W register instance. register g theg bits for the current MSAor 8 8 W W set) set) fyingwhich of the 32 architecturally fields. 9 9 is set. is W W WRP WRP Wn ing a vector register to be mapped. register vector ing a W W 10 10 register. MSAModify register. W W 11 11 ontrol Register 5) ontrol itecture Module, Revision 1.12 Revision Module, itecture MSARequest MSAMap MSAIR MSAIR ction with all vector regi all ction vector with W W 12 12 MSASave W W 13 13 Coprocessor enabled 0 is Coprocessor enabled 0 is W W 14 14 FPU instruction has requested access to but are not yet are to but FPU instruction has requested access ontrol Register 6) ontrol W W 15 15 describes the W W is always cleared before settin cleared always is 16 16 e is present, each thread context has its own own its has each thread context e is present, using CFCMSA and CTCMSA (Copy From and To Control MSA usingCFCMSA From and CTCMSA (Copy To and W W 17 17 n. the execution of each MSA or FPU in MSA or FPU of each the execution W W 18 18 ) is a 32-bit read-only speci register using CFCMSA (Copy Fromusing (Copy Control instruction.CFCMSA MSA register) the If W W 19 19 . Figure3-22 W W MSAModify 20 20 MSARequest ) is a 32-bit read/write register specify read/writeregister 32-bit is a ) = 0, …31, then the software has been granted access to and has modified register register modified has and to has been granted access software the = 0, …31, then n er (MSARequest, MSA C W W 21 21 lti-threading lti-threading modul (MSAMap, MSA C Figure 3-19Figure Format Register MSAModify for vector registers partitioning registers for vector (i.e. for vector registers partitioning registers for vector (i.e. MSAMap W W 22 22 Figure 3-20Figure Format Register MSARequest is clear, or are not yet saved, i.e. yet saved, or are not clear, is Volume IV-j: The MIPS64® SIMD Arch MSARequest MSARequest Wn present, each thread context has its own its own has eachthread context present, W W , accessible only when access to only when access , accessible , accessible only when access to only when access , accessible 23 23 e software cleared bit cleared software e MSAMap W W 24 24 is set, where is set, Required Required

W W 25 25 . The reset state of the MSA Request register is zero. of the MSA Request register state reset . The Wn Privileged Privileged W W 26 26 is set by the hardware for each or FPUMSA instru for the hardware by is set

MSAAcces is updated by the hardware when hardware the by is updated operation, i.e. hardware updates never clear any bits in clear any or hardware operation,i.e.updates never W W 27 27 W W 28 28 MSAModify W W 29 29 MSARequest n thesince timelast th W W 30 30 FPU instruction. MSARequest or write mode. either read in access Access Mode: MSA ( The register Map the format of the 3-21 shows Access Mode: ( register MSA Request The W31W0, …, MSA the current or registers vector defined i.e. available, W Compliance Level: Compliance Level: The software can read and write the the and write can read software The register) instructions. If the mu If instructions. register) the The software can read the can read software The multi-threadingis module is a logical is a instance. MSAModify If bit If 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 W W 31 31 3.4.7 MSA Map Register 3.4.6 MSA Request Regist MIPS® Architecture for Programmers Programmers MIPS® Architecture for

48 . The The n. register register n to one of n n register . Reserved MSAUnmap Wn MSAMap register to be unmapped. register fields. 3.4 MSA Control Registers otherwise , to unmap W register vector 87 543210 87 543210 MSAAccess set) MSAUnmap R0 0 Reserved WRP Write Reset State Compliance Read/ itecture Module, Revision 1.12 Revision Module, itecture MSAIR gister specifying a vector vector gister specifying a describes the describes each thread context has its own own its has context each thread Coprocessor enabled 0 is thehardware is instructed . the hardware is instructed to map vector register W register vector to map is instructed hardware the me can not exceed the number of physical registers imple- of the number physical registers me can not exceed Figure 3-24 e is present, each thread context has its own own its has each context thread e is present, Wn . using CFCMSA and CTCMSA (Copy From and To Control MSA From using and To CFCMSA and CTCMSA (Copy ap, MSA Control Register 7) ap, MSA Control using CFCMSA and CTCMSA (Copy From and To Control MSA reg- MSA Control To and From (Copy CTCMSA and CFCMSA using 0 0 ads as zero and must be writ- as zero and ads 18171615 18171615 MSAMap, MSAMap, MSAUnmap, ) is a 32-bit read/write re read/write 32-bit a is ) Description MSAAccess MSAUnmap MSAUnmap MSAMap Figure 3-21Figure Format Register MSAMap lti-threading modul threading module module is present, threading 000000000000000000000000000 000000000000000000000000000 Figure 3-23Figure Format Register MSAUnmap for vector registers partitioning registers for vector (i.e. Volume IV-j: The MIPS64® SIMD Arch MSAUnmap , accessible only when access to only when access , accessible Figure 3-22Figure Field Descriptions Register MSAMap Required

ten as ten zero. 252423 252423 Privileged n …,= 0, 31, is writtento n …, 0, = 31, is writtento

n, n, shows the format of the the format shows Fields register) instructions. If the mu If instructions. register) The software can read and write the read and write can software The instance. value When unmapping by clearing is confirmed The total number of vector registers mapped at any ti mappedat any registers vector total number of The mented. Access Mode: ( MSA The Unmapregister Figure3-23 Compliance Level: The software can read and write the can read and write software The ister) If the multi- instructions. instance. value When successful The by settingmapping physical registers. isconfirmed the available n 4:0 index.register Vector R/W 0 Required 0 31:5 for use; re future Reserved Name Bits 31 31 3.4.8 MSA Unmap Register (MSAUnm MIPS® Architecture for Programmers Programmers MIPS® Architecture for

49 CP0 Cause 3.5 Exceptions (CP Register 12, Register (CP . FR tion vector with ExcCode ExcCode with vector tion Status tion uses the common excep- the tion common uses MSACSR ): Section 3.2 Section Software “MSA ssed by the instruction is either not is ssed by the instruction R0 0 Reserved Table 3.5 Table Write Reset State Compliance Read/ MIPS SIMD Architecture on cores SIMD imple- Architecture on cores MIPS CP0 register set to 0x0e.reason for CP0register The exact itecture Module, Revision 1.12 Revision Module, itecture . Cause ): 3.5 Table ception uses the common excep the common uses ception re context switch. This excep This switch. context re the common exception vector with ExcCode field in withExcCode field vector exception the common (CP0 Register 16, Select 3, bit 28) is not set, or if the usable FPU oper- the set, or if is not 28) 16, Select 3, bit (CP0 Register CP0 register CP0 setto register 0x15. ctions are enabled are documented in ads as zero and must be writ- as zero and ads (CP Register 12, Select 0, bit 29) is set and bit and set is 29) 0, bit Select12, (CP Register ions are enabled. bits of theMSA Control Register and Status (CP0 Register 16, Select 5, bit 27) is not set or, when vector registers parti- when registers 16,vector (CP0 Register bitSelect 5, not27) is set or, are considered of to be part the are MSAP set), if any MSA vector register acce register MSA vector any set), if CU1 Cause Description or with ExcCode field in withExcCode field or WRP Cause MSAEn Config3 Status Table 3.18 Volume IV-j: The MIPS64® SIMD Arch MSAIR , if CFCMSAor CTCMSAinstructions attemptMSA control toreador write privileged Config5 , if MSA instruct MSA if , , if bit if , , a data dependent signaledexception by theMSA floating point instruction. This exception Figure 3-24Figure Descriptions Field Register MSAUnmap ExcCode field in field ExcCode ten as ten zero. , if bit , if MSAinstructions arenot enabled. Section 3.3.2 “Floating-PointRegisters Mapping” CP0 register set to 0x0b and CE field set to0indicate to set to0x0b Coprocessor 0. andCE field Cause register CP0 and field in field taking is in thethisexception register setto 0x0a. register Unusable Coprocessor enabled. This ex access 0 Coprocessor without registers MSA Disabled Select bit0, is26) not set. This exception uses MSA Disabled MSA Floating Point vect exception the common uses tion vector withtion vector available or needs to be saved/restored due toa softwa or needs to be saved/restored available Reserved Instruction Reserved tioningenabled is (i.e. ates in 32-bit mode, i.e. bit 32-biti.e. bit mode,in ates Reserved Instruction Reserved Fields All MSA reserved opcodes in MSA reserved All (see menting exceptions opcodesMSA. These willthe generate following • • • • • MSA instructions can generate the following exceptions (see exceptions following the generate can instructions MSA • The conditionsThe underwhich theMSA instru Detection” n 4:0 index.register Vector R/W 0 Required 0 31:5 for use; re future Reserved Name Bits 3.5 Exceptions MIPS® Architecture for Programmers Programmers MIPS® Architecture for

50 bit. No 3.5 Exceptions MSAEn non-trapping . Config5 MSACSR on which elements are at fault on at fault which elements are MSASave ) to record the specific exception exception the specific record ) to context. The OS can map or The context. MSA Disabled Exception when vec- when Exception Disabled MSA , the OS knows the current software the current software knows the OS , , and Cause NaN values, where the least significant 6 significant the least where values, NaN Table 3.3 MSAEn , EVZOU I is caused by some vector registers not being registers is caused bysomevector itecture Module, Revision 1.12 Revision Module, itecture 6543210 is set), the MSA Disabled Exception could be sig- be could Exception Disabled the MSA set), is Config5 , MSAAccess WRP led if at least one vector element causes an exception an exception element causes led if at least one vector Figure3-25 ed) for the current software software for the current ed) precise case this indication in MSAIR for an example of handling the for an example MSARequest ill save/restore MSA registers on context switch.MSAregisters illsave/restore Exception can be determined by checking the Cause field (see field Cause Enable bitfield are set to signalingto set are bitfield Enable t. Theotherelements will be set to the calculated results based on their oper- Mnemonic Description Signaling Bits NaN Signaling MSACSR Disabled Exception MSACSR Volume IV-j: The MIPS64® SIMD Arch bit is set. In this instance, the exception exception the instance, In this set. is bit Enable bitfield. There is no is no There bitfield. Enable registers by examining byexamining registers Table 3.5 Table Values (ExcCode) Code Exception MSA MSAEn … Figure Figure 3-25 NX when is set Elements Faulting for Format Output MSACSR Config5 Appendix A, “Vector Registers Partitioning” Registers A, “Vector Appendix 10111421 0x0a 0x0b 0x0e 0x15 RI CpU MSAFPEexception Instruction Reserved MSADisUnusable exception Coprocessor Point exception Floating MSA exception Disabled MSA Exception CodeException Value Decimal Hexadecimal tor partitioning registers is implemented. See ready (either not available or in or saved/restor need to be ready(either notavailable these vector save/restore context uses MSA resources and therefore it w and therefore MSA resources uses context implemented(i.e. is partitioning registers vector the If naled even if naledeven Innormal operation mode, floating point are signa exceptions enabled by the The exact reason for taking a MSA Disabled taking a for reason exact The MSA instruction can be executed ifMSAinstruction this bit not is Byexecuted can be set. setting and thecorresponding handling exception causes. The exception routine shouldsetthe signal normally would which elements All instruction. point MSA floating the re-execute and NX bit mode exception according to the exception an as the format same the have bits or exceptions detected for that elemen detected for exceptions or ands. 3.5.2 Exception Handling Point the MSA Floating 3.5.1 Handling the MSA MIPS® Architecture for Programmers Programmers MIPS® Architecture for

51 ) ) ) ) ) with with with with with with with with Table 3.7 Table 3.7 Table 3.7 Table 3.7 Table 3.7 3.5 Exceptions apply unchangedand Flags are not updated. are not updated. Flags Enable bitfield is not is Enable bitfield

Default Value, Default Value, s. If there excep- s. are enabled If Flags bits are updated after all by-case in the instruction’s the instruction’s in by-case MSACSR Figure 3.27 MSACSR Enabled Exception, and NX set and Cause bits and setting the destination’s destination’s the setting bits and Cause The default signaling NaN (see signaling default The in Figure 3-25 the of format shown Cause bit V set. The default signaling NaN (see signaling default The in Figure 3-25 the of format shown in Figure 3-25 the of format shown Cause Z bit set. bit Z Cause NaN (see signaling default The Cause bit U set. The default signaling NaN (see signaling default The in Figure 3-25 the of format shown signaling NaN (see default The in Figure 3-25 the of format shown Cause I Cause bit set. I Cause bit O set. MSACSR are preserved for enabled exceptions when NX when enabled exceptions for are preserved gnaled and the gnaled itecture Module, Revision 1.12 Revision Module, itecture Figure 3.26 MSACSR ting point exception will be taken, not even the always the tingalways point willexception not be taken, even Enable bit is 0, the floating point resultisvalue a default Table 3.6 Flags update and exception signaling condition. all elements the instruction operates on. It is assumed assumed is on. It operates elements the instruction all are explicitly documentedcase- d as a quiet NaN. ception will be si be will ception Cause contains no enabled exception the rounded result based the ), or one of the signaling signaling of the or one ), MSACSR MSACSR is the rounded is result based the MSACSR rgest finite number with the number finite rgest Default Value, Default Value, executing the instruction. The instruction. the executing Disabled Exception Disabled Table 3.7 Table MSACSR describes the shows the process of updating the process shows the and the default values from and values the default The default value depends on the rounding rounding the depends on value default The The default value is value default The The default value is either the default quiet quiet default the is either value default The (see NaN NaN operands propagate NaN The default value is the properly signed infin- is the value default The ity. on the rounding mode. the rounding on on the rounding mode. by an over- caused If mode. the on rounding enabled, exception overflow the without flow result. overflowed is the value default the mode, as shown below. mode, as shown For positive overflow values, positive infinity. infinity. positive values, overflow positive For the format’s values, overflow negative For number. finite negative smallest larg- the format’s values, overflow positive For val- overflow negative For number. est finite infinity. minus ues, sign of the overflow value. of the overflow sign Volume IV-j: The MIPS64® SIMD Arch i.e. the corresponding corresponding i.e. the . Figure 3.26 Figure Cause, a MSA floating-point ex MSA floating-point a Cause, Table 3.6 Table Exceptions Point Floating for Values Default Figure3-25 positive negative Table 3.6 Cause cleared bits are all before Round towards Round Round towards Round Round to nearestto Round value. overflow sign of the with the infinity An Round toward zerotoward Round la format’s The MSACSR changed and stillis used togenerate theappropriate if a floating ofresults.the NX default Regardless point value, exceptionnot is enabled, in shown as The pseudocode in pseudocode The Whenthenon-trapping mode exception bit noNX is set, floa enabled Unimplemented OperationException. Note thatby setting the NX bit, the value. This process is invoked element-by-element for This process is invoked value. MSACSR the elements have been processed and been the elements have tions in tions Figure 3.27 pseudocodeThe in Figure both the format in disabledbit is For set. theexceptions, default values description section. For instructions withFor non floating-point results, the pseudocode in Zero Invalid Inexact Overflow Divide by by Divide Operation Underflow value default The Exception Mode Rounding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

52 3.5 Exceptions has at0xNN least Signaling NaN Signaling . Byte Byte . 1 itecture Module, Revision 1.12 Revision Module, itecture */ the result gets destination 0x7CNN 0x7F80 00NN 0x7F80 Figure 3-25 Update Pseudocode Update Cause | c Quiet NaN Quiet Cause reason for generating the signaling NaN value. the for signaling reason generating Table 3.7Table NaN Encodings Default | E /* Unimplemented (E) is always enabled */ enabled is always (E) Unimplemented | E /* Volume IV-j: The MIPS64® SIMD Arch Cause Figure 3.26Figure MSACSR Enable 0x7E00 0x7FC0 0000 0x7FC0 0000 0000 0000 0x7FF8 00NN 0000 0000 0x7FF0  MSACSR and (enable & U) = 0 and (c & I) = 0 then I) = 0 (c & and & U) = 0 (enable and

and (enable & O) = 0 then & O) 0 and (enable 0 r d case */ exceptions for disabled value the default gets Cause 16-bit 32-bit 64-bit Format c | I c ^ U (bit E is 0x20, O is 0x04, U is 0x02, and I is 0x01) and I is 0x02, 0x04, U O is E is 0x20, (bit exception one bit set showing the set showing one bit v  destination enabled, are not Current exceptions /* v  /* Operation completed successfully, completed Operation /* 1. All signaling NaN values have the format shown in the shown format have NaN 1. values All signaling c:I bitfield O, U, E, V, Z, exception(s) element current d: exception disabled a case of used in be to value default e: a non-trapping set, i.e. of NX in case to be used value NaN signaling r: an exception without completed operation the if value result v: element to destination written to be value MSACSR Updated else if c = 0 then if c  c  */ exceptions all current Cause with MSACSR the update exceptions, No enabled /* MSACSR enable  MSACSR enable Input Output endif ) */ 3.3 Table (see is not enabled (U) Underflow when Underflow Clear Exact /* (c & U)  if ) */ 3.3 (see Table enabled is not (O) Overflow (I) when Set Inexact /* (c & O)  if endif enable  c & cause if cause = 0 then MIPS® Architecture for Programmers Programmers MIPS® Architecture for

53 -2008. TM 3.5 Exceptions in the instruction’s descrip- in the instruction’s tion and as such, they take precedence precedence take tion such, and as they ). is always enabled */ enabled is always itecture Module, Revision 1.12 Revision Module, itecture pected result is documented documented pected result is Table 3.6 gating operations with one NaN operand produce a NaN with a NaN with produce one NaN operand with operations gating ersions to integer/fixed-point or compares) NaNtheor or oper- ersionstointeger/fixed-point ) Cause left to rightas described by theinstruction format. e IEEE Standard for Floating-Point Arithmetic 754 Cause | c | E)) = 0 then /* Unimplemented (bit E 0x20) (bit /* Unimplemented = 0 then | E)) Update and Exception Signaling Pseudocode Signaling and Exception Update | MSACSR | Cause ys signal the Invalid Operation excep Operation Invalid signal the ys Enable Flags s are enabled */ s are enabled Flags Volume IV-j: The MIPS64® SIMD Arch  MSACSR

 MSACSR & (MSACSR = 0 then

Cause NX ((e >> 6) << 6) | c 6) << 6) ((e >> */ is not written destination destination gets the signaling NaN value for non-trapping exception */ exception non-trapping for NaN value signaling gets the destination Cause Flags /* Exceptions will trap, update MSACSR Cause with all current exceptions, current with all Cause MSACSR update will trap, Exceptions /* MSACSR MSACSR Flags are not updated */ updated are not Flags MSACSR /* No trap on exceptions, element not recorded in MSACSR Cause, in MSACSR recorded not element exceptions, No trap on /* v  Figure 3.27 MSACSR endif /* No enabled exceptions, update the MSACSR Flags with all exceptions */ all exceptions Flags with MSACSR the update exceptions, No enabled /* MSACSR /* Trap on the exceptions recorded in MSACSR Cause, MSACSR in recorded the exceptions Trap on /* MSACSR SignalException(MSAFPE, if MSACSR /* Current exception Current /* else endif else if (MSACSR else endif mantissa.In cases wherethe source signaling the same width,NaN and destinationall mantissa quietNaN have MSA propagates NaN operands as specified by th by specified as NaN operands propagates MSA If the destination format is floating-point, all NaN propa all is floating-point, format the destination If three the operands are the input NaN. When two or NaN,payload of NaN the the payload of resulting identical to is the payloadone of of theinput NaNsselected from NaN quiet default the generating in used NaN operand signaling the to select apply rules propagation NaN above The (see is disabled Operatioexception n when the Invalid value tion section. fromare generated by: Quietinput signalingNaN values NaNvalues • Copying the signalingNaNto the signquiet NaNvalue sign • Copying the mostbits of thesignificant signaling NaN mantissato the mostbits of thesignificant quiet NaN Note that signaling NaN operands alwa NaNoperands that signaling Note all quiet NaN operands. over theIf destinationformat is notfloating-point (e.g. conv ands are notpropagated (e.g. min maxor operations),the ex 3.5.3 NaN Propagation MIPS® Architecture for Programmers Programmers MIPS® Architecture for

54 FS bit FS -2008, TM df MSACSR 3.6 Instruction Syntax exact Exception to be sig- be Exception to exact cant bits of the destination destination the of bits cant replace every tiny non-zero tiny every replace urce, the least significant bits of the the of bits significant the least urce, the software emulation finalize the emulation finalize the software FS bit could be set to be could FS bit itecture Module, Revision 1.12 Revision Module, itecture df bnormalinput operands non-zeroresults. or computetiny ons except comparisons causes In causes comparisons ons except MSACSR . Flushing of tiny non-zero results causes Inexact andnon-zeroUnder- Flushing . ofresultscauses Inexact tiny 1 metic floating-point instructions are never flushed. meticfloating-point instructions are never the operands or the results. In addition, when results. In addition, the or the operands is a valid index value for elements ofdata value format index n a is valid ctions except the approximatections except reciprocals. n, where the destination is narrower than the so the than narrower is destination the coulditself be or the vector a byte, halfword, word, doubleword, t index fort index the data format (GPRs),$0,…, $31e.g. FS bit should be zero, the Underflow Exception be enabled and a trap handler be handler trap a and be enabled Exception Underflow be zero, the should bit FS implemented Operation Exception and let implemented Operation Exception and normal after rounding flushed are normal after rounding to zero. Volume IV-j: The MIPS64® SIMD Arch element of index element index of MSACSR and Exception Signaling and Exception : destination, source, and target vector registers, e.g. $w0, …, $w31 registers, :destination, vector source, and target wt FS bit thechangesUnimplementedof the behavior OperationException. All the other floating point , and : vector register register : vector : function/instructionname, e.g. ADDS_S or adds_sfor signed saturated add : general purpose registers purposers: general registers , ws , : destination data format, which which format, data destination : : immediate value valid as a bi as valid : immediate value MSACSR mantissa are set to zero. In cases where set to zero. In cases are mantissa input mantissa are ignored. bits are copied. In cases where the destination is wider than the source, the least signifi the least the source, is wider than the destination where cases In copied. are bits wd rd df func flow Exceptions to be signaledflow all for instru naled. flushed. ws[n] m •and to the mostmantissa the maximum significant bitto 1. value field exponent Setting quietthe NaN’s • • • The MSAThe assembly syntaxelements: language codingthe uses following • Some MSA floatingpoint instructions might not handlesu or desired, needed not is emulation software If operation. Such instructions may signal the Un Such may instructions • Flushing of subnormal inputoperands in all instructi Should the alternate handling exception ofthe IEEEattributes Standard for Floating-Point Arithmetic754 •floating-point comparisons, For Exception the Inexact is not signaledwhen subnormal input operands are •and 16-bitfloating-point inputs to values non arith provided toof carry provided the alternateout handling exception the execution attributes. Section 8 be desired, the desired, 8 be Section result and subnormal input operand with zeroof the same sign. The exceptions are signaled according to the new values of values new the to according signaled are exceptions • • is set: • non-zeroresults are detected before rounding Tiny Tiny non-zero results that would have been have would results that non-zero Tiny 1 3.5.4 to Zero Flush MIPS® Architecture for Programmers Programmers MIPS® Architecture for 3.6 Syntax Instruction

55

ws 3.6 Instruction Syntax d vector register sizes are sizes register vector d the destination data format units. By convention, in the units. By convention, element in the vector register register element in the vector th ) dataformat. .B .H .V .D le of the size of the data format. of of the data the size le .W . thatNote the data formatis the abbreviation Table 3.8 2 in data format df select the n itecture Module, Revision 1.12 Revision Module, itecture Abbreviation s10 = 0, 1 n for various data formatsan for various = 0, …, 7 0, …, = n 3 0, …, = n = 0, …, 15 = 0, …, n i8 , u5 func.df rd,ws[n] s10 and Vector Byte Word Byte, 8-bit Byte, Word, 32-bit Word, Halfword Data Format Halfword16-bit Doubleword are in bytes and have tobe multipare inbytes and have ents use the same word (“.W” in same word the use ents Data Format Element Index Doubleword, 64-bit Doubleword, is appended tothe instruction name Table 3.8Table Abbreviations Format Data Table 3.9Table Values Element Index Valid func.df wd,ws[n] Volume IV-j: The MIPS64® SIMD Arch . The valid element index values index element . The valid df Table 3.8 Table ore instructions 10-bita take signedoffset ta format abbreviations are case insensitive. case are abbreviations format ta d operand across all destination vector elements. destination vector all d operand across element is being used as a fixe . The vector -bit unsigned or signed value, e.g. -bit unsigned or signed value, N : -bit value where the sign is not relevant, e.g. signthe where is not relevant, -bitvalue Table 3.9 Table N , sN : iN uN assembly language syntax all offsets offsets all language syntax assembly The vector load and st load vector The MSAinstructions of the form • immediate, One of element operands. or threeor two register, instructionsMSA have abbreviations shown in shown abbreviations • shown in shown based on thedataformat same regardless of the instruction’s assumed data type. For example all integer, fixed-point, and floating-point fixed-point, assumed all integer, data type. example For of the instruction’s same regardless instructions operating on 32-bit elem Instructions names da and Instructions 2 3.6.2 Offsets Load/Store 3.6.1 Element Selection Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

56 and , Figure3-29 3.6 Instruction Syntax Figure3-28 0 0 0 itecture Module, Revision 1.12 Revision Module, itecture bitfield to the word offset value (i.e. 3 = 12 / 4). 12 / 3 = (i.e. value offset the word to bitfield cal source, target, and destination data types.target, cal source, s10 0 E 64 63 64 63 64 63 31 e initialized to the word values shown in shown initializede values tothe word . ent-by-element identi with the load word vector instruction vector load word the Figure 3-30Figure GPR $2 Value Source Figure 3-30 have the resulting values of destination $w4, $w5, $w6, vectors and$w7the resulting after exe- values have abcd ABCD Figure 3-29Figure Values $w2 Vector Source Figure 3-28Figure Values $w1 Vector Source a + A + a + B b + C c + D d Volume IV-j: The MIPS64® SIMD Arch 127 127 127 Figure 3-31Figure Instruction ADDV.W for $w5 Value Vector Destination through Figure 3-34 addv.w $w5,$w1,$w2 addv.w $w6,$2 fill.w $w7,$w1,17 addvi.w $w8,$w2[2] splati.w ld.w $w5,12($1) ld.w Figure3-31 cutingsequence the instructions:of wordadditions following and move Regular MSA instructions operate elem MSA instructions Regular GPR $2 is initialized as shown in GPR$2initialized is as shown Let us assume vector registers $w1and $w2 ar registers us assume vector Let For example, the offset by indicated the offset example, For format data word the size of the by 12) (i.e. offset byte the divides assembler The bytes. 12 rather but words, 12 not is (i.e.4), theand generates LD.Wmachine instruction by setting 3.6.3 Instruction Examples MIPS® Architecture for Programmers Programmers MIPS® Architecture for

57 formats twice as Section 3.3 “MSA . 3.7 Instruction Encoding Table 3.12 Table 0 0 0 0 and ting results on data Table 3.11 itecture Module, Revision 1.12 Revision Module, itecture c * C + d D + * C c * e data format of the destination. The data format of the format The of of data data the destination. format e 64 63 64 63 64 63 64 63 is coded by a 2-bit field as shown in Tableinstructions For coded3.10. is by a 2-bitas shown field

and half the width, inthis i.e. word case. halfword, word or doubleword data formats (see doubleword or word halfword, df jacent odd/even source elements genera source odd/even jacent a A * B + b * EEEE BBBB a + 17a + 17 b + 17 c + 17 d Volume IV-j: The MIPS64® SIMD Arch 127 127 127 127 for the destination layout instruction,of such an dot product: i.e. thesigned doubleword ). Internally, the data format Internally, ). Figure 3-32Figure Instruction FILL.W for Value $w6 Vector Destination Figure 3-35Figure Instruction DOTP_S for Value $w9 Vector Destination Figure 3-33Figure Instruction ADDVI.W for Value $w7 Vector Destination Figure 3-34Figure Instruction SPLAT.W for Value $w8 Vector Destination Figure 3-35 dotp_s.d $w9,$w1,$w2 dotp_s.d wide. See Vector Registers” operating onlyon data formats,two the internal incoding is shown Most of theMost of MSAinstructions operate onbyte, source operands is inferred as being also signed inferred as is operands source Note that the actual instruction, e.g. DOTP_S.D, specifies th specifies e.g. DOTP_S.D, that instruction, the actual Note Other MSA instructions operate on ad Other MSAinstructions operate on 3.7.1 Encoding Data Format and Index 3.7 Encoding Instruction MIPS® Architecture for Programmers Programmers MIPS® Architecture for

58 df/n as Table 3.14. 3.7 Instruction Encoding ructionswith data formats other de up to 32 byte elements. Selecting elements. byte to 32 up de are coded as shown in shown coded as are df/m itecture Module, Revision 1.12 Revision Module, itecture Reserved Instruction Exception. Instruction Reserved both formatdata and elementina 6-bitindex field Reserved 01 01 01 ) are used for coding certain inst are used ) Bit 0 Bit 0 Bit 0 Word Doubleword Halfword Word Table 3.8 give the element index value. the elementvalue. index give give the bit index value. bit index the give 01 Byte Word Halfword Doubleword df df df n m Bit 1 ements while the byte data format can co can format data byte the while ements Byte Halfword Word Doubleword 00nnnn 100nnn 1100nn 11100n 01nnnn 101nnn 1101nn 11101n Bits 6…0 Bits Bits 5…0 Bits Bits 5…0 Bits Doubleword Word Halfword Byte 0mmmmmm 10mmmmm 110mmmm 1110mmm an 0, 1, …, 15generates a 1 1 Volume IV-j: The MIPS64® SIMD Arch df/n df/n df/m Table 3.10Table Field Encoding Format Data Two-bit 1. Bits marked as marked Bits 1. 1. Bits marked as Bits marked 1. Table 3.14Table Field Encoding Index Bit and Format Data Table 3.11Table Field Encoding Format Data Halfword/Word . All invalid index values or data formats will generate a Reserved Instruction Exception.dataFor formats willor Reserved generate a values index All. invalid Table 3.12Encoding Field Table Format Data Word/Doubleword Table 3.13Table Field Encoding Index Element and Format Data Table 3.13 Table than byte, halfword, word, or doubleword. or word, halfword, byte, than Ifan instruction bita position, specifies thedata format andbit index The combinations marked Vector (“.V” in Vector combinations marked The shown in shown el has 16 byte register a vector example, any vector byte element other th byte vector any MSA instructions using a specific vector element code vector a specific instructions using MSA MIPS® Architecture for Programmers Programmers MIPS® Architecture for

59 ) ). MSA MSA ). ) coded in bits in bits coded ) ) codedis inbits ) codedinbits ) iscoded) in bits ) is codedinbits Table 3.16 ) is in coded bits ) Table 3.13 Table 3.7 Instruction Encoding Table 3.8 Table Table 3.14 Table 3.11 Table 3.8 ). Table 3.17 value, where the data format is either implicit is either data format the where value, itecture Module, Revision 1.12 Revision Module, itecture ement index and data format df/n ( format data and index ement ate, df where the data format (Table 3.8 COP1opcode 17(see minoropcodes in the MSA major opcode 30 (see immediate bitand data index formatdf/m ( specific instruction formats as follows: as instruction formats specific ) in bits1…0, and theoperation is codedinbit 25andthe minoropcode bits Table 3.8 Table field encodings in the encodings in field Volume IV-j: The MIPS64® SIMD Arch ) coded inbit 21 ) in bits22…21, and the operation codedis inbits 25…23 ):instructions with a 16-bit immediate, where the data format is either implicit or explicitly ): 2-register instructions with a 10-bit immediate a 10-bit with instructions 2-register ): ): instructions with an immediate el an immediate with instructions ): ): 3-register instructions3-register ): with implicit data formats dependingonthe operationscoded in bits ): 3-register ):3-register floating-pointoperations codedor fixed-point inbits 25…22with dataformat df ): 2-register ):2-register floating-point operationscoded in bits25…17 with data format3.11 df (Table ): with an instructions ):instructions with a 10-bitimmedi ): 2-register operations codedin ): bits 25…182-register with dataformat( df ): 3-register operations codedin ): bits 25…233-register with dataformat( df ): instructionswith an 8-bitand immediateeither value implicit data format or data format df ): instructionswith a 5-bit whereimmediatethe data format value, df ( Table 3.8 Table 3.12 , ) coded inbits 25…24 Figure 3-47 Figure 3-44 Figure Figure 3-41 Figure Figure 3-43 Figure 3-42 Figure 3-46 Figure 3-38 Figure 3-39 Figure 3-45 Figure 3-40 Figure3-36 Figure3-37 Table3.11 Table 3.8 Table ( 21…16, thewhere operationis coded in bits25…22 25…21 explicitly codedas or df ( 5…2 codedin bit 16 codedas ( df 17…16 22…21andthe operationin bits 25…23 ( 22…21 22…21and the operationin bits 25…23 22…16, thewhere operationis coded in bits25…23 •3RF ( •3RF •( ( MI10 •2R • Branch ( •2RF ( •2RF •3R ( •3R Each allocated minor opcode is associated allocated Each •I8 ( •I8 •ELM ( •ELM branch instructions use 10 rs All MSA instructions except branches use 40 branches use except MSA instructions All •VEC ( •VEC •BIT ( ( •BIT •I10 •I5 ( •I5 3.7.2 Instruction Formats MIPS® Architecture for Programmers Programmers MIPS® Architecture for

60 minor opcode minor minor opcode minor 3.7 Instruction Encoding wd wd itecture Module, Revision 1.12 Revision Module, itecture ws ws 8556 Figure 3-37Figure I5 Instruction Format Figure 3-36Figure I8 Format Instruction Figure 3-40 Figure Format 3R Instruction Figure Figure 3-39 Format Instruction I10 Figure 3-38 Figure Format BIT Instruction Volume IV-j: The MIPS64® SIMD Arch 2 df i8 ws wdopcode minor operation df wt operation df i10 wdopcode minor operation df/m ws wdopcode minor operation df i5 6325556 3156 632105 63 7 5 5 6 6325556 6 MSA MSA MSA MSA 011110 011110 011110 011110 MSA 011110 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 MIPS® Architecture for Programmers Programmers MIPS® Architecture for

61 minor opcode minor minor opcode minor minor opcode minor 3.7 Instruction Encoding wd wd wd itecture Module, Revision 1.12 Revision Module, itecture 556 ws ws df ws df/n s10 rs wdopcode minor df Figure 3-45 Figure Format 2R Instruction Figure3-42 3RF Instruction Format Figure 3-43 Figure Format VEC Instruction Figure Figure 3-41 Format ELM Instruction Figure 3-44Figure Format Instruction MI10 Volume IV-j: The MIPS64® SIMD Arch operation operation wt ws wdopcode minor operation operation df wt 6105532 6 5555 6 682 646 55 6 6415556 MSA MSA MSA MSA MSA 011110 011110 011110 011110 011110 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 MIPS® Architecture for Programmers Programmers MIPS® Architecture for

62 3.7 Instruction Encoding field the are listed top- along field 6 1 s16 opcode t three bits designating the row, and the and the row, thedesignating bits three t describesthe meaningthe of symbols field this tableencodes.Bits31…29 the of field itecture Module, Revision 1.12 Revision Module, itecture Table 3.15 opcode 556 encoding for the MSA instructions. See Volumes I andof I II encoding for the MSA instructions.See Volumes lumns of thetable.Bits 28…26the of nd binary values are given, with the firs with are given, binary values nd Figure3-46 2RF Instruction Format Figure 3-47Figure Format Instruction Branch operation df ws wdopcode minor Volume IV-j: The MIPS64® SIMD Arch operation df wt value for the instruction labelled EX1 is 33 (decimal, row and column), or 011011 (binary). Sim- (binary). 011011 or column), and row (decimal, 33 is EX1 labelled instruction the for value value for 64 (decimal),EX2 is or 110100value (binary). opcode shows a sample encoding shows tableand the instruction opcode field are listed in the left-most left-most co listed in the are field 6325 691 MSA COP1 010001 011110 last threebits designating the column. encoding is found(bits 31…29) at the intersection An instruction’s and a columnFor row of (bits28…26) value. the instance, ilarly, the ilarly, most rows of the table. Both decimal a rows most This chapter describes the bit encoding tables used for the MSA. the for used bit encoding tables the This chapter describes used in thetables.These tables only list theinstruction thismulti-volume setfor a full encoding ofall instructions. 3.48 Figure opcode 313029282726252423222120191817161514131211109876543210 313029282726252423222120191817161514131211109876543210 3.7.3 Instruction Bit Encoding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

64 BNZ.V encoding or encoding  3.7 Instruction Encoding e partner in selecting partner in selecting e MSA SPECIAL2 encoded with this value, exe- with this value, encoded ration or field codes. ration field or essor to access is not which ss is allowed) or a Coprocessor or a Coprocessor is ss allowed) licensed MIPS partners. To avoid avoid MIPS partners. To licensed oding is not implemented, execut- is not implemented, oding itecture Module, Revision 1.12 Revision Module, itecture chnologies will will th chnologies assist The partner is not required to consult with MIPS with consult required to is not The partner Instruction Exception. If the encoding is If the encoding imple- Exception. Instruction t an EJTAG instruction support and implementation t an EJTAG d avoid using these ope d avoid Meaning Field MSABranch Instructions for rs BZ.V oding is used. If no instruction is instruction no is used. If oding th this symbol are available to available are symbol this th instruction encoding for a coproc for encoding instruction emented, executing such an instruction must cause a Reserved a Reserved must cause such an instruction executing emented, ion encoding as shown in the table. as shown ion encoding eachimplementation. theIf enc coding for coding to which a coprocessor for acce ction definitions, MIPS Te ction definitions, marked with this symbol are reserved for MIPS Application Specific Specific MIPS Application for reserved are symbol this with marked future a from removed be will and obsolete are symbol with this marked of the implementations 2 for Release valid are symbol with this marked Encoding of of Encoding COP1  Volume IV-j: The MIPS64® SIMD Arch COP1 Table 3.16Table Field Opcode of the MIPS64 Encoding 26 coprocessor instruction en instruction coprocessor cuting such an instructionmust cause Instruction a Reserved Exception ( Unusable Exception (coprocessor (coprocessor Exception Unusable ing such an instruction must cause a Reserved instruct the must match it mented, of this encoding for optional is this encoding of allowed). with this symbol represen codes Field marked multiple instru conflicting partner. the by encoding if requested appropriate enc these of one when Technologies Operation or field codes marked wi marked codes or field Operation Operation or field codes or field Operation codes or field Operation Extensions. If the ASE is If the ASE not Extensions. impl Exception. Instruction codes or field Operation shoul Software ISA. of the MIPS64 revision architecture. such Executing an instruction in a Release implementation1 must cause a Reserved Exception. Instruction Table 3.15Table Tables Encoding Instruction the in Used Symbols 21 … 01234567 … 000 001 010 011 100 101 110 111 0 1 2 34567 000 001 010 011 100 101 110 111 bits 28 Table 3.17Table MIPS64  bits 23     29 … 24 Symbol … opcode 3011 4100 5101 6110 7111 0000 1001 2010 rs

bits 31 210 3 11 BZ.B BZ.H BZ.W BZ.D BNZ.B BNZ.H BNZ.W BNZ.D 000 101 bits 25 MIPS® Architecture for Programmers Programmers MIPS® Architecture for

65 2 Reserved 3.7 Instruction Encoding 1 exception or the exception 1 itecture Module, Revision 1.12 Revision Module, itecture data data format Bits 1…0 Bits 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D Minor Opcode Field . MSA LD ST . Table 3.8 81000 91001 Bits 5…2 Bits operation 1. See 1. MSA Disabled generate the MSA will and opcodes rved Section 3.5 “Exceptions” Volume IV-j: The MIPS64® SIMD Arch Table 3.18Table EncodingMIPS of 0 … 01234567 000 001 010 011 100 101 110 111 Table 3.19Table Formats Instruction MI10 for Field of Operation Encoding Bits 2 exception as specified in as specified exception 3 … minor Instruction 0 000 I8 I8 I8 * * * I5 I5 1001* * BIT BIT 23 010 *4 0115101******** 3R 100 3R6110******** * 3R MI107111******** 3R 3R MI10 ELM 3R MI10 3RF 3R MI10 3RF MI10 3R 3RF MI10 3R * MI10 VEC/2R/2RF * MI10 * * Bits 5 Bits 1. The opcodes marked ‘*’ MSA rese are ‘*’ 1. opcodes marked The 2. Includes I10 MIPS® Architecture for Programmers Programmers MIPS® Architecture for

66 3.7 Instruction Encoding 1 data data format 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D itecture Module, Revision 1.12 Revision Module, itecture Bits 22…21 Bits 2 CLTI_U 67 . 000110 000111 MAXI_U Bits 5…0 Bits Table 3.8 11*7111 * 1 001 SUBVI2 010 * MAXI_S3011 CLTI_S 00DV CEQI 0000ADDVI 4 100 MINI_S5 CLEI_S 101 MINI_U *6110 CLEI_U LDI operation Bits 25…23 Bits 1. See 1. I10 instruction format. 2. Volume IV-j: The MIPS64® SIMD Arch Table 3.20Table Format I5 Instruction for Field of Operation Encoding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

67 3.7 Instruction Encoding * * 1 itecture Module, Revision 1.12 Revision Module, itecture data data * format 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D Bits 17…16 Bits * 012 000000 000001 000010 Bits 5…0 0123 00 01 10 11 operation Bits 20…18 Bits 1 001 PCNT 0 000 FILL Bits 22…21 1XR. *01 00 * 2 01 ANDI.B311XORI.B 10 ORI.B BMNZI.B NORI.B BMZI.B SHF.B BSELI.B SHF.H SHF.W operation Bits 25…24 Bits Volume IV-j: The MIPS64® SIMD Arch 0 0001 0012 AND.V 0103 BMNZ.V 0114 OR.V 1005 * BMZ.V 1016 * 110 NOR.V * BSEL.V 2R format * * XOR.V format 2RF * * * * * * * * * * * * * 7 111 * operation Bits 25…23 Bits Table 3.21Table Format I8 Instruction for Field of Operation Encoding Table 3.23 Table Formats Instruction 2R for Field Operation of Encoding Table 3.22 Table Formats Instruction VEC/2R/2RF for Field Operation of Encoding

MIPS® Architecture for Programmers Programmers MIPS® Architecture for

68 3.7 Instruction Encoding itecture Module, Revision 1.12 Revision Module, itecture 1 data data 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D Bit Bit 16 format 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D r 2R Instruction Formats (Continued) Formats r 2R Instruction . 010 NLOC Table 3.8 0 0000 FCLASS 1 0001 FTRUNC_S 2 0010 FTRUNC_U 30011FSQRT 40100FRSQRT 50101FRCP 6 011070111FLOG2 FRINT 8 1000 FEXUPL 91001FEXUPR operation Bits 20…17 Bits 3 011 NLZC 2 4…7 100…111 * 1. See 1. Volume IV-j: The MIPS64® SIMD Arch Table 3.24 Table Formats Instruction 2RF for Field Operation of Encoding Table 3.23Table Field fo Operation of Encoding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

69 1 data data format 00 .B 01 .H 10 .W 11 .D 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D 00 .B 01 .H 10 .W 11 .D Bits 22…21 Bits 3.7 Instruction Encoding *00.B ILVL SLD VSHF SPLAT SRAR PCKEV SRLR PCKOD * * * * * * DOTP_S DOTP_U DPADD_S DPADD_U DPSUB_S HADD_S itecture Module, Revision 1.12 Revision Module, itecture 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D 0.W 1.D . Table 3.12 11 101112 FFQR 1100 FTINT_S 13 1101 FTINT_U 14 1110 FFINT_S 15 1111 FFINT_U 10 1010 FFQL 1. See 1. Volume IV-j: The MIPS64® SIMD Arch Table 3.25Table Format 3R Instruction for Field of Operation Encoding 13 14 15 16 17 18 19 20 21 001101 001110 001111 010000 010001 010010 010011 010100 010101 Bits 5…0 n Table 3.24Table (Continued) Formats 2RF Instruction Field for of Operation Encoding Bits 4 100 BSET MIN_S CLE_S AVE_S ASUB_S DIV_S 3 011 BCLR MAX_U CLT_U ADDS_U SUBSUU_S * 1 001 SRA SUBV * ADDS_A SUBS_U MADDV 2 010 SRL MAX_S CLT_S ADDS_S SUBSUS_U MSUBV 0 000 SLL ADDV CEQ ADD_A SUBS_S MULV 25…23 operatio

MIPS® Architecture for Programmers Programmers MIPS® Architecture for

70 01 .H 10 .W 11 .D 01 .H 10 .W 11 .D 01 .H 10 .W 11 .D 3.7 Instruction Encoding *00.B *00.B *00.B HSUB_S HSUB_U ILVR 1 * DPSUB_U HADD_U itecture Module, Revision 1.12 Revision Module, itecture Bits 21…16 Bits data format 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D 111110 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D Field for ELM Instruction Format Instruction ELM Field for * * * * * * * SLDI SPLATI MOVE.V 111110 COPY_S COPY_U CTCMSA 111110 CFCMSA 111110 operation Bits 25…22 Bits 3 0011 1 0001 2 0010 0 0000 Volume IV-j: The MIPS64® SIMD Arch . Table 3.26Table Operation of Encoding Table 3.8 Table Table 3.25Table (Continued) Format 3R Instruction Field for Operation of Encoding 7 111 BINSR MIN_A * AVER_U * MOD_U * ILVOD 6 110 BINSL MAX_A * AVER_S * MOD_S * ILVEV 5 101 BNEG MIN_U CLE_U AVE_U ASUB_U DIV_U 1. See MIPS® Architecture for Programmers Programmers MIPS® Architecture for

71 3.7 Instruction Encoding 1 1 0 data data Bit 21 Bit format .D.H 1 0 .H 0 .D 1 .H 0 .D 1 .D 1 .D 1 .W 1 .W 1 .W 0 .W 1 .W 0 .W 0 .W 0 .W 0 * * * FCNE FCOR itecture Module, Revision 1.12 Revision Module, itecture MUL_Q FCUNE MSUB_Q MADD_Q 00nnnn 100nnn 1100nn 11100n 11110n 111110 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D 111110 111111 00nnnn100nnn .B 1100nn .H 11100n .W 11110n .D 111110 111111 .D .D .H .D .D .D .D .D .W .W .W .W .W .W .W .W Field for 3RF Instruction Format 3RF Instruction Field for * * * * for ELM Instruction Format (Continued) Format Instruction ELM for * INSVE FDIV INSERT FADD FSUB FMUL FEXP2 FEXDO FMSUB FMADD . .D .D .D .D .D .D .D .D .D .W .W .W .W .W .W .W .W .W 26 27 28 Table 3.13 011010 011011 011100 4 0100 5 0101 Bits 5…0 6…15 0110…1111 1. See Volume IV-j: The MIPS64® SIMD Arch 5 0101 FCULT 4 0100 FCLT 8 1000 FSAF 7 0111 FCULE 6 0110 FCLE 3 0011 FCUEQ 1 00012 FCUN 0010 FCEQ 0 0000 FCAF operation Bits 25…22 Bits Table 3.27Table of Operation Encoding Table 3.26Table Field of Operation Encoding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

72 3.7 Instruction Encoding 0 .H 0 .H 0 .H 0 .W 0 .W 0 .W 0 1 * itecture Module, Revision 1.12 Revision Module, itecture FSUNE Bits 22…16 Bits MULR_Q data format data MSUBR_Q MADDR_Q. 1110mmm .B 1110mmm .B 1110mmm .B 1110mmm .B 1110mmm .B 110mmmm .H 110mmmm .H 110mmmm .H 110mmmm .H 110mmmm .H 10mmmmm .W 10mmmmm .W 10mmmmm .W 10mmmmm .W 10mmmmm .W 0mmmmmm .D 0mmmmmm .D 0mmmmmm .D 0mmmmmm .D 0mmmmmm .D .H .W .W .W .W *FSNE *FSOR FTQ FMIN FMAX . FMIN_A FMAX_A 910 .D.D .D .D .W 1 .W 1 .D.D.D .W .D .D.D 1 .D 1 .W 1 .D 1 .D .D 1 .W .W .W .W .W .W .W 001001 001010 Bits 5…0 Bits Table 3.12 Table and 4 100 BSETI * 3 011 BCLRI SRLRI 2 010 SRLI SRARI 1 001 SRAI SAT_U 0 000 SLLI SAT_S operation Bits 25…23 Bits Volume IV-j: The MIPS64® SIMD Arch Table 3.11 Table 9 1001 FSUN 14 111015 FSLE 1111 FSULE 11 1011 FSUEQ 12 110013 FSLT 1101 FSULT 10 1010 FSEQ 1. See Table 3.28Table Format BIT Instruction Field for of Operation Encoding Table 3.27 Table (Continued) Format Instruction 3RF for Field Operation of Encoding

MIPS® Architecture for Programmers Programmers MIPS® Architecture for

73 3.7 Instruction Encoding itecture Module, Revision 1.12 Revision Module, itecture 1110mmm .B 1110mmm .B 1110mmm .B 110mmmm .H 110mmmm .H 110mmmm .H 10mmmmm .W 10mmmmm .W 10mmmmm .W 0mmmmmm .D 0mmmmmm .D 0mmmmmm .D * * . Table 3.14 7 111 BINSRI 10ISI* 6110BINSLI 5 101 BNEGI Volume IV-j: The MIPS64® SIMD Arch 1. See See 1. Table 3.28Table (Continued) Format Instruction BIT for Field Operation of Encoding MIPS® Architecture for Programmers Programmers MIPS® Architecture for

), Table 4.5 ), bitwise ), load/store and move load/store), and move Table 4.1 PS base architecture. The corre- The architecture. base PS Table 4.8 Table ), floating-point), ( compare operand across all destination vector ele- destination vector all operand across for the for bitwise logical operations. ypical SIMD manner. Few instructions handle instructions Few manner. SIMD ypical Table 4.4 Table itecture Module, Revision 1.12 Revision Module, itecture 74 allinstruction descriptionsis set to128, i.e. the MSA diate value or a specific vector element selected by an element selected vector or a specific value diate ), branch and compare ( ), Instruction Description Instruction tegory and provide individual instructionindividual descriptionsand tegory provide futurereleases of the base architecturespecifications. ) are integral part of the MI part are integral ) Table 4.7 Table ) andnon arithmetic ( of Signed and Unsigned Subtract Signed Unsigned of and Table 4.11 Table ists of integer, fixed-point, and floating-point instructions, all encoded all encoded instructions, and floating-point fixed-point, of integer, ists ted Add ted Values Absolute signed Add Saturated signed Add Horizontal ). or element is being used as a fixed a fixed being used as is element or Table 4.3 ), fixed-point ( fixed-point ), Signed and Un Signed Signed and Un Signed Add and Satura Add Signed and Unsigned Average and Unsigned Signed Absolute Value Value Absolute Signed and Unsigned Average with Rounding Average and Unsigned Signed Add e vector element by vector element in a t in a element by vector element vector e Volume IV-j: The MIPS64® SIMD Arch Table 4.6 Table Table 4.1 Table Integer MSA Arithmetic Instructions ),floating-point arithmetic( Table 4.10 (Table and element permute ), Mnemonic Table 4.2 Table Table 4.9 Table vector register width inbits.register vector arranged in alphabetical order. Theconstant WRLEN used in arrangedinalphabetical order. ments. sections list MSA instructionstwo groupedbynext ca The immediate index. The immediate or vect immediate The immediate index. the elements don’t make sense, e.g. make elements don’t the because bit vectors as the operands certaininstructions, For thesource operand could be animme ( ( floating-pointconversions MSAinstruction setimplements of instructions:categories the following arithmeticinteger ( in the MSA major opcode space. MSA major in the instructions operat MSA Most The MIPS64® SIMD Architecture (MSA) cons MIPS64® SIMD Architecture The ( and DLSA ( LSA instructions add left-shift The spondingdocumentation pages will be incorporatedin the 4.1.1 Category Instruction Set Summary by HADD_S,HADD_U ASUB_S,ASUB_U AVE_S, AVE_U AVE_S, AVER_U AVER_S, ADDS_S, ADDS_U ADDV, ADDVI ADDV, ADD_A,ADDS_A MIPS® Architecture for Programmers Programmers MIPS® Architecture for 4.1 Instruction Set Descriptions The MIPS64® SIMD Architecture Instruction Set Instruction SIMD Architecture The MIPS64® Chapter 4 Chapter

75 4.1 Set Descriptions Instruction itecture Module, Revision 1.12 Revision Module, itecture Instruction Description Instruction Instruction Description Instruction gned Subtract from Unsigned from Subtract gned d Unsigned Subtract d Unsigned signed Saturated Subtract Saturated signed Subtract Horizontal signed Unsigned Product Subtract Dot Unsigned igned and Unsigned AddProduct Dot and Unsigned igned S Multiply-Subtract Logical And Logical If Not Zero Bit Move Subtract Signed and Un Signed Leading One Bits Count Leading Signed and Signed Multiply-Add Signed and Un Signed Saturate Signed Signed and Unsigned Product Dot and Unsigned Signed Maximum of AbsoluteMinimum and Values Bit Clear If Zero Bit Move Bit Select Bit Set Multiply signed Remainder (Modulo) and Unsigned Signed Unsigned Saturated Si Signed and Unsigned Saturate and Unsigned Signed Bit Negate Divide Table 4.2Table Instructions Bitwise MSA Volume IV-j: The MIPS64® SIMD Arch _U Maximum and Unsigned Signed RI Right Left and Bit Insert Table 4.1Table (Continued) MSA Integer Instructions Arithmetic ADD_U Mnemonic Mnemonic BCLR, BCLRI BCLR, BINSLI, BINSR, BINS BINSL, BMNZIBMNZ, BMZI BMZ, BNEGI BNEG, BSELI BSEL, BSETI BSET, NLOC DOTP_S, DOTP_U DOTP_S, DP DPADD_S, SUBV, SUBVI SUBV, AND,ANDI DIV_S,DIV_U MADDV MAX_A,MIN_A MAX_S,MAXI_S, MAX_U,MAXI MINI_UMINI_S, MIN_U, MIN_S, MSUBV Maximum and Unsigned Signed MULV MOD_U MOD_S, SAT_U SAT_S, SUBS_S, SUBS_U SUBSUS_U HSUB_S, HSUB_U HSUB_S, SUBSUU_S DPSUB_S, DPSUB_U DPSUB_S, MIPS® Architecture for Programmers Programmers MIPS® Architecture for

76 4.1 Set Descriptions Instruction itecture Module, Revision 1.12 Revision Module, itecture Instruction Description Instruction Instruction Description Instruction m Minimumand of Absolute Values ting-Point Reciprocal ting-Point Base 2 Exponentiation Base 2 gated Or Ne Floating-Point Fused Multiply-Add and Multiply-Subtract and Fused Multiply-Add Floating-Point Floating-Point Maximu Floating-Point Rounding Shift Right Arithmetic Right Shift Rounding Floating-Point Multiplication Floating-Point Shift Right Arithmetic Right Shift Logical Exclusive Or Exclusive Logical Floating-Point Division Floating-Point Floating-Point Logical Logical Floating-Point Round to Integer Floating-Point of Root Square Approximate Reciprocal Floating-Point Subtraction Floating-Point Floating-Point Addition Floating-Point Base Logarithm 2 Floating-Point Logical Or Logical Logical Right Shift Rounding Population (Bits Set to 1) Count Population (Bits Leading Zero Bits Count Zero Leading Approximate Floa Root Square Floating-Point Shift Logical Right Floating-Point Maximum Minimum and Floating-Point Shift Left Volume IV-j: The MIPS64® SIMD Arch Table 4.2Table (Continued) MSA Bitwise Instructions Table 4.3Table Instructions Arithmetic MSA Floating-Point Mnemonic Mnemonic FRINT FRSQRT FSQRT FSUB NLZC NOR, NORI FADD FDIV FEXP2 FLOG2 FMSUB FMADD, FMIN FMAX, FMIN_A FMAX_A, FMUL FRCP PCNT OR, ORI SLL,SLLI SRAI SRA, SRARI SRAR, SRLISRL, SRLRI SRLR, XOR, XORI MIPS® Architecture for Programmers Programmers MIPS® Architecture for

77 4.1 Set Descriptions Instruction itecture Module, Revision 1.12 Revision Module, itecture Instruction Description Instruction Instruction Description Instruction mpare Less Than or Equal Less Than or mpare Floating-Point Quiet Compare Unordered or Not Equal or Unordered Compare Quiet Floating-Point Floating-Point Quiet Compare Not Equal Not Compare Quiet Floating-Point Than Less or Unordered Compare Quiet Floating-Point Than Less Compare Signaling Floating-Point Floating-Point Quiet Compare Always False Always Compare Quiet Floating-Point Unordered Compare Quiet Floating-Point Ordered Compare Quiet Floating-Point or Equal Than Less or Unordered Compare Quiet Floating-Point Equal or or Less Than Unordered Compare Signaling Floating-Point Floating-Point Quiet Compare Unordered or Equal or Unordered Compare Quiet Floating-Point Equal Compare Signaling Floating-Point Co Signaling Floating-Point Floating-Point Quiet Compare Less Than Less Compare Quiet Floating-Point Unordered Compare Signaling Floating-Point Ordered Compare Signaling Floating-Point Equal Not Compare Signaling Floating-Point Floating-Point Quiet Compare Equal Compare Quiet Floating-Point or Equal Than Less Compare Quiet Floating-Point False Always Compare Signaling Floating-Point or Not Equal Unordered Compare Signaling Floating-Point Floating-Point Signaling Compare Unordered or Equal Unordered Compare Signaling Floating-Point or Less Than Unordered Compare Signaling Floating-Point Floating-Point ClassFloating-Point Mask Volume IV-j: The MIPS64® SIMD Arch Table 4.5Table Instructions Compare Floating-Point MSA Table 4.4 Table Instructions Non Arithmetic Floating-Point MSA Mnemonic Mnemonic FCAF FCUN FCOR FCEQ FCUNE FCUEQ FCNE FCLT FCULT FCLE FCULE FSAF FSUN FSOR FSEQ FSUNE FSUEQ FSNE FSLT FSULT FSLE FSULE FCLASS MIPS® Architecture for Programmers Programmers MIPS® Architecture for

78 4.1 Set Descriptions Instruction onvert Interchange Interchange Format onvert itecture Module, Revision 1.12 Revision Module, itecture nd with Rounding nd Instruction Description Instruction Description Instruction Instruction Description Instruction nd Convert to Signed and Unsigned Integer to Signed and Unsigned Convert nd nd Subtract without and with Rounding without and Subtract nd and Add and without and with Rounding tiply without a tiply Right-Half Floating-Point Convert from Fixed-Point Floating-Point Convert Right-Half Right-Half Floating-Point Up-C Right-Half Less-Than-or-Equal Signed and Unsigned Signed Less-Than-or-Equal Fixed-Point Multiply Fixed-Point a Multiply Fixed-Point Fixed-Point Mul Fixed-Point Floating-Point Truncate a Truncate Floating-Point Floating-Point Convert from Signed and Integer Unsigned from and Signed Convert Floating-Point Floating-Point Down-Convert Interchange Format Down-Convert Floating-Point Convert to Signed and Unsigned Integer and to Signed Round and Convert Floating-Point Branch If Not Zero Branch Left-Halfand Floating-Point Round and Convert to Fixed-Point and Convert Round Floating-Point Branch If Zero Branch Left-Half and Left-Half Compare Equal Volume IV-j: The MIPS64® SIMD Arch Table 4.7Table Instructions MSA Fixed-Point Table 4.8Table Instructions Compare and MSA Branch Table 4.6Table Instructions Conversion MSA Floating-Point Mnemonic Mnemonic Mnemonic FTINT_S, FTINT_U FTINT_S, MADD_Q, MADDR_QMADD_Q, MSUBR_Q MSUB_Q, BNZ BZ CEQI CEQ, CLE_S, CLEI_S, CLE_U, CLEI_U Compare MUL_Q, MULR_Q CLTI_U CLT_U, CLTI_S, CLT_S, Compare Signed Less-Than and Unsigned FEXDO FEXUPL, FEXUPR FFINT_S, FFINT_U FFINT_S, FFQR FFQL, FTRUNC_S, FTRUNC_U FTRUNC_S, FTQ MIPS® Architecture for Programmers Programmers MIPS® Architecture for

79 4.1 Set Descriptions Instruction itecture Module, Revision 1.12 Revision Module, itecture address calculation. Instruction Description Instruction Description Instruction Instruction Description Instruction add or add load/store to GPR Signed and Unsigned GPR Signed and to Insert GPR and Vector element 0 to Vector Element element 0 to Vector GPR and Vector Insert Vector shuffle Vector Load Immediate Load Interleave Even, Odd Even, Interleave Set Shuffle Copy from and copy to MSA Control Register to MSA Control copy from and Copy element Copy Interleave the Left, Right the Left, Interleave left-shift Double LoadVector Move to Vector Vector from GPR Fill Vector Replicate Vector Element ReplicateVector Store Vector Pack Even and Odd Elements Even Pack Element Slide calculation. address load/store or add Left-shift Volume IV-j: The MIPS64® SIMD Arch Table 4.11Table Instructions Base Architecture st of Instructions st of Table 4.10Table Instructions Permute MSA Element Table 4.9Table Instructions and Move MSA Load/Store Mnemonic Mnemonic Mnemonic 4.1.2 Alphabetical Li ILVEV, ILVOD ILVEV, ILVR ILVL, PCKOD PCKEV, SHF SLD, SLDI VSHF LSA DLSA CFCMSA, CTCMSA CFCMSA, LD LDI MOVE SPLATI SPLAT, FILL INSVE INSERT, COPY_U COPY_S, ST MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA . The 0 ws ADD_A.df , 16) , 32) , 64) 3R 010000 , 8) 16i+15..16i 32i+31..32i 64i+63..64i 6 5 es es of the elements in v ector 8i+7..8i wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 80 , 16) + abs(WR[wt] , 32) + abs(WR[wt] , 64) + abs(WR[wt] , ws . df , 8) + abs(WR[wt] 16i+15..16i 32i+31..32i 64i+63..64i 16 15 are added to the absolute valu 8i+7..8i wt abs(WR[ws] abs(WR[ws] abs(WR[ws]    25556  abs(WR[ws] Volume IV-j: The MIPS64® SIMD Arch . n-1...0 n-1..0 wd using the absolute values. absolute the using absolute_value(ws[i]) + absolute_value(wt[i]) absolute_value(ws[i])  000 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 1 then = n-1 26 25 23 22 21 20 wd[i] return -tt return WR[wd] return tt return WR[wd] WR[wd] WR[wd] ADD_A.df ADD_A.df wd,ws,wt ADD_A.B wd,ws,wt ADD_A.H wd,ws,wt ADD_A.W wd,ws,wt ADD_A.D Vector Add Absolute Values Vector endfor else endif endfor endfor endfor if tt 63 MSA ADD_A.H .. WRLEN/16-1 i in 0 for endfunction abs endfunction ADD_A.B .. WRLEN/8-1 i in 0 for ADD_A.W .. WRLEN/32-1 i in 0 for ADD_A.D .. WRLEN/64-1 i in 0 for n) abs(tt, function 011110 Description: Vector addition to vector vector addition to Vector Purpose: The absolute values of the elements in v ector Format: Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The result writtenis tovector 31 Vector Add Absolute Values MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ADD_A.df itecture Module, Revision 1.12 Revision Module, itecture 81 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Add Absolute Values MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA . The 0 ws ADDS_A.df 3R 010000 , 16) , 32) , 64) 6 5 , 8) es es of the elements in v ector wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 82 ws , WR[wt] . df 16i+15..16i 32i+31..32i 64i+63..64i then 8i+7..8i 16 15 are added to the absolute valu n-b+1 wt . 0 

wd adds_a(WR[ws] adds_a(WR[ws] adds_a(WR[ws] n-1..b-1    25556  adds_a(WR[ws] Volume IV-j: The MIPS64® SIMD Arch n-1...0 n-1..0 saturate_signed(absolute_value(ws[i]) + absolute_value(wt[i])) saturate_signed(absolute_value(ws[i])  001 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 1 then = 0 and tt = n-1 n-1 26 25 23 22 21 20 wd[i] return -tt return WR[wd] WR[wd] return tt return WR[wd] WR[wd] ADDS_A.df ADDS_A.df wd,ws,wt ADDS_A.B wd,ws,wt ADDS_A.H wd,ws,wt ADDS_A.W wd,ws,wt ADDS_A.D Vector Saturated Add of Absolute Values Vector endfor endfor endfor if tt else endif if tt endfor 63 MSA ADDS_A.W .. WRLEN/32-1 i in 0 for ADDS_A.D .. WRLEN/64-1 i in 0 for n) abs(tt, function endfunction abs endfunction b) n, sat_s(tt, function ADDS_A.H .. WRLEN/16-1 i in 0 for ADDS_A.B .. WRLEN/8-1 i in 0 for 011110 Vector saturated addition of absolutetovector values. Vector Description: Purpose: The absolute values of the elements in v ector Format: saturated signedresult is written to vector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Saturated Add of Absolute Values Absolute of Add Saturated Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ADDS_A.df itecture Module, Revision 1.12 Revision Module, itecture 83 then n-b+1 1 

b-1 b-1 n-1..b-1 || 0 || 1 Volume IV-j: The MIPS64® SIMD Arch n-b+1 n-b+1 = 1 and tt = n-1 (0 || abs(ts, n)) + (0 || abs(tt, n)) abs(tt, (0 || + n)) (0 || abs(ts, return 0 return return tt return return 1 return endif t  n) n+1, sat_s(t, return endif if tt else endfunction sat_s endfunction tt, n) adds_a(ts, function adds_a endfunction Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector Saturated Add of Absolute Values Absolute of Add Saturated Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ADDS_S.df . 3R wd 010000 , 16) , 32) , 64) 6 5 , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 . Signed arithmetic is performed and overflows , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 84 ws ws , WR[wt] . df 16i+15..16i 32i+31..32i 64i+63..64i then then 8i+7..8i 16 15 n-b+1 n-b+1 1 0  

adds_s(WR[ws] adds_s(WR[ws] adds_s(WR[ws] b-1 b-1 n-1..b-1 n-1..b-1    25556 ting the result as signed value. signed as result the ting  adds_s(WR[ws] || 1 || 0 Volume IV-j: The MIPS64® SIMD Arch are added to the ele ments in vector wt n-b+1 n-b+1 saturate_signed(signed(ws[i]) + signed(wt[i])) saturate_signed(signed(ws[i])  010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 0 and tt = 1 and tt = n-1 n-1 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] return 1 return tt return WR[wd] WR[wd] return 0 return ADDS_S.df ADDS_S.df wd,ws,wt ADDS_S.B wd,ws,wt ADDS_S.H wd,ws,wt ADDS_S.W wd,ws,wt ADDS_S.D Vector Signed SaturatedAdd Signedof Values Vector endfor endfor endfor if tt endif if tt else endif endfor 63 MSA ADDS_S.W .. WRLEN/32-1 i in 0 for ADDS_S.D .. WRLEN/64-1 i in 0 for b) n, sat_s(tt, function endfunction sat_s endfunction ADDS_S.H .. WRLEN/16-1 i in 0 for ADDS_S.B .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Vector addition to vector satura vector addition to Vector Description: Purpose: The in v elements ector Format: and/orclamp smallestbeforeto the largest writing representable signed the resultvalues tovector 31 Vector Signed Saturated Add of Signed Values Signed Add of Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ADDS_S.df itecture Module, Revision 1.12 Revision Module, itecture 85 || tt) n-1 Volume IV-j: The MIPS64® SIMD Arch || ts) + (tt || ts) + n-1 (ts t  return sat_s(t, n+1, n) n+1, sat_s(t, return function adds_s(ts, tt, n) adds_s(ts, function endfunction adds_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Signed Saturated Add of Signed Values Signed Add of Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ADDS_U.df 3R 010000 , 16) , 32) , 64) 6 5 . wd , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 86 . Unsigned arithmetic is performed and overflows ws ws , WR[wt] . df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 value before writing value theresult tovector adds_u(WR[ws] adds_u(WR[ws] adds_u(WR[ws] b    25556 then  adds_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || 1 n-b are added the to elements in vector 0 n-b wt saturate_unsigned(unsigned(ws[i]) + unsigned(wt[i])) saturate_unsigned(unsigned(ws[i])   011 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n-1..b 26 25 23 22 21 20 wd[i] (0 || ts) + (0 || tt) + (0 (0 || ts) return 0 return WR[wd] WR[wd] return tt return WR[wd] WR[wd] ADDS_U.df ADDS_U.df wd,ws,wt ADDS_U.B wd,ws,wt ADDS_U.H wd,ws,wt ADDS_U.W wd,ws,wt ADDS_U.D Vector Unsigned Saturated Add of UnsignedValues Vector endfor endfor endfor if tt else endif t  endfor 63 MSA ADDS_U.W .. WRLEN/32-1 i in 0 for ADDS_U.D .. WRLEN/64-1 i in 0 for b) n, sat_u(tt, function endfunction sat_u endfunction n) tt, adds_u(ts, function ADDS_U.H .. WRLEN/16-1 i in 0 for ADDS_U.B .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: clamp to the largest representable unsigned representable clamp to the largest Vector addition saturatingtovector theresult as unsigned value. Vector Description: Purpose: The elements in v ector Format: 31 Vector Unsigned Saturated Add of Unsigned Values Unsigned of Add Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ADDS_U.df itecture Module, Revision 1.12 Revision Module, itecture 87 Volume IV-j: The MIPS64® SIMD Arch return sat_u(t, n+1, n) n+1, sat_u(t, return endfunction adds_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Unsigned Saturated Add of Unsigned Values Unsigned of Add Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ADDV.df 3R . 001110 wd 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 88 . The result. The is written to vector 8i+7..8i ws ws . df + WR[wt] + WR[wt] + WR[wt] + WR[wt] 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch are added to the elements in vector are wt  ws[i] + wt[i] 000 df wt 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] ADDV.df ADDV.df wd,ws,wt ADDV.B wd,ws,wt ADDV.H wd,ws,wt ADDV.W wd,ws,wt ADDV.D Vector Add Vector endfor endfor endfor endfor 63 MSA ADDV.W .. WRLEN/32-1 i in 0 for ADDV.D .. WRLEN/64-1 i in 0 for ADDV.B .. WRLEN/8-1 i in 0 for ADDV.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector addition to vector. addition to Vector Description: Purpose: vector elements in The Format: The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Add Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA . 0 wd ADDVI.df I5 000110 6 5 wd ws. The resultis written to vector 11 10 itecture Module, Revision 1.12 Revision Module, itecture 89 ws . df + t + t + t + t 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i is added to the elements in vector in elements added to the is WR[ws] WR[ws] WR[ws] u5    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 4..0 4..0 4..0 4..0  ws[i] + u5 000 df u5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || u5 || || u5 || u5 || || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 WR[wd] WR[wd] WR[wd] WR[wd]     ADDVI.df ADDVI.df wd,ws,u5 ADDVI.B wd,ws,u5 ADDVI.H wd,ws,u5 ADDVI.W wd,ws,u5 ADDVI.D Immediate Add endfor endfor endfor endfor for i in 0 .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for 63 MSA ADDVI.W t ADDVI.D t ADDVI.H t ADDVI.B t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate addition to vector. Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: 5-bitThe immediate unsigned value Format: data format in integer values are and results operands The 31 Immediate Add Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 AND.V VEC 011110 6 5 wd in a bitwise logical operation.AND The wt 11 10 itecture Module, Revision 1.12 Revision Module, itecture 90 ws 16 15 corresponding bit of vector wt 21 20 Volume IV-j: The MIPS64® SIMD Arch . wd 00000 is combined with the ws ws AND wt  ws 26 25 wd AND.V AND.V wd,ws,wt AND.V Vector Logical Logical And Vector 6 5555 6 MSA WR[ws] and WR[wt] and  WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each bit of vector Format: Purpose: logical and. by vector Vector Description: The operands and results are bit vector values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector 31 Vector LogicalAnd MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 ANDI.B I8 000000 6 5 wd in a bitwise logical AND operation. The i8 11 10 itecture Module, Revision 1.12 Revision Module, itecture 91 5 5 6 ws 7..0 16 15 ..8i and i8 ..8i and i8 8i+7 ws is combined with the 8-bit immediate Volume IV-j: The MIPS64® SIMD Arch .  WR[ws] wd ws[i] AND i8 AND ws[i]  00 8i+7..8i 26 25 24 23 wd[i] ANDI.B ANDI.B wd,ws,i8 ANDI.B Immediate Logical And Immediate Logical WR[wd] 62 8 MSA endfor for i in 0 .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer byte data format. in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each byte of element vector Format: Purpose: logical and. vector Immediate by Description: 31 Immediate Logical And Logical Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ASUB_S.df 3R 010001 , 16) , 32) , 64) . The absolute value of the 6 5 ws , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 92 ws , WR[wt] . signed elements in v ector df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 es taking the absolute value of the results. of the taking the absolute value es || tt) n-1 asub_s(WR[ws] asub_s(WR[ws] asub_s(WR[ws] . are subtracted from the    wd 25556 wt  asub_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch n-1..0 n-1..0 absolute_value(signed(ws[i]) - signed(wt[i])) absolute_value(signed(ws[i])  || ts) - (tt || ts) - 100 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n-1 = 0 then 26 25 23 22 21 20 wd[i] n (ts return t return WR[wd] WR[wd] return (-t) return WR[wd] WR[wd] ASUB_S.df ASUB_S.df wd,ws,wt ASUB_S.B wd,ws,wt ASUB_S.H wd,ws,wt ASUB_S.W wd,ws,wt ASUB_S.D Vector Absolute Values of Signed Subtract Absolute Values Vector endfor endfor endfor t  else endfor if t 63 MSA ASUB_S.W .. WRLEN/32-1 i in 0 for ASUB_S.D .. WRLEN/64-1 i in 0 for n) tt, asub_s(ts, function endfunction asub_s endfunction ASUB_S.H .. WRLEN/16-1 i in 0 for ASUB_S.B .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Vector subtraction from vector of signed subtractionof valu from vector Vector Description: Purpose: The signed elements in v ector Format: signedresult is written vector to 31 Vector Absolute Values Signed of Subtract MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ASUB_S.df itecture Module, Revision 1.12 Revision Module, itecture 93 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Absolute Values Signed of Subtract MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ASUB_U.df 3R 010001 . The absolute value of the The absolute value . , 16) , 32) , 64) ws 6 5 , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 94 ws , WR[wt] . unsigned elements in vector unsigned df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 asub_u(WR[ws] asub_u(WR[ws] asub_u(WR[ws] . are subtracted from the are subtracted from the    wt wd 25556 of unsigned values taking the absolute value of the results. taking the absolute unsignedofvalue values  asub_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch n-1..0 n-1..0 absolute_value(unsigned(ws[i]) - unsigned(wt[i]))  absolute_value(unsigned(ws[i]) 101 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 0 then 26 25 23 22 21 20 wd[i] n (0 || ts) - (0 || tt) - (0 (0 || ts) return t return WR[wd] WR[wd] return (-t) return WR[wd] WR[wd] ASUB_U.df ASUB_U.df wd,ws,wt ASUB_U.B wd,ws,wt ASUB_U.H wd,ws,wt ASUB_U.W wd,ws,wt ASUB_U.D Vector Absolute Values of Unsigned Subtract Absolute Values Vector endfor endfor endfor t  if t else endfor 63 MSA ASUB_U.W .. WRLEN/32-1 i in 0 for ASUB_U.D .. WRLEN/64-1 i in 0 for n) tt, asub_u(ts, function endfunction asub_s endfunction ASUB_U.H .. WRLEN/16-1 i in 0 for ASUB_U.B .. WRLEN/8-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The Description: Vector subtraction from vector vector subtraction from Vector Purpose: in vector unsigned elements The Format: signedresult is written vector to 31 Vector Absolute Values of Unsigned Subtract Unsigned of Values Absolute Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ASUB_U.df itecture Module, Revision 1.12 Revision Module, itecture 95 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Absolute Values of Unsigned Subtract Unsigned of Values Absolute Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 AVE_S.df 3R 010000 , 16) , 32) , 64) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 96 , WR[wt] , WR[wt] , WR[wt] . The addition is done signed with full precision, i.e. ws ws . , WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 || tt) n-1 ave_s(WR[ws] ave_s(WR[ws] ave_s(WR[ws]    25556  ave_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch are added to the elements in vector wt (ws[i] + wt[i]) / 2 wt[i]) / (ws[i] +  || ts) + (tt || ts) + 100 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n..1 n-1 . wd 26 25 23 22 21 20 wd[i] (ts WR[wd] WR[wd] WR[wd] WR[wd] AVE_S.df AVE_S.df wd,ws,wt AVE_S.B wd,ws,wt AVE_S.H wd,ws,wt AVE_S.W wd,ws,wt AVE_S.D Vector Signed Average Vector endfor endfor endfor t  endfor t return 63 MSA AVE_S.W .. WRLEN/32-1 i in 0 for AVE_S.D .. WRLEN/64-1 i in 0 for n) tt, ave_s(ts, function endfunction ave_s endfunction AVE_S.B .. WRLEN/8-1 i in 0 for AVE_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector average values. signed using the average Vector Description: Purpose: The elements in vector Format: the result has one extra bit. Signed division by 2 (or arithmetic shift right by one bit) is performed before writing the result to vector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Signed Average Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 AVE_U.df 3R 010000 , 16) , 32) , 64) 6 5 wd , 8) 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 97 , WR[wt] , WR[wt] , WR[wt] . The addition is done unsigned with full precision, ws ws . , WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 ave_u(WR[ws] ave_u(WR[ws] ave_u(WR[ws]    25556  ave_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch are added to the elements in vector wt (ws[i] + wt[i]) / 2 wt[i]) / (ws[i] +  . 101 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n..1 wd 26 25 23 22 21 20 wd[i] (0 || ts) + (0 || tt) + (0 (0 || ts) WR[wd] WR[wd] WR[wd] WR[wd] AVE_U.df AVE_U.df wd,ws,wt AVE_U.B wd,ws,wt AVE_U.H wd,ws,wt AVE_U.W wd,ws,wt AVE_U.D Vector Unsigned Average Vector endfor endfor endfor t  t return endfor 63 MSA AVE_U.W .. WRLEN/32-1 i in 0 for AVE_U.D .. WRLEN/64-1 i in 0 for n) tt, ave_u(ts, function endfunction ave_u endfunction AVE_U.B .. WRLEN/8-1 i in 0 for AVE_U.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector average using the unsigned values. average Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The elements in vector Format: i.e. the result has one extra bit. Unsigned division by 2 (or logical shift right by one bit) is performed before writing the result to vector data format in integer values are and results operands The 31 Vector UnsignedAverage MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 AVER_S.df 3R 010000 , 16) , 32) , 64) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 98 . The addition of the elements plus 1 (for rounding) is rounding) plus 1 (for of the elements addition The . ws , WR[wt] ws . bit. Signed division by 2 (or arithmetic shift right by one df 16i+15..16i 32i+31..32i 64i+63..64i . 8i+7..8i wd 16 15 || tt) + 1 || tt) + n-1 aver_s(WR[ws] aver_s(WR[ws] aver_s(WR[ws]    25556  aver_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch ing the signed values. ing the are added to the elements in vector elements are added to the wt (ws[i] + wt[i] + 1) / 2 wt[i] + (ws[i] +  || ts) + (tt || ts) + 110 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n..1 n-1 26 25 23 22 21 20 wd[i] (ts WR[wd] WR[wd] WR[wd] WR[wd] AVER_S.df AVER_S.df wd,ws,wt AVER_S.B wd,ws,wt AVER_S.H wd,ws,wt AVER_S.W wd,ws,wt AVER_S.D Vector Signed Average Rounded Signed Average Vector endfor endfor endfor t  endfor t return 63 MSA AVER_S.W .. WRLEN/32-1 i in 0 for AVER_S.D .. WRLEN/64-1 i in 0 for n) tt, ave_s(ts, function endfunction aver_s endfunction AVER_S.B .. WRLEN/8-1 i in 0 for AVER_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector average rounded us rounded average Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: elements in vector The Format: data format in integer values are and results operands The bit) is performed beforewriting theresult tovector done signed with full precision, i.e. the result has one extra 31 Vector Signed Average Rounded Average Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 AVER_U.df 3R 010000 , 16) , 32) , 64) 6 5 , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 99 . The addition of the elements plus 1 (for rounding) is rounding) plus 1 (for of the elements addition The . ws , WR[wt] ws . . df 16i+15..16i 32i+31..32i 64i+63..64i wd 8i+7..8i 16 15 aver_u(WR[ws] aver_u(WR[ws] aver_u(WR[ws]    25556  aver_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch ing the unsigned values. ing the are added to the elements in vector elements are added to the wt (ws[i] + wt[i] + 1) / 2 wt[i] + (ws[i] +  111 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n..1 26 25 23 22 21 20 wd[i] (0 || ts) + (0 || tt) + 1 || tt) + + (0 (0 || ts) WR[wd] WR[wd] WR[wd] WR[wd] AVER_U.df AVER_U.df wd,ws,wt AVER_U.B wd,ws,wt AVER_U.H wd,ws,wt AVER_U.W wd,ws,wt AVER_U.D Vector Unsigned Average Rounded Unsigned Average Vector endfor endfor endfor t  t return endfor 63 MSA AVER_U.W .. WRLEN/32-1 i in 0 for AVER_U.D .. WRLEN/64-1 i in 0 for n) tt, ave_u(ts, function endfunction aver_u endfunction AVER_U.H .. WRLEN/16-1 i in 0 for AVER_U.B .. WRLEN/8-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector average rounded us rounded average Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: elements in vector The Format: data format in integer values are and results operands The done unsigned with full precision, i.e. the result has one extra bit. Unsigned division by 2 (or logical shift right by one bit) is performedbefore writing theresult tovector 31 Vector Unsigned Average Rounded Average Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BCLR.df I modulo the size 3R wt 001101 6 5 ) ) ) t t t wd ) t || 0 || 1 || 0 || 1 || 0 || 1 || 11 10 15-t 31-t 63-t itecture Module, Revision 1.12 Revision Module, itecture 100 || 0 || 1 || ws 7-t . df and (1 and (1 and (1 . wd . The bit position by is the elementsgiven in and (1 and 16 15 ws 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i in each element of vector tion clear in each element. bit_clear(ws[i], wt[i])  bit_clear(ws[i], 011 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] t  t  WR[wd] t  WR[wd] t  WR[wd] WR[wd] BCLR.df BCLR.df wd,ws,wt BCLR.B wd,ws,wt BCLR.H wd,ws,wt BCLR.W wd,ws,wt BCLR.D Vector Bit Clear Vector endfor endfor endfor endfor 63 MSA BCLR.W .. WRLEN/32-1 i in 0 for BCLR.D .. WRLEN/64-1 i in 0 for BCLR.H .. WRLEN/16-1 i in 0 for BCLR.B .. WRLEN/8-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: of the element in bits. The result is written to vector result is The bits. the element in of data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector selected bit posi selected bit Vector Description: Purpose: Clear (set to 0) one bit Format: 31 Vector Bit Clear Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BCLRI.df I BIT modulo the size of 001001 m 6 5 ) ) ) t t t wd ) t || 0 || 1 || 0 || 1 || 0 || 1 || 11 10 15-t 31-t 63-t itecture Module, Revision 1.12 Revision Module, itecture 101 || 0 || 1 || ws 7-t . df and (1 and (1 and (1 . and (1 and . The bit position is given by the immediate . Thebitposition is given 16 15 wd ws 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i df/m WR[ws] WR[ws] WR[ws]     WR[ws] Volume IV-j: The MIPS64® SIMD Arch ion clear in each element. clear ion bit_clear(ws[i], m)  bit_clear(ws[i], 011 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     BCLRI.df BCLRI.df wd,ws,m BCLRI.B wd,ws,m BCLRI.H wd,ws,m BCLRI.W wd,ws,m BCLRI.D Immediate Bit Clear Immediate Bit endfor .. WRLEN/32-1 i in 0 for endfor .. WRLEN/64-1 i in 0 for endfor endfor .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for 63 7 5 5 6 MSA BCLRI.W t BCLRI.D t BCLRI.H t BCLRI.B t 011110 Restrictions: No data-dependentare possible.exceptions Operation: the element inbits. The result is written tovector data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate selected bit posit Purpose: Description: each element of vector one bit in 0) (set to Clear Format: 31 Immediate Bit Clear Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BINSL.df I 3R 001101 6 5 while preserving the least sig- modulo the size of the element in of the element size the modulo wd wt wd 16i+15-t-1..16i 32i+31-t-1..32i 64i+63-t-1..64i 8i+7-t-1..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 102 || WR[wd] || WR[wd] || WR[wd] ws vectorto elements in . ws df || WR[wd] 16 15 16i+15..16i+15-t 32i+31..32i+31-t 64i+63..64i+63-t 8i+7..8i+7-t WR[ws] WR[ws] WR[ws]    25556 ts in each element of vector  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i bit_insert_left(wd[i], ws[i], wt[i]) ws[i], bit_insert_left(wd[i],  110 df wt 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] t  WR[wd] t  WR[wd] t  t  BINSL.df BINSL.df wd,ws,wt BINSL.B wd,ws,wt BINSL.H wd,ws,wt BINSL.W wd,ws,wt BINSL.D Vector Bit Insert Left endfor endfor endfor endfor 63 MSA BINSL.D .. WRLEN/64-1 i in 0 for BINSL.W .. WRLEN/32-1 i in 0 for BINSL.B .. WRLEN/8-1 i in 0 for BINSL.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector selectedleft most while bits copy preserving destinationright bits. Vector Description: Purpose: (left) bi most Copy significant Format: nificant (right) bits. The number of bits to copy is given by the elements in vector in elements by the is given copy to of bits number The bits. (right) nificant bitsplus 1. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Bit Insert Left Insert Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BINSLI.df I BIT the element in bits 001001 6 5 while preserving the least sig- wd wd modulo the size of 16i+15-t-1..16i 32i+31-t-1..32i 64i+63-t-1..64i m 8i+7-t-1..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 103 || WR[wd] || WR[wd] || WR[wd] ws vectorto elements in . ws df || WR[wd] 16 15 16i+15..16i+15-t 32i+31..32i+31-t 64i+63..64i+63-t 8i+7..8i+7-t to copy is by given the immediate df/m whilepreserving destination rightbits. WR[ws] WR[ws] WR[ws]    ts in each element of vector  WR[ws] Volume IV-j: The MIPS64® SIMD Arch bit_insert_left(wd[i], ws[i], m) ws[i], bit_insert_left(wd[i],  110 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     BINSLI.df BINSLI.df wd,ws,m BINSLI.B wd,ws,m BINSLI.H wd,ws,m BINSLI.W wd,ws,m BINSLI.D Immediate BitInsert Left endfor endfor endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for .. WRLEN/16-1 i in 0 for 63 7 5 5 6 MSA BINSLI.D t BINSLI.W t BINSLI.B t BINSLI.H t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate selected left most bits copy Immediate bits selected left most copy Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: (left) bi most Copy significant Format: plus1. data format in integer values are and results operands The nificant nificant (right) bits. The number of bits 31 Immediate Bit Insert Left Insert Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BINSR.df I 3R 001101 6 5 while preserving the most sig- the while preserving wd modulo the size of the element in wt wd 16i+t..16i 32i+t..32i 64i+t..64i 8i+t..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 104 || WR[ws] || WR[ws] || WR[ws] || ws to elements in vector . ws df || WR[ws] 16 15 16i+15..16i+t+1 32i+31..32i+t+1 64i+63..64i+t+1 8i+7..8i+t+1 copy is given by the elements in vector is given copy WR[wd] WR[wd] WR[wd] whilepreserving destination left bits.    25556 its in each element of vector its  WR[wd] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i bit_insert_right(wd[i], ws[i], wt[i]) ws[i], bit_insert_right(wd[i],  111 df wt 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] t  WR[wd] t  WR[wd] t  t  BINSR.df BINSR.df wd,ws,wt BINSR.B wd,ws,wt BINSR.H wd,ws,wt BINSR.W wd,ws,wt BINSR.D Vector Bit Insert Right Vector endfor endfor endfor endfor 63 MSA BINSR.D .. WRLEN/64-1 i in 0 for BINSR.W .. WRLEN/32-1 i in 0 for BINSR.B .. WRLEN/8-1 i in 0 for BINSR.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: bitsplus 1. data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector selected right most bits copy bits copy most selected right Vector Description: Purpose: (right) b least significant Copy Format: (left) bits. nificant The number of bits to 31 Vector Bit Insert Right Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BINSRI.df I BIT 001001 6 5 while preserving the most sig- the while preserving wd wd 16i+t..16i 32i+t..32i 64i+t..64i modulo the size of the element in bits plus element in bits size of the the modulo m 8i+t..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 105 || WR[ws] || WR[ws] || WR[ws] || ws to elements in vector . ws df || WR[ws] 16 15 16i+15..16i+t+1 32i+31..32i+t+1 64i+63..64i+t+1 8i+7..8i+7+t+1 df/m to copy is given by the immediate the by given is copy to WR[wd] WR[wd] WR[wd]    its in each element of vector its  WR[wd] Volume IV-j: The MIPS64® SIMD Arch bit_insert_right(wd[i], ws[i], m) ws[i], bit_insert_right(wd[i],  111 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     BINSRI.df BINSRI.df wd,ws,m BINSRI.B wd,ws,m BINSRI.H wd,ws,m BINSRI.W wd,ws,m BINSRI.D Immediate Bit Insert Right Insert Immediate Bit endfor endfor endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for .. WRLEN/16-1 i in 0 for 63 7 5 5 6 MSA BINSRI.D t BINSRI.W t BINSRI.B t BINSRI.H t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate rightselected most whilebits copy preserving destination leftbits. Description: Purpose: (right) b least significant Copy Format: bits of The number bits.(left) nificant 1. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Bit Insert Right Insert Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

wt MSA 0 BMNZ.V I VEC 011110 bits from target vector 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 106 ws for which the corresponding ws 16 15 for which the corresponding target bits for which are 0. the correspondingtarget wt 21 20 all bits from source v ector Volume IV-j: The MIPS64® SIMD Arch wd 00100 (ws AND wt) OR (wd AND NOT wt) AND OR (wd AND wt)  (ws 26 25 wd BMNZ.V BMNZ.V wd,ws,wt BMNZ.V Vector Bit Move IfNot Zero Bit Move Vector 6 5555 6 MSA (WR[ws] and WR[wt]) or (WR[wd] and not WR[wt]) and or (WR[wd] WR[wt]) and  (WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Copy Copy to destination vector Format: Purpose: mask-based bitsoncopy the condition maskbeing set. Vector Description: are 1 and leaves unchangedall destinationare 1 and leaves bits values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Bit Move If Not Zero Not If Move Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA are 1 are i8 0 BMNZI.B I I8 000001 ) 7..0 6 5 wd and not i8 and 11 10 itecture Module, Revision 1.12 Revision Module, itecture 107 8i+7..8i 5 5 6 ws for which the corresponding bits from immediatefrom correspondingbits the which for ws 16 15 ) or (WR[wd] 7..0 r whichthe correspondingimmediate bits are 0. i8 and i8 and values in integer bytedata format.values all bits from source vector bits from source all Volume IV-j: The MIPS64® SIMD Arch 8i+7..8i wd (ws[i] AND i8) OR (wd[i] AND NOT i8) AND NOT (wd[i] i8) OR AND (ws[i]  00 26 25 24 23 wd[i] BMNZI.B BMNZI.B wd,ws,i8 BMNZI.B Immediate Bit Move If NotImmediate Zero BitMove 62 8 MSA WR[wd]  (WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Copy to destination vector destination to Copy Format: Purpose: Immediate mask-based bitscopy on the condition beingmask set. Description: The operands and results are vector vector are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: and leaves unchangedall destinationand leaves bits fo 31 Immediate Bit Move If Zero If Not Move Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

wt MSA 0 BMZ.V I VEC 011110 bits from target vector 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 108 ws for which the corresponding ws 16 15 for which the corresponding target bits for which are 1. the correspondingtarget wt 21 20 conditionthe maskbeing clear. all bits from source v ector Volume IV-j: The MIPS64® SIMD Arch wd 00101 (ws AND NOT wt) OR (wd AND wt) (wd wt) OR AND NOT  (ws 26 25 wd BMZ.V BMZ.V wd,ws,wt BMZ.V Vector Bit Move If Zero 6 5555 6 MSA (WR[ws] and not WR[wt]) or (WR[wd] and WR[wt]) (WR[wd] or not WR[wt]) and  (WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Copy Copy to destination vector Format: Purpose: on bits copy mask-based Vector Description: are 0 and leaves unchangedall destinationare 0 and leaves bits values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Bit Move If Zero If Move Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA are 0 are i8 0 IBMZI.B I8 000001 6 5 wd ) 7..0 11 10 itecture Module, Revision 1.12 Revision Module, itecture 109 5 5 6 ws for which the corresponding bits from immediatefrom correspondingbits the which for ws 16 15 ) or (WR[wd] and i8 (WR[wd] ) or r whichthe correspondingimmediate bits are 1. 7..0 i8 values in integer bytedata format.values all bits from source vector bits from source all Volume IV-j: The MIPS64® SIMD Arch wd (ws[i] AND NOT i8) OR (wd[i] AND i8) AND OR (wd[i] NOT i8) AND (ws[i]  01 26 25 24 23 wd[i] BMZI.B BMZI.B wd,ws,i8 BMZI.B Immediate Bit Move If ZeroImmediate BitMove 62 8 MSA (WR[ws] and not i8 and  (WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Copy to destination vector destination to Copy Format: Purpose: Immediate mask-basedbitsonthe copy condition mask being clear. Description: The operands and results are vector vector are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: and leaves unchangedall destinationand leaves bits fo 31 Immediate Bit Move If Zero If Move Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 modulo the modulo BNEG.df I wt 3R 001101 6 5 ) ) ) t t t wd ) t || 1 || 0 || 1 || 0 || 1 || 0 || 11 10 15-t 31-t 63-t itecture Module, Revision 1.12 Revision Module, itecture 110 || 1 || 0 || ws 7-t . . df xor (0 xor (0 xor (0 . The bit position is given by the elements in the elements by is given position bit The . wd ws xor (0 xor 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch t in each element of vector in each t 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i n negate in each element. n negate bit_negate(ws[i], wt[i])  bit_negate(ws[i], 101 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] t  t  WR[wd] t  WR[wd] t  WR[wd] WR[wd] BNEG.df BNEG.df wd,ws,wt BNEG.B wd,ws,wt BNEG.H wd,ws,wt BNEG.W wd,ws,wt BNEG.D Vector Bit Negate Vector endfor endfor endfor endfor 63 MSA BNEG.W .. WRLEN/32-1 i in 0 for BNEG.D .. WRLEN/64-1 i in 0 for BNEG.H .. WRLEN/16-1 i in 0 for BNEG.B .. WRLEN/8-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Description: Vector selected bit positio selected bit Vector Purpose: bi one (complement) Negate Format: size ofsize the elementin bits. The result is writtentovector data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Bit Negate Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 modulo the BNEGI.df I m BIT 001001 6 5 ) ) ) t t t wd ) t || 1 || 0 || 1 || 0 || 1 || 0 || 11 10 15-t 31-t 63-t itecture Module, Revision 1.12 Revision Module, itecture 111 || 1 || 0 || ws 7-t . . . The bit positionby the is given immediate df xor (0 xor (0 xor (0 wd ws xor (0 xor 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i df/m WR[ws] WR[ws] WR[ws]     WR[ws] n negate in each element. in each negate n Volume IV-j: The MIPS64® SIMD Arch bit_negate(ws[i], m)  bit_negate(ws[i], 101 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     BNEGI.df BNEGI.df wd,ws,m BNEGI.B wd,ws,m BNEGI.H wd,ws,m BNEGI.W wd,ws,m BNEGI.D Immediate Bit Negate Immediate Bit endfor .. WRLEN/32-1 i in 0 for endfor .. WRLEN/64-1 i in 0 for endfor endfor .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for 63 7 5 5 6 MSA BNEGI.W t BNEGI.D t BNEGI.H t BNEGI.B t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate selected bit positio Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: (complement) one bit in Negate each element of vector Format: ofsize the elementin bits. The result is writtentovector data format in integer values are and results operands The 31 Immediate Bit Negate Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BNZ.df I 6 || 0^^2 1 s16 9..0 lay slot of a branch or of jump. lay a branch slot itecture Module, Revision 1.12 Revision Module, itecture 112 || offset GPRLEN-12 ) 9 16 15 if a branch is placed in the de in branch is placed if a is a PC word offset, i.e. signedis a PC word count offset, of32-bit instructions, from thePC (offset s16 0 for all i, s16) all i, 0 for s16) all i, 0 for s16) all i, 0 for are not zero. are not    0 for all i, s16) 0 for wt  l Elements AreZero l Elements Not l destination elements are not zero. are not elements l destination Volume IV-j: The MIPS64® SIMD Arch 0 for all i then branch PC-relative s16 branch PC-relative i then 0 for all  16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i UNPREDICTABLE 111 df wt 26 25 23 22 21 20 if wt[i] target_offset  target_offset I: + target_offset PC  PC I+1: BNZ.df BNZ.df wt,s16 BNZ.B wt,s16 BNZ.H wt,s16 BNZ.W wt,s16 BNZ.D Immediate Branch If Al Immediate Branch If if cond then 6325 BNZ.H branch(WR[wt] BNZ.W branch(WR[wt] BNZ.D branch(WR[wt] offset) branch(cond, function endif branch endfunction BNZ.B branch(WR[wt] COP1 010001 Immediate PC offset branch if al branch if offset Immediate PC Description: Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved of thedelayof slot. Restrictions: operation is Processor The branch instruction has a delay slot. The Purpose: if in all elements branch PC-relative Format: 31 Immediate Branch If All Elements Are Not Zero Not Are Elements All If Branch Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 BNZ.V I 6 || 0^^2 1 s16 not zero regardless of the data format. data the of regardless zero not 9..0 lay slot of a branch or of jump. lay a branch slot itecture Module, Revision 1.12 Revision Module, itecture 113 || offset GPRLEN-12 ) One Element of Any Format Is Not Zero) Is Not Format Any Element of One 9 16 15 if a branch is placed in the de in branch is placed if a is a PC word offset, i.e. signedis a PC word count offset, of32-bit instructions, from thePC wt is not zero, i.e at least one element is not zero, i.e at least one element is (offset s16 wt 21 20 destination vector is not zero. not is vector destination 0, s16) Volume IV-j: The MIPS64® SIMD Arch  01111 0 then branch PC-relative s16 PC-relative branch 0 then UNPREDICTABLE  26 25 if wt target_offset  target_offset I: PC + target_offset PC  PC I+1: BNZ.V BNZ.V wt,s16 BNZ.V Immediate Branch If Not Zero (AtNot Immediate Branch If Zero Least if cond then 655 function branch(cond, offset) branch(cond, function endif branch endfunction branch(WR[wt] branch(WR[wt] COP1 010001 Operation: The branch instruction has a delay slot. The thedelayof slot. Restrictions: operation is Processor Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate PC offset branch if offset Immediate PC Description: PC-relative branch if at least one bit in at least if branch PC-relative Format: Purpose: 31 Immediate Branch If Not Zero (At Least One Element of Any Format Is Zero) Is Not of Any Format Element One Least (At Zero Not If Branch Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 IBSEL.V VEC 011110 6 5 based on the corresponding bit wd wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 114 ws . wt into destination vector wt 16 15 and ws wt 21 20 , if , 1 copies thebit from ws Volume IV-j: The MIPS64® SIMD Arch from two source vectors selected from vectors by thesource two bit mask value 00110 from vthe source ectors (ws AND NOT wd) OR (wt AND wd) (wt wd) OR AND NOT  (ws 26 25 wd BSEL.V BSEL.V wd,ws,wt BSEL.V Vector Bit Select Vector 6 5555 6 :if 0 copies thebit from MSA (WR[ws] and not WR[wd]) or (WR[wt] and WR[wd]) (WR[wt] or not WR[wd]) and  (WR[ws] WR[wd] 011110 wd Restrictions: values. vector bit are and results operands The Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Description: Selectively copy bits Format: Purpose: bits copy mask-based Vector in 31 Vector Bit Select Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 BSELI.B I the based on wd I8 ) 000001 8i+7..8i 6 5 wd into destination vector i8. and WR[wd] ws 7..0 11 10 itecture Module, Revision 1.12 Revision Module, itecture 115 5 5 6 ws ) or (i8 source vand ector i8 8i+7..8i 16 15 , if 1 copies the bit from ws i8 and not WR[wd] and Volume IV-j: The MIPS64® SIMD Arch the the 8-bit immediate  : if 0 copiesthebit from 8i+7..8i wd 10 (ws AND NOT wd) OR (i8 AND wd) (i8 wd) OR AND NOT  (ws 8i+7..8i 26 25 24 23 wd (WR[ws] BSELI.B BSELI.B wd,ws,i8 BSELI.B Immediate Bit Select Immediate Bit WR[wd] 62 8 MSA endfor for i in 0 .. WRLEN/8-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Selectively Selectively copy bits from Format: Purpose: mask value by the bit selected source vectors from two bits copy Immediate mask-based Description: correspondingbit in Restrictions: values. vector bit are and results operands The Operation: 31 Immediate Bit Select Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BSET.df I 3R 001101 modulo the size of the wt 6 5 ) ) ) t t t wd ) t || 1 || 0 || 1 || 0 || 1 || 0 || 11 10 itecture Module, Revision 1.12 Revision Module, itecture 116 15-t 31-t 63-t || 1 || 0 || ws . 7-t df or (0 or (0 or (0 or (0 or 16 15 . 16i+15..16i 32i+31..32i 64i+63..64i wd . The bit position is given by the elements in 8i+7..8i ws WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i tion set in each element. in each tion set bit_set(ws[i],  bit_set(ws[i], wt[i]) 100 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] t  t  WR[wd] t  WR[wd] WR[wd] WR[wd] t  BSET.df BSET.df wd,ws,wt BSET.B wd,ws,wt BSET.H wd,ws,wt BSET.W wd,ws,wt BSET.D Vector Bit Set Vector endfor endfor endfor endfor 63 MSA BSET_S.W .. WRLEN/32-1 i in 0 for BSET_S.D .. WRLEN/64-1 i in 0 for BSET_S.B .. WRLEN/8-1 i in 0 for BSET_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector selected bit posi selected bit Vector Description: Purpose: Set to 1 one bit in each element of v ector Format: element in bits. The result writtenis tovector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Bit Set Bit Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BSETI.df I BIT 001001 . The result is written to m 6 5 ) ) ) t t t wd ) t || 1 || 0 || 1 || 0 || 1 || 0 || 11 10 itecture Module, Revision 1.12 Revision Module, itecture 117 15-t 31-t 63-t || 1 || 0 || ws . 7-t df or (0 or (0 or (0 or (0 or 16 15 16i+15..16i 32i+31..32i 64i+63..64i . The bit position is given by the immediate 8i+7..8i ws df/m WR[ws] WR[ws] WR[ws]     WR[ws] Volume IV-j: The MIPS64® SIMD Arch ition set in each set element. in ition bit_set(ws[i],  bit_set(ws[i], m) 100 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     BSETI.df BSETI.df wd,ws,m BSETI.B wd,ws,m BSETI.H wd,ws,m BSETI.W wd,ws,m BSETI.D Immediate Bit Set Immediate Bit . endfor .. WRLEN/32-1 i in 0 for endfor .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor .. WRLEN/16-1 i in 0 for wd 63 7 5 5 6 MSA BSETI_S.W t BSETI_S.D t BSETI_S.B t BSETI_S.H t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate selected bit pos Description: Purpose: Set to 1 one bit in each element of vector Format: Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The vector 31 Immediate Bit Set Bit Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 BZ.df I 6 || 0^^2 1 s16 9..0 lay slot of a branch or of jump. lay a branch slot itecture Module, Revision 1.12 Revision Module, itecture 118 || offset GPRLEN-12 ) 9 16 15 is zero. is if a branch is placed in the de in branch is placed if a is a PC word offset, i.e. signedis a PC word count offset, of32-bit instructions, from thePC wt = 0, s16) = 0, s16) = 0, s16) = (offset s16 = 0, s16) Least One Element Is Zero 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i Volume IV-j: The MIPS64® SIMD Arch UNPREDICTABLE 110 df wt 26 25 23 22 21 20 if wt[i] = 0 for some i then branch PC-relative s16 PC-relative branch some i then = 0 for if wt[i] branch(WR[wt] branch(WR[wt]  target_offset I: branch(WR[wt] branch(WR[wt] + target_offset PC  PC I+1: BZ.df BZ.df wt,s16 BZ.B wt,s16 BZ.H wt,s16 BZ.W wt,s16 BZ.D Immediate Branch If At Immediate Branch If endfor endfor endfor if cond then endfor 6325 BZ.W .. WRLEN/32-1 i in 0 for BZ.D .. WRLEN/64-1 i in 0 for offset) branch(cond, function BZ.H .. WRLEN/16-1 i in 0 for endif branch endfunction BZ.B .. WRLEN/8-1 i in 0 for COP1 010001 Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved of thedelayof slot. Restrictions: operation is Processor least one destination element is zero. is element one destination at least branch if offset Immediate PC Description: branch instruction has a delay slot. The Purpose: if at least one element in branch PC-relative Format: 31 Immediate Branch If At Least One Element Is Is Zero Element One Least At If Branch Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 BZ.V I 6 || 0^^2 1 s16 9..0 lay slot of a branch or of jump. lay a branch slot itecture Module, Revision 1.12 Revision Module, itecture 119 || offset e zero regardless of the data format. the data of regardless e zero GPRLEN-12 ) 9 16 15 if a branch is placed in the de in branch is placed if a is a PC word offset, i.e. signedis a PC word count offset, of32-bit instructions, from thePC wt (offset s16 ro (All Elements of Any Format Are Zero) Are Format of Any (All Elements ro 21 20 Volume IV-j: The MIPS64® SIMD Arch bits are zero, i.e. all elements ar wt 01011 UNPREDICTABLE 26 25 if wt = 0 then branch PC-relative s16 PC-relative then branch if wt = 0 PC + target_offset PC  PC I+1: target_offset  target_offset I: BZ.V BZ.V wt,s16 BZ.V Immediate Branch If Ze Immediate Branch If if cond then 655 endif branch endfunction = 0, s16) branch(WR[wt] offset) branch(cond, function COP1 010001 Operation: The branch instruction has a delay slot. The thedelayof slot. Restrictions: operation is Processor Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved PC-relative branch if all branch PC-relative Format: Purpose: zero. is vector destination branch if offset Immediate PC Description: 31 Immediate Branch If Zero (All Elements of Any Format Are Zero) of Any Format Elements (All Zero If Branch Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CEQ.df I 3R 001111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 120 elements are equal, otherwise set all bits to 0. to all bits set elements are equal, otherwise ws . wt df and ws 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i all destination bits are set, otherwise clear. destination bits are all = WR[wt] 64 = WR[wt] 16 = WR[wt] 32 c c c = WR[wt] 8    25556  c Volume IV-j: The MIPS64® SIMD Arch 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i (ws[i] = wt[i]) (ws[i] =  000 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i elements if the corresponding elements if WR[ws] WR[ws] WR[ws] WR[ws] wd 26 25 23 22 21 20 wd[i] WR[wd] c  c  WR[wd] c  WR[wd] c  WR[wd] CEQ.df CEQ.df wd,ws,wt CEQ.B wd,ws,wt CEQ.H wd,ws,wt CEQ.W wd,ws,wt CEQ.D Vector Compare Equal Vector endfor endfor endfor endfor 63 MSA CEQ.D .. WRLEN/64-1 i in 0 for CEQ.W .. WRLEN/32-1 i in 0 for CEQ.B .. WRLEN/8-1 i in 0 for CEQ.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector to vector compare for equality; if true vector to Vector Description: Purpose: Setall bits to1 in Format: 31 Vector Compare Equal MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CEQI.df I I5 are equal, other- s5 000111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 121 ws . df element and the 5-bit signed immediate ws 16 15 = t 32 = t 64 = t 16 c c c = t 8    4..0 4..0 4..0 25556 4..0  c Volume IV-j: The MIPS64® SIMD Arch 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i are for equality;if trueall destinationbits otherwiseare set, clear. || s5 || s5 || s5 || s5 (ws[i] = s5) (ws[i] =  elements if the corresponding 59 11 27 3 000 df s5 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i ) ) ) ) 4 4 4 4 wd WR[ws] WR[ws] WR[ws] WR[ws] 26 25 23 22 21 20 wd[i] (s5 (s5 (s5 (s5 WR[wd] c  c  c  c  WR[wd] WR[wd] WR[wd]     CEQI.df CEQI.df wd,ws,s5 CEQI.B wd,ws,s5 CEQI.H wd,ws,s5 CEQI.W wd,ws,s5 CEQI.D Immediate Compare Equal Immediate Compare for i in 0 .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/32-1 i in 0 for endfor for i in 0 .. WRLEN/16-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor 63 MSA CEQI.D t CEQI.W t CEQI.B t CEQI.H t 011110 wise set all bits to0. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Immediate to vector comp vector Immediate to Purpose: Description: Set all bits to 1 in Format: 31 Immediate Compare Equal Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CEQI.df I itecture Module, Revision 1.12 Revision Module, itecture 122 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Compare Equal Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 CFCMSA I ELM 011001 6 5 56 rd . rd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 123 cs copiedis to GPR cs 16 15 specifies a reserved register or a register that exist. does not or a register register reserved a cs specifies 05 0001111110 Volume IV-j: The MIPS64® SIMD Arch = 1 then = 1 WRP  signed(cs) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 26 25 rd if not IsCoprocessorEnabled(0) then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSAAccess, GPR[rd] then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSASave, GPR[rd] then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSAModify, GPR[rd] then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSARequest, GPR[rd] then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSAMap, GPR[rd] then not IsCoprocessorEnabled(0) if endif 64)  sign_extend(MSAUnmap, GPR[rd] CFCMSA CFCMSA rd,cs CFCMSA GPR Copy fromControl MSA Register Copy GPR sign_extend(MSAIR, 64)  sign_extend(MSAIR, GPR[rd] 64)  sign_extend(MSACSR, GPR[rd] if cs = 2 then if 3 then cs = elseif 4 then cs = elseif 5 then cs = elseif 6 then cs = elseif 7 then cs = elseif else 61 MSA if cs = 0 then if 1 then cs = elseif MSAIR elseif 011110 The sign extended contentofcontrol The MSA signregister extended Format: Purpose: copied from MSA controlregister. value GPR Description: Restrictions: if read The operation returns ZERO Operation: 31 GPR Copy from MSA MSA Register Copy from GPR Control MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CFCMSA I itecture Module, Revision 1.12 Revision Module, itecture 124 Volume IV-j: The MIPS64® SIMD Arch GPR[rd] = 0 = GPR[rd] endif GPR[rd] = 0 = GPR[rd] if else end Exceptions: Exceptions: Exception, Disabled Exception.MSA Instruction Coprocessor Reserved 0 UnusableException. GPR Copy from MSA MSA Register Copy from GPR Control MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CLE_S.df I elements, other- 3R wt 001111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 125 ws . df elements are signed less than or equal to 16i+15..16i 32i+31..32i 64i+63..64i ws 16 15 if trueall destinationbits are set, otherwiseclear. 8i+7..8i <= WR[wt] 64 <= WR[wt] 16 <= WR[wt] 32 c c c <= WR[wt] 8    25556  c Volume IV-j: The MIPS64® SIMD Arch 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i (ws[i] <= wt[i]) (ws[i] <=  100 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i if the corresponding elements wd WR[ws] WR[ws] WR[ws] WR[ws] 26 25 23 22 21 20 wd[i] WR[wd] c  WR[wd] c  WR[wd] c  WR[wd] c  CLE_S.df CLE_S.df wd,ws,wt CLE_S.B wd,ws,wt CLE_S.H wd,ws,wt CLE_S.W wd,ws,wt CLE_S.D Vector Compare Signed Less Than or Equal Than or Compare Less Signed Vector endfor endfor endfor endfor 63 MSA CLE_S.D .. WRLEN/64-1 i in 0 for CLE_S.W .. WRLEN/32-1 i in 0 for CLE_S.B .. WRLEN/8-1 i in 0 for CLE_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Description: Vector to vector compare for signed less or equal; vector to Vector Restrictions: No data-dependentare possible.exceptions Operation: Purpose: Set all bits into 1 Format: wise set all bits to0. data format in integer values are and results operands The 31 Vector Compare Signed Less Than or Equal or Than Less Signed Compare Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CLE_U.df I elements, other- elements, 3R wt 001111 ) 6 5 ) ) wd 32i+31..32i less than or equal to equal or than less ) 16i+15..16i 64i+63..64i 11 10 8i+7..8i itecture Module, Revision 1.12 Revision Module, itecture 126 ws . df true all destination bits are set, otherwise clear. otherwise are set, destination bits true all elements are unsigned elements are ws 16 15 ) <= (0 || WR[wt] ) <= (0 || WR[wt] ) <= (0 ) <= (0 || WR[wt] (0 || ) <= 64 16 32 c c c 8 64i+63..64i 8i+7..8i 16i+15..16i    25556  c Volume IV-j: The MIPS64® SIMD Arch (ws[i] <= wt[i]) (ws[i] <=  101 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i elements if the corresponding if the elements (0 || WR[ws] (0 || WR[ws] (0 || WR[wt] <= (0 || WR[ws]__32i+31..32i_)_ (0 || WR[ws] wd 26 25 23 22 21 20 wd[i] c  c  WR[wd] c  c  WR[wd] WR[wd] WR[wd] CLE_U.df CLE_U.df wd,ws,wt CLE_U.B wd,ws,wt CLE_U.H wd,ws,wt CLE_U.W wd,ws,wt CLE_U.D Vector Compare Unsigned Less Than or Equal Vector endfor endfor endfor endfor 63 MSA CLE_U.D .. WRLEN/64-1 i in 0 for CLE_U.W .. WRLEN/32-1 i in 0 for CLE_U.B .. WRLEN/8-1 i in 0 for CLE_U.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector to vector less compare for unsignedor equal; if vector to Vector Restrictions: No data-dependentare possible.exceptions Operation: Purpose: 1 in to bits all Set Format: Description: wise set all bits to0. data format in integer values are and results operands The 31 Vector Compare Unsigned Less Than or Equal Than Less Unsigned Compare Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for , s5 MSA MSA MSA MSA 0 CLEI_S.df I I5 000111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 127 stination bits are set, otherwise clear. otherwise are set, stination bits ws . df element is less than or equal to the 5-bit signed immediate ws 16 15 <= t 32 <= t 64 <= t 16 c c c <= t 8    4..0 4..0 25556 4..0  c for signedless equal;or trueif allde Volume IV-j: The MIPS64® SIMD Arch 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i || s5__4.. 0__ || s5__4.. || s5 || s5 || s5 (ws[i] <= s5) (ws[i] <=  59 11 27 3 100 df s5 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i elements if the corresponding ) ) ) ) 4 4 4 4 WR[ws] WR[ws] WR[ws] WR[ws] wd 26 25 23 22 21 20 wd[i] (s5 (s5 (s5 (s5 c  WR[wd] c  c  c  WR[wd] WR[wd] WR[wd]     CLEI_S.df CLEI_S.df wd,ws,s5 CLEI_S.B wd,ws,s5 CLEI_S.H wd,ws,s5 CLEI_S.W wd,ws,s5 CLEI_S.D Immediate Compare Signed Less Than or Equal Immediate Compare for i in 0 .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/32-1 i in 0 for endfor for i in 0 .. WRLEN/16-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor 63 MSA CLEI_S.D t CLEI_S.W t CLEI_S.B t CLEI_S.H t 011110 otherwise set all bitsto 0. data format in integer values are and results operands The Immediate to vector compare vector Immediate to Restrictions: No data-dependentare possible.exceptions Operation: Purpose: Set all bits to 1 in Format: Description: 31 Immediate Compare Signed Less Than or Equal or Than Less Signed Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CLEI_S.df I itecture Module, Revision 1.12 Revision Module, itecture 128 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Compare Signed Less Than or Equal or Than Less Signed Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CLEI_U.df I I5 000111 6 5 wd 11 10 stination bits are set, otherwise clear. are set, stination bits itecture Module, Revision 1.12 Revision Module, itecture 129 ws . df element is unsigned less than or equal to the 5-bit unsigned ws 16 15 ) <= (0 || t) ) <= (0 ) <= (0 || t) (0 || ) <= <= (0 || t) <= (0 32 || t) <= (0 64 16 c c c 8 8i+7..8i 16i+15..16i    25556  c for unsignedor less equal; if true all de Volume IV-j: The MIPS64® SIMD Arch 32i+31..32i 64i+63..64i 4..0 4..0 4..0 4..0 (ws[i] <= u5) (ws[i] <=  elements if the corresponding 101 df u5 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i wd || u5 || || u5 || u5 || WR[ws] WR[ws] (0 || WR[ws] (0 || WR[ws] || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 WR[wd] c  c  c  c  WR[wd] WR[wd] WR[wd]     CLEI_U.df CLEI_U.df wd,ws,u5 CLEI_U.B wd,ws,u5 CLEI_U.H wd,ws,u5 CLEI_U.W wd,ws,u5 CLEI_U.D Immediate Compare Unsigned Less Than or Equal Unsigned Less Than or Immediate Compare u5, otherwise set all bitsto 0. for i in 0 .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/32-1 i in 0 for endfor for i in 0 .. WRLEN/16-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor 63 MSA CLEI_U.D t CLEI_U.W t CLEI_U.B t CLEI_U.H t 011110 Description: Immediate to vector compare vector Immediate to Purpose: Set all bits to 1 in Format: immediate data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Compare Unsigned Less Than or Equal Than or Less Unsigned Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CLEI_U.df I itecture Module, Revision 1.12 Revision Module, itecture 130 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Compare Unsigned Less Than or Equal Than or Less Unsigned Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CLT_S.df I 3R 001111 elements, otherwise set all 6 5 wt wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 131 ion bits are set, otherwise clear. set, otherwise are ion bits ws . df elements are signed less than ws 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i < WR[wt] 64 < WR[wt] 16 < WR[wt] 32 c c c < WR[wt] 8 than;less iftrue alldestinat    25556  c Volume IV-j: The MIPS64® SIMD Arch 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i (ws[i] < wt[i]) (ws[i] <  elements if the corresponding 010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i wd WR[ws] WR[ws] WR[ws] WR[ws] 26 25 23 22 21 20 wd[i] WR[wd] c  WR[wd] c  WR[wd] c  WR[wd] c  CLT_S.df CLT_S.df wd,ws,wt CLT_S.B wd,ws,wt CLT_S.H wd,ws,wt CLT_S.W wd,ws,wt CLT_S.D Vector Compare Signed Less Than Compare Less Signed Vector endfor endfor endfor endfor 63 MSA CLT_S.D .. WRLEN/64-1 i in 0 for CLT_S.W .. WRLEN/32-1 i in 0 for CLT_S.B .. WRLEN/8-1 i in 0 for CLT_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Description: Vector to vector compare for signed vector to Vector Purpose: Set all bits to 1 i n Format: bitsto 0. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Compare Signed Less Than Less Signed Compare Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 CLT_U.df I 3R 001111 elements, otherwise set all ) 6 5 wt ) ) wd 32i+31..32i ) 16i+15..16i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 132 8i+7..8i ws . df elements are unsigned less than ws 16 15 true all destination bits otherwiseare set, clear. ) < (0 || WR[wt] ) < (0 || WR[wt] ) < (0 || ) < (0 || WR[wt] ) < 64 16 32 c c c 8 64i+63..64i 8i+7..8i 16i+15..16i    25556  c Volume IV-j: The MIPS64® SIMD Arch (ws[i] < wt[i]) (ws[i] <  011 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i elements if the corresponding (0 || WR[ws] (0 || WR[ws] || WR[wt] < (0 (0 || WR[ws]__32i+31..32i_)_ (0 || WR[ws] wd 26 25 23 22 21 20 wd[i] c  c  c  WR[wd] c  WR[wd] WR[wd] WR[wd] CLT_U.df CLT_U.df wd,ws,wt CLT_U.B wd,ws,wt CLT_U.H wd,ws,wt CLT_U.W wd,ws,wt CLT_U.D Vector Compare Less Than Unsigned Vector endfor endfor endfor endfor 63 MSA CLT_U.D .. WRLEN/64-1 i in 0 for CLT_U.W .. WRLEN/32-1 i in 0 for CLT_U.B .. WRLEN/8-1 i in 0 for CLT_U.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved bitsto 0. data format in integer values are and results operands The Vector to vector compare for unsigned less than; if if less compare for unsignedthan; vector to Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: Set all bits to 1 in Format: 31 Vector Compare Unsigned Less Than Less Unsigned Compare Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 , otherwise s5 CLTI_S.df I I5 000111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 133 ws . df element is less than the 5-bit signed immediate ws 16 15 < t 32 < t 64 < t 16 c c c < t 8 e corresponding    4..0 4..0 25556 4..0  c Volume IV-j: The MIPS64® SIMD Arch e for signedless than; if true all destination bits otherwise are set, clear. 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i || s5__4.. 0__ || s5__4.. || s5 || s5 || s5 (ws[i] < s5) (ws[i] <  59 11 27 3 010 df s5 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i elements if th ) ) ) ) 4 4 4 4 wd WR[ws] WR[ws] WR[ws] WR[ws] 26 25 23 22 21 20 wd[i] (s5 (s5 (s5 (s5 c  WR[wd] c  c  c  WR[wd] WR[wd] WR[wd]     CLTI_S.df CLTI_S.df wd,ws,s5 CLTI_S.B wd,ws,s5 CLTI_S.H wd,ws,s5 CLTI_S.W wd,ws,s5 CLTI_S.D Immediate Compare Signed Less Than Immediate Compare for i in 0 .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/32-1 i in 0 for endfor for i in 0 .. WRLEN/16-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor 63 MSA CLTI_S.D t CLTI_S.W t CLTI_S.B t CLTI_S.H t 011110 set all bits to0. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Immediate to vector compar vector Immediate to Description: Purpose: Set all bits to 1 in Format: 31 Immediate Compare Signed Less Than Less Signed Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CLTI_S.df I itecture Module, Revision 1.12 Revision Module, itecture 134 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Compare Signed Less Than Less Signed Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for , u5 MSA MSA MSA MSA 0 CLTI_U.df I I5 000111 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 135 ws . df element is unsigned less than the 5-bit unsigned immediateunsigned 5-bit the less than is unsigned element ws 16 15 ) < (0 || t) ) < (0 || ) < (0 || t) ) < < (0 || t) < (0 32 || t) < (0 64 16 c c c 8 8i+7..8i 16i+15..16i    25556  c Volume IV-j: The MIPS64® SIMD Arch e for unsignedthan; less if trueall destinationbits are set, otherwiseclear. 32i+31..32i 64i+63..64i 4..0 4..0 4..0 4..0 (ws[i] < u5) (ws[i] <  011 df u5 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i elements if the corresponding || u5 || || u5 || u5 || WR[ws] WR[ws] (0 || WR[ws] (0 || WR[ws] wd || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 WR[wd] c  c  c  c  WR[wd] WR[wd] WR[wd]     CLTI_U.df CLTI_U.df wd,ws,u5 CLTI_U.B wd,ws,u5 CLTI_U.H wd,ws,u5 CLTI_U.W wd,ws,u5 CLTI_U.D Immediate Compare Unsigned Less Than Immediate Compare for i in 0 .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/32-1 i in 0 for endfor for i in 0 .. WRLEN/16-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for endfor 63 MSA CLTI_U.D t CLTI_U.W t CLTI_U.B t CLTI_U.H t 011110 otherwise set all bitsto 0. data format in integer values are and results operands The Immediate to vector compar vector Immediate to Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: 1 in to all bits Set Format: 31 Immediate Compare Unsigned Less Than Less Unsigned Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CLTI_U.df I itecture Module, Revision 1.12 Revision Module, itecture 136 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Compare Unsigned Less Than Less Unsigned Compare Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA 0 MIPS64 MSA COPY_S.df I ELM 011001 6 5 rd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 137 . rd ws , 64) , 64) , 64) 16 15 8n+7..8n 16n+15..16n 32n+31..32n n-1..0 df/n || tt and copy the result to the GPR result and copy ws 64n+63..64n 22 21 Volume IV-j: The MIPS64® SIMD Arch GPRLEN-n ) 0010 n-1 of vector n of  sign_extend(WR[ws]  sign_extend(WR[ws]  WR[ws]  sign_extend(WR[ws]  signed(ws[n]) 26 25 rd COPY_S.df COPY_S.df rd,ws[n] COPY_S.B rd,ws[n] COPY_S.H rd,ws[n] COPY_S.W rd,ws[n] COPY_S.D Element Copy to GPR Signed to GPR Element Copy return (tt return 646556 MSA COPY_S.H GPR[rd] COPY_S.W GPR[rd] COPY_S.D GPR[rd] function sign_extend(tt, n) sign_extend(tt, function sign_extend endfunction COPY_S.B GPR[rd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Element value sign extended to GPR. and copied extended sign Element value Description: Purpose: elementSign-extend Format: Restrictions: No data-dependentare possible.exceptions Operation: 31 Element Copy to GPR to Copy Signed Element MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MIPS64 MSA COPY_U.df I ELM 011001 6 5 rd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 138 . rd ws , 64)) , 64) , 64)) 16 15 8n+7..8n 16n+15..16n 32n+31..32n df/n n-1..0 and copy the result to the GPR result and copy ws 22 21 Volume IV-j: The MIPS64® SIMD Arch || tt || 0011 of vector n of  zero_extend(WR[ws]  zero_extend(WR[ws] GPRLEN-n  zero_extend(WR[ws]  unsigned(ws[n]) 26 25 rd COPY_U.df COPY_U.df rd,ws[n] COPY_U.B rd,ws[n] COPY_U.H rd,ws[n] COPY_U.W Element Copy to GPR Unsigned to GPR Element Copy return 0 return 646556 MSA COPY_U.H GPR[rd] COPY_U.W GPR[rd] n) zero_extend(tt, function endfunction zero_extend endfunction COPY_U.B GPR[rd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Zero-extend element Zero-extend Format: Purpose: and copied to GPR. extended zero Element value Description: Restrictions: No data-dependentare possible.exceptions Operation: 31 Element Copy to GPR GPR Copy to Unsigned Element MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 CTCMSA I ELM 011001 . cd 6 5 56 cd thatdoes notor is not exist writable. 11 10 itecture Module, Revision 1.12 Revision Module, itecture 139 rs 0 then causes the appropriate e xception if any Cause bit and its )  is copied to MSA control register copied to MSA control is rs 16 15 MSACSR Enables 31..0 The register is written before the ception ex occurs and the EPC register con- 31..0 31..0 specifies a reserved register or a register or register a reserved specifies 05 31..0 cd 31..0 0000111110 Volume IV-j: The MIPS64® SIMD Arch tion, MSA Disabled Exception, MSA Floating Point Exception. Coprocessor 0 Unusable and (1 || MSACSR and gnificant 31 bitsofgnificant GPR = 1 then = 1 Cause WRP  rs SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, SignalException(CoprocessorUnusableException, 0) SignalException(CoprocessorUnusableException, 26 25 cd SignalException(MSAFloatingPointException) if not IsCoprocessorEnabled(0) then not IsCoprocessorEnabled(0) if endif  GPR[rs] MSAMap then not IsCoprocessorEnabled(0) if endif  GPR[rs] MSAUnmap if not IsCoprocessorEnabled(0) then not IsCoprocessorEnabled(0) if endif  GPR[rs] MSAModify if not IsCoprocessorEnabled(0) then not IsCoprocessorEnabled(0) if endif  GPR[rs] MSASave CTCMSA CTCMSA cd,rs CTCMSA GPR Copy to MSA Control Register Copy GPR endif endif elseif cd = 6 then cd = elseif 7 then cd = elseif elseif cd = 4 then cd = elseif if MSACSR MSACSR  GPR[rs] MSACSR if cd = 3 then if 61 MSA elseif MSAIR elseif endif if cd = 1 then if 011110 Operation: Exceptions: Exceptions: Reserved Instruction Excep corresponding Enable bit set.are both tains theaddress theof CTCMSA instruction. Restrictions: writeattemptThe is IGNORED if Writing to the MSA Control and Status Register The content of the least si content of the The Format: Purpose: copied to MSA control register. value GPR Description: 31 GPR Copy to MSA Control Register Control MSA Copy to GPR MIPS® Architecture for Programmers Programmers MIPS® Architecture for

CTCMSA I itecture Module, Revision 1.12 Revision Module, itecture 140 Volume IV-j: The MIPS64® SIMD Arch Exception. GPR Copy to MSA Control Register Control MSA Copy to GPR MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 DIV_S.df I 3R 010010 . The result is written to wt . 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 141 UNPREDICTABLE ws . df div WR[wt] div WR[wt] div WR[wt] div WR[wt] div 16 15 8i+7..8i 16i+15..16i 32i+31..32i 64i+63..64i are divided by signed integer elements in vector is zero, the result value is is zero, the result value ws wt WR[ws] WR[ws] WR[ws]  WR[ws]    25556 Volume IV-j: The MIPS64® SIMD Arch 8i+7..8i ws[i] div wt[i] ws[i] div  100 df wt 64i+63..64i 16i+15..16i 32i+31..32i WR[wd] 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] DIV_S.df DIV_S.df wd,ws,wt DIV_S.B wd,ws,wt DIV_S.H wd,ws,wt DIV_S.W wd,ws,wt DIV_S.D Vector Signed Divide Vector . If a divisor element vector element divisor . If a endfor endfor endfor endfor wd 63 MSA DIV_S.D .. WRLEN/64-1 i in 0 for DIV_S.W .. WRLEN/32-1 i in 0 for DIV_S.B .. WRLEN/8-1 i in 0 for DIV_S.H .. WRLEN/16-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved vector Vector signed divide. Vector Description: Purpose: The signed integer elements in vector Format: 31 Vector Signed Divide Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 DIV_U.df I 3R . The result is writ- 010010 wt . 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i UNPREDICTABLE 8i+7..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 142 ws . df udiv WR[wt] udiv WR[wt] udiv WR[wt] udiv udiv WR[wt] udiv 16 15 8i+7..8i is zero, the result value is result value the zero, is 16i+15..16i 32i+31..32i 64i+63..64i wt are divided by unsigned integer elements in vector ws WR[ws] WR[ws] WR[ws]  WR[ws]    25556 Volume IV-j: The MIPS64® SIMD Arch 8i+7..8i ws[i] udiv wt[i] ws[i] udiv  101 df wt 64i+63..64i 16i+15..16i 32i+31..32i WR[wd] . If a divisor element vector If a divisor . 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] wd DIV_U.df DIV_U.df wd,ws,wt DIV_U.B wd,ws,wt DIV_U.H wd,ws,wt DIV_U.W wd,ws,wt DIV_U.D Vector Unsigned Divide Vector endfor endfor endfor endfor 63 MSA DIV_U.D .. WRLEN/64-1 i in 0 for DIV_U.W .. WRLEN/32-1 i in 0 for DIV_U.B .. WRLEN/8-1 i in 0 for DIV_U.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector unsigned divide. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The unsigned elements in vectorinteger Format: data format in integer values are and results operands The ten tovector 31 Vector Unsigned Divide Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 DLSA I DLSA 010101 . rd 26 000 sa sult is placed into GPR sult is 108765 1110 itecture Module, Revision 1.12 Revision Module, itecture 143 = 1 then rd MSAP 1615 ) + GPR[rt] s rt and the 64-bit and arithmetic re the rt is shifted left, inserting zeros into the emptied bits; the 64-bit doubleword aled if access to 64-bit operations is not enabled MSA implementationor is || 0 || rs 2120 ccurs under any circumstances. under any ccurs (63-s)..0 Volume IV-j: The MIPS64® SIMD Arch 63..0 (GPR[rs] << (sa + 1)) + GPR[rt] (sa + 1)) << (GPR[rs] rs  2625 GPR[rd] sa + 1 DLSA DLSA rd,rs,rt,sa DLSA Doubleword Left Shift Doubleword Add s   (GPR[rs] temp GPR[rd]  temp GPR[rd] SignalException(ReservedInstruction) 6 5553 else endif if Are64bitOperationsEnabled() and Config3 Are64bitOperationsEnabled() if 000000 SPECIAL Exceptions: Exceptions: Exception. Instruction Reserved The 64-bit doubleword value in GPR Format: Purpose: ofnumber bits andaddthe toresult another doubleword. by a fixed doubleword left-shift a To Description: result addedis to thein GPR64-bit value o exception No Overflow Integer Restrictions: A Instruction Reserved Exception is sign present. not Operation: 31 Doubleword Left Shift Add Shift Left Doubleword MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 DOTP_S.df I 3R producing a result 010011 ws , 8) , 16) , 32) 6 5 ven elements are added and stored to and stored elements are added ven wd 16i+15..16i 32i+31..32i 64i+63..64i teger teger elements in vector 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 144 . The results are values in integer The data format. results are values df ws 16i+15..16i 32i+31..32i 64i+63..64i 16 15 , n) , n) 2n-1..n are multiplied by signed in multiplication results of adjacent odd/e n-1..0

wt dotp_s(WR[ws] dotp_s(WR[ws] dotp_s(WR[ws] , tt , tt    r data format halfthesize of n-1..0 n-1..0 25556 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 || ts || tt n n ) ) 000 df wt 16i+15..16i 32i+31..32i 64i+63..64i 2n-1..0 2n-1..0 n-1 n-1 mulx_s(ts mulx_s(ts 26 25 23 22 21 20 signed(ws[2i+1]) * signed(wt[2i+1]) + signed(ws[2i]) * + signed(ws[2i]) * signed(wt[2i+1])  signed(ws[2i+1]) wd[2i]) (wd[2i+1], (ts s * t p1 + p0 (tt WR[wd] WR[wd] WR[wd] DOTP_S.df DOTP_S.df wd,ws,wt DOTP_S.H wd,ws,wt DOTP_S.W wd,ws,wt DOTP_S.D Vector Signed Dot Product Vector endfor endfor endfor s  p0  p  p return p1  p  p return t  63 MSA DOTP_S.W .. WRLEN/32-1 i in 0 for DOTP_S.D .. WRLEN/64-1 i in 0 for n) tt, mulx_s(ts, function endfunction dotp_s endfunction endfunction mulx_s endfunction n) tt, dotp_s(ts, function DOTP_S.H .. WRLEN/16-1 i in 0 for 011110 signed(wt[2i]) the destination. in intege are values operands The Restrictions: No data-dependentare possible.exceptions Operation: The signed integer elements in vector Format: Purpose: signed Vector dot product (multiply and pairwisethen add the adjacent multiplication results) to double width ele- ments. Description: The input the operands. size of the twice 31 Vector Vector Product Dot Signed MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DOTP_S.df I itecture Module, Revision 1.12 Revision Module, itecture 145 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Vector Product Dot Signed MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 producing a DOTP_U.df I ws 3R 010011 , 8) , 16) , 32) results) results) to double width ele- 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i acent acent odd/even elements are added and 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 146 . The results are values in integer The data format. results are values df ws dd the adjacent multiplication 16i+15..16i 32i+31..32i 64i+63..64i 16 15 , n) , n) are multiplied by unsigned integer elements in vector wt 2n-1..n n-1..0 unsigned(ws[2i+1]) * unsigned(wt[2i+1]) + unsigned(wt[2i+1]) *  unsigned(ws[2i+1]) ands. The multiplication results of adj dotp_u(WR[ws] dotp_u(WR[ws] dotp_u(WR[ws] , tt , tt    r data format halfthesize of

25556 ultiply then and pairwise a Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 n-1..0 n-1..0 001 df wt 16i+15..16i 32i+31..32i 64i+63..64i 2n-1..0 2n-1..0 || ts || tt n n mulx_u(ts mulx_u(ts 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], 0 0 s * t p1 + p0 WR[wd] WR[wd] WR[wd] DOTP_U.df DOTP_U.df wd,ws,wt DOTP_U.H wd,ws,wt DOTP_U.W wd,ws,wt DOTP_U.D Vector Unsigned Dot Product Vector t  endfor endfor endfor s  p0  p  p return p1  p  p return 63 MSA DOTP_U.W .. WRLEN/32-1 i in 0 for DOTP_U.D .. WRLEN/64-1 i in 0 for n) tt, mulx_u(ts, function endfunction dotp_u endfunction endfunction mulx_s endfunction n) tt, dotp_u(ts, function DOTP_U.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The unsigned integer elements in v ector Format: Purpose: unsigned dot product (m Vector ments. Description: unsigned(wt[2i]) * unsigned(ws[2i]) result twice the size of the input oper stored to the destination. in intege are values operands The 31 Vector Unsigned Dot Product Dot Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DOTP_U.df I itecture Module, Revision 1.12 Revision Module, itecture 147 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Unsigned Dot Product Dot Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 DPADD_S.df I 3R , 8) , 16) , 32) producing a result 010011 ws are added to the integer added are 6 5 16i+15..16i 32i+31..32i 64i+63..64i lication results) and add to double wd , WR[wt] , WR[wt] , WR[wt] teger teger elements in vector 11 10 itecture Module, Revision 1.12 Revision Module, itecture 148 . The results are values in integer The data format. results are values 16i+15..16i 32i+31..32i 64i+63..64i df ws 16 15 , n) , n) + dotp_s(WR[ws] + dotp_s(WR[ws] + dotp_s(WR[ws] + 2n-1..n are multiplied by signed in jacent odd/even elements odd/even results of adjacent multiplication n-1..0 (wd[2i+1], wd[2i]) + signed(ws[2i+1]) * signed(ws[2i+1]) + wd[2i])  (wd[2i+1],

wt

, tt , tt    r data format halfthesize of n-1..0 n-1..0 25556 64i+63..64i 16i+15..16i 32i+31..32i Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 || ts || tt n n ) ) WR[wd] WR[wd] WR[wd] . 010 df wt 64i+63..64i 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 wd mulx_s(ts mulx_s(ts 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], s * t (ts (tt WR[wd] WR[wd] WR[wd] DPADD_S.df DPADD_S.df wd,ws,wt DPADD_S.H wd,ws,wt DPADD_S.W wd,ws,wt DPADD_S.D Vector Signed Dot Product andAdd Vector p0  p  p return p1  endfor endfor endfor s  t  63 MSA endfunction mulx_s endfunction n) tt, dotp_s(ts, function DPADD_S.H .. WRLEN/16-1 i in 0 for DPADD_S.W .. WRLEN/32-1 i in 0 for DPADD_S.D .. WRLEN/64-1 i in 0 for n) tt, mulx_s(ts, function 011110 elements in vector in intege are values operands The Restrictions: No data-dependentare possible.exceptions Operation: The signed integer elements in vector Format: Purpose: Vector signed dot p roduct (multiply and then adpairwise d the width elements. muadjacent ltip Description: signed(wt[2i]) * signed(ws[2i]) + signed(wt[2i+1]) input operands. The the the size of twice 31 Vector Signed Dot Product Add and Product Dot Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DPADD_S.df I itecture Module, Revision 1.12 Revision Module, itecture 149 Volume IV-j: The MIPS64® SIMD Arch 2n-1..0 p1 + p0 p  p return endfunction dotp_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Signed Dot Product Add and Product Dot Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 producing a ws are added to the DPADD_U.df I 3R , 8) , 16) , 32) 010011 6 5 16i+15..16i 32i+31..32i 64i+63..64i t odd/even t elements odd/even wd , WR[wt] , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 150 . The results are values in integer The data format. results are values 16i+15..16i 32i+31..32i 64i+63..64i df ws dd the adjacent multiplication results) and add to double 16 15 , n) , n) are multiplied by unsigned integer elements in vector + dotp_u(WR[ws] + dotp_u(WR[ws] + dotp_u(WR[ws] + wt 2n-1..n n-1..0 (wd[2i+1], wd[2i]) + unsigned(ws[2i+1]) * unsigned(ws[2i+1]) + wd[2i])  (wd[2i+1], ds. The multiplication results of adjacen

, tt , tt    r data format halfthesize of

25556 64i+63..64i 16i+15..16i 32i+31..32i . Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 wd n-1..0 n-1..0 WR[wd] WR[wd] WR[wd] 011 df wt 64i+63..64i 16i+15..16i 32i+31..32i 2n-1..0 || ts || tt n n mulx_u(ts mulx_u(ts 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], 0 s * t 0 WR[wd] WR[wd] WR[wd] DPADD_U.df DPADD_U.df wd,ws,wt DPADD_U.H wd,ws,wt DPADD_U.W wd,ws,wt DPADD_U.D Vector Unsigned Dot Product and Add Vector t  p0  p  p return p1  endfor endfor endfor s  63 MSA endfunction mulx_s endfunction n) tt, dotp_u(ts, function DPADD_U.H .. WRLEN/16-1 i in 0 for DPADD_U.W .. WRLEN/32-1 i in 0 for DPADD_U.D .. WRLEN/64-1 i in 0 for n) tt, mulx_u(ts, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands are values in intege are values operands The The unsigned integer elements in v ector Format: Purpose: unsigned Vector dot product (multiply and then pairwise a width results. Description: * unsigned(wt[2i]) + unsigned(ws[2i]) unsigned(wt[2i+1]) result twice the size of the input operan in vector elements integer 31 Vector Unsigned Dot Productand Add MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DPADD_U.df I itecture Module, Revision 1.12 Revision Module, itecture 151 Volume IV-j: The MIPS64® SIMD Arch 2n-1..0 p1 + p0 p  p return endfunction dotp_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Unsigned Dot Productand Add MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 DPSUB_S.df I 3R , 8) , 16) , 32) 010011 producing a signed ws lts) and subtract from dou- 6 5 16i+15..16i 32i+31..32i 64i+63..64i wd of adjacent odd/even is elements sub- , WR[wt] , WR[wt] , WR[wt] 11 10 integer integer elements in vector itecture Module, Revision 1.12 Revision Module, itecture 152 . The results are values in integer The data format. results are values 16i+15..16i 32i+31..32i 64i+63..64i df ws the adjacent multiplication resu 16 15 , n) , n) to a signedresult. wd - dotp_s(WR[ws] - dotp_s(WR[ws] - dotp_s(WR[ws] - 2n-1..n are multiplied by signed n-1..0 (wd[2i+1], wd[2i]) - (signed(ws[2i+1]) * (signed(ws[2i+1]) - wd[2i])  (wd[2i+1],

wt ands. The sum of multiplication results , tt , tt    r data format halfthesize of n-1..0 n-1..0 25556 64i+63..64i 16i+15..16i 32i+31..32i Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 || ts || tt n n ) ) WR[wd] WR[wd] WR[wd] 100 df wt 64i+63..64i 16i+15..16i 32i+31..32i 2n-1..0 elements in vector n-1 n-1 mulx_s(ts mulx_s(ts 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], s * t (ts (tt WR[wd] WR[wd] WR[wd] DPSUB_S.df DPSUB_S.df wd,ws,wt DPSUB_S.H wd,ws,wt DPSUB_S.W wd,ws,wt DPSUB_S.D Vector Signed Dot Product andSubtract Vector p0  p  p return p1  endfor endfor endfor s  t  63 MSA endfunction mulx_s endfunction n) tt, dotp_s(ts, function DPSUB_S.H .. WRLEN/16-1 i in 0 for DPSUB_S.W .. WRLEN/32-1 i in 0 for DPSUB_S.D .. WRLEN/64-1 i in 0 for n) tt, mulx_s(ts, function 011110 ble width elements. Description: The signed integer Format: Purpose: signed dot product (multiply and then pairwise add Vector signed(wt[2i])) * signed(ws[2i]) + signed(wt[2i+1]) result twice input opethe size of the r vector in elements the integer tracted from Restrictions: No data-dependentare possible.exceptions Operation: The operands are values in intege are values operands The 31 Vector Signed Dot Product Subtract and Dot Product Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DPSUB_S.df I itecture Module, Revision 1.12 Revision Module, itecture 153 Volume IV-j: The MIPS64® SIMD Arch 2n-1..0 p1 + p0 p  p return endfunction dotp_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Signed Dot Product Subtract and Dot Product Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 DPSUB_U.df I producing a pos- producing 3R , 8) , 16) , 32) 010011 ws lts of adjacent odd/even ele- lts of adjacent odd/even 6 5 16i+15..16i 32i+31..32i 64i+63..64i wd , WR[wt] , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 154 . The results are values in integer The data format. results are values 16i+15..16i 32i+31..32i 64i+63..64i df ws e multiplicationsum of e resu to a signed result. to a wd 16 15 , n) , n) are multiplied by unsigned integer elements in vector inelements integer unsignedbymultipliedare - dotp_u(WR[ws] - dotp_u(WR[ws] - dotp_u(WR[ws] - 2n-1..n n-1..0 wt (wd[2i+1], wd[2i]) - (unsigned(ws[2i+1]) * (unsigned(ws[2i+1]) - wd[2i])  (wd[2i+1],

, tt , tt    r data format halfthesize of ultiply and then pairwise add the adjacent multiplication results) and subtract from

25556 64i+63..64i 16i+15..16i 32i+31..32i Volume IV-j: The MIPS64® SIMD Arch 2n-1..n n-1..0 e integer elements in vector integer e n-1..0 n-1..0 WR[wd] WR[wd] WR[wd] 101 df wt 64i+63..64i 16i+15..16i 32i+31..32i 2n-1..0 || ts || tt n n mulx_u(ts mulx_u(ts 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], 0 s * t 0 WR[wd] WR[wd] WR[wd] DPSUB_U.df DPSUB_U.df wd,ws,wt DPSUB_U.H wd,ws,wt DPSUB_U.W wd,ws,wt DPSUB_U.D Vector Unsigned Dot Product and Subtract Vector t  p0  p  p return p1  endfor endfor endfor s  63 MSA endfunction mulx_s endfunction n) tt, dotp_u(ts, function DPSUB_U.H .. WRLEN/16-1 i in 0 for DPSUB_U.W .. WRLEN/32-1 i in 0 for DPSUB_U.D .. WRLEN/64-1 i in 0 for n) tt, mulx_u(ts, function 011110 ments is subtracted from th from is subtracted ments in intege are values operands The Restrictions: No data-dependentare possible.exceptions Operation: The unsigned integer elements in vector elements unsigned integer The Format: Purpose: unsigned Vector dot (m product double width elements. Description: * unsigned(wt[2i])) + unsigned(ws[2i]) unsigned(wt[2i+1]) unsigned result twice the size of the input operands. Th itive, 31 Vector Unsigned Dot Productand Subtract MIPS® Architecture for Programmers Programmers MIPS® Architecture for

DPSUB_U.df I itecture Module, Revision 1.12 Revision Module, itecture 155 Volume IV-j: The MIPS64® SIMD Arch 2n-1..0 p1 + p0 p  p return endfunction dotp_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Unsigned Dot Productand Subtract MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FADD.df I 3RF 011011 . resultThe writtenis to ws , 32) , 64) -2008. 6 5 TM wd 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 156 . , WR[wt] , WR[wt] df ws by the I EEE Standard for Floating-Point Arithmetic 754 32i+31..32i 64i+63..64i 16 15 ed Exception,MSA Floating PointException. are added to the floating-point elements in vector elements in added to the floating-point are wt AddFP(WR[ws] AddFP(WR[ws] 15556 es ines floating-point data format   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0000 df wt  ws[i] + wt[i] 32i+31..32i 64i+63..64i 26 25 wd[i] WR[wd] WR[wd] FADD.df FADD.df wd,ws,wt FADD.W wd,ws,wt FADD.D Vector Floating-Point Addition Vector . endfor endfor */ operation. add defined Implementation /* wd 64 MSA FADD.D .. WRLEN/64-1 i in 0 for n) ts, AddFP(tt, function AddFP endfunction FADD.W .. WRLEN/32-1 i in 0 for 011110 The addThe by theoperation defined IEEE Standard is for Floating-Point Arithmetic 754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved vector The floating-point elements in vector floating-point elements The Format: Purpose: floating-point addition. Vector Description: The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Floating-Point Addition MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 IFCAF.df . df 3RF 011010 , 32) , 64) 6 5 32i+31..32i 64i+63..64i wd , WR[wt] , WR[wt] 11 10 signal Invalid Operation exception. signal Invalid itecture Module, Revision 1.12 Revision Module, itecture 157 tination bits are clear. tination wt erands are flushed based on the flush-to-zero bit FS in on based flushed are erands or ws 32i+31..32i 64i+63..64i ws . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 df 16 15 always false; all all des false; always ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to QuietFALSE(WR[ws] QuietFALSE(WR[ws] MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0000 df wt quietFalse(ws[i], wt[i])  quietFalse(ws[i], 32i+31..32i 64i+63..64i elements. Signaling NaNelements in elements. Signaling wd 26 25 wd[i] WR[wd] WR[wd] FCAF.df FCAF.df wd,ws,wt FCAF.W wd,ws,wt FCAF.D Vector Floating-Point False QuietCompare Always Vector endfor endfor */ NaN test signaling defined Implementation /* 0 return 64 MSA FCAF.W .. WRLEN/32-1 i in 0 for FCAF.D .. WRLEN/64-1 i in 0 for n) ts, QuietFALSE(tt, function QuietFALSE endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The MSA Control and Status Register Setall bits to0 in Format: Purpose: floating-pointtovector quiet compare Vector 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified Description: 31 Vector Floating-Point QuietAlways Compare False MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FCEQ.df I . df -2008. 3RF 011010 TM 6 5 wd , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 158 erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are ordered and equal, other- ws 32i+31..32i 64i+63..64i wt ;if trueall destinationbits otherwiseare set, clear. . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 df ws 16 15 , WR[wt] , WR[wt] ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32 64 32i+31..32i 64i+63..64i c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0010 df wt (ws[i] =(quiet) wt[i]) =(quiet) (ws[i]  elements if the corresponding 32i+31..32i 64i+63..64i wd EqualFP(WR[ws] EqualFP(WR[ws] 26 25 wd[i] c  c  WR[wd] WR[wd] FCEQ.df FCEQ.df wd,ws,wt FCEQ.W wd,ws,wt FCEQ.D Vector Floating-Point QuietCompare Equal Vector endfor */ operation. compare equal quiet defined Implementation /* endfor 64 MSA function EqualFP(tt, ts, n) ts, EqualFP(tt, function EqualFP endfunction FCEQ.D .. WRLEN/64-1 i in 0 for FCEQ.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Description: Set Set all bits to 1 in Format: Purpose: floating-pointtovector quiet compare for equality Vector wise set all bits to0. quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Equal Compare Quiet Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 . FCLASS.df I df 2RF 011110 o (bit 5). Bits 6, 7, 8, 9 6 5 56 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 159 5 Negative, Negative, Infinite, Subnormal, Quiet NaN, or Signaling N values: signaling NaN (bit 0) and quiet NaN (bit 1). Bits l (bit 3), subnormal (bit zer 4), and ed by the flush-to-zero bit FS in MSA Control and Status . The results are values in integer data format data integer in results are values . The df 1 df ws , 32) , 64) 9..0 9..0 17 16 15 || c || c || 22 54 32i+31..32i 64i+63..64i 0 0 a bit mask reflecting the floating-point class of the correspo nding element of   wd 110010000 Volume IV-j: The MIPS64® SIMD Arch  class(ws[i]) 32i+31..32i 64i+63..64i ClassFP(WR[ws] ClassFP(WR[ws] . 26 25 wd[i] c  c  WR[wd] WR[wd] FCLASS.df FCLASS.df wd,ws FCLASS.W wd,ws FCLASS.D Vector Floating-Point Class Mask Vector . endfor */ operation. class defined Implementation /* endfor MSACSR ws 69 MSA function ClassFP(tt, n) ClassFP(tt, function ClassFP endfunction FCLASS.D .. WRLEN/64-1 i in 0 for FCLASS.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The mask has 10 bits as follows. Bits 0 and 1 indicate Na vector 2, 3, 4, 5 classify negative values: infinity (bit 2), norma Store Store in each element of v ector Format: Purpose: Vector floating-point class shown as a bit for mask Zero, NaN. Description: classify positive values:infinity (bit 6), normal(bit values:infinity 7),subnormal (bit 8),andclassify positive zero (bit 9). The input values and generated bit masks are not affect Register in floating-point operandsdata formatThe are values Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Floating-Point ClassMask MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FCLE.df I . df -2008. 3RF 011010 TM 6 5 wd , 32) , 64) , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 160 erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i floating-point elements are ordered eitherand less than or . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 ws df 16 15 , WR[wt] , WR[wt] , WR[wt] , WR[wt] 32 64 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i (c | d) (c | d) MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0110 df wt (ws[i] <=(quiet) wt[i])  (ws[i] <=(quiet) elements if the corresponding 64i+63..64i 32i+31..32i wd LessFP(WR[ws] LessFP(WR[ws] EqualFP(WR[ws] EqualFP(WR[ws] 26 25 wd[i] c  c  d  d  WR[wd] WR[wd] FCLE.df FCLE.df wd,ws,wt FCLE.W wd,ws,wt FCLE.D Vector Floating-Point CompareQuiet Less or Equal Vector floating-pointelements, otherwiseset allbits to 0. endfor */ operation. than compare less quiet defined Implementation /* */ operation. compare equal quiet defined Implementation /* endfor wt 64 MSA function LessThanFP(tt, ts, n) ts, LessThanFP(tt, function LessThanFP endfunction n) ts, EqualFP(tt, function EqualFP endfunction FCLE.D .. WRLEN/64-1 i in 0 for FCLE.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits to 1 in Format: Purpose: floating-pointto vector quiet comparefor thanless or equal; if true all destinationbits are set, otherwise clear. Vector Description: The quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 equal to equal MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Less or Equal or Less Compare Quiet Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 floating- wt FCLT.df I . df -2008. 3RF 011010 TM 6 5 wd ordered are and less than , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 161 erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws 32i+31..32i 64i+63..64i . The results are values in integer data format data integer in results are values . The floating-point elements by the I EEE Standard for Floating-Point Arithmetic 754 df ws 16 15 , WR[wt] , WR[wt] ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32 64 32i+31..32i 64i+63..64i c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0100 df wt (ws[i] <(quiet) wt[i]) <(quiet) (ws[i]  32i+31..32i 64i+63..64i if the corresponding elements wd LessFP(WR[ws] LessFP(WR[ws] 26 25 wd[i] c  c  WR[wd] WR[wd] FCLT.df FCLT.df wd,ws,wt FCLT.W wd,ws,wt FCLT.D Vector Floating-Point QuietCompare Less Than Vector endfor */ operation. than compare less quiet defined Implementation /* endfor 64 MSA function LessThanFP(tt, ts, n) ts, LessThanFP(tt, function LessThanFP endfunction FCLT.D .. WRLEN/64-1 i in 0 for FCLT.W .. WRLEN/32-1 i in 0 for 011110 point elements, otherwise set all bits to 0. to all bits otherwise set point elements, quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits to 1 in Format: Purpose: floating-pointtovector quiet compareclear. otherwise forare set, thandestination bits less true all ; if Vector Description: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Less Than Less Compare Quiet Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FCNE.df I . df -2008. 3RF 011100 TM 6 5 wd , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 162 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are ordered andnot equal, oth- ws wt l; if true all destination bits are set, otherwise clear. set, otherwise are destination bits true l; all if . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 , WR[wt] , WR[wt] df ws 16 15 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 32 64 (quiet) wt[i]) (quiet) c c  MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0011 df wt  (ws[i] 32i+31..32i 64i+63..64i elementsthe if corresponding NotEqualFP(WR[ws] NotEqualFP(WR[ws] wd 26 25 wd[i] c  c  WR[wd] WR[wd] FCNE.df FCNE.df wd,ws,wt FCNE.W wd,ws,wt FCNE.D Vector Floating-Point QuietCompare NotEqual Vector endfor */ operation. equal compare not quiet defined Implementation /* endfor 64 MSA function NotEqualFP(tt, ts, n) ts, NotEqualFP(tt, function NotEqualFP endfunction FCNE.D .. WRLEN/64-1 i in 0 for FCNE.W .. WRLEN/32-1 i in 0 for 011110 erwiseset allbits to 0. quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Description: Setall bits to 1 in Format: Purpose: floating-pointtovector quiet compare for notequa Vector MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet CompareNot Equal MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FCOR.df I . df -2008. 3RF 011100 TM ordered, i.e. both elements 6 5 wd , 64) , , 32) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 163 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands 32i+31..32i floating-point elements are ws wt . The results are values in integer data format data integer in results are values . The and if true all destination bits are set, otherwise clear. by the I EEE Standard for Floating-Point Arithmetic 754 df , WR[wt] ws , WR[wt] 16 15 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 64i+63..64i 32i+31..32i 32 64 c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0001 df wt ws[i] !?(quiet) wt[i] !?(quiet) ws[i]  32i+31..32i 64i+63..64i elements if the corresponding OrderedFP(WR[ws] OrderedFP(WR[ws] wd 26 25 wd[i] c  c  WR[wd] WR[wd] FCOR.df FCOR.df wd,ws,wt FCOR.W wd,ws,wt FCOR.D Vector Floating-Point QuietCompare Ordered Vector endfor */ operation. compare ordered quiet defined Implementation /* endfor 64 MSA function OrderedFP(tt, ts, n) OrderedFP(tt, function OrderedFP endfunction FCOR.D .. WRLEN/64-1 i in 0 for FCOR.W .. WRLEN/32-1 i in 0 for 011110 are not NaN values, otherwise setall are bits tonotNaN0. values, quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits in to 1 bits all Set Format: Purpose: floating-pointtovector quiet compare ordered; Vector Description: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Ordered Compare Quiet Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 . FCUEQ.df I df -2008. 3RF 011010 TM 6 5 wd , 32) , 64) true all destination bits are set, otherwise , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 164 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are unordered or equal, other- ws 32i+31..32i 64i+63..64i wt , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 df ws 16 15 , WR[wt] , WR[wt] 32 64 ed Exception,MSA Floating PointException. 32i+31..32i 64i+63..64i . In case of a floating-point exception, the default result has all bits set to 64i+63..64i 32i+31..32i (c | d) (c | d) MSACSR 15556   t unordered quiet compare for or equality; if 22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0011 df wt (ws[i] =?(quiet) wt[i])  (ws[i] =?(quiet) 64i+63..64i 32i+31..32i elements if the corresponding wd EqualFP(WR[ws] EqualFP(WR[ws] UnorderedFP(WR[ws] UnorderedFP(WR[ws] 26 25 wd[i] d  d  c  c  WR[wd] WR[wd] FCUEQ.df FCUEQ.df wd,ws,wt FCUEQ.W wd,ws,wt FCUEQ.D Vector Floating-Point CompareQuiet Unordered or Equal Vector endfor */ operation. compare unordered quiet defined Implementation /* */ operation. compare equal quiet defined Implementation /* endfor 64 MSA function UnorderedFP(tt, ts, n) ts, UnorderedFP(tt, function UnorderedFP endfunction n) ts, EqualFP(tt, function EqualFP endfunction FCUEQ.W .. WRLEN/32-1 i in 0 for FCUEQ.D .. WRLEN/64-1 i in 0 for 011110 wise set all bits to0. quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits to 1 in Format: Purpose: to vector Vector floating-poin clear. Description: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 1. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Unordered or Equal or Unordered Compare Quiet Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 . FCULE.df I df -2008. 3RF 011010 TM 6 5 wd , 32) , 64) , 32) , 64) , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 165 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i floating-point elements are unordered or less than or equal , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 df ws 32 64 16 15 , WR[wt] , WR[wt] , WR[wt] , WR[wt] are Unordered or Less or Equal 32i+31..32i 64i+63..64i . In case of a floating-point exception, the default result has all bits set to 64i+63..64i 32i+31..32i 64i+63..64i 32i+31..32i (c | d | e) (c | d (c | d | e) (c | d MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0111 df wt (ws[i] <=?(quiet) wt[i]) <=?(quiet) (ws[i]  elements if the corresponding 64i+63..64i 32i+31..32i wd LessFP(WR[ws] LessFP(WR[ws] UnorderedFP(WR[ws] UnorderedFP(WR[ws] EqualFP(WR[ws] EqualFP(WR[ws] 26 25 wd[i] d  d  c  c  e  e  WR[wd] WR[wd] FCULE.df FCULE.df wd,ws,wt FCULE.W wd,ws,wt FCULE.D Vector Floating-Point QuietComp Vector endfor */ operation. compare unordered quiet defined Implementation /* */ operation. than compare less quiet defined Implementation /* endfor 64 floating-point elements,otherwiseall set bits to0. MSA function UnorderedFP(tt, ts, n) ts, UnorderedFP(tt, function UnorderedFP endfunction n) ts, LessThanFP(tt, function LessThanFP endfunction FCULE.W .. WRLEN/32-1 i in 0 for FCULE.D .. WRLEN/64-1 i in 0 for 011110 wt The quietThe compare operation bythe IEEEis defined Standard for Floating-Point Arithmetic754 to Set all bits to 1 in Format: Purpose: to Vector vector floating-point quiet compare for unordered or less than or equal; if true all destination bits are set, clear. otherwise Description: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 1. in floating-point operandsdata formatThe are values 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Quiet Compare Unordered or Less or Equal or Less Unordered Quiet Compare Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

FCULE.df I itecture Module, Revision 1.12 Revision Module, itecture 166 iet equal compare operation. */ operation. compare iet equal ed Exception,MSA Floating PointException. Volume IV-j: The MIPS64® SIMD Arch /* Implementation defined qu defined Implementation /* function EqualFP(tt, ts, n) EqualFP(tt, function endfunction EqualFP endfunction Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Vector Floating-Point Quiet Compare Unordered or Less or Equal or Less Unordered Quiet Compare Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 floating- wt FCULT.df I . df -2008. 3RF 011010 TM 6 5 wd , 32) , 64) , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 167 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands less than; if true all destination bits are set, otherwise ws 32i+31..32i 64i+63..64i , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 floating-point elementsare unordered or lessthan df ws 16 15 , WR[wt] , WR[wt] 32 ed Exception,MSA Floating PointException. 32i+31..32i 64i+63..64i . In case of a floating-point exception, the default result has all bits set to 64 32i+31..32i 64i+63..64i c (c | d) MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0101 df wt (ws[i]

MSA MSA 0 FFQL.df I ). 31 2RF . or 2 011110 wd 15 6 5 56 input data is half the size of the out- wd , 16) , 32) by the IEEE Standard for Floating-Point . The results are floating-point values in data 11 10 df itecture Module, Revision 1.12 Revision Module, itecture 177 5 point doubling the element element width. doubling the point 1 df ws floating-point.The result writtenis tovector 16i+15+WRLEN/2..16i+WRLEN/2 32i+31+WRLEN/2..32i+WRLEN/2 17 16 15 are up-converted to are up-converted floating-point data format, i.e. from 16-bit Q15 to ws t conversion t operations conversion are defined ert fromFixed-Point Left f f g-point exceptions are possible because the point data format half the size of   format conversion to floating- format conversion 110011010 Volume IV-j: The MIPS64® SIMD Arch ) and then the resulting floating-point value is scaled down (divided by2 (divided andis scaled down ) thentheresulting floating-point value  from_q(left_half(ws)[i]) 32i+31..32i 64i+63..64i 31 FromFixPointFP(WR[ws] FromFixPointFP(WR[ws] -2008. No floatin or 2 or TM 26 25 wd[i] 15 f  WR[wd] WR[wd] f  FFQL.df FFQL.df wd,ws FFQL.W wd,ws FFQL.D Vector Floating-Point Conv Vector . endfor endfor */ conversion. to floating-point fixed-point defined Implementation /* df 69 MSA FFQL.W .. WRLEN/32-1 i in 0 for FFQL.D .. WRLEN/64-1 i in 0 for n) FromFixPointFP(tt, function FromFixPointFP endfunction 011110 The scaling and integer to floating-poin Arithmetic 754 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The left half elementsfixed-point in vector Format: Purpose: Vector left fix-point elements elements left fix-point Vector Description: 32-bitfloating-point, fromor 32-bitQ31 to64-bit The fixed-point Q15 or Q31 value is first converted to floating-point as a 16-bit or 32-bit integer (as though it was scaled up by2 put. The operands are values in fixed- Restrictions: No data-dependentare possible.exceptions Operation: format 31 Vector Floating-Point Convertfrom Fixed-PointLeft MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 FFQR.df I ). 31 2RF . or 2 011110 wd 15 6 5 56 input data is half the size of the out- wd by the IEEE Standard for Floating-Point . The results are floating-point values in data 11 10 df itecture Module, Revision 1.12 Revision Module, itecture 178 5 , 16) , 32) 1 df ws floating-point.The result writtenis tovector 16i+15..16i 32i+31..32i 17 16 15 are up-converted to floating-point data format, i.e. from 16-bit Q15 to Q15 16-bit from i.e. data format, to floating-point up-converted are ws t conversion t operations conversion are defined ert from Fixed-Point ert fromRightFixed-Point f f g-point exceptions are possible because the point data format half the size of   format conversion tofloating-point doubling formattheelement conversion width. 110011011 Volume IV-j: The MIPS64® SIMD Arch ) and then the resulting floating-point value is scaled down (divided by2 (divided andis scaled down ) thentheresulting floating-point value  from_q(right_half(ws)[i]); int elements in vector elements int 32i+31..32i 64i+63..64i 31 FromFixPointFP(WR[ws] FromFixPointFP(WR[wt] -2008. No floatin or 2 or TM 26 25 wd[i] 15 WR[wd] WR[ws] f  f  FFQR.df FFQR.df wd,ws FFQR.W wd,ws FFQR.D Vector Floating-Point Conv Vector . endfor endfor */ conversion. to floating-point fixed-point defined Implementation /* df 69 MSA FFQR.W .. WRLEN/32-1 i in 0 for FFQR.D .. WRLEN/64-1 i in 0 for n) FromFixPointFP(tt, function FromFixPointFP endfunction 011110 The scaling and integer to floating-poin Arithmetic 754 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The right half fixed-po half right The Format: Purpose: Vector right fix-point elements right fix-point Vector Description: put. The operands are values in fixed- Restrictions: No data-dependentare possible.exceptions Operation: format 32-bitfloating-point, fromor 32-bitQ31 to64-bit The fixed-point Q15 or Q31 value is first converted to floating-point as a 16-bit or 32-bit integer (as though it was scaled up by2 31 Vector Floating-Point Convertfrom Fixed-PointRight MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA 0 FILL.df I MIPS64 MSA 2R 011110 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 179 556 . If the source GPR is wider than the destination data format, the df rs wd 15..0 31..0 63..0 7..0 18 17 16 15 GPR[rs] GPR[rs] GPR[rs]     GPR[rs] Volume IV-j: The MIPS64® SIMD Arch 11000000  rs 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i in vectorto all elements value rs 26 25 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] FILL.df FILL.df wd,rs FILL.B wd,rs FILL.H wd,rs FILL.W wd,rs FILL.D Vector Fill from GPR Vector endfor endfor endfor endfor 682 MSA FILL.W .. WRLEN/32-1 i in 0 for FILL.D .. WRLEN/64-1 i in 0 for FILL.B .. WRLEN/8-1 i in 0 for FILL.H .. WRLEN/16-1 i in 0 for 011110 destination'selements will be set to the least significant bits of the GPR. Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector elements GPR. replicated from elements Vector Description: Purpose: Replicate GPR Format: 31 Vector Fill fromGPR MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FLOG2.df I 2RF 011110 6 5 56 wd are written as floating-point values to ws . df 11 10 itecture Module, Revision 1.12 Revision Module, itecture 180 5 by the I EEE Standard for Floating-Point Arithmetic 754 1 df ws as defined by the IEEE Standard for Floating-Point Arithmetic , 32) , 64) 17 16 15 ed Exception,MSA Floating PointException. logB() 32i+31..32i 64i+63..64i l f   110010111 Volume IV-j: The MIPS64® SIMD Arch results are values infloating-point results dataformat are values  log2(ws[i]) wd 32i+31..32i 64i+63..64i . Log2FP(WR[ws] Log2FP(WR[ws] wd 26 25 wd[i] l  WR[wd] WR[wd] f  FLOG2.df FLOG2.df wd,ws FLOG2.W wd,ws FLOG2.D Vector Floating-Point LogarithmBase 2 Vector endfor endfor */ base 2 operation. logarithm defined Implementation /* operands and operands -2008. 69 MSA ws FLOG2.W .. WRLEN/32-1 i in 0 for FLOG2.D .. WRLEN/64-1 i in 0 for n) Log2FP(tt, function Log2FP endfunction 011110 TM Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The signed integral base 2 e xponents of floating-point elements in vector Format: Purpose: floating-point base 2 logarithm. Vector Description: elements vector This operation is the homogeneous base 2 754 The 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Base 2 Logarithm 2 Base Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 , 64) , 32) FMADD.df I 3RF 011011 -2008. The multipli--2008.The 32i+31..32i 64i+63..64i TM to the floating- are added ws 6 5 , WR[wt] , WR[wt] wd 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 181 . df ws , WR[ws] , WR[ws] by the I EEE Standard for Floating-Point Arithmetic 754 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. multiplied by floating-point elements in vector gnals Invalid Operation exception. If the Invalid Operation exception is dis- wt 15556 es ines floating-point data format   22 21 20 Volume IV-j: The MIPS64® SIMD Arch . The operation is fused, i.e. computed as if with unbounded range and precision, rounding precision, and range unbounded with if as i.e. computed is fused, operation The . wd 0100 df wt wd[i] + ws[i] * wt[i] + wd[i]  MultiplyAddFP(WR[wd] MultiplyAddFP(WR[wd] 32i+31..32i 64i+63..64i 26 25 wd[i] WR[wd] endfor WR[wd] FMADD.df FMADD.df wd,ws,wt FMADD.W wd,ws,wt FMADD.D Vector Floating-Point Multiply-Add Vector /* Implementation defined multiply add operation. */ operation. add multiply defined Implementation /* endfor 64 MSA FMADD.W .. WRLEN/32-1 i in 0 for function MultiplyAddFP(td, tt, ts, n) tt, MultiplyAddFP(td, function MultiplyAddFP endfunction FMADD.D .. WRLEN/64-1 i in 0 for 011110 cation between an infinity and a zero si abled, the result is the defaultquiet NaN. valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point elements in vector Format: Purpose: floating-point multiply-add Vector Description: point elements in vector point only once to thedestination format. 754 Arithmetic for Floating-Point Standard IEEE the by defined is operation add multiply The 2008. Operation: 31 Vector Floating-Point Multiply-Add MIPS® Architecture for Programmers Programmers MIPS® Architecture for . - - wd TM TM MSA MSA 0 FMAX.df I 3RF 011011 are writtento vector are wt , 32) , 64) 6 5 wd and vector 32i+31..32i 64i+63..64i ws 11 10 itecture Module, Revision 1.12 Revision Module, itecture 182 . , WR[wt] , WR[wt] df ws by the I EEE Standard for Floating-Point Arithmetic 754 32i+31..32i 64i+63..64i 16 15 ed Exception,MSA Floating PointException. floating-point elements in vector in elements floating-point MaxFP(WR[ws] MaxFP(WR[ws] 15556   es ines floating-point data format 22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1110 df wt max(ws[i],  max(ws[i], wt[i]) 32i+31..32i 64i+63..64i 26 25 wd[i] WR[wd] WR[wd] FMAX.df FMAX.df wd,ws,wt FMAX.W wd,ws,wt FMAX.D Vector Floating-Point Maximum Vector endfor endfor */ argument. largest the returns defined, Implementation /* 64 MSA FMAX.D .. WRLEN/64-1 i in 0 for n) ts, MaxFP(tt, function MaxFP endfunction FMAX.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The largest value is defined by the maxNum operation in the IEEE Standard for Floating-Point Arithmetic 754 The largest values between corresponding between values largest The Format: Purpose: floating-point maximum. Vector Description: 2008. valu are and results operands The 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Maximum MIPS® Architecture for Programmers Programmers MIPS® Architecture for - ws TM MSA MSA 0 FMAX_A.df I , 32) , 64) 3RF 011011 for Floating-Point Arithme- Floating-Point for 32i+31..32i 64i+63..64i 6 5 wd , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 183 . 32i+31..32i 64i+63..64i df ws operation in the IEEE Standard the IEEE in operation betweencorresponding floating-pointelementsvector in by the I EEE Standard for Floating-Point Arithmetic 754 16 15 ed Exception,MSA Floating PointException. . MaxAbsoluteFP(WR[ws] MaxAbsoluteFP(WR[ws] wd 15556 es ines floating-point data format   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1111 df wt absolute_value(ws[i]) > absolute_value(wt[i])? ws[i]: wt[i] ws[i]: > absolute_value(wt[i])? absolute_value(ws[i])  32i+31..32i 64i+63..64i absolute value. For equal absolute values, returns the largest the returns values, absolute equal For value. absolute argument.*/ 26 25 wd[i] WR[wd] WR[wd] are written tovector FMAX_A.df FMAX_A.df wd,ws,wt FMAX_A.W wd,ws,wt FMAX_A.D Vector Floating-Point MaximumBased onAbsolute Values Vector wt -2008. endfor endfor largest with argument the returns defined, Implementation /* TM 64 MSA FMAX_A.W .. WRLEN/32-1 i in 0 for FMAX_A.D .. WRLEN/64-1 i in 0 for n) ts, MaxAbsoluteFP(tt, function MaxAbsoluteFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved and vector The largest absolute value is defined by the maxNumMag the by is defined value absolute largest The The value with the largest magnitude, i.e. absolute value, i.e. absolute magnitude, largest withthe value The Format: Purpose: floating-point maximumbased on the magnitude,i.e. absolutevalues. Vector Description: tic 754 The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Floating-Point Maximum Based on Absolute Values on Absolute Based Maximum Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - - TM TM MSA MSA 0 IFMIN.df are written to 3RF wt 011011 , 32) , 64) 6 5 and v ector ws wd 32i+31..32i 64i+63..64i rd for Floating-Point Arithmetic 754 11 10 itecture Module, Revision 1.12 Revision Module, itecture 184 . , WR[wt] , WR[wt] df ws by the I EEE Standard for Floating-Point Arithmetic 754 32i+31..32i 64i+63..64i 16 15 ed Exception,MSA Floating PointException. ng floating-point elements in v ector MinFP(WR[ws] MinFP(WR[ws] e minNum operation in the IEEE Standa 15556   es ines floating-point data format 22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1100 df wt min(ws[i],  min(ws[i], wt[i]) 32i+31..32i 64i+63..64i 26 25 wd[i] WR[wd] WR[wd] FMIN.df FMIN.df wd,ws,wt FMIN.W wd,ws,wt FMIN.D Vector Floating-Point Minimum Vector . endfor endfor */ argument. smallest the returns defined, Implementation /* wd 64 MSA FMIN.D .. WRLEN/64-1 i in 0 for n) ts, MinFP(tt, function MinFP endfunction FMIN.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The smallest value between correspondi Format: Purpose: floating-point minimum. Vector Description: 2008. valu are and results operands The 2008. Operation: Restrictions: Data-dependent exceptions are poss ible as s pecified The valuesmallest is defined by th vector 31 Vector Floating-Point Minimum MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FMIN_A.df I , 32) , 64) 3RF 011011 floating-point elements in 32i+31..32i 64i+63..64i 6 5 wd , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 185 . 32i+31..32i 64i+63..64i df ws by the I EEE Standard for Floating-Point Arithmetic 754 16 15 . ed Exception,MSA Floating PointException. wd MinAbsoluteFP(WR[ws] MinAbsoluteFP(WR[ws] 15556 es ines floating-point data format   22 21 20 Volume IV-j: The MIPS64® SIMD Arch are written to vector are written to 1101 df wt absolute_value(ws[i]) < absolute_value(wt[i])? ws[i]: wt[i] ws[i]: < absolute_value(wt[i])? absolute_value(ws[i])  32i+31..32i 64i+63..64i wt absolute value. For equal absolute values, returns the smallest the returns values, absolute equal For value. absolute argument.*/ 26 25 wd[i] WR[wd] WR[wd] -2008. FMIN_A.df FMIN_A.df wd,ws,wt FMIN_A.W wd,ws,wt FMIN_A.D Vector Floating-Point Minimum Basedon Absolute Values Vector TM and vector endfor endfor smallest with argument the returns defined, Implementation /* ws 64 MSA FMIN_A.W .. WRLEN/32-1 i in 0 for FMIN_A.D .. WRLEN/64-1 i in 0 for n) ts, MinAbsoluteFP(tt, function MinAbsoluteFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The value with the smallest magnitude, i.e. absol ute lue, va between corres ponding Format: Purpose: floating-point minimum based on the magnitude, absolutei.e. values. Vector Description: vector The smallest absolute value is defined by the minNumMag operation in the IEEE rd Standa for Floating-Point Arith- metic 754 The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Floating-Point Minimum Based on Absolute Values on Absolute Based Minimum Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 , 64) , 32) -2008. The FMSUB.df I TM 3RF 011011 32i+31..32i 64i+63..64i are subtracted from the ws 6 5 e Invalid Operation exceptione Invalid , WR[wt] , WR[wt] wd 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 186 . oint elements in vector df ws , WR[ws] , WR[ws] lid Operation exception. If th by the I EEE Standard for Floating-Point Arithmetic 754 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. multiplied by floating-p wt . The. operation i.e. is computed fused, as if with unbounded rangeand precision, defined theby IEEE Standard for Floating-Point Arithmetic 754 15556 es ines floating-point data format   wd 22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0101 df wt wd[i] - ws[i] * wt[i] - wd[i]  MultiplySubFP(WR[wd] MultiplySubFP(WR[wd] 32i+31..32i 64i+63..64i 26 25 wd[i] WR[wd] WR[wd] FMSUB.df FMSUB.df wd,ws,wt FMSUB.W wd,ws,wt FMSUB.D Vector Floating-Point Multiply-Sub Vector endfor */ operation. subtract multiply defined Implementation /* endfor 64 MSA FMSUB.W .. WRLEN/32-1 i in 0 for function MultiplySubFP(td, tt, ts, n) tt, MultiplySubFP(td, function MultiplySubFP endfunction FMSUB.D .. WRLEN/64-1 i in 0 for 011110 rounding only once tothedestination format. The multiply subtract operation is Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point elements in vector Format: Purpose: floating-point multiply-sub Vector Description: floating-point in vector elements multiplication between an infinity and a zero signals Inva is disabled, theresult is the default quietNaN. valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Floating-Point Multiply-Sub MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FMUL.df I 3RF -2008. . The result is writ- The result . 011011 , 32) , 64) TM ws 6 5 32i+31..32i 64i+63..64i wd , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 187 . df ws 32i+31..32i 64i+63..64i by the I EEE Standard for Floating-Point Arithmetic 754 16 15 ed Exception,MSA Floating PointException. are multiplied by thefloating-point elementsin vector wt MultiplyFP(WR[ws] MultiplyFP(WR[ws] 15556   es ines floating-point data format 22 21 20 Volume IV-j: The MIPS64® SIMD Arch ion is defined by the IEEE Standard for Floating-Point Arithmetic 754 Arithmetic Floating-Point for Standard by the IEEE ion is defined 0010 df wt  ws[i] * wt[i] 32i+31..32i 64i+63..64i . 26 25 wd[i] WR[wd] WR[wd] wd FMUL.df FMUL.df wd,ws,wt FMUL.W wd,ws,wt FMUL.D Vector Floating-Point Multiplication Vector endfor endfor */ operation. multiplication defined Implementation /* 64 MSA FMUL.W .. WRLEN/32-1 i in 0 for FMUL.D .. WRLEN/64-1 i in 0 for n) ts, MultiplyFP(tt, function MultiplyFP endfunction 011110 ten tovector multiplication operat The Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point elements in vector elements floating-point The Format: Purpose: floating-point multiplication. Vector Description: 2008. Operation: The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Multiplication MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FRCP.df I 2RF 011110 low. low. The result is w ritten to 6 5 e IEEE Standard for Floating- for Standard IEEE e 56 wd , 32) , 64) , ce the approximatedthe ce from resultthe may differ 11 10 itecture Module, Revision 1.12 Revision Module, itecture 188 . df 32i+31..32i 64i+63..64i 5 led based on the IEEE Standard for Floating-Point Arithme-Floating-Point for Standard IEEE based on the led as specifare calculated ied be -2008 foroperation. the divide by the I EEE Standard for Floating-Point Arithmetic 754 ws TM l all the exceptions specified by specified all the exceptions signal The compliant reciprocals . 1 df ws 17 16 15 ed Exception,MSA Floating PointException. MSACSR be approximate. The approximation fromdiffers the compliant reciprocal rep- ReciprocalFP(WR[ws] ReciprocalFP(WR[ws] es ines floating-point data format   te reciprocal operations are allowed to not signal the Overflow or Underflow excep- or Underflow te toreciprocal notsignaloperations theare allowed Overflow eciprocal is Inexact or if there is a chan eciprocal is Inexact 110010101 Volume IV-j: The MIPS64® SIMD Arch ng-Point Arithmetic 754 Arithmetic ng-Point one unit in the least significant place. Approximate reciprocal operations signal the Inex- Approximateoperations place. reciprocal significant the least in unit one -2008 defined divide operation is affected by the rounding mode bits andRM flush-to-zero 1.0 / ws[i] 1.0  64i+63..64i 32i+31..32i TM 26 25 wd[i] WR[wd] WR[wd] FRCP.df FRCP.df wd,ws FRCP.W wd,ws FRCP.D Vector Approximate Floating-Point Reciprocal Vector -2008 defined divide operation. divide -2008 defined . endfor endfor */ operation. Reciprocal defined Implementation /* TM wd 69 MSA FRCP.W .. WRLEN/32-1 i in 0 for FRCP.D .. WRLEN/64-1 i in 0 for n) ts, ReciprocalFP(tt, function ReciprocalFP endfunction 011110 bit FS in MSA Control and Status Register FS in MSA Control bit the IEEE Standard for Floati Standard the IEEE Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved tic 754 The reciprocals of fl oating-point elements in vector r compliant if the act exception Format: Purpose: floating-point reciprocal. Vector Description: The reciprocal operation is to allowed than by no more resentation compliant Approxima reciprocal. are signa Zero exceptions by and divide The Invalid tions. The compliant reciprocal operation is defined as 1.0 divided by element value, where th where value, by element 1.0 divided as defined is operation compliant reciprocal The vector Point Arithmetic 754 The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Approximate Floating-Point Reciprocal Floating-Point Approximate Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA -2008, 0 TM FRINT.df I 2RF 011110 . The result is written to int Arithmetic 754 6 5 MSACSR 56 wd rd for Floating-Po 11 10 itecture Module, Revision 1.12 Revision Module, itecture 189 . df 5 by the I EEE Standard for Floating-Point Arithmetic 754 , 32) , 64) 1 df ws 17 16 15 ed Exception,MSA Floating PointException. 32i+31..32i 64i+63..64i are rounded to an integral valued floating-point number in the same format ws f f   es ines floating-point data format 110010110 Volume IV-j: The MIPS64® SIMD Arch  round_int(ws[i]) 32i+31..32i 64i+63..64i RoundIntFP(WR[ws] RoundIntFP(WR[ws] 26 25 wd[i] WR[wd] WR[wd] f  f  FRINT.df FRINT.df wd,ws FRINT.W wd,ws FRINT.D Vector Floating-Point Roundto Integer Vector . endfor endfor */ operation. to integer round defined Implementation /* wd 69 MSA FRINT.W .. WRLEN/32-1 i in 0 for FRINT.D .. WRLEN/64-1 i in 0 for n) RoundIntFP(tt, function RoundIntFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The round to integer operation is exact as defined by the IEEE Standa The floating-point elements in vector Format: Purpose: floating-point roundto integer. Vector Description: based on the rounding mode bits vector RM in MSA Control and Status Register 2008. Operation: i.e. the Inexact exception is signaled if the result does not have theis signaled sameas exception thei.e. thetheif input Inexact numericalresult doesoperand. notvalue have valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Round to Integer to Round Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for . - TM MSA MSA 0 MSACSR FRSQRT.df I 2RF 011110 on differs from the compli- from the differs on EEE Standard for Floating- gnificant gnificant place. Approximate 6 5 56 wd mpliant reciprocal of the square root. are calculated as specif ied below. The ws , 32) , 64) -2008 defined divide andoperations root square divide -2008 defined 11 10 itecture Module, Revision 1.12 Revision Module, itecture 190 . TM df two units in the least si 5 exception exception if the compliant reciprocal of the square root is 32i+31..32i 64i+63..64i e exceptions specified by the I be approximate. The approximati be by the I EEE Standard for Floating-Point Arithmetic 754 1 df ws 17 16 15 tReciprocal of Square Root root operation is defined as 1.0 di vided by the square root of the element oximated result may differ from the co of floating-point elements in vector f f bits RM and flush-to-zero bit FS in MSA Control and Status Register es ines floating-point data format   for Floating-Point Arithmetic 754 Floating-Point for 110010100 Volume IV-j: The MIPS64® SIMD Arch . wd -2008 for the divide rootsoperations.and square -2008 for the divide 1.0 / sqrt(ws[i]) 1.0  32i+31..32i 64i+63..64i TM SquareRootReciprocalFP(WR[ws] SquareRootReciprocalFP(WR[ws] 26 25 wd[i] f  f  WR[wd] WR[wd] FRSQRT.df FRSQRT.df wd,ws FRSQRT.W wd,ws FRSQRT.D Vector Approximate Floating-Poin Vector endfor */ operation. reciprocal root square defined Implementation /* endfor -2008 defined divide operation. divide -2008 defined 69 MSA FRSQRT.W .. WRLEN/32-1 i in 0 for function SquareRootReciprocalFP(tt, ts, n) SquareRootReciprocalFP(tt, function SquareRootReciprocalFP endfunction FRSQRT.D .. WRLEN/64-1 i in 0 for 011110 TM The compliant reciprocal of the square result writtenis tovector Standard the IEEE where value, The Invalid and divide by Zero exceptions are signaled based on the IEEE Stand ard for Floating-Point Arithmetic 754 ant reciprocal of the square root representation by no more than reciprocal of the square root operations signal the Inexact Inexact or if there is a chance the appr are affected by the rounding mode The compliant reciprocals of the square roots signal all th PointArithmetic 754 The reciprocals of the square roots Format: Purpose: Vector floating-point reciprocal of square root. Vector Description: The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified The reciprocal of the square root operation is allowed to reciprocalThe rootoperation of the square is allowed 2008. Operation: 31 Vector Approximate Floating-Point Reciprocal of Square Root of Square Reciprocal Floating-Point Approximate Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

FRSQRT.df I itecture Module, Revision 1.12 Revision Module, itecture 191 ed Exception,MSA Floating PointException. Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Exception, Disabl MSA Instruction Reserved Vector Approximate Floating-Point Reciprocal of Square Root of Square Reciprocal Floating-Point Approximate Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSAF.df I . df , 32) , 64) 3RF 011010 32i+31..32i 64i+63..64i 6 5 wd , WR[wt] , WR[wt] signal Invalid Operation exception. signal Invalid wt or ws 11 10 itecture Module, Revision 1.12 Revision Module, itecture 192 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 df 16 15 e always false; all destination bits are clear. bits are all destination false; always e ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to SignalingFALSE(WR[ws] SignalingFALSE(WR[ws] MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1000 df wt signalingFalse(ws[i], wt[i])  signalingFalse(ws[i], 32i+31..32i 64i+63..64i elements. Signaling and quiet NaN elements in quiet NaN and elements. Signaling wd 26 25 wd[i] WR[wd] WR[wd] FSAF.df FSAF.df wd,ws,wt FSAF.W wd,ws,wt FSAF.D Vector Floating-Point SignalingFalse CompareAlways Vector endfor endfor */ NaN test and quiet signaling defined Implementation /* 0 return 64 MSA FSAF.W .. WRLEN/32-1 i in 0 for FSAF.D .. WRLEN/64-1 i in 0 for ts, n) SignalingFALSE(tt, function SignalingFALSE endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Setall bits to0 in Format: Purpose: floating-pointtovector signalingcompar Vector Description: 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Always False Always Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSEQ.df I . -2008. df TM 3RF 011010 6 5 ts are set, otherwise clear. otherwise are set, ts wd , 32) , 64) 11 10 itecture Module, Revision 1.12 Revision Module, itecture 193 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elementsare equal,otherwise set bitsall ws y; trueif alldestination bi wt . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 and , WR[wt] , WR[wt] df ws 16 15 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 32 64 c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1010 df wt (ws[i] =(signaling) wt[i]) =(signaling) (ws[i]  32i+31..32i 64i+63..64i elements if the corresponding EqualSigFP(WR[ws] EqualSigFP(WR[ws] wd 26 25 wd[i] c  c  WR[wd] WR[wd] FSEQ.df FSEQ.df wd,ws,wt FSEQ.W wd,ws,wt FSEQ.D Vector Floating-Point Signaling CompareEqual Vector endfor */ operation. equal compare signaling defined Implementation /* endfor 64 MSA function EqualSigFP(tt, ts, n) ts, EqualSigFP(tt, function EqualSigFP endfunction FSEQ.D .. WRLEN/64-1 i in 0 for FSEQ.W .. WRLEN/32-1 i in 0 for 011110 Description: to 0. signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Setall bitsto 1 in Format: Purpose: floating-pointvector to signalingcompare equalitfor Vector 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Equal Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 floating- FSLE.df I wt . -2008. df TM 3RF 011010 6 5 all destination bits are set, otherwise wd , 32) , 64) , 32) , 64) , 11 10 itecture Module, Revision 1.12 Revision Module, itecture 194 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws . The results are values in integer data format data integer in results are values . The floating-point elements are less than or equal to by the I EEE Standard for Floating-Point Arithmetic 754 , WR[wt] , WR[wt] df , WR[wt] , WR[wt] ws 16 15 32 64 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i compare for less than or equal; if true (c | d) (c | d) MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1110 df wt (ws[i] <=(signaling) wt[i]) <=(signaling) (ws[i]  elements if the corresponding 64i+63..64i 32i+31..32i wd LessSigFP(WR[ws] LessSigFP(WR[ws] EqualSigFP(WR[ws] EqualSigFP(WR[ws] 26 25 wd[i] c  c  d  WR[wd] d  WR[wd] FSLE.df FSLE.df wd,ws,wt FSLE.W wd,ws,wt FSLE.D Vector Floating-Point Signaling EqualLess or Compare Vector endfor */ operation. compare less than signaling defined Implementation /* */ operation. equal compare signaling defined Implementation /* endfor 64 MSA function LessThanSigFP(tt, ts, n) ts, LessThanSigFP(tt, function LessThanSigFP endfunction n) ts, EqualSigFP(tt, function EqualSigFP endfunction FSLE.W .. WRLEN/32-1 i in 0 for FSLE.D .. WRLEN/64-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved clear. Description: 0. to all bits otherwise set point elements, signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 Set all bits to 1 in Format: Purpose: to vector Vector floating-point signaling 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Less or Equal or Less Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSLT.df I . -2008. df TM 3RF floating-point ele- 011010 wt 6 5 wd , 32) , 64) , 11 10 itecture Module, Revision 1.12 Revision Module, itecture 195 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws an; set, if true all destination bitsotherwise clear. are floating-point elements are less than . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 ws df , WR[wt] , WR[wt] 16 15 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 32 64 c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1100 df wt (ws[i] <(signaling) wt[i]) <(signaling) (ws[i]  elements if the corresponding 32i+31..32i 64i+63..64i wd LessSigFP(WR[ws] LessSigFP(WR[ws] 26 25 wd[i] c  c  WR[wd] WR[wd] FSLT.df FSLT.df wd,ws,wt FSLT.W wd,ws,wt FSLT.D Vector Floating-Point Signaling Less Than Compare Vector endfor */ operation. compare less than signaling defined Implementation /* endfor 64 MSA function LessThanSigFP(tt, ts, n) ts, LessThanSigFP(tt, function LessThanSigFP endfunction FSLT.D .. WRLEN/64-1 i in 0 for FSLT.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Description: Set Set all bits to 1 in Format: Purpose: floating-pointvector to signalingcompare lessfor th Vector ments, otherwiseset allbits to 0. signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Less Than Less Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSNE.df I . -2008. df TM 3RF 011100 6 5 , 32) , 64) wd 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 196 erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are not equal, otherwise set all l; set, if true all destination bitsotherwise clear. are ws wt , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 df ws 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32 64 (signaling) wt[i]) (signaling) c c  MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1011 df wt  (ws[i] 32i+31..32i 64i+63..64i elements if the corresponding NotEqualSigFP(WR[ws] NotEqualSigFP(WR[ws] wd 26 25 wd[i] c  c  WR[wd] WR[wd] FSNE.df FSNE.df wd,ws,wt FSNE.W wd,ws,wt FSNE.D Vector Floating-Point Signaling CompareNot Equal Vector endfor */ operation. compare not equal signaling defined Implementation /* endfor 64 MSA function NotEqualSigFP(tt, ts, n) ts, NotEqualSigFP(tt, function NotEqualSigFP endfunction FSNE.D .. WRLEN/64-1 i in 0 for FSNE.W .. WRLEN/32-1 i in 0 for 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved bitsto 0. signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 Set all in bits to 1 Format: Purpose: floating-pointvector to signalingcompare not for equa Vector Description: 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Not Equal Not Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSOR.df I . -2008. df TM 3RF 011100 ordered, i.e. both elements 6 5 , 32) , 64) wd 11 10 32i+31..32i 64i+63..64i itecture Module, Revision 1.12 Revision Module, itecture 197 erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are ws wt if truealldestination bits are set, otherwiseclear. , WR[wt] , WR[wt] , . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 df ws 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32 64 c c MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1001 df wt ws[i] !?(signaling) wt[i] !?(signaling) ws[i]  32i+31..32i 64i+63..64i elements if the corresponding OrderedSigFP(WR[ws] OrderedSigFP(WR[ws] wd 26 25 wd[i] c  WR[wd] c  WR[wd] FSOR.df FSOR.df wd,ws,wt FSOR.W wd,ws,wt FSOR.D Vector Floating-Point Signaling Ordered Compare Vector endfor operation. */ ordered compare /* Implementation defined signaling endfor 64 MSA function OrderedSigFP(tt, ts, n) ts, OrderedSigFP(tt, function OrderedSigFP endfunction FSOR.D .. WRLEN/64-1 i in 0 for FSOR.W .. WRLEN/32-1 i in 0 for 011110 are not NaN values, otherwise setall are bits tonotNaN0. values, signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits in to 1 bits all Set Format: Purpose: floating-pointvector to signalingcompareordered; Vector Description: 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 0. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Ordered Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSQRT.df I 2RF 011110 -2008. TM 6 5 56 . wd wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 198 . df 5 , 32) , 64) , written to vector are by the I EEE Standard for Floating-Point Arithmetic 754 ws 1 df ws 17 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. IEEE Standard for Floating-Point Arithmetic 754 Arithmetic Floating-Point for Standard IEEE f f es ines floating-point data format   110010011 Volume IV-j: The MIPS64® SIMD Arch  sqrt(ws[i]) 32i+31..32i 64i+63..64i SquareRootFP(WR[ws] SquareRootFP(WR[ws] 26 25 wd[i] f  WR[wd] WR[wd] f  FSQRT.df FSQRT.df wd,ws FSQRT.W wd,ws FSQRT.D Vector Floating-Point Square Root Vector endfor endfor */ operation. root square defined Implementation /* 69 MSA FSQRT.W .. WRLEN/32-1 i in 0 for FSQRT.D .. WRLEN/64-1 i in 0 for n) ts, SquareRootFP(tt, function SquareRootFP endfunction 011110 The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The squareThe roots of floating-point elements in vector Format: Purpose: floating-point square root. Vector Description: The square root operation is defined by the is defined root operation square The 2008. Operation: 31 Vector Floating-Point Square Root Square Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSUB.df I . The result is ws 3RF 011011 , 32) , 64) -2008. TM 6 5 32i+31..32i 64i+63..64i wd , WR[wt] , WR[wt] ting-point elements in vector 11 10 itecture Module, Revision 1.12 Revision Module, itecture 199 . df ws 32i+31..32i 64i+63..64i by the I EEE Standard for Floating-Point Arithmetic 754 16 15 Standard for Floating-Point Arithmetic754 ed Exception,MSA Floating PointException. are subtracted from the floa wt SubtractFP(WR[ws] SubtractFP(WR[ws] 15556 es ines floating-point data format   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 0001 df wt  ws[i] - wt[i] 32i+31..32i 64i+63..64i . wd 26 25 wd[i] WR[wd] WR[wd] FSUB.df FSUB.df wd,ws,wt FSUB.W wd,ws,wt FSUB.D Vector Floating-Point Subtraction Vector endfor endfor */ operation. subtract defined Implementation /* 64 MSA FSUB.W .. WRLEN/32-1 i in 0 for FSUB.D .. WRLEN/64-1 i in 0 for n) ts, SubtractFP(tt, function SubtractFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The subtract operation is defined byIEEE the subtractThe operation is defined The floating-point elements in vector Format: Purpose: floating-point subtraction. Vector Description: writtento vector The operands and results are valu are and results operands The Restrictions: Data-dependent exceptions are poss ible as s pecified 2008. Operation: 31 Vector Floating-Point Subtraction Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 . FSUEQ.df -2008. I df TM 3RF 011010 6 5 , 32) , 64) ue all destination bits are set, other- wd , 32) , 64) 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 200 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands floating-point elements are unordered or equal, other- ws wt , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The and by the I EEE Standard for Floating-Point Arithmetic 754 , WR[wt] , WR[wt] df ws 16 15 32i+31..32i 64i+63..64i 32 64 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i g for compare unordered or equality; if tr (c | d) (c | d) MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1011 df wt (ws[i] =?(signaling) wt[i]) =?(signaling) (ws[i]  64i+63..64i 32i+31..32i elements if the corresponding wd UnorderedSigFP(WR[ws] EqualSigFP(WR[ws] UnorderedSigFP(WR[ws] EqualSigFP(WR[ws] 26 25 wd[i] c  d  c  d  WR[wd] WR[wd] FSUEQ.df FSUEQ.df wd,ws,wt FSUEQ.W wd,ws,wt FSUEQ.D Vector Floating-Point Signaling CompareUnordered or Equal Vector endfor */ operation. compare unordered signaling defined Implementation /* */ operation. equal compare signaling defined Implementation /* endfor 64 MSA function UnorderedSigFP(tt, ts, n) UnorderedSigFP(tt, function UnorderedSigFP endfunction n) ts, EqualSigFP(tt, function EqualSigFP endfunction FSUEQ.W .. WRLEN/32-1 i in 0 for FSUEQ.D .. WRLEN/64-1 i in 0 for 011110 wise set all bits to0. signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Set all bits to 1 in Format: Purpose: to vector floating-point Vector signalin wise clear. Description: 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 1. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Unordered or Equal or Unordered Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 FSULE.df I . -2008. df TM 3RF 011010 6 5 , 32) , 64) wd , 32) , 64) , 32) , 64) , 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 201 32i+31..32i 64i+63..64i 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands ws , WR[wt] , WR[wt] floating-point elements are unordered or less than or equal . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 , WR[wt] , WR[wt] df ws , WR[wt] 32 , WR[wt] 64 16 15 32i+31..32i 64i+63..64i mpare Unorderedmpare Equal or Less or . In case of a floating-point exception, the default result has all bits set to 32i+31..32i 64i+63..64i 64i+63..64i 32i+31..32i (c | d | e) (c | d (c | d | e) (c | d MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1111 df wt (ws[i] <=?(signaling) wt[i]) <=?(signaling) (ws[i]  elements if the corresponding 64i+63..64i 32i+31..32i wd UnorderedSigFP(WR[ws] LessSigFP(WR[ws] UnorderedSigFP(WR[ws] LessSigFP(WR[ws] EqualSigFP(WR[ws] EqualSigFP(WR[ws] 26 25 wd[i] c  d  c  d  e  WR[wd] e  WR[wd] FSULE.df FSULE.df wd,ws,wt FSULE.W wd,ws,wt FSULE.D Vector Floating-Point Signaling Co Vector endfor */ operation. compare unordered signaling defined Implementation /* */ operation. compare less than signaling defined Implementation /* endfor 64 floating-point elements,otherwiseall set bits to0. MSA function UnorderedSigFP(tt, ts, n) UnorderedSigFP(tt, function UnorderedSigFP endfunction n) ts, LessThanSigFP(tt, function LessThanSigFP endfunction FSULE.W .. WRLEN/32-1 i in 0 for FSULE.D .. WRLEN/64-1 i in 0 for 011110 wt The signalingThe IEEE Standardby the operationcompare Floating-Point for is defined Arithmetic 754 to Vector to Vector vector floating-point signaling compare for unordered or less than or equal; if true all destination bits are otherwise clear. set, Description: Set all bits to 1 in Format: Purpose: 2008. Operation: MSA Control and Status Register The Inexact Exception is not signaled when subnormal input op inputsubnormal when signaled is not Exception Inexact The 1. in floating-point operandsdata formatThe are values Restrictions: Data-dependent exceptions are poss ible as s pecified 31 Vector Floating-Point Signaling Compare Unordered or Less or Equal or Less Unordered Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

FSULE.df I itecture Module, Revision 1.12 Revision Module, itecture 202 gnaling equal compare operation. */ operation. compare equal gnaling ed Exception,MSA Floating PointException. Volume IV-j: The MIPS64® SIMD Arch /* Implementation defined si defined Implementation /* function EqualSigFP(tt, ts, n) ts, EqualSigFP(tt, function endfunction EqualSigFP endfunction Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved Vector Floating-Point Signaling Compare Unordered or Less or Equal or Less Unordered Compare Signaling Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for - TM MSA MSA 0 floating- wt FSULT.df I . -2008. df TM 3RF 011010 6 5 , 32) , 64) wd , 32) , 64) , 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 203 32i+31..32i 64i+63..64i erands are flushed based on the flush-to-zero bit FS in on based flushed are erands d or less than; if true all destination bits other-are set, ws , WR[wt] , WR[wt] . The results are values in integer data format data integer in results are values . The by the I EEE Standard for Floating-Point Arithmetic 754 floating-point elementsare unordered or lessthan df , WR[wt] , WR[wt] ws 16 15 32i+31..32i 64i+63..64i 32 64 ed Exception,MSA Floating PointException. . In case of a floating-point exception, the default result has all bits set to 64i+63..64i 32i+31..32i (c | d) (c | d) MSACSR 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch 1101 df wt (ws[i]

MSA MSA 0 . FTINT_S.df I df 2RF 011110 6 5 . wd 56 wd have the same numerical value as the 11 10 itecture Module, Revision 1.12 Revision Module, itecture 206 5 outside the range of the destination format signal the , 32) , 64) . The results are values in integer data format data integer in results are values . The df 1 df ws . The result writtenis to vector 17 16 15 32i+31..32i 64i+63..64i ed Exception,MSA Floating PointException. MSACSR f f   are rounded and converted to signed integer values based on the rounding mode bits mode rounding the based on values integer signed to and converted are rounded ws 110011100 Volume IV-j: The MIPS64® SIMD Arch  to_int_s(ws[i]) 32i+31..32i 64i+63..64i ToIntSignedFP(WR[ws] ToIntSignedFP(WR[ws] integer conversion. */ conversion. integer 26 25 wd[i] WR[wd] WR[wd] f  f  -2008, i.e. the Inexact tion excep is signaled if the result does not FTINT_S.df FTINT_S.df wd,ws FTINT_S.W wd,ws FTINT_S.D Vector Floating-Point Convert to SignedInteger Floating-Point Convert Vector TM endfor endfor signed and rounding floating-point defined Implementation /* 69 MSA FTINT_S.W .. WRLEN/32-1 i in 0 for FTINT_S.D .. WRLEN/64-1 i in 0 for n) ToIntSignedFP(tt, function ToIntSignedFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point elements in elements floating-point The Format: Purpose: tosigned integer. floating-point convert Vector Description: Invalid Operation Invalid exception. For positive numeric operands outside the range, the default result is the largest signed The defaultinteger value. numericresult operandsfor negative outside the range is the smallest signed integer value. default resultThe for NaN operands is zero. in floating-point operandsdata formatThe are values input operand. Inthisresult case, the defaultis the rounded result. NaN values and numeric operands converting to an integer RM in MSA ControlandStatusRegister The floating-point operationto integer conversion is exact as defined by the IEEE Standard for Floating-Point Arith- metic 754 Restrictions: are possible. Data-dependent exceptions Operation: 31 Vector Floating-Point Convert to Signed Integer Signed to Convert Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 . df FTINT_U.df I 2RF 011110 . 6 5 wd 56 wd have the same numerical value as the 11 10 itecture Module, Revision 1.12 Revision Module, itecture 207 de the range, the default result is the largest unsigned the largest is result the default range, the de 5 outside the range of the destination format signal the , 32) , 64) . The results are values in integer data formatin integer The results. are values . The . result is writtentovector df 1 df ws 32i+31..32i 64i+63..64i 17 16 15 MSACSR ed Exception,MSA Floating PointException. f f are rounded to and converted unsigned basedinteger values on the rounding mode   ws 110011101 Volume IV-j: The MIPS64® SIMD Arch  to_int_u(ws[i]) 32i+31..32i 64i+63..64i ToIntUnsignedFP(WR[ws] ToIntUnsignedFP(WR[ws] integer conversion. */ conversion. integer 26 25 wd[i] WR[wd] WR[wd] f  f  -2008, i.e. the Inexact tion excep is signaled if the result does not FTINT_U.df FTINT_U.df wd,ws FTINT_U.W wd,ws FTINT_U.D Vector Floating-Point Round and Convert to UnsignedInteger Floating-Point Roundand Convert Vector TM endfor endfor unsigned and rounding floating-point defined Implementation /* 69 MSA FTINT_U.W .. WRLEN/32-1 i in 0 for FTINT_U.D .. WRLEN/64-1 i in 0 for n) ToIntUnsignedFP(tt, function ToIntUnsignedFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point operationto integer conversion is exact as defined by the IEEE Standard for Floating-Point Arith- metic 754 bitsRM inMSA Control and Status Register The elements in floating-point Format: Purpose: to unsigned integer. floating-point roundand convert Vector Description: integer value. The default result for negative numeric operands is zero. The resultdefault for NaN operands is zero. resultnegative Thedefault for value. integer in floating_pointoperands The are data formatvalues Invalid Operation exception. For positive numeric operands outsi numeric positive For exception. Operation Invalid input operand. Inthisresult case, the defaultis the rounded result. NaN values and numeric operands converting to an integer Restrictions: are possible. Data-dependent exceptions Operation: 31 Vector Floating-Point Round and Convert to Unsigned Integer to Unsigned Convert and Round Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 IFTQ.df 3RF 011011 ) and then rounded and 31 in data format half the size the half format data in 6 5 or 2 15 t representation, t i.e. from representation, 64-bit wd point to 16-bit Q15 fixed-point repre- Q15 point to fixed-point 16-bit 11 10 itecture Module, Revision 1.12 Revision Module, itecture 208 ws In thisthe case, result default is the roundedresult. , 64) , 32) , 32) , 64) ations ations are defined by the IEEE Standard for Floating-Point . The results are fixed-point values fixed-point are results The . ion is exact, i.e. the Inexact exception is signaled if the result q q df 16 15   are down-converted to a fixed-poin are down-converted 64i+63..64i 64i+63..64i 32i+31..32i 32i+31..32i wt The default result for NaN operands is zero. for NaN result The default and representation, or from 32-bit floating- ws to_q(ws[i]); right_half(wd)[i]  to_q(wt[i]) right_half(wd)[i]  to_q(ws[i]); r r 15556   22 21 20 Volume IV-j: The MIPS64® SIMD Arch floating-pointformatdata 1010 df wt 16i+15+WRLEN/2..16i+WRLEN/2 16i+15..16i 32i+31+WRLEN/2..32i+WRLEN/2 32i+31..32i ToFixPointFP((WR[wt] ToFixPointFP((WR[wt] ToFixPointFP((WR[ws] ToFixPointFP((WR[ws] -2008. The operat integer conversion . The . resultingis the Q15value or Q31 representation. TM 26 25 left_half(wd)[i] left_half(wd)[i] WR[wd] WR[wd] WR[wd] r  WR[wd] r  q  q  FTQ.df FTQ.df wd,ws,wt FTQ.H wd,ws,wt FTQ.W Vector Floating-Point Convert to Fixed-Point Floating-Point Convert Vector endfor endfor MSACSR 64 MSA . FTQ.H .. WRLEN/32-1 i in 0 for FTQ.W .. WRLEN/64-1 i in 0 for 011110 df sentation. The floating-point data inside the range fixed-point is first scaled up (multiplied by 2 floating-point to 32-bit Q31 fixed-point does not have the same numerical value as the as input the same numericaloperand. value does nothave in values operands are The NaN values signal the Invalid Operation exception. Numeric operands converting to fixed-pointoperands outside numeric positive For values exceptions. the Inexact outside and Overflow the signal the format the destination of range the range, the default result is the largest fixed-point value. The default result for numeric negative operands outside value. fixed-point the the range is smallest of Restrictions: are possible. Data-dependent exceptions Operation: The floating-point elements in vectors Format: Purpose: fromfloating-point. formatfix-point conversion Vector Description: converted converted to a 16-bit Register or 32 -bit integer based on the roThe scaling and floating-point to integer undingoper conversion mode bits RM in MSA Control and St atus Arithmetic 754 31 Vector Floating-Point Convert to Fixed-Point to Convert Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

IFTQ.df itecture Module, Revision 1.12 Revision Module, itecture 209 ed Exception,MSA Floating PointException. Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Exception, Disabl MSA Instruction Reserved Vector Floating-Point Convert to Fixed-Point to Convert Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 . wd . df 2RF FTRUNC_S.df I 011110 6 5 56 wd have the same numerical value as the to signed integer values. The rounding 11 10 itecture Module, Revision 1.12 Revision Module, itecture 210 , 32) , 64) 5 outside the range of the destination format signal the are not used. The result is to vector written is are not used. The result . The results are values in integer data format data integer in results are values . The df 32i+31..32i 64i+63..64i 1 MSACSR df ws 17 16 15 ed Exception,MSA Floating PointException. f f   ws are truncated, i.e. rounded toward zero, 110010001 Volume IV-j: The MIPS64® SIMD Arch  truncate_to_int_s(ws[i]) 32i+31..32i 64i+63..64i TruncToIntSignedFP(WR[ws] TruncToIntSignedFP(WR[ws] integer conversion. */ conversion. integer 26 25 wd[i] WR[wd] WR[wd] f  f  -2008, i.e. the Inexact tion excep is signaled if the result does not FTRUNC_S.df FTRUNC_S.df wd,ws FTRUNC_S.W wd,ws FTRUNC_S.D Vector Floating-Point Truncate and Convert Signedto Integer and Floating-PointConvert Truncate Vector TM endfor endfor and signed truncation floating-point defined Implementation /* 69 MSA FTRUNC_S.W .. WRLEN/32-1 i in 0 for FTRUNC_S.D .. WRLEN/64-1 i in 0 for n) TruncToIntSignedFP(tt, function TruncToIntSignedFP endfunction 011110 The floating-point operationto integer conversion is exact as defined by the IEEE Standard for Floating-Point Arith- metic 754 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved mode bits RM in MSA Controland Status Register The floating-point elements in Format: Purpose: to signedinteger. floating-point truncateandconvert Vector Description: Invalid Operation Invalid exception. For positive numeric operands outside the range, the default result is the largest signed The defaultinteger value. numericresult operandsfor negative outside the range is the smallest signed integer value. default resultThe for NaN operands is zero. in floating-point operandsdata formatThe are values input operand. Inthisresult case, the defaultis the rounded result. NaN values and numeric operands converting to an integer Restrictions: are possible. Data-dependent exceptions Operation: 31 Vector Floating-Point Truncate and Convert to Signed Integer Signed to Convert and Truncate Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 . wd . df 2RF FTRUNC_U.df I 011110 6 5 56 wd have the same numerical value as the to unsigned integer values. The rounding 11 10 , 32) , 64) itecture Module, Revision 1.12 Revision Module, itecture 211 de the range, the default result is the largest unsigned the largest is result the default range, the de 5 outside the range of the destination format signal the are not used. The result is to vector written is are not used. The result ds is zero. The default result for NaN operands is zero. operands is result for NaN default zero. The ds is 32i+31..32i 64i+63..64i . The results are values in integer data formatin integer The results. are values df 1 MSACSR df ws 17 16 15 ed Exception,MSA Floating PointException. f f are truncated, i.e. rounded toward zero,   ws 110010010 Volume IV-j: The MIPS64® SIMD Arch  truncate_to_int_u(ws[i]) 32i+31..32i 64i+63..64i TruncToIntUnsignedFP(WR[ws] TruncToIntUnsignedFP(WR[ws] integer conversion. */ conversion. integer 26 25 wd[i] WR[wd] WR[wd] f  f  -2008, i.e. the Inexact tion excep is signaled if the result does not FTRUNC_U.df FTRUNC_U.df wd,ws FTRUNC_U.W wd,ws FTRUNC_U.D Vector Floating-Point Truncate and Convert Unsignedto Integer and Floating-PointConvert Truncate Vector TM endfor endfor and unsigned truncation floating-point defined Implementation /* 69 MSA FTRUNC_U.W .. WRLEN/32-1 i in 0 for FTRUNC_U.D .. WRLEN/64-1 i in 0 for n) TruncToIntUnsignedFP(tt, function TruncToIntUnsignedFP endfunction 011110 Exceptions: Exceptions: Exception, MSA Disabl Instruction Reserved The floating-point elements in Format: Purpose: to unsigned integer. floating-point truncateandconvert Vector Description: The floating-point operationto integer conversion is exact as defined by the IEEE Standard for Floating-Point Arith- metic 754 integer value. The default value for negative numericoperan for negative The default value value. integer in floating_pointoperands The are data formatvalues Invalid Operation exception. For positive numeric operands outsi numeric positive For exception. Operation Invalid input operand. Inthisresult case, the defaultis the rounded result. NaN values and numeric operands converting to an integer mode bits RM in MSA Controland Status Register Restrictions: are possible. Data-dependent exceptions Operation: 31 Vector Floating-Point Truncate and Convert to Unsigned Integer Unsigned to Convert and Truncate Floating-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 producing a wt HADD_S.df I 3R 010101 , 8) , 16) , 32) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i todouble width elements . d even d elements in even vector ) wd 11 10 , WR[wt] , WR[wt] , WR[wt] n-1..0 itecture Module, Revision 1.12 Revision Module, itecture 212 . The results are values in integer The data format. results are values df ws || tt || n ) 16i+15..16i 32i+31..32i 64i+63..64i n-1 with the even elements with the even 16 15 are added to the sign-extende ) + ((tt ws signed(ws[2i+1]) + signed(wt[2i]) +  signed(ws[2i+1]) hadd_s(WR[ws] hadd_s(WR[ws] hadd_s(WR[ws] 2n-1..n    r data format halfthesize of 25556 Volume IV-j: The MIPS64® SIMD Arch || ts n ) 100 df wt 32i+31..32i 64i+63..64i 16i+15..16i 2n-1 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], ((ts WR[wd] WR[wd] WR[wd] HADD_S.df HADD_S.df wd,ws,wt HADD_S.H wd,ws,wt HADD_S.W wd,ws,wt HADD_S.D Vector Signed Horizontal Add Vector endfor endfor endfor t  return t return 63 MSA HADD_S.W .. WRLEN/32-1 i in 0 for HADD_S.D .. WRLEN/64-1 i in 0 for n) tt, hadd_s(ts, function endfunction hadd_s endfunction HADD_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Restrictions: No data-dependentare possible.exceptions Operation: The sign-extended odd elements in vector Format: Purpose: odd elements the add and pairwise sign extend Vector Description: result twice thesize ofthe input operands.The result is writtentovector in intege are values operands The 31 Vector Vector Add Horizontal Signed MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df 0 producing a wt HADD_U.df I 3R 010101 , 8) , 16) , 32) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i todouble width elements . d even d elements even in vector wd 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 213 . The results are values in integer The data format. results are values df ws ) 16i+15..16i 32i+31..32i 64i+63..64i with the even elements with the even n-1..0 16 15 are added to the zero-extende || tt ws n unsigned(ws[2i+1]) + unsigned(wt[2i]) +  unsigned(ws[2i+1]) hadd_u(WR[ws] hadd_u(WR[ws] hadd_u(WR[ws] ) + (0    r data format halfthesize of 25556 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n 101 df wt 32i+31..32i 64i+63..64i 16i+15..16i || ts || n 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], (0 WR[wd] WR[wd] WR[wd] HADD_U.df HADD_U.df wd,ws,wt HADD_U.H wd,ws,wt HADD_U.W wd,ws,wt HADD_U.D Vector Unsigned Horizontal Add Vector endfor endfor endfor t  t return 63 MSA HADD_U.W .. WRLEN/32-1 i in 0 for HADD_U.D .. WRLEN/64-1 i in 0 for n) tt, hadd_u(ts, function hadd_u endfunction HADD_U.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: result twice thesize ofthe input operands.The result is writtentovector in intege are values operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The zero-extended The zero-extended odd elements in vector Format: Purpose: odd elements the add and pairwise zero extend Vector Description: 31 Vector Unsigned Horizontal Add Horizontal Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

pro- MSA MSA MSA . df wt 0 HSUB_S.df I 3R 010101 . , 8) , 16) , 32) wd 6 5 ts to double width elements wd 16i+15..16i 32i+31..32i 64i+63..64i tended even elements in vector ) 11 10 , WR[wt] , WR[wt] , WR[wt] n-1..0 itecture Module, Revision 1.12 Revision Module, itecture 214 . The results are values in integer The data format. results are values df ws || tt || n ) 16i+15..16i 32i+31..32i 64i+63..64i n-1 ements from the odd elemen ements the from 16 15 t operands. The result is written to vector result is written The t operands. are subtracted from the sign-ex wt ) - ((tt signed(ws[2i+1]) - signed(wt[2i]) -  signed(ws[2i+1]) hsub_s(WR[ws] hsub_s(WR[ws] hsub_s(WR[ws] 2n-1..n    r data format halfthesize of 25556 Volume IV-j: The MIPS64® SIMD Arch || ts n ) 110 df wt 32i+31..32i 64i+63..64i 16i+15..16i 2n-1 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], ((ts WR[wd] WR[wd] WR[wd] HSUB_S.df HSUB_S.df wd,ws,wt HSUB_S.H wd,ws,wt HSUB_S.W wd,ws,wt HSUB_S.D Vector Signed Horizontal Subtract Vector endfor endfor endfor t  return t return 63 MSA HSUB_S.W .. WRLEN/32-1 i in 0 for HSUB_S.D .. WRLEN/64-1 i in 0 for n) tt, hsub_s(ts, function endfunction hsub_s endfunction HSUB_S.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Restrictions: No data-dependentare possible.exceptions Operation: The sign-extended odd elements in vector Format: Purpose: el and pairwisesign subtract extend theeven Vector Description: ducing a signed result twice the size of ducing of the inpu a signed the result size twice in intege are values operands The 31 Vector Signed Horizontal Subtract Horizontal Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA . df ws pro- 0 HSUB_U.df I 3R 010101 . , 8) , 16) , 32) wd 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i tended even tended elements even in vector dd elements to double width elements 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 215 . The results are values in integer The data format. results are values df ws ) 16i+15..16i 32i+31..32i 64i+63..64i n-1..0 16 15 t operands. The result is written to vector result is written The t operands. are subtracted from the zero-ex || tt wt n unsigned(ws[2i+1]) - unsigned(wt[2i]) -  unsigned(ws[2i+1]) act the even elements from the o elements from the even act the hsub_u(WR[ws] hsub_u(WR[ws] hsub_u(WR[ws] ) - (0    r data format halfthesize of 25556 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n 111 df wt 32i+31..32i 64i+63..64i 16i+15..16i || ts || n 26 25 23 22 21 20 (wd[2i+1], wd[2i]) wd[2i]) (wd[2i+1], (0 WR[wd] WR[wd] WR[wd] HSUB_U.df HSUB_U.df wd,ws,wt HSUB_U.H wd,ws,wt HSUB_U.W wd,ws,wt HSUB_U.D Vector Unsigned Horizontal Subtract Vector endfor endfor endfor t  t return 63 MSA HSUB_U.W .. WRLEN/32-1 i in 0 for HSUB_U.D .. WRLEN/64-1 i in 0 for n) tt, hsub_u(ts, function hsub_u endfunction HSUB_U.H .. WRLEN/16-1 i in 0 for 011110 Description: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved ducing a signed result twice the size of ducing of the inpu a signed the result size twice in intege are values operands The Restrictions: No data-dependentare possible.exceptions Operation: The zero-extended odd elements in vector Format: Purpose: and pairwise zero subtrextend Vector 31 Vector Unsigned Horizontal Subtract Horizontal Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ILVEV.df I 3R with one element 010100 ws 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 216 alternating one element from ws . wd df 16 15  ws[2i] 64j+63..64j 64j+63..64j 16j+15..16j 16j+15..16j 32j+31..32j 32j+31..32j 8j+7..8j 8j+7..8j are copied to vector WR[wt] WR[ws] WR[wt] WR[ws] WR[wt] WR[ws] wt       25556  WR[wt]  WR[ws] and Volume IV-j: The MIPS64® SIMD Arch ws wt[2i]; wd[2i+1] wd[2i+1]  wt[2i]; 110 df wt 32j+31..32j 32k+31..32k 64j+63..64j 64k+63..64k 8j+7..8j 8k+7..8k 16j+15..16j 16k+15..16k 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 26 25 23 22 21 20 wd[2i] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] ILVEV.df ILVEV.df wd,ws,wt ILVEV.B wd,ws,wt ILVEV.H wd,ws,wt ILVEV.W wd,ws,wt ILVEV.D Vector Interleave Even Interleave Vector endfor endfor endfor . 63 wt MSA ILVEV.D .. WRLEN/128-1 i in 0 for ILVEV.W .. WRLEN/64-1 i in 0 for ILVEV.H .. WRLEN/32-1 i in 0 for ILVEV.B .. WRLEN/16-1 i in 0 for 011110 from data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Vector even elements interleave. even Vector Description: Purpose: Even v elements in ectors Format: 31 Vector Interleave Even Interleave Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ILVEV.df I itecture Module, Revision 1.12 Revision Module, itecture 217 Volume IV-j: The MIPS64® SIMD Arch endfor Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Interleave Even Interleave Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ILVL.df I 3R with one element 010100 ws 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 218 alternating one element from  left_half(ws)[i] wd ws . df 16 15 64i+63+WRLEN/2..64i+WRLEN/2 64i+63+WRLEN/2..64i+WRLEN/2 16i+15+WRLEN/2..16i+WRLEN/2 16i+15+WRLEN/2..16i+WRLEN/2 32i+31+WRLEN/2..32i+WRLEN/2 32i+31+WRLEN/2..32i+WRLEN/2 8i+7+WRLEN/2..8i+WRLEN/2 8i+7+WRLEN/2..8i+WRLEN/2 are copied to vector are wt WR[wt] WR[ws] WR[wt] WR[ws] WR[wt] WR[ws] and       ws 25556  WR[wt]  WR[ws] Volume IV-j: The MIPS64® SIMD Arch left_half(wt)[i]; wd[2i+1] wd[2i+1]  left_half(wt)[i]; 100 df wt 32j+31..32j 32k+31..32k 64j+63..64j 64k+63..64k 8j+7..8j 8k+7..8k 16j+15..16j 16k+15..16k 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 26 25 23 22 21 20 wd[2i] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] ILVL.df ILVL.df wd,ws,wt ILVL.B wd,ws,wt ILVL.H wd,ws,wt ILVL.W wd,ws,wt ILVL.D Vector Interleave Left Interleave Vector endfor endfor endfor . 63 wt MSA ILVL.D .. WRLEN/128-1 i in 0 for ILVL.W .. WRLEN/64-1 i in 0 for ILVL.H .. WRLEN/32-1 i in 0 for ILVL.B .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: from data format in integer values are and results operands The Vector left interleave. elements Vector Description: Purpose: vectors in elements half left The Format: 31 Vector Interleave Left MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ILVL.df I itecture Module, Revision 1.12 Revision Module, itecture 219 Volume IV-j: The MIPS64® SIMD Arch endfor Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Interleave Left MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ILVOD.df I 3R with one element 010100 ws 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 220 alternating one element from ws . wd df  ws[2i+1] 16 15 64k+63..64k 64k+63..64k 16k+15..16k 16k+15..16k 32k+31..32k 32k+31..32k 8k+7..8k 8k+7..8k WR[wt] WR[ws] WR[wt] WR[ws] WR[wt] WR[ws] are copied to v ector wt       25556  WR[wt]  WR[ws] and Volume IV-j: The MIPS64® SIMD Arch ws wt[2i+1]; wd[2i+1] wd[2i+1]  wt[2i+1]; 111 df wt 32j+31..32j 32k+31..32k 64j+63..64j 64k+63..64k 8j+7..8j 8k+7..8k 16j+15..16j 16k+15..16k 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 26 25 23 22 21 20 wd[2i] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] ILVOD.df ILVOD.df wd,ws,wt ILVOD.B wd,ws,wt ILVOD.H wd,ws,wt ILVOD.W wd,ws,wt ILVOD.D Vector Interleave Odd Interleave Vector endfor endfor endfor . 63 wt MSA ILVOD.D .. WRLEN/128-1 i in 0 for ILVOD.W .. WRLEN/64-1 i in 0 for ILVOD.H .. WRLEN/32-1 i in 0 for ILVOD.B .. WRLEN/16-1 i in 0 for 011110 Vector odd elements interleave. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: Odd elements in v ectors Format: from data format in integer values are and results operands The 31 Vector Interleave Odd MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ILVOD.df I itecture Module, Revision 1.12 Revision Module, itecture 221 Volume IV-j: The MIPS64® SIMD Arch endfor Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Interleave Odd MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ILVR.df I with one ele- ws 3R 010100 6 5 wd alternating one element from 11 10 itecture Module, Revision 1.12 Revision Module, itecture 222 wd  right_half(ws)[i] ws . df 16 15 64i+63..64i 64i+63..64i 16i+15..16i 16i+15..16i 32i+31..32i 32i+31..32i are copied to vector 8i+7..8i 8i+7..8i wt and WR[wt] WR[ws] WR[wt] WR[ws] WR[wt] WR[ws] ws       25556  WR[wt]  WR[ws] Volume IV-j: The MIPS64® SIMD Arch right_half(wt)[i]; wd[2i+1] wd[2i+1]  right_half(wt)[i]; 101 df wt 32j+31..32j 32k+31..32k 64j+63..64j 64k+63..64k 8j+7..8j 8k+7..8k 16j+15..16j 16k+15..16k 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 2 * i 1 2 * i + 26 25 23 22 21 20 wd[2i] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] WR[wd] j  k  WR[wd] . ILVR.df ILVR.df wd,ws,wt ILVR.B wd,ws,wt ILVR.H wd,ws,wt ILVR.W wd,ws,wt ILVR.D Vector Interleave Right Interleave Vector wt endfor endfor endfor 63 MSA ILVR.D .. WRLEN/128-1 i in 0 for ILVR.W .. WRLEN/64-1 i in 0 for ILVR.H .. WRLEN/32-1 i in 0 for ILVR.B .. WRLEN/16-1 i in 0 for 011110 Vector right elements interleave. right elements Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The right half in vectorselements Format: ment from data format in integer values are and results operands The 31 Vector Interleave Right MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ILVR.df I itecture Module, Revision 1.12 Revision Module, itecture 223 Volume IV-j: The MIPS64® SIMD Arch endfor Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Interleave Right MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA 0 MIPS64 MSA INSERT.df I ELM 011001 6 5 wd are unchanged. If the source GPR is wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 224 rs . 16 15 df 15..0 31..0 63..0 df/n 7..0 value. All other elements in vector rs GPR[rs] GPR[rs] GPR[rs] 22 21    Volume IV-j: The MIPS64® SIMD Arch to GPR  GPR[rs] wd 0100  rs 16n+15..16n 32n+31..32n 64n+63..64n 8n+7..8n 26 25 wd[n] in vector n INSERT.df INSERT.df wd[n],rs INSERT.B wd[n],rs INSERT.H wd[n],rs INSERT.W wd[n],rs INSERT.D GPR Insert Element GPR 646556 MSA INSERT.W WR[wd] INSERT.D WR[wd] INSERT.B WR[wd] INSERT.H WR[wd] 011110 Restrictions: No data-dependentare possible.exceptions Operation: wider thanthedestination data format, the destination's elementswill be set to the leastbitsof significant the GPR. in data format values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved GPR value copied to vector element. copied to vector value GPR Description: Purpose: Set element Format: 31 GPR Insert Element GPRInsert MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 INSVE.df I ELM 011001 are unchanged. wd 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 225 ws . value. All other elements in vector value. 16 15 df ws 15..0 31..0 63..0 df/n 7..0 WR[ws] WR[ws] WR[ws] 22 21    Volume IV-j: The MIPS64® SIMD Arch to element 0 in vector element 0 in to  WR[ws] wd 0101  ws[0] 16n+15..16n 32n+31..32n 64n+63..64n 8n+7..8n 26 25 wd[n] in vector n INSVE.df INSVE.df wd[n],ws[0] INSVE.B wd[n],ws[0] INSVE.H wd[n],ws[0] INSVE.W wd[n],ws[0] INSVE.D Element Insert Element 646556 MSA INSVE.D WR[wd] INSVE.B WR[wd] INSVE.H WR[wd] INSVE.W WR[wd] 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in data format values are and results operands The Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Element value copied to vector element. vector copied to Element value Description: Purpose: element Set Format: 31 Element Insert Element Insert Element MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. df MSA MSA MSA MSA LD.df I 2 and the rs df 2 1 0 4 1000 MI10 6 5 rs and the 10-bit signed immediate 5 ruction is atomic at the element level wd mic operation issued in no particular 11 10 . itecture Module, Revision 1.12 Revision Module, itecture 226 df rs e in bytes and have to be multiple of e the size of in bytes and the have data to form the effective memory location address. memory to form the effective rs , a, WRLEN/64) , a, WRLEN/16) , 16 15 , a, WRLEN/8) , a, WRLEN/32) , a, ry location addressed by the base WRLEN-1..0 ts, i.e. each element load is an ato bitfield value dividing the byte offset sizebythe of the data format the byteoffset dividing value bitfield element aligned, the vector load inst WRLEN-1..0 s10 as format elements of data WRLEN-1..0 WRLEN-1..0 05 wd have noalignment restrictions. have s10 units is added to the base the added to unitsis df Volume IV-j: The MIPS64® SIMD Arch element's vector position. vector element's memory[rs + (s10 + i) * sizeof(wd[i])] + (s10 memory[rs  address a. */ address 26 25 wd[i] rs + s10 * 2 rs + s10 * 8 rs + s10 rs + s10 * 4 rs + s10 rs + s10 / 8 bytes at the ef fective memo     LD.df LD.df wd,s10(rs) LD.B wd,s10(rs) LD.H wd,s10(rs) LD.W wd,s10(rs) LD.D Vector Load are fetched and placed in . The. assembler determines the LoadHalfwordVector(WR[wd] LoadDoublewordVector(WR[wd] LoadWordVector(WR[wd] LoadByteVector(WR[wd] /* Implementation defined load ts vector of n bytes from virtual n bytes of ts vector load defined Implementation /* from n halfwords of ts vector load defined Implementation /* offset in data format in offset df 61 s10 MSA WRLEN s10 LD.H a LD.D a LD.W a LD.B a function LoadByteVector(ts, a, n) LoadByteVector(ts, function LoadByteVector endfunction a, n) LoadHalfwordVector(ts, function 011110 ent from base register plus offset memory address, load plus offset entelement-by-elem frombase register Vector Description: Purpose: The Format: with ordering among elemen no guaranteed Restrictions: Address-dependentexceptions are possible. Operation: offset The format effective memory location address address memory location effective memory location address If the effective is the order to with respect in the assembly language By ar convention, syntax all offsets 31 Vector Load MIPS® Architecture for Programmers Programmers MIPS® Architecture for

LD.df I Error Exceptions. itecture Module, Revision 1.12 Revision Module, itecture 227 or A Disabled Exception. DataA access Disabled Exception. TLB Address and Volume IV-j: The MIPS64® SIMD Arch virtual address a. */ address virtual address a. */ address a. */ address virtual /* Implementation defined load ts vector of n words from virtual n words of ts vector load defined Implementation /* from n doublewords of ts vector load defined Implementation /* endfunction LoadHalfwordVect endfunction function LoadWordVector(ts, a, n) LoadWordVector(ts, function LoadWordVector endfunction n) a, LoadDoublewordVector(ts, function LoadDoublewordVector endfunction Exceptions: Exceptions: Exception, MS Instruction Reserved Vector Load MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 LDI.df I I10 000111 6 5 the least significant 8 bits of s10 5 6 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 228 0 s10 elements. For byte elements, only wd t t t 9..0 9..0 9..0    oss all destination elements.  t Volume IV-j: The MIPS64® SIMD Arch || s10 || s10 || s10 54 6 22  s10 ) ) ) 110 df 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i 9 9 9 7..0 26 25 23 22 21 20 wd[i] s10 (s10 (s10 (s10 WR[wd] WR[wd] WR[wd] WR[wd]     LDI.df LDI.df wd,s10 LDI.B wd,s10 LDI.H wd,s10 LDI.W wd,s10 LDI.D Immediate Load for i in 0 .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for .. WRLEN/32-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for endfor endfor endfor endfor 6321 MSA LDI.B t LDI.W t LDI.D t LDI.H t 011110 will be used. Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate value replicated acr replicated Immediate value Description: Purpose: The signed immediate s10 is replicated in all Format: 31 Immediate Load Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA LSA I 0 UNPRE- LSA . 000101 rd 26 000 sa does not have to rs does be a not properly sign-extendedhave 108765 1110 itecture Module, Revision 1.12 Revision Module, itecture 229 rd 63..31 equal), then the result of the operation is operation of the result the thenequal),63..31 sign-extended and placed into GPR sign-extended 1615 ) + GPR[rt] s rt ed if MSA implementation is not present. is not MSA implementation ed if || 0 || (temp31..0) 2120 is shifted left, inserting zeros into the emptied bits; the 32-bit word result is added to added is result word 32-bit the bits;emptied the into zeros insertingleft, shifted is ccurs under any circumstances. under any ccurs rs (31-s)..0 Volume IV-j: The MIPS64® SIMD Arch (GPR[rs] << (sa + 1)) + GPR[rt] (sa + 1)) << (GPR[rs] and the 32-bit arithmetic result is rs  rt = 1 then = 1 MSAP 2625 GPR[rd] sa + 1 . LSA LSA rd,rs,rt,sa LSA Left Shift Add does not contain sign-extended 32-bit values (bits 32-bitvalues sign-extendedcontain not does GPR[rd]  sign_extend GPR[rd] SignalException(ReservedInstruction) UNPREDICTABLE s  temp  (GPR[rs] temp rt 6 5553 else endif if NotWordValue(GPR[rt]) then if NotWordValue(GPR[rt]) endif if Config3 000000 SPECIAL Exceptions: Exceptions: Exception. Instruction Reserved Notes: Programming nearlyUnlike all other word operations, the LSA input operand GPR desti- a 64-bit into sign-extended is always The result word result.32-bit sign-extended a valid produce to value word nation register. the 32-bit value inGPRthe 32-bit value The 32-bit word value in GPR GPR in value word 32-bit The Format: Purpose: number bits of and add theresult toanother word. byfixed a word left-shift a To Description: o exception No Overflow Integer Restrictions: signal is Exception Instruction A Reserved GPR If DICTABLE Operation: 31 Left Shift Add Shift Left MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MADD_Q.df I 3RF 011100 , 16) , 32) addedare to the fixed-point 6 5 ws 16i+15..16i 32i+31..32i . Truncation to fixed-point data df wd , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 230 . 16i+15..16i 32i+31..32i df ws . then then , WR[ws] , WR[ws] wd 16 15 fixed-pointby elements in vector wt n-b+1 n-b+1 1 0  

16i+15..16i 32i+31..32i

b-1 b-1 n-1..b-1 n-1..b-1 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 || 1 || 0 22 21 20 Volume IV-j: The MIPS64® SIMD Arch int in vectorelements || ts || tt n-b+1 n-b+1 n n 0101 df wt saturate(wd[i] + ws[i]  saturate(wd[i] * wt[i]) ) ) q_madd(WR[wd] q_madd(WR[wd] . The multiplication result is not saturated, i.e. exact (-1) * (-1) = 1 is added to the destination. 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 wd = 0 and tt = 1 and tt = n-1 n-1 26 25 wd[i] s * t (ts (tt return tt return return 1 return WR[wd] return 0 return WR[wd] MADD_Q.df MADD_Q.df wd,ws,wt MADD_Q.H wd,ws,wt MADD_Q.W Vector Fixed-Point Multiply and Add Vector is performed lastat the very stage, after saturation. endif else endif if tt p  if tt return p return endfor endfor s  t  df 64 MSA endfunction mulx_s endfunction b) n, sat_s(tt, function MADD_Q.H .. WRLEN/16-1 i in 0 for MADD_Q.W .. WRLEN/32-1 i in 0 for n) tt, mulx_s(ts, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: format valu are and results operands The The saturated fixed-point results are stored back to back are stored results saturated fixed-point The Internally, Internally, the multiplication and addition operate on data double the size of elements in vector The products of fixed-po Format: Purpose: multiply and add. fixed-point Vector Description: 31 Vector Fixed-Point Multiply and Add and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MADD_Q.df I itecture Module, Revision 1.12 Revision Module, itecture 231 2n-1..0 ) + p n-1 n) || 0 , n+1, n) , n+1, n-1..0 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n-1 || td n-1..0 n-1 sat_s(d mulx_s(ts, tt, n) tt, mulx_s(ts, (td d  p  d  return d return endfunction sat_s endfunction ts, tt, q_madd(td, function endfunction q_madd endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Fixed-Point Multiply and Add and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MADDR_Q.df I 3RF , 16) , 32) 011100 . Truncation to fixed-point df addedare to the fixed-point 6 5 16i+15..16i 32i+31..32i ws wd , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 232 . . wd 16i+15..16i 32i+31..32i df ws , WR[ws] , WR[ws] then then 16 15 fixed-pointby elements in vector wt n-b+1 n-b+1 0 1  

16i+15..16i 32i+31..32i

b-1 b-1 int results are stored back to int n-1..b-1 n-1..b-1 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 || 1 || 0 22 21 20 Volume IV-j: The MIPS64® SIMD Arch int in vectorelements || ts || tt n-b+1 n-b+1 n n 1101 df wt saturate(round(wd[i] + ws[i] * wt[i])) + ws[i] saturate(round(wd[i]  ) ) q_maddr(WR[wd] q_maddr(WR[wd] . The multiplication result is not saturated, i.e. exact (-1) * (-1) = 1 is added to the destination. 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 wd = 0 and tt = 1 and tt = n-1 n-1 26 25 wd[i] is performedlast at thestage,very after saturation. s * t (ts (tt return tt return return 0 return 1 return WR[wd] WR[wd] df MADDR_Q.df MADDR_Q.df wd,ws,wt MADDR_Q.H wd,ws,wt MADDR_Q.W Vector Fixed-Point MultiplyFixed-Point and Add Rounded Vector else if tt endif if tt p  return p return endfor endfor s  t  64 MSA endfunction mulx_s endfunction b) n, sat_s(tt, function MADDR_Q.H .. WRLEN/16-1 i in 0 for MADDR_Q.W .. WRLEN/32-1 i in 0 for n) tt, mulx_s(ts, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: Internally, the Internally, multiplication, addition, and rounding operate on data double the size of data format roundingThe done is by adding bitthat 1to theis going mostsignificant to be discarded at truncation. valu are and results operands The The rounded and saturated fixed-po saturated and rounded The elements in vector The products of fixed-po Format: Purpose: multiply add rounded.and fixed-point Vector Description: 31 Vector Fixed-Point Multiply Rounded Add and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MADDR_Q.df I itecture Module, Revision 1.12 Revision Module, itecture 233 2n-1..0 ) + p n-1 || 0 , n+1, n) , n+1, ) n-1..0 n-2 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n-1 || td n-1..0 maddr(td, ts, tt, n) ts, tt, maddr(td, n-1 sat_s(d mulx_s(ts, tt, n) tt, mulx_s(ts, (td || 0 d + (1 endif d  p  d  d  return d return endfunction sat_s endfunction function q_ function endfunction q_maddr endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Fixed-Point Multiply Rounded Add and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MADDV.df I 3R 010010 6 5 and added to the integer elements in elements integer and added to the wd 16i+15..16i 32i+31..32i 64i+63..64i ws 8i+7..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 234 * WR[wt] * WR[wt] * WR[wt] ws . * WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 + WR[ws] + WR[ws] + WR[ws] + + WR[ws]    are multiplied by integer elements in vector elementsin multipliedare integer by 25556 wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i  Volume IV-j: The MIPS64® SIMD Arch wd[i] + ws[i] * wt[i] wd[i] +  WR[wd] WR[wd] WR[wd] WR[wd] 001 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] MADDV.df MADDV.df wd,ws,wt MADDV.B wd,ws,wt MADDV.H wd,ws,wt MADDV.W wd,ws,wt MADDV.D Vector Multiply andAdd Vector . The most half significant of the multiplication resultis discarded. endfor endfor endfor endfor wd 63 MSA MADDV.D .. WRLEN/64-1 i in 0 for MADDV.W .. WRLEN/32-1 i in 0 for MADDV.B .. WRLEN/8-1 i in 0 for MADDV.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The vector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector multiply and add. Vector Description: Purpose: in vector elements integer The Format: 31 Vector Multiplyand Add MIPS® Architecture for Programmers Programmers MIPS® Architecture for

and MSA MSA MSA MSA ws 0 MAX_A.df I 3R 001110 , 16) , 32) , 64) 6 5 signed in elements vector wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 235 , WR[wt] , WR[wt] , WR[wt] ws . , WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i value, value, between corresponding 8i+7..8i 16 15 max_a(WR[ws] max_a(WR[ws] max_a(WR[ws] .    25556 sed on the absolute values. the sed absolute on wd  max_a(WR[ws] Volume IV-j: The MIPS64® SIMD Arch n-1...0 n-1..0 absolute_value(ws[i]) > absolute_value(wt[i])? ws[i]: wt[i] ws[i]: > absolute_value(wt[i])? absolute_value(ws[i])  110 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 1 then = n-1 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] -tt return return tt return WR[wd] WR[wd] MAX_A.df MAX_A.df wd,ws,wt MAX_A.B wd,ws,wt MAX_A.H wd,ws,wt MAX_A.W wd,ws,wt MAX_A.D Vector Maximum Based on AbsoluteValues Vector are written to vector are written to endfor endfor endfor if tt else endif endfor wt 63 MSA MAX_A.W .. WRLEN/32-1 i in 0 for MAX_A.D .. WRLEN/64-1 i in 0 for n) abs(tt, function endfunction abs endfunction MAX_A.B .. WRLEN/8-1 i in 0 for MAX_A.H .. WRLEN/16-1 i in 0 for 011110 Vector and vector maximum ba maximum and vector Vector Description: Purpose: The value with the largest magnitude, i.e. absolute Format: vector absolute value. has the largest representable value minimum negative The data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Maximum Based on Absolute Values Absolute on Based Maximum Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MAX_A.df I itecture Module, Revision 1.12 Revision Module, itecture 236 Volume IV-j: The MIPS64® SIMD Arch 0 || abs(tt, n) 0 || abs(tt, 0 || abs(ts, n) 0 || abs(ts, return ts return tt return t < s then t  s  if else endif function max_a(ts, tt, n) tt, max_a(ts, function endfunction max_a endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Maximum Based on Absolute Values Absolute on Based Maximum Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. MSA MSA MSA MSA wd 0 MAX_S.df I 3R 001110 , 16) , 32) , 64) are written to vector to are written 6 5 ws wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 237 , WR[wt] , WR[wt] , WR[wt] ws . , WR[wt] df and signed elements in vector and signed 16i+15..16i 32i+31..32i 64i+63..64i wt 8i+7..8i 16 15 max_s(WR[ws] max_s(WR[ws] max_s(WR[ws]    25556  max_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch max(ws[i],  max(ws[i], wt[i]) 010 df wt 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i || tt || ts || n-1 n-1 26 25 23 22 21 20 wd[i] tt ts WR[wd] WR[wd] return ts return tt return WR[wd] WR[wd] MAX_S.df MAX_S.df wd,ws,wt MAX_S.B wd,ws,wt MAX_S.H wd,ws,wt MAX_S.W wd,ws,wt MAX_S.D Vector Signed Maximum Vector endfor endfor endfor t  s  t < s then if else endif endfor 63 MSA MAX_S.W .. WRLEN/32-1 i in 0 for MAX_S.D .. WRLEN/64-1 i in 0 for n) tt, max_s(ts, function endfunction max_s endfunction MAX_S.B .. WRLEN/8-1 i in 0 for MAX_S.H .. WRLEN/16-1 i in 0 for 011110 Vector and vector signed maximum. signed and vector Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: signed elements in vector between Maximum values Format: data format in integer values are and results operands The 31 Vector Signed Maximum Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MAX_S.df I itecture Module, Revision 1.12 Revision Module, itecture 238 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Signed Maximum Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MAX_U.df I are written to 3R ws 001110 , 16) , 32) , 64) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 239 , WR[wt] , WR[wt] , WR[wt] and uns igned elements in v ector ws . , WR[wt] wt df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 max_u(WR[ws] max_u(WR[ws] max_u(WR[ws]    25556  max_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch max(ws[i],  max(ws[i], wt[i]) 011 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] 0 || tt 0 || ts WR[wd] WR[wd] ts return tt return WR[wd] WR[wd] MAX_U.df MAX_U.df wd,ws,wt MAX_U.B wd,ws,wt MAX_U.H wd,ws,wt MAX_U.W wd,ws,wt MAX_U.D Vector Unsigned Maximum Vector . endfor endfor endfor t  s  t < s then if else endif endfor wd 63 MSA MAX_U.W .. WRLEN/32-1 i in 0 for MAX_U.D .. WRLEN/64-1 i in 0 for n) tt, max_u(ts, function max_u endfunction MAX_U.B .. WRLEN/8-1 i in 0 for MAX_U.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The vector Vector and vector unsigned maximum.andvector Vector Description: Purpose: Maximum values between signed un in v elements ector Format: 31 Vector UnsignedMaximum MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MAX_U.df I itecture Module, Revision 1.12 Revision Module, itecture 240 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector UnsignedMaximum MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. MSA MSA MSA MSA wd 0 MAXI_S.df I I5 000110 are written to vector are s5 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 241 , t, 16) , t, 32) , t, 64) ws . , t, 8) df and the 5-bit signed immediate signed 5-bit the and 16i+15..16i 32i+31..32i 64i+63..64i ws 8i+7..8i 16 15 max_s(WR[ws] max_s(WR[ws] max_s(WR[ws]    4..0 4..0 4..0 25556 4..0  max_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || s5 || s5 || s5 || s5 max(ws[i],  max(ws[i], s5) 59 11 27 3 010 df s5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || tt || ts || ) ) ) ) 4 4 4 4 n-1 n-1 26 25 23 22 21 20 wd[i] (s5 ts (s5 (s5 (s5 tt WR[wd] WR[wd] ts return tt return WR[wd] WR[wd]     MAXI_S.df MAXI_S.df wd,ws,s5 MAXI_S.B wd,ws,s5 MAXI_S.H wd,ws,s5 MAXI_S.W wd,ws,s5 MAXI_S.D Immediate Signed Maximum Signed Immediate endfor s  t < s then if else for i in 0 .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for endfor endfor endfor t  63 MSA MAXI_S.H t MAXI_S.B t MAXI_S.W t MAXI_S.D t n) tt, max_s(ts, function 011110 Immediate signed and vector maximum. Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: signed elements in vector between Maximum values Format: data format in integer values are and results operands The 31 Immediate Signed Maximum Signed Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MAXI_S.df I itecture Module, Revision 1.12 Revision Module, itecture 242 Volume IV-j: The MIPS64® SIMD Arch endif endfunction max_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Immediate Signed Maximum Signed Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MAXI_U.df I are written to I5 u5 000110 6 5 signed immediate wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 243 , t, 16) , t, 32) , t, 64) ws and the 5-bit un . , t, 8) ws df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 max_u(WR[ws] max_u(WR[ws] max_u(WR[ws]    25556  max_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch 4..0 4..0 4..0 4..0 max(ws[i],  max(ws[i], u5) 011 df u5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || u5 || || u5 || u5 || || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 0 || tt 0 || ts WR[wd] WR[wd] WR[wd] WR[wd] return ts return     MAXI_U.df MAXI_U.df wd,ws,u5 MAXI_U.B wd,ws,u5 MAXI_U.H wd,ws,u5 MAXI_U.W wd,ws,u5 MAXI_U.D Immediate Unsigned Maximum . endfor for i in 0 .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/32-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for endfor endfor endfor t  s  t < s then if else wd 63 MSA MAXI_U.H t MAXI_U.B t MAXI_U.W t MAXI_U.D t n) tt, max_u(ts, function 011110 Immediate unsigned and vector maximum. Description: Purpose: Maximum values between unsigned elements in v ector Format: Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The vector 31 Immediate Unsigned Maximum Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MAXI_U.df I itecture Module, Revision 1.12 Revision Module, itecture 244 Volume IV-j: The MIPS64® SIMD Arch return tt return endif endfunction max_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Immediate Unsigned Maximum Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

and MSA MSA MSA MSA ws 0 MIN_A.df I 3R 001110 , 16) , 32) , 64) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 245 , WR[wt] , WR[wt] , WR[wt] ws . , WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i value, value, between corresponding signed elements in vector 8i+7..8i 16 15 min_a(WR[ws] min_a(WR[ws] min_a(WR[ws] .    25556 sed onsed the absolute values. wd  min_a(WR[ws] Volume IV-j: The MIPS64® SIMD Arch absolute_value(ws[i]) < absolute_value(wt[i])? ws[i]: wt[i] ws[i]: < absolute_value(wt[i])? absolute_value(ws[i])  111 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] 0 || abs(tt, n) 0 || abs(tt, n) 0 || abs(ts, WR[wd] WR[wd] ts return tt return WR[wd] WR[wd] MIN_A.df MIN_A.df wd,ws,wt MIN_A.B wd,ws,wt MIN_A.H wd,ws,wt MIN_A.W wd,ws,wt MIN_A.D Vector Minimum Based on Absolute Value Vector are written to vector are written to endfor endfor endfor t  s  t > s then if else endif endfor wt 63 MSA MIN_A.W .. WRLEN/32-1 i in 0 for MIN_A.D .. WRLEN/64-1 i in 0 for n) tt, min_a(ts, function MIN_A.B .. WRLEN/8-1 i in 0 for MIN_A.H .. WRLEN/16-1 i in 0 for 011110 Vector and vector minimum ba and vector Vector Description: Purpose: The value with the smallest magnitude, i.e. absolute Format: vector absolute value. has the largest representable value minimum negative The data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Minimum Based on Absolute Value on Absolute Based Minimum Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MIN_A.df I itecture Module, Revision 1.12 Revision Module, itecture 246 Volume IV-j: The MIPS64® SIMD Arch endfunction min_a endfunction Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Minimum Based on Absolute Value on Absolute Based Minimum Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. MSA MSA MSA MSA wd 0 MIN_S.df I 3R 001110 , 16) , 32) , 64) are written to vector are written 6 5 ws wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 247 , WR[wt] , WR[wt] , WR[wt] ws . , WR[wt] df in vector and signed elements 16i+15..16i 32i+31..32i 64i+63..64i wt 8i+7..8i 16 15 min_s(WR[ws] min_s(WR[ws] min_s(WR[ws]    25556  min_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch gned elements in vector gned min(ws[i],  min(ws[i], wt[i]) 100 df wt 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i || tt || ts || n-1 n-1 26 25 23 22 21 20 wd[i] tt ts WR[wd] WR[wd] return ts return tt return WR[wd] WR[wd] MIN_S.df MIN_S.df wd,ws,wt MIN_S.B wd,ws,wt MIN_S.H wd,ws,wt MIN_S.W wd,ws,wt MIN_S.D Vector Signed Minimum Vector endfor endfor endfor t  s  t > s then if else endif endfor 63 MSA MIN_S.W .. WRLEN/32-1 i in 0 for MIN_S.D .. WRLEN/64-1 i in 0 for n) tt, min_s(ts, function endfunction min_s endfunction MIN_S.B .. WRLEN/8-1 i in 0 for MIN_S.H .. WRLEN/16-1 i in 0 for 011110 Vector and vector signed minimum. signed and vector Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: betweenMinimum si values Format: data format in integer values are and results operands The 31 Vector Signed Minimum Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MIN_S.df I itecture Module, Revision 1.12 Revision Module, itecture 248 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Signed Minimum Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MIN_U.df I are w ritten to 3R ws 001110 , 16) , 32) , 64) 6 5 wd 16i+15..16i 32i+31..32i 64i+63..64i ..8i, 8) ..8i, 8i+7 11 10 itecture Module, Revision 1.12 Revision Module, itecture 249 , WR[wt] , WR[wt] , WR[wt] ws and unsigne d elements in v ector . , WR[wt] wt df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 min_u(WR[ws] min_u(WR[ws] min_u(WR[ws]    25556  min_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch min(ws[i],  min(ws[i], wt[i]) 101 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] 0 || tt 0 || ts WR[wd] WR[wd] ts return tt return WR[wd] WR[wd] MIN_U.df MIN_U.df wd,ws,wt MIN_U.B wd,ws,wt MIN_U.H wd,ws,wt MIN_U.W wd,ws,wt MIN_U.D Vector Unsigned Minimum Vector . endfor endfor endfor t  s  t > s then if else endif endfor wd 63 MSA MIN_U.W .. WRLEN/32-1 i in 0 for MIN_U.D .. WRLEN/64-1 i in 0 for n) tt, min_u(ts, function min_u endfunction MIN_U.B .. WRLEN/8-1 i in 0 for MIN_U.H .. WRLEN/16-1 i in 0 for 011110 Vector and vector unsigned minimum.andvector Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: Minimum values between unsi gned elements in vector data format in integer values are and results operands The Format: vector 31 Vector Unsigned Minimum Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MIN_U.df I itecture Module, Revision 1.12 Revision Module, itecture 250 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Unsigned Minimum Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. MSA MSA MSA MSA wd 0 MINI_S.df I I5 000110 are written to vector to are written 6 5 s5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 251 , t, 16) , t, 32) , t, 64) ws . , t, 8) df and the 5-bit signed immediate 16i+15..16i 32i+31..32i 64i+63..64i ws 8i+7..8i 16 15 min_s(WR[ws] min_s(WR[ws] min_s(WR[ws]    4..0 4..0 4..0 25556 4..0  min_s(WR[ws] Volume IV-j: The MIPS64® SIMD Arch gned elements in vector gned || s5 || s5 || s5 || s5 min(ws[i],  min(ws[i], s5) 59 11 27 3 100 df s5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || tt || ts || ) ) ) ) 4 4 4 4 n-1 n-1 26 25 23 22 21 20 wd[i] (s5 ts (s5 (s5 (s5 tt WR[wd] WR[wd] ts return tt return WR[wd] WR[wd]     MINI_S.df MINI_S.df wd,ws,s5 MINI_S.B wd,ws,s5 MINI_S.H wd,ws,s5 MINI_S.W wd,ws,s5 MINI_S.D Immediate SignedMinimum endfor s  t > s then if else for i in 0 .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for endfor endfor endfor t  63 MSA MINI_S.H t MINI_S.B t MINI_S.W t MINI_S.D t n) tt, min_s(ts, function 011110 The operands and results are values in integer data format in integer values are and results operands The Immediate signed and vector minimum. Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: betweenMinimum si values Format: 31 Immediate Signed Signed Minimum Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MINI_S.df I itecture Module, Revision 1.12 Revision Module, itecture 252 Volume IV-j: The MIPS64® SIMD Arch endif endfunction min_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Immediate Signed Signed Minimum Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MINI_U.df I are written to I5 u5 000110 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 253 , t, 16) , t, 32) , t, 64) ws and the 5-bit unsigned immediate . , t, 8) ws df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 min_u(WR[ws] min_u(WR[ws] min_u(WR[ws]    25556 signed signed elements in vector  min_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch 4..0 4..0 4..0 4..0 min(ws[i],  min(ws[i], u5) 101 df u5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || u5 || || u5 || u5 || || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 0 || tt 0 || ts WR[wd] WR[wd] WR[wd] WR[wd] return ts return     MINI_U.df MINI_U.df wd,ws,u5 MINI_U.B wd,ws,u5 MINI_U.H wd,ws,u5 MINI_U.W wd,ws,u5 MINI_U.D Immediate Unsigned Minimum . endfor for i in 0 .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/32-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for endfor endfor endfor t  s  t > s then if else wd 63 MSA MINI_U.H t MINI_U.B t MINI_U.W t MINI_U.D t n) tt, min_u(ts, function 011110 Immediate unsigned and vector minimum. Description: Purpose: Minimum values between un Format: Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The vector 31 Immediate Unsigned Minimum Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MINI_U.df I itecture Module, Revision 1.12 Revision Module, itecture 254 Volume IV-j: The MIPS64® SIMD Arch return tt return endif endfunction min_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Immediate Unsigned Minimum Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 UNPRE- MOD_S.df I 3R 010010 . The remainder of the wt 6 5 the result valueis zero, is wd wt 16i+15..16i 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 255 8i+7..8i ws . df mod WR[wt] mod WR[wt] mod WR[wt] mod WR[wt] mod 16 15 . divisorIf a element vector 16i+15..16i 32i+31..32i 64i+63..64i wd 8i+7..8i are divided by signed integer elements in vector ws WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch ws[i] mod wt[i] ws[i] mod  110 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] . MOD_S.df MOD_S.df wd,ws,wt MOD_S.B wd,ws,wt MOD_S.H wd,ws,wt MOD_S.W wd,ws,wt MOD_S.D Vector Signed Modulo Vector endfor endfor endfor endfor 63 MSA MOD_S.D .. WRLEN/64-1 i in 0 for MOD_S.W .. WRLEN/32-1 i in 0 for MOD_S.B .. WRLEN/8-1 i in 0 for MOD_S.H .. WRLEN/16-1 i in 0 for 011110 DICTABLE data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved same sign as the dividend is written to vector Vector signed remainder (modulo). Vector Description: Purpose: The signed integer elements in vector Format: 31 Vector Signed Modulo Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MOD_U.df I 3R . . The remainder is 010010 wt 6 5 wd UNPREDICTABLE 16i+15..16i 32i+31..32i 64i+63..64i 11 10 8i+7..8i itecture Module, Revision 1.12 Revision Module, itecture 256 ws . df umod WR[wt] umod WR[wt] umod WR[wt] umod umod WR[wt] umod 16 15 is zero, the result value is value result zero, the is wt 16i+15..16i 32i+31..32i 64i+63..64i are divided by unsigned integer elements in vector 8i+7..8i ws WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch ws[i] umod wt[i] ws[i] umod  111 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i . If a divisor element vector a divisor . If wd 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] MOD_U.df MOD_U.df wd,ws,wt MOD_U.B wd,ws,wt MOD_U.H wd,ws,wt MOD_U.W wd,ws,wt MOD_U.D Vector Unsigned Modulo Vector endfor endfor endfor endfor 63 MSA MOD_U.D .. WRLEN/64-1 i in 0 for MOD_U.W .. WRLEN/32-1 i in 0 for MOD_U.B .. WRLEN/8-1 i in 0 for MOD_U.H .. WRLEN/16-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector unsigned remainder(modulo). Vector Description: Purpose: The unsigned integer elements in vector Format: writtento vector 31 Vector Unsigned Modulo Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 MOVE.V I ELM 011001 6 5 56 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 257 ws 16 15 . wd 05 to vector ws 0010111110 Volume IV-j: The MIPS64® SIMD Arch  ws  WR[ws] 26 25 wd MOVE.V MOVE.V wd,ws MOVE.V Vector Move Vector 61 MSA WR[wd] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Copy all WRLEN bits in vector in all WRLEN bits Copy Format: Purpose: move. to vector Vector Description: The operand and result are bit vector values. result are bit vector and operand The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Move Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MSUB_Q.df I 3RF 011100 , 16) , 32) 6 5 are subtracted from the fixed- 16i+15..16i 32i+31..32i . Truncation to fixed-point data ws df wd , WR[wt] , WR[wt] exact (-1) * (-1) = 1 is subtracted from the 11 10 itecture Module, Revision 1.12 Revision Module, itecture 258 . . wd 16i+15..16i 32i+31..32i df ws then then , WR[ws] , WR[ws] 16 15 by fixed-point elements in vector n-b+1 n-b+1 wt 1 0  

16i+15..16i 32i+31..32i

b-1 b-1 n-1..b-1 n-1..b-1 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 point results are storedback to || 1 || 0 22 21 20 Volume IV-j: The MIPS64® SIMD Arch . The multiplication result is not saturated, i.e. on and subtraction operate on data double the size of || ts || tt wd n-b+1 n-b+1 n n 0110 df wt saturate(wd[i] - ws[i]  saturate(wd[i] * wt[i]) ) ) q_msub(WR[wd] q_msub(WR[wd] 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 = 0 and tt = 1 and tt = n-1 n-1 26 25 wd[i] s * t (ts (tt return tt return return 1 return WR[wd] return 0 return WR[wd] MSUB_Q.df MSUB_Q.df wd,ws,wt MSUB_Q.H wd,ws,wt MSUB_Q.W Vector Fixed-Point MultiplyFixed-Point and Subtract Vector is performed lastat the very stage, after saturation. endif else endif if tt p  if tt return p return endfor endfor s  t  df 64 MSA endfunction mulx_s endfunction b) n, sat_s(tt, function MSUB_Q.H .. WRLEN/16-1 i in 0 for MSUB_Q.W .. WRLEN/32-1 i in 0 for n) tt, mulx_s(ts, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: point elements in vector The product of fixed-point elements in vector Format: Purpose: multiply and subtract.fixed-point Vector Description: destination.The saturated fixed- Internally, the multiplicati format valu are and results operands The 31 Vector Fixed-Point Multiply and Subtract and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSUB_Q.df I itecture Module, Revision 1.12 Revision Module, itecture 259 2n-1..0 ) - p n-1 n) || 0 , n+1, n) , n+1, n-1..0 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n-1 || td n-1..0 n-1 sat_s(d mulx_s(ts, tt, n) tt, mulx_s(ts, (td d  p  d  return d return endfunction sat_s endfunction ts, tt, q_msub(td, function endfunction q_msub endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Fixed-Point Multiply and Subtract and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MSUBR_Q.df I 3RF , 16) , 32) 011100 . Truncation to fixed- df 6 5 16i+15..16i 32i+31..32i are subtracted from the fixed- ws wd . wd , WR[wt] , WR[wt] exact (-1) * (-1) = 1 is subtracted from the 11 10 itecture Module, Revision 1.12 Revision Module, itecture 260 . 16i+15..16i 32i+31..32i df ws , WR[ws] , WR[ws] then then 16 15 by fixed-point elements in vector by fixed-point n-b+1 n-b+1 wt d rounding operate on data double the size of 0 1  

16i+15..16i 32i+31..32i

b-1 b-1 n-1..b-1 n-1..b-1 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 || 1 || 0 22 21 20 Volume IV-j: The MIPS64® SIMD Arch . The multiplication result is not saturated, i.e. || ts || tt int elements in vector wd n-b+1 n-b+1 n n 1110 df wt saturate(round(wd[i] - ws[i] * wt[i])) - ws[i] saturate(round(wd[i]  ) ) q_msubr(WR[wd] q_msubr(WR[wd] 16i+15..16i 32i+31..32i is performed last stage,afteratthevery saturation. 2n-1..0 n-1 n-1 = 0 and tt = 1 and tt = df n-1 n-1 26 25 wd[i] s * t (ts (tt return tt return return 0 return 1 return WR[wd] WR[wd] MSUBR_Q.df MSUBR_Q.df wd,ws,wt MSUBR_Q.H wd,ws,wt MSUBR_Q.W Vector Fixed-Point MultiplyFixed-Point and Subtract Rounded Vector else if tt endif if tt p  return p return endfor endfor s  t  64 MSA endfunction mulx_s endfunction b) n, sat_s(tt, function MSUBR_Q.H .. WRLEN/16-1 i in 0 for MSUBR_Q.W .. WRLEN/32-1 i in 0 for n) tt, mulx_s(ts, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: The products of fixed-po Format: Purpose: multiply and subtractfixed-point rounded. Vector Description: point elements in vector point data format The roundingThe done is by adding bitthat 1to theis going mostsignificant to be discarded at truncation. valu are and results operands The destination. The rounded and saturated fixed-point are storedbackdestination.toresults The rounded and saturated fixed-point Internally, the multiplication, subtraction, an 31 Vector Fixed-Point Multiply and Subtract Rounded Subtract and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSUBR_Q.df I itecture Module, Revision 1.12 Revision Module, itecture 261 2n-1..0 ) - p n-1 || 0 , n+1, n) , n+1, ) n-1..0 n-2 Volume IV-j: The MIPS64® SIMD Arch 2n-1..n-1 || td n-1..0 msubr(td, ts, tt, n) ts, tt, msubr(td, n-1 sat_s(d mulx_s(ts, tt, n) tt, mulx_s(ts, (td || 0 d + (1 endif d  p  d  d  return d return endfunction sat_s endfunction function q_ function endfunction q_msubr endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Fixed-Point Multiply and Subtract Rounded Subtract and Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 MSUBV.df I 3R 010010 6 5 and subtracted from ele- the integer wd 16i+15..16i 32i+31..32i 64i+63..64i ws 8i+7..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 262 * WR[wt] * WR[wt] * WR[wt] ws . * WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 - WR[ws] - WR[ws] - WR[ws] - - WR[ws]    are multiplied by integer elements in vector 25556 wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i  Volume IV-j: The MIPS64® SIMD Arch wd[i] - ws[i] * wt[i] wd[i] -  WR[wd] WR[wd] WR[wd] WR[wd] 010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i . mostThe significant halfof multiplicationthe resultis discarded. wd 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] MSUBV.df MSUBV.df wd,ws,wt MSUBV.B wd,ws,wt MSUBV.H wd,ws,wt MSUBV.W wd,ws,wt MSUBV.D Vector Multiply andSubtract Vector endfor endfor endfor endfor 63 MSA MSUBV.D .. WRLEN/64-1 i in 0 for MSUBV.W .. WRLEN/32-1 i in 0 for MSUBV.B .. WRLEN/8-1 i in 0 for MSUBV.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector multiply and subtract. Vector Description: Purpose: in vector elements The integer Format: in vector ments 31 Vector Multiplyand Subtract MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MUL_Q.df I 3RF 011100 . The result is written to , 16) , 32) ws 6 5 wd 16i+15..16i 32i+31..32i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 263 , WR[wt] , WR[wt] . df ws then 16i+15..16i 32i+31..32i n-1 16 15 multiplied by f ixed-point elements in vector

wt q_mul(WR[ws] q_mul(WR[ws] 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 and tt = 1 || 0 tt = 1 and n-1 22 21 20 Volume IV-j: The MIPS64® SIMD Arch n-1 || ts || tt 2n-2..n-1 n n 0100 df wt  ws[i] * wt[i] ) ) 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 mulx_s(ts, tt, n) tt, mulx_s(ts, 26 25 wd[i] (ts s * t (tt WR[wd] p  p return WR[wd] return 0 || 1 0 || return MUL_Q.df MUL_Q.df wd,ws,wt MUL_Q.H wd,ws,wt MUL_Q.W Vector Fixed-Point Multiply Vector . endfor endfor s  else endif p  0 ts = 1 || if return p return t  wd 64 MSA MUL_Q.W .. WRLEN/32-1 i in 0 for tt, n) mulx_s(ts, function endfunction q_mul endfunction endfunction mulx_s endfunction n) tt, q_mul(ts, function MUL_Q.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: Fixed-point multiplication for 16-bit Q15 and 32-bit Q31 is a regular signed multiplication followed by one bit shift left withsaturation. Onlythe mosthalf significant of the result is preserved. valu are and results operands The vector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The fixed-point elements in vector Format: Purpose: multiplication. fixed-point Vector Description: 31 Vector Fixed-Point Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA 0 MULR_Q.df I 3RF 011100 , 16) , 32) 6 5 . The rounded result is written ws wd 16i+15..16i 32i+31..32i 11 10 , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 264 . df ws then 16i+15..16i 32i+31..32i n-1 16 15

) multiplied by fixed-point elements in vector q_mulr(WR[ws] q_mulr(WR[ws] wt n-2 15556 es in fixed-point data format in es fixed-point   n-1..0 n-1..0 and tt = 1 || 0 tt = 1 and n-1 22 21 20 Volume IV-j: The MIPS64® SIMD Arch n-1 || ts || tt 2n-2..n-1 n n 1100 df wt round(ws[i] * wt[i]) round(ws[i]  ) ) 16i+15..16i 32i+31..32i 2n-1..0 n-1 n-1 mulx_s(ts, tt, n) tt, mulx_s(ts, || 0 p + (1 26 25 wd[i] (ts s * t (tt return p return WR[wd] p  p  WR[wd] return 0 || 1 0 || return . MULR_Q.df MULR_Q.df wd,ws,wt MULR_Q.H wd,ws,wt MULR_Q.W Vector Fixed-Point Multiply Rounded Vector wd endfor endfor s  else p  0 ts = 1 || if return p return t  64 MSA MULR_Q.W .. WRLEN/32-1 i in 0 for n) tt, mulx_s(ts, function endfunction q_mulr endfunction endfunction mulx_s endfunction n) tt, q_mulr(ts, function MULR_Q.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The fixed-point The elements in fixed-point vector Format: Purpose: rounded. multiply fixed-point Vector Description: to vector Fixed-point multiplication for 16-bit Q15 and 32-bit Q31 is a regular signed multiplication followed by one bit shift left withsaturation. Onlythe mosthalf significant of the result is preserved. The rounding is done by adding 1 to the mostbit significant that is going to be discarded prior to shifting left the full multiplicationresult. valu are and results operands The 31 Vector Fixed-Point Multiply Rounded Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MULR_Q.df I itecture Module, Revision 1.12 Revision Module, itecture 265 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Fixed-Point Multiply Rounded Multiply Fixed-Point Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for . wd MSA MSA MSA MSA 0 MULV.df I 3R 010010 6 5 . . The result is written to vector wd ws 16i+15..16i 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 266 8i+7..8i ws . df * WR[wt] * WR[wt] * WR[wt] * WR[wt] 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i WR[ws] WR[ws] WR[ws] are multiplied by integer elements in vector    25556 wt  WR[ws] Volume IV-j: The MIPS64® SIMD Arch  ws[i] * wt[i] 000 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] MULV.df MULV.df wd,ws,wt MULV.B wd,ws,wt MULV.H wd,ws,wt MULV.W wd,ws,wt MULV.D Vector Multiply Vector endfor endfor endfor endfor 63 MSA MULV.D .. WRLEN/64-1 i in 0 for MULV.W .. WRLEN/32-1 i in 0 for MULV.B .. WRLEN/8-1 i in 0 for MULV.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector multiply. Vector Description: Purpose: The integer elements in vector Format: The most significant mosthalfThe of significant the multiplication resultis discarded. data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Multiply MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 NLOC.df I 2R 011110 . wd 6 5 , 16) , 32) , 64) wd , 8) 16i+15..16i 32i+31..32i 64i+63..64i 11 10 8i+7..8i itecture Module, Revision 1.12 Revision Module, itecture 267 556 . df is stored to the elements in vector the elements in to is stored ws df ws 18 17 16 15 leading_one_count(WR[ws] leading_one_count(WR[ws] leading_one_count(WR[ws]     leading_one_count(WR[ws] Volume IV-j: The MIPS64® SIMD Arch 11000010  leading_one_count(ws[i]) 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i = 0 then = 0 i return z return z  z + 1 26 25 wd[i] 0 else endif WR[wd] WR[wd] WR[wd] if tt WR[wd] NLOC.df NLOC.df wd,ws NLOC.B wd,ws NLOC.H wd,ws NLOC.W wd,ws NLOC.D Vector Leading Ones Count Vector endfor endfor endfor z  i in n-1..0 for endfor 682 MSA endfunction leading_one_count endfunction NLOC.W .. WRLEN/32-1 i in 0 for NLOC.D .. WRLEN/64-1 i in 0 for n) leading_one_count(tt, function NLOC.B .. WRLEN/8-1 i in 0 for NLOC.H .. WRLEN/16-1 i in 0 for 011110 Vector element count of leading element bits set to 1. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: elements in vector leading ones for number of The Format: data format in integer values are and results operands The 31 Vector Leading Ones Count Ones Leading Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

NLOC.df I itecture Module, Revision 1.12 Revision Module, itecture 268 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Leading Ones Count Ones Leading Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 INLZC.df 2R 011110 . wd 6 5 , 16) , 32) , 64) wd , 8) 16i+15..16i 32i+31..32i 64i+63..64i 11 10 8i+7..8i itecture Module, Revision 1.12 Revision Module, itecture 269 556 . df is stored to the elements in the vector stored elements to is ws df ws 18 17 16 15 leading_zero_count(WR[ws] leading_zero_count(WR[ws] leading_zero_count(WR[ws]     leading_zero_count(WR[ws] Volume IV-j: The MIPS64® SIMD Arch 11000011  leading_zero_count(ws[i]) 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i = 1 then = 1 i return z return z  z + 1 26 25 wd[i] 0 else endif WR[wd] WR[wd] WR[wd] if tt WR[wd] NLZC.df NLZC.df wd,ws NLZC.B wd,ws NLZC.H wd,ws NLZC.W wd,ws NLZC.D Vector Leading Zeros Count Vector endfor endfor endfor z  i in n-1..0 for endfor 682 MSA endfunction leading_zero_count endfunction NLZC.W .. WRLEN/32-1 i in 0 for NLZC.D .. WRLEN/64-1 i in 0 for n) leading_zero_count(tt, function NLZC.B .. WRLEN/8-1 i in 0 for NLZC.H .. WRLEN/16-1 i in 0 for 011110 Vector element count of leading element bits set to 0. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: vector elements in for leading zeroes number of The Format: data format in integer values are and results operands The 31 Vector Leading Zeros Count Zeros Leading Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

INLZC.df itecture Module, Revision 1.12 Revision Module, itecture 270 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Leading Zeros Count Zeros Leading Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 NOR.V I VEC 011110 6 5 wd in a bi twise logical NOR operation. The wt 11 10 itecture Module, Revision 1.12 Revision Module, itecture 271 ws 16 15 wt 21 20 Volume IV-j: The MIPS64® SIMD Arch . wd 00010 is combined with the corresponding bit of vector ws ws NOR wt  ws 26 25 wd NOR.V NOR.V wd,ws,wt NOR.V Vector Logical Negated Or Logical Negated Vector 6 5555 6 MSA WR[ws] nor WR[wt] nor  WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each bit of vector Format: Purpose: or. negated logical vector by Vector Description: The operands and results are bit vector values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector 31 Vector Logical Negated Or Negated Logical Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 NORI.B I I8 000000 6 5 wd in a bitwis e logical NOR operation. The i8 11 10 itecture Module, Revision 1.12 Revision Module, itecture 272 5 5 6 ws 7..0 16 15 ..8i nor i8 ..8i nor i8 8i+7 is combined with the 8-bit immediate ws Volume IV-j: The MIPS64® SIMD Arch .  WR[ws] wd ws[i] NOR i8 NOR ws[i]  10 8i+7..8i 26 25 24 23 wd[i] NORI.B NORI.B wd,ws,i8 NORI.B Immediate Or LogicalNegated WR[wd] 62 8 MSA endfor for i in 0 .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer byte data format. in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each element byte of vector Format: Purpose: or. Immediatelogical negated byvector Description: 31 Immediate Logical Negated Or Negated Logical Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 OR.V I VEC 011110 6 5 wd in a bit wise logical OR operation. The wt 11 10 itecture Module, Revision 1.12 Revision Module, itecture 273 ws 16 15 rresponding bit of vector wt 21 20 Volume IV-j: The MIPS64® SIMD Arch . wd 00001 is combined with the co ws ws OR wt  ws 26 25 wd OR.V OR.V wd,ws,wt OR.V Vector Logical Logical Or Vector 6 5555 6 MSA WR[ws] or WR[wt] or  WR[ws] WR[wd] 011110 The operands and results are bit vector values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each bit of vector Format: Purpose: logical or. by vector Vector Description: 31 Vector Logical Or Logical Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 ORI.B I I8 000000 The result operation. 6 5 wd i8 logical OR in a bitwise 11 10 itecture Module, Revision 1.12 Revision Module, itecture 274 5 5 6 ws 7..0 16 15 ..8i or i8 ..8i or i8 8i+7 is combined with the 8-bit immediate Volume IV-j: The MIPS64® SIMD Arch ws  WR[ws] ws[i] OR i8 OR ws[i]  . 01 wd 8i+7..8i 26 25 24 23 wd[i] ORI.B ORI.B wd,ws,i8 ORI.B Immediate Logical Or Immediate Logical WR[wd] 62 8 MSA endfor for i in 0 .. WRLEN/8-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each element of vector byte Format: Purpose: logical or. vector Immediate by Description: is written tovector The operands and results are values in integer byte data format. in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Logical Or Logical Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 PCKEV.df I 3R are copied to the 010100 wt 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 275 and even elements in vector wd 64j+63..64j 16j+15..16j 32j+31..32j ws 8j+7..8j . df WR[ws] WR[ws] WR[ws] 16 15     WR[ws] 64j+63..64j 16j+15..16j 32j+31..32j 8j+7..8j ws[2i]; right_half(wd)[i]  wt[2i] right_half(wd)[i]  ws[2i]; WR[wt] WR[wt] WR[wt]    25556  WR[wt] Volume IV-j: The MIPS64® SIMD Arch are copied to the left half of vector ws . 010 df wt 32i+31+WRLEN/2..32j+WRLEN/2 32i+31..32i 64i+63+WRLEN/2..64j+WRLEN/2 64i+63..64i 8i+7+WRLEN/2..8i+WRLEN/2 8i+7..8i 16i+15+WRLEN/2..16j+WRLEN/2 16i+15..16i wd 2 * i 2 * i 2 * i 2 * i 26 25 23 22 21 20 left_half(wd)[i] left_half(wd)[i] WR[wd] WR[wd] j  WR[wd] WR[wd] j  WR[wd] WR[wd] j  WR[wd] j  WR[wd] PCKEV.df PCKEV.df wd,ws,wt PCKEV.B wd,ws,wt PCKEV.H wd,ws,wt PCKEV.W wd,ws,wt PCKEV.D Vector Pack Even Pack Vector endfor endfor endfor endfor 63 MSA PCKEV.D .. WRLEN/128-1 i in 0 for PCKEV.W .. WRLEN/64-1 i in 0 for PCKEV.B .. WRLEN/16-1 i in 0 for PCKEV.H .. WRLEN/32-1 i in 0 for 011110 Vector even elements copy. even Vector Description: Purpose: Even elements in vector Format: right halfvector of The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Pack Even Pack Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

PCKEV.df I itecture Module, Revision 1.12 Revision Module, itecture 276 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Pack Even Pack Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 PCKOD.df I 3R 010100 are copied to the right to are copied wt 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 277 and odd elements in vector odd and 64k+63..64k 16k+15..16k 32k+31..32k ws wd 8k+7..8k . df WR[ws] WR[ws] WR[ws] 16 15     WR[ws] 64k+63..64k 16k+15..16k 32k+31..32k 8k+7..8k ws[2i+1]; right_half(wd)[i]  wt[2i+1] right_half(wd)[i]  ws[2i+1]; WR[wt] WR[wt] WR[wt]    25556  WR[wt] Volume IV-j: The MIPS64® SIMD Arch are copied to the left half of vector left to the copied are ws 011 df wt 32i+31+WRLEN/2..32i+WRLEN/2 32i+31..32i 64i+63+WRLEN/2..64i+WRLEN/2 64i+63..64i 8i+7+WRLEN/2..8i+WRLEN/2 8i+7..8i 16i+15+WRLEN/2..16i+WRLEN/2 16i+15..16i 2 * i + 1 2 * i + 1 2 * i + 1 2 * i + 1 2 * i + . 26 25 23 22 21 20 left_half(wd)[i] left_half(wd)[i] wd WR[wd] WR[wd] k  WR[wd] WR[wd] k  WR[wd] WR[wd] k  WR[wd] k  WR[wd] PCKOD.df PCKOD.df wd,ws,wt PCKOD.B wd,ws,wt PCKOD.H wd,ws,wt PCKOD.W wd,ws,wt PCKOD.D Vector Pack Odd Pack Vector endfor endfor endfor endfor 63 MSA PCKOD.D .. WRLEN/128-1 i in 0 for PCKOD.W .. WRLEN/64-1 i in 0 for PCKOD.B .. WRLEN/16-1 i in 0 for PCKOD.H .. WRLEN/32-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: half of half vector Vector odd copy. elements Vector Description: Purpose: vector in elements Odd Format: 31 Vector Pack Odd Pack Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

PCKOD.df I itecture Module, Revision 1.12 Revision Module, itecture 278 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Pack Odd Pack Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 PCNT.df I 2R 011110 . 6 5 wd , 16) , 32) , 64) wd , 8) 16i+15..16i 32i+31..32i 64i+63..64i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 279 8i+7..8i 556 . df is stored to the elements in vector elements stored to the is ws df ws 18 17 16 15 population_count(WR[ws] population_count(WR[ws] population_count(WR[ws]     population_count(WR[ws] Volume IV-j: The MIPS64® SIMD Arch 11000001  population_count(ws[i]) 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i = 1 then = 1 i z  z + 1 26 25 wd[i] 0 endif WR[wd] WR[wd] WR[wd] if tt WR[wd] PCNT.df PCNT.df wd,ws PCNT.B wd,ws PCNT.H wd,ws PCNT.W wd,ws PCNT.D Vector Population Count Vector endfor endfor endfor z  i in n-1..0 for endfor 682 MSA endfunction population_count endfunction PCNT.B .. WRLEN/8-1 i in 0 for PCNT.W .. WRLEN/32-1 i in 0 for PCNT.D .. WRLEN/64-1 i in 0 for n) population_count(tt, function PCNT.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector element count of all bits 1. set to all count bits of element Vector Description: Purpose: bits number set to 1 for elements of in vector The Format: The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Population Count Population Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SAT_S.df I BIT 001010 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 280 , 16, m+1) , 32, m+1) , 64, m+1) bits without changing the data width. The result is result The the data width. changing bits without m+1 ws . , 8, m+1) df 16i+15..16i 32i+31..32i 64i+63..64i then then 8i+7..8i 16 15 n-b+1 n-b+1 1 0  

df/m sat_s(WR[ws] sat_s(WR[ws] sat_s(WR[ws] b-1 b-1 n-1..b-1 n-1..b-1     sat_s(WR[ws] || 1 || 0 saturationof signedvalues. are saturated to signed values of values to signed are saturated Volume IV-j: The MIPS64® SIMD Arch ws n-b+1 n-b+1 saturate_signed(ws[i], m+1)  saturate_signed(ws[i], 000 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i . = 0 and tt = 1 and tt = wd n-1 n-1 26 25 23 22 wd[i] WR[wd] return 1 return tt return WR[wd] return 0 return WR[wd] WR[wd] SAT_S.df SAT_S.df wd,ws,m SAT_S.B wd,ws,m SAT_S.H wd,ws,m SAT_S.W wd,ws,m SAT_S.D Immediate SignedSaturate endfor endif if tt else endif endfor endfor endfor if tt 63 7 5 5 6 MSA SAT_S.H .. WRLEN/16-1 i in 0 for endfunction sat_s endfunction SAT_S.B .. WRLEN/8-1 i in 0 for SAT_S.W .. WRLEN/32-1 i in 0 for SAT_S.D .. WRLEN/64-1 i in 0 for b) n, sat_s(tt, function 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: writtento vector Description: Immediate selected bit width Purpose: vector in elements Signed Format: 31 Immediate Signed Signed Saturate Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SAT_S.df I itecture Module, Revision 1.12 Revision Module, itecture 281 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Signed Signed Saturate Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SAT_U.df I BIT 001010 6 5 wd 11 10 m+1 bits without changing the data width. The itecture Module, Revision 1.12 Revision Module, itecture 282 , 16, m+1) , 32, m+1) , 64, m+1) ws . , 8, m+1) df 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i 16 15 df/m sat_u(WR[ws] sat_u(WR[ws] sat_u(WR[ws] b    then  sat_u(WR[ws] ws are saturated to unsigned values of Volume IV-j: The MIPS64® SIMD Arch . || 1 n-b wd 0 n-b saturate_unsigned(ws[i], m+1)  saturate_unsigned(ws[i],  001 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n-1..b 26 25 23 22 wd[i] return 0 return WR[wd] return tt return WR[wd] WR[wd] WR[wd] SAT_U.df SAT_U.df wd,ws,m SAT_U.B wd,ws,m SAT_U.H wd,ws,m SAT_U.W wd,ws,m SAT_U.D Immediate Unsigned Saturate endfor else endif endfor endfor endfor if tt 63 7 5 5 6 MSA SAT_U.H .. WRLEN/16-1 i in 0 for endfunction sat_u endfunction SAT_U.B .. WRLEN/8-1 i in 0 for SAT_U.W .. WRLEN/32-1 i in 0 for SAT_U.D .. WRLEN/64-1 i in 0 for b) n, sat_u(tt, function 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The result writtenis tovector Immediate bitselected width saturationof unsigned values. Description: Purpose: Unsigned elements in vector Format: 31 Immediate Unsigned Saturate Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SAT_U.df I itecture Module, Revision 1.12 Revision Module, itecture 283 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Unsigned Saturate Unsigned Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA 0 ISHF.df I8 000010 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 284 5 5 6 ws data format. All sets are shuf fled in the same w ay: the df , 2, where i is 0, 3. 1, wd 16 15 16k+15..16k 32k+31..32k 8k+7..8k i8 WR[ws] WR[ws]   2j+1..2j 2j+1..2j 2j+1..2j  WR[ws] Volume IV-j: The MIPS64® SIMD Arch based 4 element setcopy shuffle_set(ws, i, i8) i, shuffle_set(ws,  is copied over the element i in over copied is 32i+31..32i 8i+7..8i 16i+15..16i df ws i % 4 i8 i - j + i % 4 i8 i - j + i % 4 i8 i - j + in 26 25 24 23 wd[i] WR[wd] j  k  WR[wd] j  k  WR[wd] j  k  SHF.df SHF.df wd,ws,i8 SHF.B wd,ws,i8 SHF.H wd,ws,i8 SHF.W Immediate Set Shuffle Elements Shuffle Immediate Set 2i+1..2i endfor endfor endfor i8 62 8 MSA SHF.W .. WRLEN/32-1 i in 0 for SHF.B .. WRLEN/8-1 i in 0 for SHF.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved The set shuffle instruction works on 4-element sets in Format: Purpose: value- Immediate control element Description: The operands and results are values in byte data format. in byte values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Set Shuffle Elements Set Shuffle Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ws wd MSA MSA MSA MSA indi- df and 0 ISLD.df wd 3R 010100 6 5 wd . as byte elements, with data format as byte elements, with wd es) stored row-wise, with as many rows as ws 11 10 itecture Module, Revision 1.12 Revision Module, itecture 285 in the order they appear in the syntax, i.e. first they order in the and wd ws ) and then slide it to the left over the concatenation of concatenation the over to the left it slide and then ws t+s-1..t 16 15 of vector registers vector of 8j+7..8j . The resultis written to vector v rt  || WR[ws] are concatenated horizontally are concatenated 8j+7..8j ws 25556  v . and t+s-1..t Volume IV-j: The MIPS64® SIMD Arch df slide left source array. left source slide contain 2-dimensional byte arrays (rectangl wd t+8i+7..t+8i ws slide(wd, ws, rt) slide(wd,  manipulate the content manipulate 000 df rt 8i+7..8i and i + n (WR[wd] j  i + n WR[wd] wd 26 25 23 22 21 20 wd[i] destination vector. elements in the GPR[rt] % (WRLEN/32) GPR[rt] WRLEN/4 GPR[rt] % (WRLEN/8) GPR[rt] || WR[ws] WR[wd] % (WRLEN/16) GPR[rt] WRLEN/2 for i in 0 .. s/8-1 i in 0 .. for endfor j  WR[wd] s * k t = v  df . Place a new destination rectangle over rectangle destination new a Place .    SLD.df SLD.df wd,ws[rt] SLD.B wd,ws[rt] SLD.H wd,ws[rt] SLD.W wd,ws[rt] SLD.D GPRColumns Slide ws endfor s  endfor v  .. WRLEN/8-1 i in 0 for s  1 k in 0, for value is interpreted modulo the number of columns in destination rectangle, or equivalently, the number of 63 rt MSA SLD.W n SLD.H n SLD.B n 011110 data format Restrictions: No data-dependentare possible.exceptions Operation: GPR and then and cating the 2-dimensional byte array layout. rectangles source two The in GPR by thenumber columns given of GPR number of columns columns to number of GPR Purpose: Description: Vector registers Format: bytes indatainteger format The slide instructionsslide The 31 GPR Slide Columns MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ISLD.df itecture Module, Revision 1.12 Revision Module, itecture 286 ) ) t+s-1..t t+s-1..t 8j+7..8j 8j+7..8j v v   || WR[ws] || WR[ws] t+s-1..t t+s-1..t Volume IV-j: The MIPS64® SIMD Arch t+8i+7..t+8i t+8i+7..t+8i (WR[wd] (WR[wd] j  i + n WR[wd] j  i + n WR[wd] GPR[rt] % (WRLEN/64) GPR[rt] WRLEN/8 t = s * k t = for i in 0 .. s/8-1 .. i in 0 for s/8-1 i in 0 .. for v  endfor endfor s * k t = v   for k in 0, .., 3 k in 0, for endfor endfor s  .., 7 k in 0, for SLD.D n Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved GPR Slide Columns MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ws wd MSA MSA MSA MSA indi- df and 0 wd SLDI.df I ELM 011001 6 5 wd as byte elements, with data format as byte elements, with es) stored row-wise, with as many rows as ws 11 10 itecture Module, Revision 1.12 Revision Module, itecture 287 in the order they appear in the syntax, i.e. first they order in the and wd ws ) ) and then slide it to the left over the concatenation of concatenation the over to the left it slide and then ws t+s-1..t t+s-1..t 16 15 . of vector registers vector of 8j+7..8j wd v df/n  || WR[ws] || WR[ws] are concatenated horizontally are concatenated 8j+7..8j ws to slide left source array. slide left to  v . 22 21 and t+s-1..t t+s-1..t Volume IV-j: The MIPS64® SIMD Arch df contain 2-dimensional byte arrays (rectangl wd t+8i+7..t+8i ws 0000 slide(wd, ws, n) slide(wd,  manipulate the content manipulate 8i+7..8i and (WR[wd] (WR[wd] i + n j  i + n j  i + n WR[wd] wd 26 25 wd[i] WRLEN/4 WRLEN/2 WR[wd] || WR[ws] WR[wd] for i in 0 .. s/8-1 i in 0 .. for for i in 0 .. s/8-1 i in 0 .. for endfor s * k t = v  t = s * k t = v  j  WR[wd] . Place a new destination rectangle over rectangle destination new a Place .    SLDI.df SLDI.df wd,ws[n] SLDI.B wd,ws[n] SLDI.H wd,ws[n] SLDI.W wd,ws[n] SLDI.D Immediate ColumnsSlide ws endfor .., 3 k in 0, for endfor 1 k in 0, for for i in 0 .. WRLEN/8-1 i in 0 for 646556 MSA SLDI.W s SLDI.H s SLDI.B v 011110 n columns.result The is written to vector and then and cating the 2-dimensional byte array layout. rectangles source two The by Restrictions: No data-dependentare possible.exceptions Operation: Immediate number of columns of columns Immediate number Description: Purpose: Vector registers Format: bytes indatainteger format The slide instructionsslide The 31 Immediate Columns Slide Columns Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SLDI.df I itecture Module, Revision 1.12 Revision Module, itecture 288 ) t+s-1..t 8j+7..8j 8j+7..8j v v   || WR[ws] t+s-1..t Volume IV-j: The MIPS64® SIMD Arch t+8i+7..t+8i t+8i+7..t+8i (WR[wd] WR[wd] j  i + n WR[wd] WRLEN/8 for i in 0 .. s/8-1 i in 0 .. for endfor s * k t = v  endfor  endfor .., 7 k in 0, for endfor SLDI.D s Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate Columns Slide Columns Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SLL.df I 3R 001101 specify modulo the size of 6 5 wt wd t t t 11 10 itecture Module, Revision 1.12 Revision Module, itecture 289 t || 0 || 0 || 0 || ws . df of the elements in bits vector || 0 . 16 15 wd 16i+16-t-1..16i 32i+32-t-1..32i 64i+64-t-1..64i 8i+8-t-1..8i WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i are shifted left by the number ws ws[i] << wt[i] ws[i] <<  000 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] WR[wd] t  WR[wd] t  WR[wd] t  WR[wd] t  SLL.df SLL.df wd,ws,wt SLL.B wd,ws,wt SLL.H wd,ws,wt SLL.W wd,ws,wt SLL.D Vector Shift Left Vector endfor endfor endfor endfor 63 MSA SLL.D .. WRLEN/64-1 i in 0 for SLL.W .. WRLEN/32-1 i in 0 for SLL.B .. WRLEN/8-1 i in 0 for SLL.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector bit count shiftleft. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The elements in vector Format: the element inbits. The result is written tovector data format in integer values are and results operands The 31 Vector Shift Left Shift Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SLLI.df I BIT 001001 6 5 . wd wd t t t 11 10 itecture Module, Revision 1.12 Revision Module, itecture 290 t || 0 || 0 || 0 || ws . df || 0 16 15 bits. Theresult is writtento vector 16i+16-t-1..16i 32i+32-t-1..32i 64i+64-t-1..64i m 8i+8-t-1..8i df/m WR[ws] WR[ws] WR[ws]     WR[ws] Volume IV-j: The MIPS64® SIMD Arch ws are shiftedby left ws[i] << m ws[i] <<  000 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     SLLI.df SLLI.df wd,ws,m SLLI.B wd,ws,m SLLI.H wd,ws,m SLLI.W wd,ws,m SLLI.D Immediate Shift Left endfor endfor endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for .. WRLEN/16-1 i in 0 for 63 7 5 5 6 MSA SLLI.D t SLLI.W t SLLI.B t SLLI.H t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate bitcount shift left. Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: vector elements in The Format: data format in integer values are and results operands The 31 Immediate Shift Left Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SPLAT.df I 3R 010100 6 5 . wd wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 291 elements in the destination vector. elements df ws to all elements in vector rt . 16 15 df 64n+63..64n 16n+15..16n 32n+31..32n 8n+7..8n WR[ws] WR[ws] WR[ws]    in all destinationelements. 25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch  ws[rt] 001 df rt 32i+31..32i 64i+63..64i 8i+7..8i 16i+15..16i element with index given by GPR element given withindex ws 26 25 23 22 21 20 wd[i] GPR[rt] % (WRLEN/16) GPR[rt] % (WRLEN/32) GPR[rt] % (WRLEN/64) GPR[rt] GPR[rt] % (WRLEN/8) GPR[rt] WR[wd] WR[wd] WR[wd] WR[wd]     SPLAT.df SPLAT.df wd,ws[rt] SPLAT.B wd,ws[rt] SPLAT.H wd,ws[rt] SPLAT.W wd,ws[rt] SPLAT.D GPR Element Splat Element GPR endfor endfor endfor .. WRLEN/16-1 i in 0 for endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for value is interpreted modulo the number of data format data of number the modulo interpreted is value 63 rt MSA SPLAT.D n SPLAT.H n SPLAT.W n SPLAT.B n 011110 The operands and results are values in data format values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved GPR selected element replicated selected GPR Description: Purpose: Replicate vector Format: GPR 31 GPR Element Splat MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SPLATI.df I ELM 011001 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 292 ws . wd . 16 15 df 16n+15..16n 32n+31..32n 64n+63..64n 8n+7..8n df/n WR[ws] WR[ws] WR[ws] ed in all destination elements.    to all elements in vector to all  WR[ws] ws 22 21 Volume IV-j: The MIPS64® SIMD Arch 0001  ws[n] 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n in vector 26 25 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] SPLATI.df SPLATI.df wd,ws[n] SPLATI.B wd,ws[n] SPLATI.H wd,ws[n] SPLATI.W wd,ws[n] SPLATI.D Immediate Element Splat endfor endfor endfor endfor 646556 MSA SPLATI.D .. WRLEN/64-1 i in 0 for SPLATI.W .. WRLEN/32-1 i in 0 for SPLATI.B .. WRLEN/8-1 i in 0 for SPLATI.H .. WRLEN/16-1 i in 0 for 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in data format values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate selected element replicat Immediate selected element Description: Purpose: Replicate element Format: 31 Immediate Element Splat Element Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRA.df I specify modulo 3R wt 001101 6 5 wd 16i+15..16i+t 32i+31..32i+t 64i+63..64i+t 11 10 itecture Module, Revision 1.12 Revision Module, itecture 293 8i+7..8i+t . mber of bits vector the elements in mber of bits ws wd . df || WR[ws] || WR[ws] || WR[ws] t t t ) ) ) || WR[ws] 16 15 t ) 16i+15 32i+31 64i+63 8i+7 (WR[ws] (WR[ws] (WR[ws]    25556 Theresult is writtento vector  (WR[ws] Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i ws are shifted right arithmetic by the nu ws[i] >> wt[i] ws[i] >>  001 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] WR[wd] t  WR[wd] t  WR[wd] t  WR[wd] t  SRA.df SRA.df wd,ws,wt SRA.B wd,ws,wt SRA.H wd,ws,wt SRA.W wd,ws,wt SRA.D Vector Shift Right Arithmetic Vector endfor endfor endfor endfor 63 MSA SRA.D .. WRLEN/64-1 i in 0 for SRA.W .. WRLEN/32-1 i in 0 for SRA.B .. WRLEN/8-1 i in 0 for SRA.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector bit count shiftright arithmetic. Vector Description: Purpose: The elements in vector Format: in bits. element the size of the The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Vector Arithmetic Right Shift MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRAI.df I BIT 001001 . wd 6 5 wd 16i+15..16i+t 32i+31..32i+t 64i+63..64i+t 11 10 itecture Module, Revision 1.12 Revision Module, itecture 294 8i+7..8i+t ws . bits.The result is written tovector df m || WR[ws] || WR[ws] || WR[ws] t t t ) ) ) || WR[ws] 16 15 t ) 16i+15 32i+31 64i+63 8i+7 df/m (WR[ws] (WR[ws] (WR[ws]     (WR[ws] Volume IV-j: The MIPS64® SIMD Arch ws are shifted by arithmetic right ws[i] >> m ws[i] >>  001 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     SRAI.df SRAI.df wd,ws,m SRAI.B wd,ws,m SRAI.H wd,ws,m SRAI.W wd,ws,m SRAI.D Immediate Shift RightArithmetic endfor endfor endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for .. WRLEN/16-1 i in 0 for 63 7 5 5 6 MSA SRAI.D t SRAI.W t SRAI.B t SRAI.H t 011110 Restrictions: No data-dependentare possible.exceptions Operation: The operands and results are values in integer data format in integer values are and results operands The Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate bitcount shift rightarithmetic. Description: Purpose: vector elements in The Format: 31 Immediate Shift Right Arithmetic Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRAR.df I specify modulo 3R wt 010101 6 5 , 16) , 32) , 64) wd , 8) 16i+3..16i 32i+4..32i 64i+5..64i 8i+2..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 295 , WR[wt] , WR[wt] , WR[wt] , mber of bits vector the elements in mber of bits ws . , WR[wt] df n-1 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i ) + ts b-1..n significant discarded bit is added to the shifted value (for rounding) and the srar(WR[ws] srar(WR[ws] srar(WR[ws] || ts ||    n 25556 )  srar(WR[ws] Volume IV-j: The MIPS64® SIMD Arch . b-1 wd ws are shifted right arithmetic by the nu ws[i] >>(rounded) wt[i] ws[i] >>(rounded)  001 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] ts return ((ts return WR[wd] WR[wd] SRAR.df SRAR.df wd,ws,wt SRAR.B wd,ws,wt SRAR.H wd,ws,wt SRAR.W wd,ws,wt SRAR.D Vector Shift Right ArithmeticRounded Vector endfor endfor endfor n = 0 then if else endif endfor 63 MSA SRAR.W .. WRLEN/32-1 i in 0 for SRAR.D .. WRLEN/64-1 i in 0 for b) n, srar(ts, function endfunction srar endfunction SRAR.B .. WRLEN/8-1 i in 0 for SRAR.H .. WRLEN/16-1 i in 0 for 011110 Vector bit count shiftright arithmetic with rounding Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The elements in vector data format in integer values are and results operands The Format: the size of the element in bits. The most result writtenis tovector 31 Vector Shift Right Arithmetic Rounded Arithmetic Right Shift Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SRAR.df I itecture Module, Revision 1.12 Revision Module, itecture 296 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Shift Right Arithmetic Rounded Arithmetic Right Shift Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRARI.df I BIT 001010 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 297 , m, 16) , m, 32) , m, 64) , . ws wd bits. The most significant discarded bit is added to the . m , m, 8) df n-1 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i ) + ts b-1..n df/m srar(WR[ws] srar(WR[ws] srar(WR[ws] || ts ||    n )  srar(WR[ws] Volume IV-j: The MIPS64® SIMD Arch b-1 are shifted right arithmetic by ws ws[i] >>(rounded) m ws[i] >>(rounded)  010 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] WR[wd] WR[wd] ts return ((ts return WR[wd] WR[wd] SRARI.df SRARI.df wd,ws,m SRARI.B wd,ws,m SRARI.H wd,ws,m SRARI.W wd,ws,m SRARI.D Immediate Shift RightArithmetic Rounded endfor endfor endfor n = 0 then if else endfor endif 63 7 5 5 6 MSA SRARI.W .. WRLEN/32-1 i in 0 for SRARI.D .. WRLEN/64-1 i in 0 for b) n, srar(ts, function SRARI.H .. WRLEN/16-1 i in 0 for endfunction srar endfunction SRARI.B .. WRLEN/8-1 i in 0 for 011110 Immediate bitcount shift rightarithmetic withrounding Description: (for rounding)shifted value and the resultis written vector to Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The in velements ector Format: data format in integer values are and results operands The 31 Immediate Shift Right Arithmetic Rounded Arithmetic Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SRARI.df I itecture Module, Revision 1.12 Revision Module, itecture 298 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Shift Right Arithmetic Rounded Arithmetic Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 ISRL.df 3R 001101 specify modulo the wt 6 5 wd 64i+63..64i+t 11 10 itecture Module, Revision 1.12 Revision Module, itecture 299 ws ber of bits the elements in vector elements in the bits of ber . . df wd || WR[ws] t 16i+15..16i+t 32i+31..32i+t ) 16 15 8i+7..8i+t 64i+63 || WR[ws] || WR[ws] t t || WR[ws] (WR[ws] 0 0 t    25556  0 Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 8i+2..8i 16i+3..16i 32i+4..32i are shifted right logicalbythe num are shifted right ws ws[i] >> wt[i] ws[i] >>  010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i WR[wt] WR[wt] WR[wt] WR[wt] 26 25 23 22 21 20 wd[i] WR[wd] t  WR[wd] t  WR[wd] t  WR[wd] t  SRL.df SRL.df wd,ws,wt SRL.B wd,ws,wt SRL.H wd,ws,wt SRL.W wd,ws,wt SRL.D Vector Shift Right Logical Vector endfor endfor endfor endfor 63 MSA SRL.D .. WRLEN/64-1 i in 0 for SRL.W .. WRLEN/32-1 i in 0 for SRL.B .. WRLEN/8-1 i in 0 for SRL.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector bit count shiftright logical. Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: vector in elements The Format: ofsize the elementin bits. The result is writtentovector data format in integer values are and results operands The 31 Vector Vector Logical Right Shift MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRLI.df I BIT 001001 . wd 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 300 ws . df bits.The result written is tovector 16i+15..16i+t 32i+31..32i+t 64i+63..64i+t m 16 15 8i+7..8i+t || WR[ws] || WR[ws] || WR[ws] t t t df/m || WR[ws] 0 0 0 t     0 Volume IV-j: The MIPS64® SIMD Arch ws are shifted logical by right ws[i] >> m ws[i] >>  010 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] m m m m WR[wd] WR[wd] WR[wd] WR[wd]     SRLI.df SRLI.df wd,ws,m SRLI.B wd,ws,m SRLI.H wd,ws,m SRLI.W wd,ws,m SRLI.D Immediate Shift RightLogical endfor endfor endfor .. WRLEN/32-1 i in 0 for .. WRLEN/64-1 i in 0 for endfor for i in 0 .. WRLEN/8-1 i in 0 for .. WRLEN/16-1 i in 0 for 63 7 5 5 6 MSA SRLI.D t SRLI.W t SRLI.B t SRLI.H t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate bitcount shift rightlogical. Description: Purpose: vector elements in The Format: The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Shift Right Logical Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRLR.df I 3R 010101 specify modulo the wt 6 5 , 16) , 32) , 64) wd , 8) 16i+3..16i 32i+4..32i 64i+5..64i 8i+2..8i 11 10 itecture Module, Revision 1.12 Revision Module, itecture 301 , WR[wt] , WR[wt] , WR[wt] , ws ber of bits the elements in vector elements in the bits of ber . t is added to the shifted value (for rounding) and the result rounding) (for value shifted the to added is t , WR[wt] df 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i n-1 ) + ts srlr(WR[ws] srlr(WR[ws] srlr(WR[ws] b-1..n    25556  srlr(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || ts are shifted right logicalbythe num are shifted right n ws ws[i] >>(rounded) wt[i] ws[i] >>(rounded)  . 010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i wd 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] ts return (0 return WR[wd] WR[wd] SRLR.df SRLR.df wd,ws,wt SRLR.B wd,ws,wt SRLR.H wd,ws,wt SRLR.W wd,ws,wt SRLR.D Vector Shift Right LogicalRounded Vector endfor endfor endfor n = 0 then if else endif endfor 63 MSA SRLR.W .. WRLEN/32-1 i in 0 for SRLR.D .. WRLEN/64-1 i in 0 for b) n, srlr(ts, function endfunction srlr endfunction SRLR.H .. WRLEN/16-1 i in 0 for SRLR.B .. WRLEN/8-1 i in 0 for 011110 Vector bit count shiftright logical with rounding Vector Description: Purpose: vector in elements The Format: discarded bi significant The most in bits. the element of size is written tovector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Shift Right Logical Rounded Logical Right Shift Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SRLR.df I itecture Module, Revision 1.12 Revision Module, itecture 302 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Vector Shift Right Logical Rounded Logical Right Shift Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SRLRI.df I BIT 001010 6 5 wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 303 , m, 16) , m, 32) , m, 64) , ws . , m, 8) df . bits. The most significant discarded bit is added to the shifted to the discardedis added bit significant mostThe bits. wd m 16i+15..16i 32i+31..32i 64i+63..64i 16 15 8i+7..8i n-1 ) + ts df/m srlr(WR[ws] srlr(WR[ws] srlr(WR[ws] b-1..n     srlr(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || ts are shifted right logical by logical right shifted are n ws ws[i] >>(rounded) m ws[i] >>(rounded)  011 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i 26 25 23 22 wd[i] WR[wd] WR[wd] ts return (0 return WR[wd] WR[wd] SRLRI.df SRLRI.df wd,ws,m SRLRI.B wd,ws,m SRLRI.H wd,ws,m SRLRI.W wd,ws,m SRLRI.D Immediate Shift RightLogical Rounded endfor endfor endfor n = 0 then if else endfor endif 63 7 5 5 6 MSA SRLRI.W .. WRLEN/32-1 i in 0 for SRLRI.D .. WRLEN/64-1 i in 0 for b) n, srlr(ts, function SRLRI.H .. WRLEN/16-1 i in 0 for endfunction srlr endfunction SRLRI.B .. WRLEN/8-1 i in 0 for 011110 Immediate bitcount shift rightlogical withrounding Description: Purpose: elements in vector The Format: (forrounding) value andthe resultis written to vector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Immediate Shift Right Logical Rounded Logical Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SRLRI.df I itecture Module, Revision 1.12 Revision Module, itecture 304 Volume IV-j: The MIPS64® SIMD Arch Exceptions: Exceptions: Disabled Exception. MSA Exception, Instruction Reserved Immediate Shift Right Logical Rounded Logical Right Shift Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

. df MSA MSA MSA MSA ST.df I 2 and the rs df 2 1 0 4 1001 MI10 omic at the element level 6 5 5 wd omic operation issued in no particular at the effective memory location addressed effective the at df 11 10 or store instruction is at itecture Module, Revision 1.12 Revision Module, itecture 305  wd[i] rs e in bytes and have to be multiple of e the size of in bytes and the have data , a, WRLEN/64) to form the effective memory location address. memory to form the effective rs . , a, WRLEN/16) s10 16 15 , a, WRLEN/8) , a, WRLEN/32) , a, WRLEN-1..0 bitfield value dividing the byte offset sizebythe of the data format the byteoffset dividing value bitfield ts, i.e. each element store is an at WRLEN-1..0 register plus offset memory address.plus offset register s10 WRLEN-1..0 WRLEN-1..0 05 have noalignment restrictions. have s10 are stored as elements of data format of elements as are stored wd units is added to the base the added to unitsis df Volume IV-j: The MIPS64® SIMD Arch element's vector position. vector element's address a. */ address 26 25 memory[rs + s10 + i * sizeof(wd[i])] i * sizeof(wd[i])] + s10 + memory[rs rs + s10 * 2 rs + s10 * 8 rs + s10 rs + s10 rs + s10 * 4 rs + s10 / 8 bytes in vector 8 bytes / and the 10-bit signed immediate offset 10-bit and the rs     ST.df ST.df wd,s10(rs) ST.B wd,s10(rs) ST.H wd,s10(rs) ST.W wd,s10(rs) ST.D Vector Store Vector . The. assembler determines the StoreHalfwordVector(WR[wd] StoreDoublewordVector(WR[wd] StoreByteVector(WR[wd] StoreWordVector(WR[wd] virtual tt to vector n byte store defined Implementation /* to virtual tt vector n halfword store defined Implementation /* offset in data format in offset df 61 MSA WRLEN s10 ST.H a ST.D a ST.B a ST.W a a, n) StoreByteVector(tt, function StoreByteVector endfunction n) a, StoreHalfwordVector(tt, function 011110 Restrictions: Address-dependentexceptions are possible. Operation: Vector store to base element-by-element Vector Description: Purpose: The Format: The with no guaranteed ordering among elemen the order to with respect in the assembly language By ar convention, syntax all offsets format effective memory location address address memory location effective memory location address is element aligned, the vect If the effective by thebase 31 Vector Store Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

ST.df I Error Exceptions. itecture Module, Revision 1.12 Revision Module, itecture 306 tor A Disabled Exception. DataA access Disabled Exception. TLB Address and Volume IV-j: The MIPS64® SIMD Arch address a. */ a. address address a. */ address a. */ address /* Implementation defined store n word vector tt to virtual tt to vector n word store defined Implementation /* tt to virtual vector n doubleword store defined Implementation /* endfunction StoreHalfwordVec endfunction function StoreWordVector(tt, a, n) a, StoreWordVector(tt, function StoreWordVector endfunction n) a, StoreDoublewordVector(tt, function StoreDoublewordVector endfunction Exceptions: Exceptions: Exception, MS Instruction Reserved Vector Store Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 . wd SUBS_S.df I 3R 010001 , 16) , 32) , 64) 6 5 , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i writing the result to vector . Signed arithmetic is performed and o ver- 8i+7..8i ws 11 10 , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 307 ws , WR[wt] . df 16i+15..16i 32i+31..32i 64i+63..64i then then 8i+7..8i ble signed values before ble signed values 16 15 n-b+1 n-b+1 1 0  

subs_s(WR[ws] subs_s(WR[ws] subs_s(WR[ws] b-1 b-1 rating the result as signed value. n-1..b-1 n-1..b-1    25556  subs_s(WR[ws] || 1 || 0 Volume IV-j: The MIPS64® SIMD Arch are subtracted the elemefrom nts in vector btract of Signed Values Signed of btract n-b+1 n-b+1 wt saturate_signed(signed(ws[i]) - signed(wt[i])) saturate_signed(signed(ws[i])  000 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 0 and tt = 1 and tt = n-1 n-1 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] return 1 return tt return WR[wd] WR[wd] return 0 return SUBS_S.df SUBS_S.df wd,ws,wt SUBS_S.B wd,ws,wt SUBS_S.H wd,ws,wt SUBS_S.W wd,ws,wt SUBS_S.D Vector Signed SaturatedSubtract of Signed Values Vector endfor endfor endfor if tt endif if tt else endif endfor 63 MSA SUBS_S.W .. WRLEN/32-1 i in 0 for SUBS_S.D .. WRLEN/64-1 i in 0 for b) n, sat_s(tt, function endfunction sat_s endfunction SUBS_S.H .. WRLEN/16-1 i in 0 for SUBS_S.B .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: flows clamp to the largest and/or smallest representa and/or smallest largest clamp to the flows Vector subtraction from vector satu vector subtraction from Vector Description: Purpose: The elements in vector Format: 31 Vector Signed Saturated Su Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SUBS_S.df I itecture Module, Revision 1.12 Revision Module, itecture 308 || tt) n-1 Volume IV-j: The MIPS64® SIMD Arch btract of Signed Values Signed of btract || ts) - (tt || ts) - n-1 (ts t  return sat_s(t, n+1, n) n+1, sat_s(t, return function subs_s(ts, tt, n) subs_s(ts, function endfunction subs_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Signed Saturated Su Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SUBS_U.df I 3R 010001 , 16) , 32) , 64) 6 5 , 8) wd 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i . Unsigned arithmetic is performed and under- is performed and arithmetic Unsigned . 11 10 ws , WR[wt] , WR[wt] , WR[wt] itecture Module, Revision 1.12 Revision Module, itecture 309 ws , WR[wt] . df 16i+15..16i 32i+31..32i 64i+63..64i . 8i+7..8i wd 16 15 subs_u(WR[ws] subs_u(WR[ws] subs_u(WR[ws] rating the result as unsignedvalue. b    25556 then  subs_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || 1 n-b are subtracted from the elements in vector the elements from are subtracted 0 n-b wt saturate_unsigned(unsigned(ws[i]) - unsigned(wt[i])) saturate_unsigned(unsigned(ws[i])   001 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n-1..b 26 25 23 22 21 20 wd[i] (0 || ts) - (0 || tt) - (0 (0 || ts) return 0 return WR[wd] WR[wd] return tt return WR[wd] WR[wd] SUBS_U.df SUBS_U.df wd,ws,wt SUBS_U.B wd,ws,wt SUBS_U.H wd,ws,wt SUBS_U.W wd,ws,wt SUBS_U.D Vector Unsigned Saturated SubtractUnsigned of Values Vector endfor endfor endfor if tt else endif t  endfor 63 MSA SUBS_U.W .. WRLEN/32-1 i in 0 for SUBS_U.D .. WRLEN/64-1 i in 0 for b) n, sat_u(tt, function endfunction sat_u endfunction n) tt, subs_u(ts, function SUBS_U.H .. WRLEN/16-1 i in 0 for SUBS_U.B .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: Vector subtraction from vector satu vector subtraction from Vector Description: Purpose: in vector elements The Format: to vector the result writing clamp to 0 before flows 31 Vector Unsigned Saturated Subtract of Unsigned Values of Unsigned Subtract Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SUBS_U.df I itecture Module, Revision 1.12 Revision Module, itecture 310 Volume IV-j: The MIPS64® SIMD Arch = 0 n return sat_u(t, n+1, n) n+1, sat_u(t, return 0 return if t else endfunction subs_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Unsigned Saturated Subtract of Unsigned Values of Unsigned Subtract Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 3R SUBSUS_U.df I 010001 . The signe d is result , 16) , 32) , 64) ws 6 5 , 8) 16i+15..16i 32i+31..32i 64i+63..64i wd 8i+7..8i , WR[wt] , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 311 signed elements in v ector , WR[wt] ws . 16i+15..16i 32i+31..32i 64i+63..64i df 8i+7..8i 16 15 values saturating the resultsas unsignedvalues values. act of Signed from Unsigned Signed from act of . wd || tt) subsus_u(WR[ws] subsus_u(WR[ws] subsus_u(WR[ws] n-1 are subtracted from the un b    wt 25556 then  subsus_u(WR[ws] Volume IV-j: The MIPS64® SIMD Arch || 1 n-b 0 n-b saturate_unsigned(unsigned(ws[i]) - signed(wt[i])) saturate_unsigned(unsigned(ws[i])   010 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i n-1..b 26 25 23 22 21 20 wd[i] (0 || ts) - (tt (0 || ts) WR[wd] WR[wd] 0 return return tt return WR[wd] WR[wd] SUBSUS_U.df SUBSUS_U.df wd,ws,wt SUBSUS_U.B wd,ws,wt SUBSUS_U.H wd,ws,wt SUBSUS_U.W wd,ws,wt SUBSUS_U.D Vector Unsigned Saturated Subtr Vector endfor endfor endfor if tt else endif t  endfor 63 MSA SUBSUS_U.W .. WRLEN/32-1 i in 0 for SUBSUS_U.D .. WRLEN/64-1 i in 0 for b) n, sat_u(tt, function endfunction sat_u endfunction tt, n) subsus_u(ts, function SUBSUS_U.H .. WRLEN/16-1 i in 0 for SUBSUS_U.B .. WRLEN/8-1 i in 0 for 011110 Purpose: The signed elements in v ector Format: from unsigned subtraction of signed values Vector Description: unsigned saturated andwritten to vector The operands and results are values in integer data format in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: 31 Vector Unsigned Saturated Subtract of Signed Unsigned of Signed from Subtract Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SUBSUS_U.df I itecture Module, Revision 1.12 Revision Module, itecture 312 Volume IV-j: The MIPS64® SIMD Arch = 0 n return sat_u(t, n+1, n) n+1, sat_u(t, return 0 return if t else endfunction subsus_u endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Unsigned Saturated Subtract of Signed Unsigned of Signed from Subtract Saturated Unsigned Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 3R SUBSUU_S.df I 010001 . The si gned result is , 16) , 32) , 64) ws 6 5 , 8) 16i+15..16i 32i+31..32i 64i+63..64i wd 8i+7..8i , WR[wt] , WR[wt] , WR[wt] 11 10 itecture Module, Revision 1.12 Revision Module, itecture 313 , WR[wt] unsigned in velements ector ws . 16i+15..16i 32i+31..32i 64i+63..64i df 8i+7..8i then then 16 15 n-b+1 n-b+1 1 0  

. btract of Unsigned Values btract wd subsuu_s(WR[ws] subsuu_s(WR[ws] subsuu_s(WR[ws] are subtracted from the b-1 b-1 wt n-1..b-1 n-1..b-1    25556 of unsigned values saturating the results as signed values. as signed the results saturating unsigned values of  subsuu_s(WR[ws] || 1 || 0 Volume IV-j: The MIPS64® SIMD Arch btract of Unsigned Values Unsigned of btract n-b+1 n-b+1 saturate_signed(unsigned(ws[i]) - unsigned(wt[i])) - saturate_signed(unsigned(ws[i])  011 df wt 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i = 0 and tt = 1 and tt = n-1 n-1 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] return 1 return tt return WR[wd] WR[wd] return 0 return SUBSUU_S.df SUBSUU_S.df wd,ws,wt SUBSUU_S.B wd,ws,wt SUBSUU_S.H wd,ws,wt SUBSUU_S.W wd,ws,wt SUBSUU_S.D Vector Signed SaturatedSu Vector endfor endfor endfor if tt endif if tt else endif endfor 63 MSA SUBSUU_S.W .. WRLEN/32-1 i in 0 for SUBSUU_S.D .. WRLEN/64-1 i in 0 for b) n, sat_s(tt, function endfunction sat_s endfunction SUBSUU_S.H .. WRLEN/16-1 i in 0 for SUBSUU_S.B .. WRLEN/8-1 i in 0 for 011110 Vector subtraction from vector subtraction from Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The unsigned elements in vector Format: signedsaturated and writtento vector data format in integer values are and results operands The 31 Vector Signed Saturated Su Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

SUBSUU_S.df I itecture Module, Revision 1.12 Revision Module, itecture 314 Volume IV-j: The MIPS64® SIMD Arch btract of Unsigned Values Unsigned of btract s(t, n+1, n) s(t, n+1, (0 || ts) - (0 || tt) - (0 (0 || ts) t  return sat_ return function subsuu_s(ts, tt, n) subsuu_s(ts, function endfunction subsuu_s endfunction Exceptions: Exceptions: Exception. MSA Disabled Exception, Instruction Reserved Vector Signed Saturated Su Saturated Signed Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 . SUBV.df I wd 3R 001110 6 5 wd . The resultis written to vector 16i+15..16i 32i+31..32i 64i+63..64i 11 10 ws itecture Module, Revision 1.12 Revision Module, itecture 315 8i+7..8i ws . df - WR[wt] - WR[wt] - WR[wt] - WR[wt] 16 15 the elements in vector in elements the 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch are subtracted from subtracted are wt  ws[i] - wt[i] 001 df wt 64i+63..64i 16i+15..16i 32i+31..32i 8i+7..8i 26 25 23 22 21 20 wd[i] WR[wd] WR[wd] WR[wd] WR[wd] SUBV.df SUBV.df wd,ws,wt SUBV.B wd,ws,wt SUBV.H wd,ws,wt SUBV.W wd,ws,wt SUBV.D Vector Subtract Vector endfor endfor endfor endfor 63 MSA SUBV.W .. WRLEN/32-1 i in 0 for SUBV.D .. WRLEN/64-1 i in 0 for SUBV.B .. WRLEN/8-1 i in 0 for SUBV.H .. WRLEN/16-1 i in 0 for 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector subtraction from vector. subtraction from Vector Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: vector elements in The Format: data format in integer values are and results operands The 31 Vector Subtract Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA MSA MSA MSA 0 SUBVI.df I I5 000110 . The r esult is written to 6 5 ws wd 11 10 itecture Module, Revision 1.12 Revision Module, itecture 316 the elements in v ector ws . df - t - t - t - t 16 15 16i+15..16i 32i+31..32i 64i+63..64i 8i+7..8i is subtracted from u5 WR[ws] WR[ws] WR[ws]    25556  WR[ws] Volume IV-j: The MIPS64® SIMD Arch 4..0 4..0 4..0 4..0  ws[i] - u5 001 df u5 64i+63..64i 8i+7..8i 16i+15..16i 32i+31..32i || u5 || || u5 || u5 || || u5 59 11 27 3 26 25 23 22 21 20 wd[i] 0 0 0 0 WR[wd] WR[wd] WR[wd] WR[wd]     SUBVI.df SUBVI.df wd,ws,u5 SUBVI.B wd,ws,u5 SUBVI.H wd,ws,u5 SUBVI.W wd,ws,u5 SUBVI.D Immediate Subtract . endfor endfor endfor endfor .. WRLEN/64-1 i in 0 for for i in 0 .. WRLEN/32-1 i in 0 for for i in 0 .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for wd 63 MSA SUBVI.W t SUBVI.D t SUBVI.H t SUBVI.B t 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Immediate subtraction from vector. Immediate subtraction from Description: Restrictions: No data-dependentare possible.exceptions Operation: Purpose: The 5-bit immediate unsigned value data format in integer values are and results operands The Format: vector 31 Immediate Subtract Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

wt , into MSA MSA MSA MSA ws wt 0 VSHF.df I and ws 3R 010101 6 5 wd concatenation of vectors 11 10 itecture Module, Revision 1.12 Revision Module, itecture 317 ws . . df wd 16 15 data elements from the ntrol preserving thevector inputdata vectors. 16k+15..16k bit 6 or bit 7 is 1, there will the but rather be destination no copy, element 0 v 0 then 8k+7..8k 0 then mod (WRLEN/8) mod (WRLEN/16)     mod (WRLEN/4)

 0  v ding control element in ding control elements modulo the number of elements in the concatenated vectors 25556 wd Volume IV-j: The MIPS64® SIMD Arch 16i+5..16i 32i+5..32i 8i+5..8i 16i+15..16i 16i+15..16i 8i+7..8i 8i+7..8i 16i+7..16i+6 8i+7..8i+6 000 df wt vector_shuffle(control(wd), ws, wt)  vector_shuffle(control(wd), WR[wd] WR[wd] WR[wd] WR[wd] WR[wd] WR[wd] WR[wd] 26 25 23 22 21 20 wd WR[ws] || WR[wt] WR[ws] WR[ws] || WR[wt] WR[ws] WR[ws] || WR[wt] WR[ws] if WR[wd] endif k  endif k  k  else else if WR[wd]    VSHF.df VSHF.df wd,ws,wt VSHF.B wd,ws,wt VSHF.H wd,ws,wt VSHF.W wd,ws,wt VSHF.D Vector Data Preserving Shuffle Vector based on the correspon the on based endfor .. WRLEN/32-1 i in 0 for endfor .. WRLEN/16-1 i in 0 for for i in 0 .. WRLEN/8-1 i in 0 for wd 63 MSA VSHF.W v VSHF.H v VSHF.B v 011110 Restrictions: No data-dependentare possible.exceptions Operation: specify the index of the source element. If 0. to set is data format in integer values are and results operands The vector The least significant 6 bits in Vector elements selective copy based on the co on the based copy selective elements Vector Purpose: The vector shuffle instructions selectively copy Format: Description: 31 Vector Data Preserving Shuffle Preserving Data Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

VSHF.df I itecture Module, Revision 1.12 Revision Module, itecture 318 32k+31..32k 64k+63..64k 0 v 0 v 0 then 0 then mod (WRLEN/32)       Volume IV-j: The MIPS64® SIMD Arch 64i+5..64i 32i+31..32i 32i+31..32i 64i+63..64i 64i+63..64i 64i+7..64i+6 32i+7..32i+6 WR[wd] WR[wd] WR[wd] WR[wd] WR[wd] WR[ws] || WR[wt] WR[ws] if WR[wd] endif endif k  else else if WR[wd]  endfor endfor .. WRLEN/64-1 i in 0 for VSHF.D v Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Vector Data Preserving Shuffle Preserving Data Vector MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 XOR.V VEC 011110 6 5 wd in a bi twise logical XOR operation. The wt 11 10 itecture Module, Revision 1.12 Revision Module, itecture 319 ws 16 15 wt 21 20 Volume IV-j: The MIPS64® SIMD Arch . wd 00011 is combined with the corresponding bit of vector ws ws XOR wt  ws 26 25 wd XOR.V XOR.V wd,ws,wt XOR.V Vector Logical Exclusive Or Logical Exclusive Vector 6 5555 6 MSA WR[ws] xor WR[wt] xor  WR[ws] WR[wd] 011110 Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each bit of vector Format: Purpose: or. logical exclusive vector by Vector Description: The operands and results are bit vector values. vector bit are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector 31 Vector Logical Exclusive Or MIPS® Architecture for Programmers Programmers MIPS® Architecture for

MSA 0 XORI.B I I8 000000 6 5 wd in a bitwis e logical XOR operation. The i8 11 10 itecture Module, Revision 1.12 Revision Module, itecture 320 5 5 6 ws 7..0 16 15 ..8i xor i8 ..8i xor i8 8i+7 is combined with the 8-bit immediate ws Volume IV-j: The MIPS64® SIMD Arch .  WR[ws] wd ws[i] XOR i8 XOR ws[i]  11 8i+7..8i 26 25 24 23 wd[i] XORI.B XORI.B wd,ws,i8 XORI.B Immediate Logical Exclusive Or Exclusive Immediate Logical WR[wd] 62 8 MSA endfor for i in 0 .. WRLEN/8-1 i in 0 for 011110 The operands and results are values in integer byte data format. in integer values are and results operands The Restrictions: No data-dependentare possible.exceptions Operation: result writtenis tovector Exceptions: Exceptions: MSA Disabled Exception. Exception, Instruction Reserved Each element byte of vector Format: Purpose: or. logical exclusive vector Immediate by Description: 31 Immediate Logical Exclusive Or Exclusive Logical Immediate MIPS® Architecture for Programmers Programmers MIPS® Architecture for

, …, , 0 . The success- The . . MSAMap sters and the MSA Access Dis- Access MSA and the sters , the hardware finds a free physical physical free a finds the hardware , . When the hardware runs out of physi- out runs hardware the When . rdware/software interface used for vec- used interface rdware/software Section 3.5 “Exceptions” , and 64 physical vector registers pv registers 64 physical vector and , 3 and MSAMap W5 W0 W2 N/A N/A , …, tc …, , Register itecture Module, Revision 1.12 Revision Module, itecture 321 0 Architecture Architecture an 32 physical vector registers per hardware thread per hardware registers vector 32 physical an in the 4 hardware thread contexts. Hypothetically, the inthe4 hardware thread Hypothetically, contexts. lies on the MSAon control regi lies 3 3 0 tc tc tc none none Hardware Hardware Thread Context . to as many vector registers as needed as registers vector many as to 2 4 0 1 esented in the following sections. The ha in the following esented 63 Table A.1 ……… pv pv pv pv pv the running threads or processes tofor theaccommodate pending requests. Section 3.4 “MSA Control Registers” Register Physical Physical Volume IV-j: The MIPS64® SIMD Arch . For example, on writing 1 to 1 on writing example, For MSAAccess. . Each hardware thread context has its own set of MSA controlregisters. its has own thread context . Each hardware 63 Table A.1Table Internal) (Hardware Mapping Register Vector Context Physical-to-Thread The OS grants a vector register hardwareto a OSregister thread by writingtoThe grants vector context a index the register The hardware maintainshardware The look-up a table of the thewitharchitecturally mapping to any of the 64physicalregisters …, W31 usableW0, from registers with 32 vector defined look-up in tablecould shown be as pv Let’s assume an implementation with 4 hardware thread contexts tc contexts thread 4 hardware withimplementation assume an Let’s MSA allows for multi-threadedMSAallows implementations th with fewer the OS re-schedulescal registers, mapping The actual switching. context on software registers the vector restoring and for saving OS is responsible The of thephysicalto registers the thread managedis contexts by the hardware itselftoandthe soft- it is totally invisible ware. pr process is of the this An overview context. The thread contexts have access access have contexts The thread context. tor register allocation and software context switching re allocation context and software tor register in described all Exception, abled ful mapping is confirmed in confirmed is mapping ful MIPS® Architecture for Programmers Programmers MIPS® Architecture for A.1 Mapping Registers Vector Vector Registers Partitioning Registers Vector Appendix A Appendix

322 , …, …, , 0 already Table 0 bits set). are

W1 ). , … Two look-up tables , … Two 12 , s , Saved Registers on Context Switch MSAAccess 11 Registers , s Table A.4 (Register List) ). Now that the context tc that the context Now ). and 10

control register changes from changes control register W2 , …, s , and 64 physical vector registers pv registers vector and 64 physical , 0 3 1 Table A.2 Table using vector register W2.runningother The register using vector 0 W5 W0 W2 register W1 saved from a previous run. from a previous W1saved register N/A W1 Saved Saved , …, tc , Register itecture Module, Revision 1.12 Revision Module, itecture Registers 0 (Hex Mask) (Hex MSAAccess MSAAccess Architecture Architecture MSAAccess

0 does not change. To confirm the avail- confirm MSAAccess notdoes change.To ware context mapping and previously saved vector registers registers vector saved previously and mapping context ware A.2 Saving/Restoring Vector 3 3 0 0 tc tc tc tc none waiting 0x00000002 W1 on thread tc context running onrunning onrunning 0x00000000 0x00000000 none none Hardware Hardware 10 ster usage for each software thread ( software for each usage ster Thread Context MSAAccess. Registers on Context Switch on Context Registers tion, i.e. 4 thread contexts tc contexts 4 thread i.e. tion, 1 2 4 0 0 3 63 bit set) to 0x00000006 (now bitset) to 0x00000006(now

……… tc tc pv pv pv pv N/A pv W2 Register Physical Physical Hardware Hardware , and updates its internal look-up table (see updates its internal and , Volume IV-j: The MIPS64® SIMD Arch is waiting to scheduledis waiting be andhas vector 0 Thread Context Status 1. Updated entry. 1. Updated 12 show softwarethread s show Table A.3Table (OS Internal) Table Mapping Context MSAAccess using W0 and W5. The hardware view of this configuration has been presented above in of has been presented thisconfiguration using above W0 and W5.view The hardware 3 , thread s , 10 11 12 s s s Table A.4 Thread Software Software on tc on 11 and , theOS switching managesthecontext for a set of software threads, s Table A.3 In Table . 63 ) and the second with the vector regi vector ) A.3 with Tableand the second the 0x00000004(only using W2 is being granted access to vector register is W1, the tc is W1, register vector access to granted W2being is using thread is s Using the above hardware implementa hardware the above Using pv A.1 register, maps it to W1 for tc for W1 to it maps register, are used for this purpose: one with the status of the soft the the status of one with are purpose: used for this ( ability, theOS should backread and check ability, If the hardware runs out of physical vector registers to map, theregisters runs outof the physical vector If hardware Table A.3 Table A.2 Table Internal) (Hardware Mapping Register Vector Context Physical-to-Thread Updated MIPS® Architecture for Programmers Programmers MIPS® Architecture for A.2 Vector Saving/Restoring

. 12 10 with- 0 on tc or in the s 12 , Revision 1.12 is clear. Then, is clear. W1 /W2 belongs to s /W2tobelongs 0 , and clearing the clearing , and , which allows for sav- , which allows 10 . Restoring W1 requires W1 Restoring . 1 W1 MSAAccess W2 Saved to start running s Registers Table A.3 Table (Register List) MSASave it is known that tc that known is it . Exception. The exception is signaled exception Exception. The when this register is requested. is this register when and MSAAccess s vector register W1, the hardware sets hardware W1, the register vector s (Register List) spect of the saved value. In this context, the context, In this value. saved the of spect 12 W2 Saved Saved Table A.4 Table all the bits set in either in set bits the all Registers MSAMap (Hex Mask) (Hex . What the OS does is the OS does . What 0 MSASave 0x000000040x00000021 W2 W5 W0, (Hex Mask) (Hex on tc on MSAAccess MSASave 12 . Because W1 has been written, the hardware will set hardware the has been written, W1 Because . er already saved for s er already saved waiting 0x00000004 W1 is set, and W1 is not available i.e. MSAAccess andW1is set, not is available and s running onrunning onrunning 0x00000000 0x00000002 none W1 10 s Volume IV-j: Module The MIPS64® SIMD Architecture W2 0 3 tc tc has two bits set: , but setting in setting but , according to the Saved Registers Mask in Registers according to the Saved Hardware Hardware MSASave 12 Thread Context is set. From the register usage is set. From the register and signals the MSA Access Disabled Disabled signals the MSA Access and 3 0 MSASave tc tc runs writes vector register W2 and read register vector runs writes N/A W2 W2 0 MSASave Hardware Hardware /tc 12 MSAAccess Thread Context Status

and restoring W1 regist restoring and 10 11 0 because the restored W1 is not changed with re with is not changed W1 restored because the s s 10 Table A.4Table Internal) (OS Table Usage Register show the software context mapping / saved registers and the vector register usage look-upregister and the vector registers mappingcontext the software / saved show Thread W1 MSASave Software . MSARequest . Table A.5Table (OS Internal) Table Mapping Context Updated , 10 11 12 W1 s s s W2 Table A.6 W1 Thread Software Software MSAModify and Saved Registers Mask bit W1 is still relevant and shouldRegisters Maskas bit set. be preserved W1is still Saved relevant ers Partitioning 12 MSAModify MSASave a vector load followed by clearing clearing by followed load vector a s Saving W2 requires a vector store followed by setting bit 2 in Saved Registers Mask Registers storeby setting offollowed s W2 requires a vector bit2 Saving inSaved the OS actions:willtake the following •W2 because Save MSARequest If the first MSA instruction s instruction MSA first the If saved registers mask. Therefore registers saved Table A.5 tables after these updates. after these tables Let’s suppose there is context switch between s suppose there is context Let’s because W2 needs to be saved, i.e. because be saved, W2 needs to • for W1 bywritingregister 1 to physical vector Requesta new • Clear ing W2 register used by s by used register W2 ing out changing the current tc the changing out •used by s W1 the previous Restore 323Programmer MIPS® Architecture for VectorRegist

1

324 , the , 1 Table A.6 Table / when re-sched- when 3 Table A.5 software thread starts with with starts thread software me, while the other threads while me, is the next free register, pv free register, next the is 1 1 . Then, the hardware will mark will mark hardware the . Then, used by the new software thread are thread software by the new used ector registers per thread context thread context per registers ector the same thread context (intra re-alloca- context thread same the ing the architecture register index to index architecture register ing the W1, W2 W1, SectionA.2 “Saving/RestoringVector vector register W0 on tc register vector MSAAccess (Register List) MSAUnmap A.3 Registers Vector Physical Re-allocating itecture Module, Revision 1.12 Revision Module, itecture or registers for longer ti for longer registers or control register. If the new If the new control register. , as free. In a different thread context, let’s say tc say let’s context, thread different free. In a as , 3 (between different hardware thread contexts), threadmandateshardware contexts), (betweendifferent e particular mapping is — it can always access the same access — e it can always particular mapping is 0x000000210x00000006 W5 W0, (Hex Mask) (Hex MSAAccess p them by writing the corre- them by writing ed unmap for re-allocation and , and writes 0 to and writes , 11 MSASave . software decides to free up decides software 12 used for W0/tc used ware threads with a lighter usage pattern. witha lighter threads ware 3 0 tc tc , it is guaranteed all vector registers registers , it is guaranteed all vector A implementation withthan less 32 v this procedure is presented above in above presented procedure is this Hardware Hardware read might use lots of vect of use lots might read from one software thread to another on another to threadsoftware one from ead context/architecture register by writ by register ead context/architecture Thread Context Table A.2 Table changed to s to changed 10 . To exemplify, let’s start from the configuration shown in startfrom theshown configuration let’s exemplify, To . MSAAccess . Volume IV-j: The MIPS64® SIMD Arch 11 12 s s rdware thread context. rdware (hardware view). If the view). (hardware Thread Software MSAUnmap Table A.6Table Internal) (OS Table Register Usage Updated for W9. 1. Updated entry, s 1. Updated entry, 1 Table A.2 Table being identicalto . It is not relevant if the software knows what th knows software if the . not It is relevant 1. entry. Updated , then it saves W0, marks W0 as saved for s as W0saved W0, marks then it saves , 11 , i.e. the hypothetical mapping in mapping hypothetical i.e. the , 1 pv software could now map a new vector register, decidespve.g. W9, and if the , vector map a new could now software will be used by tc Inter-thread contexts physical vector registers re-allocation registers physical vector contexts Inter-thread properly saved/restored. An example of An example properly saved/restored. Registerson Context Switch” intend all the registers to save thread context the owner sponding indexes to spondingindexes depends the actual register usage at run-time and the OS scheduling strategy. dependsthe actual run-timeusage at register and theOS schedulingstrategy. th one software a typical application, In The performance of a multithreaded MS performance of The MSAMap and view) (OS A physical register is A mappedphysical register to a thr uling s MSASave register fromregister thesame ha re-allocation registers vector Physical the in bits corresponding the setting by done is tion) TheOS could schedule themost threaddemanding on the same thread context, software few. sporadicallyvery use soft for the another context while time-sharing MIPS® Architecture for Programmers Programmers MIPS® Architecture for A.4 Register Allocation Vector Heuristic for A.3 Registers Vector Re-allocating Physical

ignment restrictions. operates float- with on the vector registers. vector on the odify and Request control regis- and odify 2008/ABS2008 requirements. 2008/ABS2008 nusable exception for privileged MSA for privileged nusable exception ption pseudocode clarification. ption e instructions FCGT, FSGT, FCGE, FSGE. FCGE, FSGT, FCGT, instructions e e context of vector registers partitioning. registers vector of context e instructions with no al instructions and FPU is usable S) for both subnormal input operands and tiny and tiny operands input both subnormal for S) (FCAF, FSAF, FCUEQ, FSUEQ, FCULT, FCUEQ, FSUEQ, FCULT, FSAF, (FCAF, h to zero to in compare instructions. h de the 128-bit wide vector registers. vector the wide de 128-bit after changing FR from 0 to 1 and from 1 to 0. from 0 and to 1 FR from changing after rounding (SRAR, SRARI. SRLR, SRLRI) and hor- (SRAR, SRARI. rounding Description ify that the output is always a signaling NaN value a signaling NaN value ify is that output always the itecture Module, Revision 1.12 Revision Module, itecture 325 on for out of range numeric operands. for out on FRCP, and FRSQRT instructions. FRSQRT and FRCP, gh read/write operations read/write operations gh on detected when NX is set. NX is when detected on e Release 5 and FPU Release 5 and NAN e UNPREDICTABLE MSAEn bit and MSA Access, Save, M MSAEn Access, Save, and MSA bit udocode update. udocode NaN definition, non-trapping exce NaN definition, S Architecture Release 5. Release Architecture S control registers ing-point registers in 32-bit mode. registers ing-point ters zero. is excepti floating-point any for non-zero results. non-zero add/sub izontal (HADD_S,HADD_U, HSUB_S,HSUB_U). FCULE,FSULE, FSUN,FCOR, FSOR,FCUNE, FSUNE). FSULT, MSUBR_Q. MSUB_Q, •typo fix. LDX/STX pseudocode • clarification. description FLOG2 • for GPR-based instructions. 64-bit fix Typo • for elements Reserved df/nvalues outsi • constant to be 128. WRLEN Specified •opcode H/W 3RF table vs. W/D typo fixed. • NaN propagation rule. Specified •* 0. for infinity Invalid FMADD/FMSUB signals • CTCMSA/CFCMSA Coprocessor 0 signal U • when executed be instruction can not MSA • excepti Overflow the signals FTQ • bits. control flush to zero new Specified • 0. to from 1 0 and to 1 FR of from changing effects the Clarified • instructions: INSVE, Added new •format. instruction VECS5 unused Removed •load/store for address calculation Clarified •(F one bit with to zero is controlled Flush • Explicit MIPS Architectur •flus subnormal input operands Clarified • are FPR registers •to SUBSUS_U. SUBUS_S andto SUBSUU_S, SUBSS_U to INSERT, INSV Renamed • truncation. to integer floating-point for FTRUNC_U) instructions (FTRUNC_S, New •shift right with instructions for New •compar floating point redundant Eliminated •changes Opcode for FCNE, FSNE, MUL_Q, MULR_Q, MADD_Q,MADDR_Q, • th access in registers floating-point Defined • compare point instructions floating New •pse Load/store Volume IV-j: The MIPS64® SIMD Arch Date May 31, 2013May • description to spec NX mode Fixed 1.001.01 201212, December • 8, 2013 February MIP • Signaling 1.024, March 2013 • state for Reset 1.038, March 2013 •of hi FPR the effect Specified 1.04 Revision MIPS® Architecture for Programmers Programmers MIPS® Architecture for Revision History Revision Appendix B Appendix

ructions. as this case isno ned saturation pseudo- saturation ned gnal Inexact only for finite only for finite Inexact gnal SA instructions. SA and do not guarantee any order- guarantee any do not and after writing scalar float- writing after ess is element aligned. xt. ructions. Updated ELM Instruction ructions. For- Updated ELM Instruction ng for approximate procal inst reci for approximate ng ng Underflow, Overflow, and Inexact signal- Inexact and Overflow, Underflow, ng SA descriptions. SA add/sub NaN propagation propagation NaN add/sub gisters in VSHF.W and COPY_S/U pseudo- and gisters in VSHF.W sub and signed-to-unsig sub and t values where the sign is not relevant. is not sign the where values t SUB added to the Set Summary. Instruction rounded result based on the rounding mode. result on the rounding based rounded exact for values outside the range. for values exact ed control registers behavior. control registers ed floating-point compare instructions. floating-point Description itecture Module, Revision 1.12 Revision Module, itecture 326 hift add LSA and DL XUPR FCLASS. to non arithmetic instruction and te MSA Disabled exception. lues in data format units. format in data lues ption and FMAX_A pseudocode. ption UNPREDICTABLE ctions FRCP and FRSQRT si ctions FRCP and FRSQRT atomic at the element level level element the at atomic MIPS logo and legal te legal MIPS logo and er layout and data format. layout and er ral left-to-right rule. ral guaranteed only if the addr only if the guaranteed EE 2008 maxNum/maxNumMag and minNum/minNumMag in 2008 maxNum/maxNumMag EE in the LSA and DL and LSA the sa in e instructions and FM instructions and e describing the NaN propagation rules. the propagation NaN describing . MSACSR pseudocode.Flags update MSACSR www.wavecomp.ai with u2 ing-pointvalues restrictions. Flush to zero (FS) does not apply to 16-bit float data used by format conversion instruc- used by format conversion data float 16-bit to apply not does (FS) to zero Flush tions FEXUPL, and FEXDO, FE ing among elements. among ing ing. code. mattable accordingly. FMAX/FMAX_A FMIN/FMIN_A.and numerical operands. numerical code. exception from the gene from exception •FFQL/FFQR scaling typo. Fixed •Replaced •Replaced •atomicity is Load/store • MSA opcodes genera Reserved • bytes. in is LD/ST offset the for syntax assembler the that Specified •alignment any have address LD/ST effective calculated nor the address base the Neither • • instructions are Load/store •zero. as be written must and as zero read R0: as fields reserved Defined • SLD/SLDI regist Clarified •are MIPS64 inst COPY_S/U.Dand INSERT.D • to the ordered Added “ordered” text •selection. for bit in mulx_s/u pseudocode fixed Typo • 2B. class to document AFP MIPS32 MSA Changed •is the for Underflow value default The • instru Approximate reciprocal • regardi clarifications FRSQRT and FRCP • as 8-bi i8 immediates defined Explicitly • destination re source and for fixed Typos •offsets. s10 descriptions show LD/ST •signali the exception flush-to-zero Specified •comply Reciprocalwith instructionsthe IEEE rules. FRCP FRSQRT and •bits are register vector Higher •CTCMSA/CFCMSA reserv Specified • instructions. LDX/STX load/store indexed Removed •left-s base Introduced architecture •descri FEXUPL in fixed Typos •typo fixed. FCLASS pseudocode •In and signals both the Overflow FTQ •changed. opcode LDI •10-bit va are offsets Load/store •16 bits. are offsets Branch •rules. quiet to NaN conversion Added signaling •point add/ multiply for fixed Corrections • for multiply text superfluous the Deleted •IE referenced Explicit Copyright © Wave Computing, Inc. All rights reserved. Volume IV-j: The MIPS64® SIMD Arch Date April 2014 8, June 21, 2013 June •to change update Template 1.05 1.12 3, 2016 February •MSA64. from COPY_U.D removed 1.06 August 6, 2013 •immediat Missing 1.11 1.07 October 2, 2013 • in fixed Typo 1.09 20, December 2013 • formats. in the instruction some typos Fixed 1.10 7, 2014 February •the text Expanded Revision MIPS® Architecture for Programmers Programmers MIPS® Architecture for