IBM zSeries Mainframes

~ Development IBM Corporation Charles F. Webb Evolution of z/Architecture z 1960s S/360: 24-bit address z 1970s S/370: virtual address/DAT z 1980s 370/XA: 31-bit address, new I/O z 1990s ESA/390: multiple address spaces z 1998 added IEEE 754 Standard z 2000 z/Architecture: 64-bit address Uni Performance

Relative Performance 200 CMOS-99 = 9672-G6 CMOS-98 = 9672-G5 BIPOLAR 100

CMOS 50 CMOS-97 = 9672-G4 CMOS-96 = 9672-G3 30

20 CMOS-95 = 9672-R2/R3

CMOS-94 = 9672-R1 10

9221-170 5 9221-150

3 Performance x 2 Performance x 2 2 every 4 to 5 years every 12-24 months 9370-50 1 Year 70 75 80 85 90 95 2000 ProcessorProcessor OrganizationOrganization

Inst Tags BTB Decode Tags 256K Buffer 256K I- Inst D-Cache TLB TLB Fetch Inst AGen Queue ST Buf

DAT Operand GRs FPRs Engine Compare Buffer

Checkpoint Co- State FXU FPU Processor Execution Execution

Data Address Inst/Cntl CBus CP Pipeline IF,Dec,Agen,C1,C2,E1,WR Decode / Op Fetch Execute/Commit I Fetch I REG Operands Instr Ifetch request Decode Exec ADDR B, X, D Instr Result Cache access Op Req Stage I Queue Data Address Check Cache I Buff Stage Data Chkpt Op Buff CP Chip - 17.9mm x 9.9mm – 47 million transistors

BTBBTB II UnitUnit AA 2k2k xx 44 FXUFXU AA FPUFPU AA

II CacheCache DD CacheCache DD CacheCache ControlsControls ControlsControls II CacheCache 256256 KBKB RR UnitUnit 256256 KBKB

CompressionCompression FXUFXU BB FPUFPU BB TranslatorTranslator II UnitUnit BB Binodal L2 System

STI STI MBA1 PU00 PU01 PU02 PU03 PU04 PU05 PU06 PU07 PU08 PU09 MBA0

16 byte bidi

SCD2,3 SCD0,1 Between SCs SCC0 2 uni 16 byte 4 busses CRP0 CRP1 To MEM 16 byte bidi SCD6,7 SCC1 SCD4,5

MBA3 PU0A PU0B PU0C PU0D PU0E PU0F PU10 PU11 PU12 PU13 MBA2 STI STI Millicode z Licensed Internal Code layer for complex functions – System/control ops, interruptions, service operations, etc. z Variant of z/Architecture ISA – Unique GRs and ARs – Includes all hardwired z/Architecture ops – Modest set of millicode-only ops – Access to all processor state via R-Unit z Millicode mode entered under hardware control – Mode-changing branch with minimal context z Uses same instruction pipeline as normal code – Minimal unique hardware z Enables architectural and design flexibility – New ops and features, workarounds – Full CISC support with manageable complexity R-Unit z Focal point for hardware fault checking – Mirrored unit comparators and other checkers z Buffers entire processor architected state – GRs, FPRs, ARs, CRs, PSW – Millicode CRs, SysRegs, Timing facility, etc. z Maintains CP checkpoint for recovery – Processor state protected via ECC or equivalent – Granularity: every HW instruction (regular or millicode) z Provides R/W access to processor state – Millicode special ops plus a few hardwired z/Arch ops – State mapped into 256 x 64-bit register space R-Unit CBus-A IU/EU-A CBus-B IU/EU-B

ECC Gen ECC Gen Error Check Check Checkers Recovery Control Buffer Buffer

EU-A Write Timing System System Operations EU-B Address Checkpoint Facility Registers State- IU-A Registers Read Inst Addr Async IU-B Address PSW Interrupts Processor State

EU-A EU-B Fault Checking z Combination of checking schemes used – Mirrored units: complex logic and dataflow – Parity check: byte-coherent dataflow, BTB, etc. – Functional / state checks: cache controls, co-processor – ECC / duplicate parity: checkpoint state in R-Unit z All processor state updates sent to R-Unit – Checked on hardware-instruction granularity z Results committed to checkpoint only if clean – All mirrored compares equal – No faults detected anywhere in processor z Target: near-100% detection of hardware faults – Both hard/permanent and soft/transient varieties CP Chip - Checking Strategy by Unit

BTBBTB II UnitUnit AA 2k2k xx 44 FXUFXU AA FPUFPU AA

II CacheCache DD CacheCache DD CacheCache ControlsControls ControlsControls II CacheCache 256256 KBKB RR UnitUnit 256256 KBKB

CompressionCompression FXUFXU BB FPUFPU BB TranslatorTranslator II UnitUnit BB zSeries RAS Priorities 1. Ensure data integrity • Requires ~100% error detection • zSeries is industry leader 2. Keep applications on the air • Whenever #1 is not compromised • Requires fine-grained recovery • zSeries is industry leader 3. Repair on-line • Primarily 2nd level packaging constraint ¾ Crash and Re-boot is not good enough! Fault Recovery

I-Unit Check all state updates I-Unit (unchecked) Cache Preserve known good state (parity) (mirror) If error Stop state updates Refresh from saved state Restart CPU E-Unit If error persists E-Unit (mirror) Extract saved state (SE) (unchecked) Load into spare CPU R-Unit Start spare CPU (ECC on saved state)

Address Cache data Instructions Results / state updates Saved state data R-Unit CBus-A IU/EU-A CBus-B IU/EU-B

ECC Gen ECC Gen Error Check Check Checkers Recovery Control Buffer Buffer

EU-A Write Timing System System Operations EU-B Address Checkpoint Facility Registers State- IU-A Registers Read Inst Addr Async IU-B Address PSW Interrupts Processor State

EU-A EU-B Dynamic CPU Sparing Operating CPU Service Processor Spare CPU Check all state updates Extract saved state from CPU Wait in idle loop until needed Preserve known good state CPU state Load CPU state from buffer If error Adjust CPU numbers Special CPU instruction Stop state updates Check for special conditions Refresh from saved state Store CPU state in memory Replace R-Unit contents Restart CPU Signal Spare CPU Refresh CPU state If error persists Restart CPU with new state Signal service processor

System I-Unit I-Unit Cache I-Unit Cache I-Unit (unchecked) (unchecked) (mirror) Memory (mirror) (parity) State (parity) Buffer

E-Unit E-Unit E-Unit E-Unit (unchecked) (unchecked) (mirror) (mirror) R-Unit R-Unit (ECC on saved (ECC on saved state) Service state) Processor Other RAS Features z CP Arrays (caches / tags / TLBs / BTB) – Data stored through to L2 to get ECC protection – Line and set deletion for persistent array faults z L2 and Memory – ECC on arrays and busses – Retry on failing commands – DRAM chip sparing z I/O Subsystem – Multiple paths to devices – Multiple identical hubs / channels – Retry on failing commands z Power / Cooling / Service – N+1 redundancy Conclusion z Custom CISC z Durable Design Point z Industry-Leading RAS z More to Come Want to know more?

z C.F.Webb & J.S.Liptay, “A High-frequency Custom CMOS S/390 Microprocessor”, IBM Journal of R&D, July/September 1997 z T.J.Slegel et al., “IBM’s S/390 G5 Microprocessor”, IEEE Micro, March/April 1999 z C.F.Webb, “ S/390 Microprocessor Design”, IBM Journal of R&D, November 2000 z E.M.Schwarz et al., “The of the IBM eServer z900”, IBM Journal of R&D, July/September 2002 z K.E.Plambeck et al., “Development and Attributes of z/Architecture”, IBM Journal of R&D, July/September 2002 z Questions? cfw@us..com Thanks for Listening!