The first space-qualified Klessydra RISC-V microcontroller to be launched on a satellite
presenters:
LUIGI BLASI FRANCESCO VIGLI Digital System Lab at Sapienza University of Rome
Mauro Olivieri Abdallah Cheikh Associate Professor PhD candidate
Francesco Menichelli Giulia Stazi Assistant Professor PhD cand. @UTC inc.
Antonio Mastrandrea Luigi Blasi Research Fellow PhD cand. @DSI Gmbh
Francesco Lannutti Francesco Vigli collaborator @Synopsys collaborator @ ELT Spa
Stefano Sordillo master thesis student Outline
• Motivation • Space environment issues • Architectural fault tolerance • Klessydra Fx3x core family • Results and perspective
Design and evaluation of fault-tolerant 12/06/2019 Page 3 architectures for RISC-V processors cores qualified for space Motivation • Nanosatellites (CubeSat, PicoSat, PocketCube, etc.) allow academic institutions and small companies to afford space mission research. • The little production volume demands the usage of COTS components, yet in extremely severe operating conditions. • Specialized microarchitecture design along with the exploitation of remotely configurable devices are very interesting. • So, what about sending a RISC-V microcontroller in space?
Design and evaluation of fault-tolerant 12/06/2019 Page 4 architectures for RISC-V processors cores qualified for space Klessydra RISC-V Cores: a PULPino compatible core family
core
PULP PULP PULP feat. feat. ARIANE feat. RI5CY-FPU Linux RI5CY-FPU multi-core, compatible multi-core multi-cluster core
PULPino PULPino PULPino PULPissimo feat. feat. feat. feat. ZeroRiscy RI5CY RI5CY-FPU RI5CY-FPU core core core core courtesy of
Design and evaluation of fault-tolerant 12/06/2019 Page 5 architectures for RISC-V processors cores qualified for space Klessydra RISC-V Cores: a PULPino compatible core family
PULPino feat. Klessydra Fx core
PULPino PULPino PULP PULP feat. feat. PULP feat. feat. ARIANE PULPino Klessydra Klessydra feat. RI5CY-FPU Linux feat. T0 cores T1 cores RI5CY-FPU multi-core, compatible Klessydra multi-core multi-cluster core S0 core
PULPino PULPino PULPino PULPissimo feat. feat. feat. feat. ZeroRiscy RI5CY RI5CY-FPU RI5CY-FPU core core core core
Design and evaluation of fault-tolerant 12/06/2019 Page 6 architectures for RISC-V processors cores qualified for space Klessydra RISC-V Cores: a PULPino compatible core family
PULPino feat. Klessydra Fx core
PULPino PULPino feat. feat. PULPino Klessydra Klessydra • “space-qualified” core, feat. T0 cores T1 cores • T0 microarchitecture Klessydra • + configurable HW/SW S0 core fault-tolerance support
• M mode v1.10 • “edge computing” core • Starting point • RV32I user ISA • extends T0 microarchitecture • M mode v1.10 • Atomic ext. (partial) • RV32IM • RV32I user ISA • multiple PC & CSR • + configurable multiple • single hart • multiple interleaved scratchpad memories harts • + configurable vector unit • Extended ISA
Design and evaluation of fault-tolerant 12/06/2019 Page 7 architectures for RISC-V processors cores qualified for space Single event effects and protection techniques
• SET: Single Event Transient • SEU: Single Event Upset • MBU/MCU: Multiple Bit (Cell) Upset • SEFI: Single Event Functional Interrupt
Design and evaluation of fault-tolerant 12/06/2019 Page 8 architectures for RISC-V processors cores qualified for space Single event effects and protection techniques
MODULE 1
MODULE 2 time CHECKER TASK1 TASK1
Check point MODULE N
Spatial redundancy Temporal redundancy Miscellaneous • Double Modular • Repetition with error • Watch-dog timer Redundancy (DMR) with detection techniques as a error detection • Check-point and particular case • Triple Modular recovery • EDAC protection for Redundancy (TMR) with • Task level or Cycle level memories error correction • Hardware and/or • Fail-safe FSMs software
Design and evaluation of fault-tolerant 12/06/2019 Page 9 architectures for RISC-V processors cores qualified for space List of explored techniques (still ongoing work…)
• full Triple-Modular-Redundancy on hart’s register file, CSRs, and on pipeline registers; • partial Triple-Modular-Redundancy on register subsets; • pipeline doubling with lockstep execution; • delayed shadow thread with single pipeline; • delayed shadow thread with double pipeline; • checkpoint setting and status recovery; • watchdog timer and watchdog hart • memory ECC protection
Design and evaluation of fault-tolerant 12/06/2019 Page 2 architectures for RISC-V processors cores qualified for space Klessydra core multi-hart pipeline scheme
Data reg file Data reg file Data memory Data reg file
Program memory ...
Pipeline Pipeline Pipeline Stage Stage Stage
logic logic ... logic
PC value PC value PC value PC
Instruction ... Processing Pipeline
Pipeline Actual PC Control PC update CSR CS Section PC CS PC updatelogic CSRlogic CSReg. Harc PC Reg.Reg. MUX PC updatelogic CSRlogic PC FileFile counter logic logic File • Hardware support for interleaved multi-threading (barrel processor) • Inter-hart software interrupts supported
Design and evaluation of fault-tolerant 12/06/2019 Page 11 architectures for RISC-V processors cores qualified for space Design case: Klessydra F03X – full TMR
• Klessydra T03X core with registers protected by TMR
D S ET Q
C LR Q
CombinationalLogica D S ET Q logic C LR Q
Combinatoria Voter
D S ET Q
C LR Q CLK
• Fail-safe and fault-tolerant FSMs • TMR registers synthesized either as structural modules or functions
Design and evaluation of fault-tolerant 12/06/2019 Page 12 architectures for RISC-V processors cores qualified for space Microarchitecture H2-CSR H1-CSR TMR registers 3232 CSR_rdata_o 32bit PC update MSTATUS MSTATUS bitbit MSTATUS MSTATUS regreg Voter H2-CSR reg
From_Debug_Unit MSTATUS H1-CSR MIP
From_CSR:Unit From_Memory logic From_PC_Unit TMR registers PC update 32 CSR_rdata_o 3232
Voter bit PC update CSR_wdata_i MSTATUS MSTATUS bitbit logic MSTATUS MSTATUS regreg
PC update Voter MSTATUS MIP reg Voter logic 32-bit reg CSR_wdata_i PC update Voter MUX 32-bit reg MSTATUS MSTATUS
logic To_PC_Unit 32-bit reg MSTATUSTo_Memory MSTATUS
logic MSTATUSTo_CSR_Unit H0-CSR MEPC MTVEC MSTATUS To_Debug_Unit PC update 32-bit reg Voter MSTATUS MUX MSTATUS MEPC MTVEC CSR_done TMR FF MSTATUS MSTATUS H0-CSR SET MSTATUS MSTATUS S Q MCAUSE MESTATUS S SET Q CSR_done_o 32-bit reg SET S Q
R CLR Q Output R CLR Q logic R CLR Q harc_IE_i Counters and Performance Registers Input32-bit reg CSR_done TMR FF MSTATUS MSTATUS MSTATUS MSTATUS S SET Q MCAUSE MESTATUS S SET Q CSR_done_o voting SET
S Q
From_Debug_Unit
From_CSR:Unit
From_Memory From_PC_Unit R CLR Q R CLR Q
R CLR Q
To_PC_Unit
To_Memory
To_CSR_Unit To_Debug_Unit harc_IE_i Counters and Performance Registers Output Input
voting
. .
.
. . Write .
. . Fetch Decode . Execute
Back
. .
. From_PIpe_Unit
. . Write From_CSR:Unit . From_PC_Unit
. . Fetch Decode . Execute Back
Debug Unit
H0-31 x 32 bit reg H1-31 x 32 bit reg H2-31 x 32 bit reg
voting voting voting H0-31 x 32 bit reg H1-31 x 32 bit reg H2-31 x 32 bit reg
voting voting voting Design and evaluation of fault-tolerant 12/06/2019 Page 13 architectures for RISC-V processors cores qualified for space Design case: Klessydra F13X - Double Pipeline
• Double Modular Redundancy error detection system
• Protected checkpoint restoring on fault occurrence
• Dedicated hardware unit to support checkpoints (Checkpoint & Restore Unit)
• Restoring and checkpoints managed by 4 pseudo–instructions
– System controlled by dedicated CSR accesses • Additional features for the execution of dependent threads
• Program Counter and CSR Unit fully protected with TMR technique
Design and evaluation of fault-tolerant 12/06/2019 Page 14 architectures for RISC-V processors cores qualified for space Microarchitecture
H2-CSR H1-CSR TMR registers PC update 32 CSR_rdata_o 3232
Voter bit MSTATUS MSTATUS bitbit logic MSTATUS MSTATUS regreg
PC update MSTATUS MIP reg Voter logic 32-bit reg CSR_wdata_i PC update Voter MUX 32-bit reg MSTATUSMSTATUS MSTATUSMSTATUS logic H0-CSR MEPC MTVEC 32-bit reg CSR_done TMR FF MSTATUS MSTATUS MSTATUS MSTATUS S SET Q MCAUSE MESTATUS S SET Q CSR_done_o S SET Q
R CLR Q R CLR Q R CLR Q
harc_IE_i Counters and Performance Registers
From_PIpe_Unit
From_CSR:Unit From_PC_Unit
Regfile 3 x (31 x 32 bit) CRU r r r Error IF e ID e IE e WB g g g control To_PC_Unit From_Memory
From_Debug_Unit To_CSR_Unit Regfile 3 x (31 x 32 bit) Debug Unit From_CSR:Unit To_Debug_Unit r r r CRU From_PC_Unit IF e ID e IE e WB g g g Dpipe control To_Memory
CRU reg Update
CSTATUS - TMR
Design and evaluation of fault-tolerant 12/06/2019 Page 15 architectures for RISC-V processors cores qualified for space Design case: Klessydra F23X – Shadow Thread and Double Pipeline
Instruction Decode Instruction Execution Instruction WB
Hart X
Hart X+1 Hart X reg Shadow reg
Hart X+1
Hart X+1 Hart X reg reg Shadow
SCU voter • Time-space redundancy hybrid technique • 3 interleaved harts alternated on 2 pipelines • Error correction through 3 results majority vote
Design and evaluation of fault-tolerant 12/06/2019 Page 16 architectures for RISC-V processors cores qualified for space Microarchitecture
H2-CSR H1-CSR TMR registers PC update 32 From_Debug_Unit CSR_rdata_o
3232
Voter From_CSR:Unit
From_Memory bit From_PC_Unit MSTATUS MSTATUS bitbit logic MSTATUS MSTATUS regreg
PC update MSTATUS MIP reg Voter logic 32-bit reg CSR_wdata_i PC update Voter MUX 32-bit reg MSTATUSMSTATUS MSTATUSMSTATUS
logic H0-CSR MEPC MTVEC
PC_control
To_PC_Unit CSR_control
32-bit reg To_Memory To_CSR_Unit
To_Debug_Unit CSR_done TMR FF MSTATUS MSTATUS MSTATUS MSTATUS S SET Q MCAUSE MESTATUS S SET Q CSR_done_o S SET Q
R CLR Q R CLR Q R CLR Q
harc_IE_i Counters and Performance Registers
From_Debug_Unit
From_CSR:Unit
From_Memory From_PC_Unit
Regfile 3 x (31 x 32 bit)
PC_control
To_PC_Unit
CSR_control
To_Memory To_CSR_Unit r r To_Debug_Unit r IF e ID e IE e WB g g g
MUX Regfile 3 x (31 x 32 bit) r r r IF e ID e IE e WB SRU g g g
MUX r r r From_PIpe_Unit
SRU From_CSR:Unit IF e ID e IE e WB From_PC_Unit g g g r r r IF e ID e IE e WB g g g Regfile 3 x (31 x 32 bit) Regfile 3 x (31 x 32 bit)
Debug Unit
Shadow voting & Shadow Sync control MUX Shadow voting & Shadow Sync control MUX
Design and evaluation of fault-tolerant 12/06/2019 Page 17 architectures for RISC-V processors cores qualified for space Functional tests
Risc-V official test routines and Klessydra-specific test routines have been executed to compare the operation correctness and the perfomance between Fx3x cores and T03x non-fault-tolerant core. Reported execution latency in cycles:
F23X F13X F03X T03X
testALU 146 679 123 413 123 131 123 135 testCSR 77 396 63 380 63 098 63 098 testIRQ 337 613 316 792 316 383 316 383
testException 51 425 50 418 43 949 43 949 sw_irq_test 3 508 2 436 2 158 2 158 WFI_test 3 534 2 397 2 119 2 119 barrier_test 4 032 2 691 2 415 2 415
Design and evaluation of fault-tolerant 12/06/2019 Page 18 architectures for RISC-V processors cores qualified for space Hardware resource utilization (preliminary)
• FPGA Xilinx Artix 7 a35 • Synthesis and implementation on Vivado 2018.2 • Vivado synthesis keep equivalent registers option enable
F23X F13X F03X T03X LUT 25138 19377 21117 8804
FF 9299 9401 14863 5401
IO 269 269 269 269
BUFG 12 12 12 10
LATCH 1533 1676 1079 547
Design and evaluation of fault-tolerant 12/06/2019 Page 19 architectures for RISC-V processors cores qualified for space Fault injection test results
• Simulate error injection over all the architecture internal registers. • Random single event upset for different upset rates. • Worst case fault injection rate for all architectures: 1 Upset/bit every 1ms.
F23X F13X F03X T03X
min 123 441 FI_testALU_FX3X 146 679 123 131 FAIL max 187 609 min 63 618 FI_testCSR_FX3X 77 398 63 098 FAIL max 121 748 min 380 035 FI_testIRQ_FX3X 337 641 316 383 FAIL max 756 406
Total cycle count to complete test programs:
Design and evaluation of fault-tolerant 12/06/2019 Page 20 architectures for RISC-V processors cores qualified for space Conclusions and perspectives • Ongoing work: – time-redundancy techniques: shadow thread on single pipeline with check point recovery – Watchdog timer techniques – Mixed techniques • Hardware testing for all versions • The final choice of the core version(s) to be included in the final prototype is expected in early September
Design and evaluation of fault-tolerant 12/06/2019 Page 21 architectures for RISC-V processors cores qualified for space Conclusions and perspectives
• The space-qualified Pulpino-Klessydra RISC-V platform, prototyped on a 4 cm x 3 cm TE0714 FPGA board, is expected to fly in a sun-synchronous Low- Earth-Orbit PocketQube satellite, by the end of 2019.
• Stay tuned to read the #tweets it will send to Earth!
Design and evaluation of fault-tolerant 12/06/2019 Page 22 architectures for RISC-V processors cores qualified for space Thank you for your kind attention contact: [email protected]