^ ^ ^ ^ I ^ ^ ^ ^ II ^ II ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ I ^ � European Patent Office Office europeen des brevets £P Q 508 604 B1

(12) EUROPEAN PATENT SPECIFICATION

(45) Date of publication and mention (51) |nt CI.6: G06F 3/06, G06F 13/42, of the grant of the patent: G06F 11/10 17.11.1999 Bulletin 1999/46

(21) Application number: 92302098.6

(22) Date of filing: 11.03.1992

(54) for system Speicherplattenanordnungsteuerungsvorrichtung fur eine Datenspeicherungsanordnung Dispositif de commande d'un reseau de disques pour un systeme de stockage de donnees

(84) Designated Contracting States: • McCombs, Craig C. DE FR GB Wichita, KS 6721 3 (US) • Thompson, Kenneth J. (30) Priority: 13.03.1991 US 668660 Wichita, KS 67226 (US)

(43) Date of publication of application: (74) Representative: Gill, David Alan et al 14.10.1992 Bulletin 1992/42 W.P. Thompson & Co., Celcon House, (73) Proprietors: 289-293 High Holborn • NCR International, Inc. London WC1V7HU (GB) Dayton, Ohio 45479 (US) • HYUNDAI ELECTRONICS AMERICA (56) References cited: Milpitas, California 95035 (US) EP-A- 0 294 287 EP-A- 0 320 107 • Symbios, Inc. EP-A- 0 369 707 US-A- 4 899 342 Fort Collins, Colorado 80525 (US) US-A-4 914 656

(72) Inventors: • Jibbe, Mahmoud K. Wichita, KS 67208 (US)

DO ^- o CO 00 o 10 Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice the Patent Office of the Notice of shall be filed in o to European opposition to European patent granted. opposition a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. a. 99(1) European Patent Convention). LU

Printed by Jouve, 75001 PARIS (FR) 1 EP 0 508 604 B1 2

Description paritydisk. In asimilarfashion, read operations typically need only access a single one of the N data disks, un- [0001] This invention relates to data storage systems less the data to be read exceeds the block length stored of the kind including host bus means adapted to be con- on each disk. As with RAID level 3 systems, the parity nected to a host device, and disk drive means adapted 5 disk is used to reconstruct information in the event of a to be connected to a plurality of disk drive devices. disk failure. [0002] Recent and continuing increases in computer [0007] RAID level 5 is similar to RAID level 4 except processing power and speed, in the speed and capacity that parity information, in addition to the data, is distrib- of primary memory, and in the size and complexity of uted across the N+1 disks in each group. Although each computer software has resulted in the need for faster 10 group contains N+1 disks, each disk includes some operating, larger capacity secondary memory storage blocks for storing data and some blocks for storing parity devices; magnetic disks forming the most common ex- information. Where parity information is stored is con- ternal or secondary memory storage means utilized in trolled by an algorithm implemented by the user. As in present day computer systems. Unfortunately, the rate RAID level 4 systems, RAID level 5 writes require ac- of improvement in the performance of large magnetic 15 cess to at least two disks; however, no longer does every disks has not kept pace with processor and main mem- write to a group require access to the same dedicated ory performance improvements. However, significant parity disk, as in RAID level 4 systems. This feature pro- secondary memory storage performance and cost im- vides the opportunity to perform concurrent write oper- provements may be obtained by the replacement of sin- ations. gle large expensive disk drives with a multiplicity of 20 [0008] In a known disk array system the host operates small, inexpensive disk drives interconnected in a par- as the RAID controller and performs the parity genera- allel array, which to the host appears as a single large tion and checking. Having the host computer parity is fast disk. expensive in host processing overhead. Furthermore, [0003] Several disk array design alternatives were the known system includes a fixed data path structure presented in an article titled "A Case for Redundant Ar- 25 interconnecting the plurality of disk drives with the host rays of Inexpensive Disks (RAID)" by David A. Patter- system. Rearrangement of the disk array system to ac- son, Garth Gibson and Randy H. Katz; University of Cal- commodate different quantities of disk drives or different ifornia Report No. UCB/CSD 87/391, December 1987. RAID configurations is not easily accomplished. This article discusses disk arrays and the improvements [0009] US-A-4,91 4,656 discloses a data storage sys- in performance, reliability, power consumption and scal- 30 tern including a host bus means adapted to be connect- ability that disk arrays provide in comparison to single ed to a host device, disk drive means adapted to be con- large magnetic disks. nected to a plurality of disk drive devices, selectively [0004] Five disk array arrangements, referred to as controllable coupling means connected to the host bus RAID levels, are described in the article. The first level means and to a plurality of buses and which includes RAID comprises N disks for storing data and N addition- 35 connecting means adapted to connect the host bus al "mirror" disks for storing copies of the information writ- means to a first group of buses. EP-A-0,369,707 dis- ten to the data disks. RAID level 1 write functions require closes a data storage system likewise including a host that data be written to two disks, the second "mirror" disk bus means for connection to a host device, disk drive receiving the same information provided to the first disk. means for connection to a plurality of disk drive devices When data is read, it can be read from either disk. 40 and selectively controllable coupling means connected [0005] RAID level 3 systems comprise one or more to the host bus means and the disk drive means. groups of N+1 disks. Within each group, N disks are [0010] EP-A-0,294,287and EP-A-0,320,107 relate to used to store data, and the additional disk is utilized to data storage systems of the type defined within the store parity information. During RAID level 3 write func- present application but all of these aforementioned sys- tions, each block of data is divided into N portions for 45 terns are disadvantageously restricted having regard to storage among the N data disks. The corresponding their adaptability and versatility for controlling data flow parity information is written to a dedicated parity disk. between a host and a disk array. When data is read, all N data disks must be accessed. [0011] It is an object of the present invention to pro- The parity disk is used to reconstruct information in the vide a data storage system of the kind specified which event of a disk failure. so may be readily adapted to support various disk drive ar- [0006] RAID level 4 systems are also comprised of ray configurations. one or more groups of N+1 disks wherein N disks are [0012] Therefore, according to the present invention, used to store data, and the additional disk is utilized to there is provided a data storage system, including host store parity information. RAID level 4 systems differ from bus means adapted to be connected to a host device, RAID level 3 systems in that data to be saved is divided 55 disk drive means adapted to be connected to a plurality into larger portions, consisting of one or many blocks of of disk drive devices, selectively controllable coupling data, for storage among the disks. Writes still require means connected to said host bus means and to a plu- access to two disks, i.e., one of the N data disks and the rality of array buses coupled to said disk drive means,

2 3 EP 0 508 604 B1 4 first connecting means adapted to connect said host bus bit line. Bus 16 includes 18 data lines, two of which are means to a first group of selected array buses charac- bus parity lines. Additional control, acknowledge and terized in that said first connecting means includes a plu- message lines, not shown, are also included with the rality of register means each associated with a respec- data busses. tive array bus and connected to said host bus means for 5 [0016] SCSI adapter 14, SCSI bus interface chips 31 receiving data therefrom, each register means having a through 36, SRAM 36 and microprocessor 51 are com- respective bus driver connected thereto and also con- mercially available items. For example, SCSI adapter nected to the associated array bus, the bus drivers being 1 4 may be a Fast SCSI 2 chip, SCSI bus interface chips selectively controllable to connect said host bus means 31 through 35 may be NCR 53C96 chips and microproc- to said first group of selected array buses, and in that 10 essor 51 could be a Motorola MC68020, 16 megahertz said coupling means includes parity generation means, microprocessor. Also residing on the microprocessor second connecting means adapted to connect a second bus are a one megabyte DRAM, a 1 28 kilobyte EPROM, group of selected array buses to the input of said parity an eight kilobyte EEPROM, a 68901 multifunction pe- generation means, and third connecting means adapted ripheral controller and various registers. to connect the output of said parity generation means to is [0017] SCSI data path chip 10 is an application spe- a third group of selected array buses. cific integrated circuit device capable of handling all data [0013] One embodiment of the invention will now be routing, data multiplexing and demultiplexing, and parity described by way of example, with reference to the ac- generation and checking aspects of RAI D levels 1 , 3 and companying drawings, in which:- 5. Data multiplexing of non-redundant data among five 20 channels is also accommodated. ADP chip 10 handles Figure 1 is a block diagram of a disk array controller the transfer of data between host SCSI adapter 14 and incorporating the data and parity routing architec- SCSI bus interface chips 31 through 35. The ADP chip ture of the present invention; also handles the movement of data to and from the 64 Figure 2 is a functional block diagram of the SCSI kilobyte SRAM during read/modify/write operations of Array Data Path Chip (ADP) shown in Figure 1 25 RAID level 5. which incorporates the data and parity routing ar- [0018] Figure 2 is a functional block diagram of the chitecture of the present invention; SCSI Array Data Path Chip (ADP) shown in Figure 1 . Figure 3 is a block diagram of the DMA FIFO and The ADP chip consists of the following internal function- routing blocks shown in Figure 2, representing a al blocks: control logic block 60 including diagnostic preferred embodiment of the present invention; 30 module 62, configuration control and status register Figures 4A through 4F provide a more detailed module 64, interrupt module 66 and interface module block diagram of the architecture shown in Figure 68; DMA FIFO block 70; and data and parity routing 3; and block 100. Figures 5 through 18 illustrate some of the various [001 9] The main function of control block 60 is to con- configurations of the circuit of Figure 3. For exam- 35 figure and to test the data path and parity path described ple, Figure 8 shows the circuit configuration for per- in herein. Microprocessor interface module 68 contains forming a RAID level 3 write operation. basic microprocessor read and write cycle control logic providing communication and data transfer between mi- [0014] Referring now to Figure 1 , there is seen a block croprocessor 51 (shown in Figure 1 ) and control and sta- diagram of a SCSI (small computer system interface) 40 tus registers within the control block. This interface uti- disk array controller incorporating the data and parity lizes a multiplexed address and data bus standard routing architecture of the present invention. The con- wherein microprocessor read and write cycles consist troller includes a host SCSI adapter 14 to interface be- of an address phase followed by a data phase. During tween the host system data bus 12 and SCSI data bus the address phase a specific register is selected by the 16. Parity generation and data and parity routing are 45 microprocessor. During the following data phase, data performed within SCSI Array Data Path Chip (ADP) 10. is written to or read from the addressed register. ADP chip 10 interconnects SCSI data bus 16 with six [0020] Configuration control and status register mod- additional data busses, identified by reference numerals ule 64 includes numerous eight-bit registers under the 21 through 26. SCSI bus interface chips 31 through 35 control of microprocessor interface module 68 as out- interface between respective SCSI data busses 21 50 lined above. The contents of the control registers deter- through 25 and corresponding external disk drive data mine the configuration of the data and parity paths. Sta- busses 41 through 45. Bus 26 connects ADP chip 10 tus registers provide configuration and interrupt infor- with a 64 kilobyte static random access memory mation to the microprocessor. (SRAM) 36. ADP chip 10, SCSI adapter 14 and SCSI [0021] Diagnostic module 62 includes input and out- bus interface chips 31 through 35 all operate under the 55 put data registers for the host and each array channel. control of a dedicated microprocessor 51 . The input registers are loaded by the processor and pro- [0015] All data busses shown, with the exception of vide data to host bus 16, array busses 21 through 25 bus 16, include nine data lines, one of which is a parity and SRAM bus 26 to emulate host or array transfers.

3 5 EP 0 508 604 B1 6

The output registers, which are loaded with data routed one of array busses 21 through 25 and a first input of a through the various data buses, are read by the micro- respective two-to-one multiplexer, the output of which processor to verify the proper operation of selected data forms an input to exclusive-OR circuit 110. The second and parity path configurations. inputs of each one of the multiplexers, identified by ref- [0022] Interrupt module 66 contains the control logic 5 erence numerals 111 through 115, are connected to necessary to implement masking and grouping of ADP ground. chip and disk controller channel interrupt signals. Any [0028] The output of exclusive-OR circuit 110 is pro- combination of five channel interrupt signals received vided via nine-bit bus 130 to double registers 101 and from SCSI bus interface chips 31 through 35 and three 1 02 and tristate buffers 121, 122, 123, 124, 125 and 126. internally generated ADP parity error interrupt signals 10 Each one of tristate buffers 121 through 126, when en- can be combined together to generate interrupt signals abled, provides the parity information generated by ex- for the microprocessor. clusive-OR circuit 110 to a respective one of busses 21 [0023] The function of DMA FIFO block 70 is to hold through 26. Bus 26 connects the output of tristate buffer data received from the host until the array is ready to 126 with SRAM 36 and one input of a two-to-one multi- accept the data and to convert the data from eighteen 15 plexer 116. The output of multiplexer 1 1 6 is input to ex- bit bus 1 6 to nine bit busses 1 8. During read operations clusive-OR circuit 110 to provide the architecture with DMA FIFO block 70 holds the data received from the parity checking and data reconstruction capabilities, to disk array until the host system is ready to accept the be discussed in greater detail below. data and converts the data from nine bit busses 18 to [0029] Also shown in Figure 3 are six transceivers, eighteen bit bus 16. 20 identified by reference numerals 141 through 146, for [0024] Data and parity routing block 1 00 contains the connecting respective busses 21 through 26 with proc- steering logic for configuring the data and parity paths essor bus 53 for transferring diagnostic information be- between the host, the disk array and SRAM 36 in re- tween the array channels and the processor. An addi- sponse to control bytes placed into the control registers tional transceiver 150 connects host busses 16U and contained in module 64. Block 100 also includes logic 25 16L with processor bus 53. Transceivers 141 through for generating and checking parity, verifying data, 1 46 and 1 50 allow diagnostic operations by the control- broadcasting data, monitoring data transfers, and re- ler processor. constructing data lost due to a single disk drive failure. [0030] Figures 4A through 4F provide additional detail [0025] Figure 3 is a block diagram of DMA FIFO block concerning the construction and operation of double 70 and routing block 100 shown in Figure 2, represent- 30 registers 101 through 1 05, transceivers 1 41 through 1 46 ing a preferred embodiment of the present invention. and transceiver 1 50, and control of data to and from SC- The data routing architecture includes five data chan- SI bus interface chips 31 through 35 and SRAM 36. nels, each channel providing a data path between SCSI [0031] Double register 101 is seen to include a multi- adapter 14 and a corresponding one of SCSI bus inter- plexer 203 for selecting data from either of busses 1 6U face chips 31 through 35. The first data channel includes 35 or 1 6L for placement into registers 205 and 207, a latch a double register 101 connected between nine-bit host 209 and a buffer/driver 211 for placing the register con- busses 1 6U and 1 6L and array bus 21 . Busses 1 6U and tents onto array bus 21. Double register 101 further in- 16L transfer sixteen bits of data and two bits of parity cludes a multiplexer 213 for selecting data from either information between external SCSI adapter 1 4 and dou- of busses 21 or 1 30 for placement into registers 21 5 and ble register 101. Array bus 21 connects double register 40 217 and a buffer/driver 219 for placing the content of 101 with SCSI bus interface chip 31. Additional taps to registers 21 3 and 215 onto bus 1 6L. Double register 1 02 array bus 21 will be discussed below. Data channels two is identical in construction to double register 101 for through five are similarly constructed, connecting host transferring data from either of busses 16U or 16L to busses 16U and 16L through respective double regis- bus 22, and from either of busses 21 or 130 to bus 16U. ters 102 through 105 and array busses 22 through 25 45 Double registers 1 03 and 1 04 are similar to double reg- with SCSI bus interface chips 32 through 35. isters 101 and 102, respectively, except that neither in- [0026] Each one of double registers 101 through 105 cludes a multiplexer corresponding to multiplexer 213 includes an internal two-to-one multiplexer for selecting or a connection to bus 1 30. Double register 1 05 includes as input either one of host busses 1 6U or 1 6L when data only those elements described above which are neces- is to be transferred from busses 1 6U and 1 6L to the ar- 50 sary to transfer information from busses 16U and 16L ray busses. Double registers 101 and 102 also each in- to bus 25. clude an internal two-to-one multiplexer for selecting as [0032] Each one of transceivers 141 through 146 in- input either the associated array bus or a parity bus 1 30 cludes a register 223 connected to receive and store da- when data is to be transferred to host busses 16U and ta residing on the associated array or SRAM bus , a buff- 16L. 55 er/driver 225 connected to provide the contents of reg- [0027] An exclusive-OR circuit 110 is employed to ister 223 to processor bus 53, a register 227 connected generate parity information from data placed on array to receive and store data from processor bus 53, and a busses 21 through 25. A tap is provided between each buffer/driver 229 connected to provide the contents of

4 7 EP 0 508 604 B1 8

register 227 to the associated array or SRAM bus. [0038] In the table above, a logic one stored in one or Transceiver 150 includes registers and drivers for stor- more of register bits 0 through 5 sets the corresponding ing and transferring data between processor bus 53 and multiplexer to provide data from its associated array bus each one of host busses 16U and 16L. to exclusive-OR circuit 110. A logic zero stored in one [0033] Array busses 21 through 25 each include a pair or more of register bits 0 through 5 selects the corre- of parallel-connected buffer/drivers 233 and 235 for con- sponding multiplexer input connected to ground. trolling the direction of data flow between the array bus [0039] Data path control register 3 enables and disa- and its associated SCSI bus interface chip. SRAM bus bles the flow of data to or from SCSI bus interface chips 26 is similarly connected with SRAM 36. 31 through 35 and SRAM 36. The functions of the indi- [0034] Control lines, not shown, are provided be- 10 vidual register bits are described below. tween control registers residing in configuration control and status module 64, discussed above, and double Data Path Control Register 3 registers 101 through 105, multiplexers 111 through 116, Bit 0 enable SCSI DMA I/O on channel 1 tri-state buffers 121 through 126, and transceivers 141 Bit 1 enable SCSI DMA I/O channel 2 through 146 and 150. Five data path control registers 15 on are responsible for configuration of the data path. Bit 2 enable SCSI DMA I/O on channel 3 [0035] Data Path Control Register 1 controls dataflow Bit 3 enable SCSI DMA I/O on channel 4 through double registers 101 through 105 during array Bit 4 enable SCSI DMA I/O on channel 5 write operations. Each control register bit controls the Bit 5 enable data I/O with SRAM operation of one of registers 101 through 105 as identi- 20 Bit 6 not used fied below. Bit 7 not used

Data Path Control Register 1 [0040] A logic one stored in any one of bits 0 through 5 enables the channel. Data flow direc- Bit 0 enables register 101 25 corresponding Bit 1 enables register 102 tion to or from SCSI bus interface chips 31 through 35 Bit 2 enables register 103 and SRAM 36 is determined by data path control regis- ter 5. Bit 3 enables register 104 [0041] Data path control register 4, through associat- Bit 4 enables register 105 ed hardware, enable RAID 30 chip specific operations as Bit 5 not used described below. Bit 6 not used Bit 7 not used Data Path Control Register 4 BitO array data transfer direction [0036] When any one of control bits 0 through 4 are 35 Bit 1 RAID level 1 enable set high, data obtained from the host is transferred onto Bit 2 RAID level enable the internal array bus corresponding to the enabled reg- 4 or 5 ister. The data obtained from the host is thereafter avail- Bit 3 RAID level 3 enable able to the SCSI bus interface chip connected to the ar- Bit 4 drive configuration 2+1 enable bus, the parity generating logic and to the diagnostic Bit 5 drive 4+1 enable ray 40 configuration circuitry. Bit 6 array parity checking enable [0037] Data Path Control Register 2 controls the op- Bit 7 not used eration of multiplexers 111 through 116, thereby govern- ing the input to exclusive-OR circuit 110. Each register [0042] The direction of data flow through the register bit controls the operation of one of multiplexers 111 45 banks 101 through 105 is determined from the setting through 116 as identified below. of bit 0. This bit is set high for host to array write oper- ations and set low for array to host read operations. Bits Data Path Control Register 2 1 , 2 and 3 enable the specific RAID operations when set BitO selects input to MUX 111 high. Bits 4 and 5 enable the specific RAID level 3 op- Bit 1 selects input to MUX 112 50 erations when set high. Control bit 6 is set high to enable Bit 2 selects input to MUX 113 parity checking. Data control 5 controls the data Bit 3 selects input to MUX 114 [0043] path register flow direction to or from SCSI bus interface chips 31 Bit 4 selects input to MUX 1 1 5 through 35 and SRAM 36. Bit 5 selects to MUX 1 1 6 input 55 Bit 6 not used Data Path Control Register 5 Bit 7 not used Bit 0 array direction channel 1

5 9 EP 0 508 604 B1 10

(continued) to bus 1 30. Since multiplexers 1 1 2 through 1 1 6 are dis- abled the data provided to multiplexer 111 passes Data Path Control Register 5 through the parity generator unaltered. Double registers Bit 1 array direction channel 2 101 and 102 are enabled to provide data received from Bit 2 array direction channel 3 5 bus 130 to busses 16U and 16L, respectively. Bus 16U is with information while bus 16L Bit 3 array direction channel 4 provided upper byte transmits lower byte information. Bit 4 array direction channel 5 7 shows the data architecture Bit 5 external SRAM direction [0048] Figure path con- figured to reconstruct the RAID level 1 channel 1 disk Bit 6 not used 10 from the channel 2 disk. Data path control register con- Bit 7 not used tents are set as follows: control register 1 - OOhex, con- trol register 2 - 02hex, control register 3 - 03hex, control [0044] A logic one stored in any one of register bits 0 register 4 - 03hex, and control register 5 - 01 hex. SCSI through 5 specifies that data is to transferred from the bus interface chip 32 is enabled to provide data to bus array channel bus to its connected SCSI bus interface is 22, multiplexer 1 1 2 is enabled to provide bus 22 data to chip or SRAM. A logic 0 sets the data transfer direction bus 130, tristate buffer 121 is enabled to place bus 130 from the SCSI bus interface chip or SRAM to the corre- data onto bus 21 , and SCSI bus interface chip 31 is en- sponding array channel bus. The control bits of register abled to write data received from bus 21 onto the chan- 3 must be set to enable SCSI bus interface chips 31 nel 1 disk. The architecture could similarly be configured through 35 of SRAM 36 to perform a data transfer op- 20 to reconstruct the channel 2 disk from the channel 1 eration. disk. [0045] The five data path control registers discussed [0049] Figure 8 shows the data path architecture con- above enable or disable double registers 101 through figured to permit RAID level 3, 4+1 (four data disks and 105, multiplexers 111 through 116, tri-state buffers 121 one parity disk) write operations. The contents of the da- through 1 26, and transceivers 1 41 through 1 46 and 1 50 25 ta path control registers are: control register 1 - OFhex, to configure the architecture to perform the various control register 2 - OFhex, control register 3 - 1 Fhex, RAID levels 1, 3, 4 and 5 read and write operations control register 4 - 29hex, and control register 5 - 1 Fhex. which are described below with reference to Figures 5 Double registers 101 through 104 are enabled to pro- through 14. vide data from the host to busses 21 through 24, respec- [0046] Figure 5 shows the data path architecture con- 30 tively. Multiplexers 111 through 114 are enabled to pro- figured to perform RAID level 1 write operations on vide data from busses 21 through 24 to parity generator channels 1 and 2. The contents of data path control reg- 1 1 0, the output of which is provided via bus 1 30, enabled isters 1 through 5, expressed in hexadecimal notation, tri-state buffer 125, and bus 25 to bus interface 35. Bus are: control register 1 - 03hex, control register 2 - OOhex, interface chips 31 through 35 are enabled to provide da- control register 3 - 03hex, control register 4 - 03hex, and 35 ta from busses 21 through 25 to their corresponding ar- control register 5 - 03hex. Double registers 101 and 1 02 ray disks. are thereby enabled to provide data received from the [0050] The RAID level 3 read configuration is illustrat- host via busses 1 6U and 1 6L to busses 21 and 22. Data ed in Figure 9 where SCSI bus interface chips 31 is transferred first from bus 16L to busses 21 and 22, through 34 are enabled to provide data from their cor- and then from bus 16U to busses 21 and 22. SCSI bus 40 responding array disks to double registers 101 through interface chips 31 and 32 are enabled to transfer data 104 via busses 21 through 25. Double registers 101 from busses 21 and 22 to their associated array disks, through 104 are enabled to provide the data received each disk receiving the same data. Active busses from busses 21 through 24 to busses 1 6U and 1 6L. The (busses 16U, 16L, 21 and 22) are drawn with greater data path control registers are set as follows to conf igu re line weight in Figure 5. Arrows are provided next to the 45 the data paths: control register 1 - OOhex, control register active busses to indicate the direction of data flow on 2 - 1 Fhex, control register 3 - 1 Fhex, control register 4 the busses. - 68hex, and control register 5 - OOhex. Parity checking [0047] The RAID level 1 read configuration is shown is performed by exclusive-ORing the parity information in Figure 6. As both channel 1 and channel 2 array disks stored on the channel 5 disk with the data read from the store the same information only one of the disks need so channel 1 through 4 disks. The output of parity generator be accessed for a RAID level 1 read. To read data from 1 1 0 should be OOhex if the stored data is retrieved from the channel 1 disk data path control register contents the array without error. are set as follows: control register 1 - OOhex, control reg- [0051] Figure 10 shows the data path architecture ister 2 -01 hex, control register 3 -01 hex, control register configured to reconstruct the RAID level 3 channel 2 4 - 02hex, and control register 5 - OOhex. SCSI bus in- 55 disk from the remaining data and parity disks. Datapath terface chip 31 is enabled to provide data from the chan- control register contents are set as follows: control reg- nel 1 disktobus21. Multiplexer 111 is enabled to provide ister 1 - OOhex, control register 2 - 1 Dhex, control regis- the data on bus 21 through parity generation circuit 110 ter 3 - 1 Fhex, control register 4 - 29hex, and control reg-

6 11 EP 0 508 604 B1 12

ister 5 - 02hex. SCSI bus interface chips 31 , 33, 34 and 1 Fhex, control register 4 - 25hex, and control register 5 35 and multiplexers 111, 113, 114 and 1 1 5 are enabled - 02hex. to provide data and parity information from the channel [0056] The data path architecture may also be config- 1 , 3, 4 and 5 array disks to parity generator 110. Parity ured to perform data verification, data broadcasting, and generator 1 1 0 combines the received data and parity in- s diagnostic operations. Additional control registers are formation to regenerate channel 2 data. The output of provided to control transceivers 141 through 146 and the parity generator is provided to the channel 2 disk transceiver 1 50 to configure the architecture to perform through bus 130, enabled tristate device 122, bus 22 these additional operations. Figures 1 5 through 1 8 illus- and enabled bus interface chip 32. Alternatively, the ar- trate a few of the data verification, data broadcasting, chitecture could be configured to provide the recon- 10 and diagnostic operations which can be accomplished. structed data directly to the host during read operations. [0057] Figure 15 shows the data path architecture [0052] RAID level 5 write operations involve both a configured to verify that information stored on channel read and a write procedure. Figures 1 1 and 1 2 illustrate 1 and 2 array disks in accordance with RAID level 1 is a RAID level 5 write involving the channel 1 and 2 array correct. Channels 1 and 2 are enabled to receive data disks, wherein data is to be written to the channel 2 array 15 from SCSI bus interface chips 31 and 32. Multiplexers disk and parity information is to be updated on the chan- 1 1 1 and 1 1 2 are enabled to provide this data to the ex- nel 1 disk. The data paths are first configured as shown clusive-OR circuit 110. The output of circuit 110, which in Figure 11 to read information from the channel 1 and should be OOhex if the channel 1 and 2 disks contain 2 disks. This information is provided through multiplex- duplicate information, is provided through bus 130, en- ers 111 and 112 to parity generator 110 and the result 20 abled tri-state buffer 125, bus 25, enabled transceiver stored in external SRAM 36 via bus 130, enabled tri- 1 45 and bus 53 to the processor for evaluation. state buffer 1 36 and bus 26. The data path control reg- [0058] Figure 1 6 illustrates the data path architecture isters contain the following codes: control register 1 - configured to verify that information stored in the array OOhex, control register 2 - 03hex, control register 3 - in accordance with RAID levels 3 or 5 is correct. Chan- 23hex, control register 4 - 04hex, and control register 5 25 nels 1 through 5 are enabled to receive data from SCSI - 20hex. bus interface chips 31 through 35. Multiplexers 111 [0053] New data and parity information is then written through 115 are enabled to provide this data to the ex- to the channel 1 and 2 array disks as shown in Figure clusive-OR circuit 110. The output of circuit 110, which 12. Double register 102 is enabled to receive the new should be OOhex if the data and parity information are data from the host and provide the data to the channel 30 in agreement, is provided through bus 130, enabled tri- 2 disk via bus 22 and enabled SCSI bus interface chip state buffer 1 26, bus 26 and enabled transceiver 1 46 to 32. The new data is also provided to parity generator the processor for evaluation. 1 1 0 through multiplexer 1 1 2. The information previously [0059] Figures 17 and 18 show the data path archi- written into SRAM 36 is also provided to parity generator tecture configured for data broadcasting wherein the 110 through multiplexer 116. The output of the parity 35 same data is written to each of the disks in the array. generator is the new parity which is provided through Data broadcasting may be performed to initialize all the enabled tri-state buffer 121, bus 21 and enabled SCSI array disks with a known data pattern. In Figure 17 data bus interface chip 31 to the channel 1 disk. The contents is received by double register 101 from the processor of the data path control registers for the second portion via bus 53 and enabled transceiver 150. In Figure 18 of the RAI D level 5 write operation are as follows: control 40 data is provided to double register 101 by the host sys- register 1 - 02hex, control register 2 - 22hex, control reg- tem. In both cases double register 1 01 is enabled to pro- ister 3 - 23hex, control register 4 - 05hex, and control vide the received data to SCSI bus interface chips 31 register 5 - 03hex. via bus 21 . Multiplexer 1 1 1 is enabled to provide bus 21 [0054] Figure 1 3 illustrates a RAID level 5 read oper- data to exclusive-OR circuit 110. Multiplexers 112 ation. In this example the information to be read resides 45 through 1 1 6 remain disabled so that the output of circuit on the channel 5 array disk. The data path control reg- 110 is equivalent to the data received from bus 21. isters are set to enable SCSI bus interface chip 35, mul- Tristate devices 112 through 115 are enabled to provide tiplexer 1 1 5, and double registers 101 and 1 02 to trans- the output of circuit 110 to SCSI bus interface chips 32 fer data from the channel 5 disk to the host. Control reg- through 35. ister contents are: control register 1 - OOhex, control reg- so [0060] It can thus be seen that there has been provid- ister 2 - 10hex, control register 3- 10hex, control register ed by the present invention a versatile data path and 4 - 04hex, and control register 5 - OOhex. parity path architecture for controlling data flow between [0055] RAID level 5 disk reconstruction is similar to a host system and disk array. Those skilled in the art will RAID level 3 disk reconstruction discussed above in appreciate that the invention is not limited to the details connection with Figure 9. Figure 14 depicts a configu- 55 of the foregoing embodiments. For example, the archi- ration for reconstructing data on channel 2. The data tecture need not be limited to five channels. In addition, path control register contents are: control register 1 - configurations other than those described above are OOhex, control register 2 - 1Dhex, control register 3 - possible. Processors, processor interfaces, and bus in-

7 13 EP 0 508 604 B1 14 terfaces other than the types shown in the Figures and characterized in that said third connecting means discussed above can be employed. For example, the includes respective bus drivers (121 -1 25) associat- datapath architecture can be constructed with ESDI, IPI ed with said array buses (21-25), each bus driver or EISA devices rather than SCSI devices. (121-1 25) having an input connected to the output 5 of said exclusive-OR circuit (110), an output con- nected to the associated array bus (21-25) and a Claims control signal input.

1. A data storage system, including host bus means 6. A data storage system according to any one of (16) adapted to be connected to a host device, disk 10 Claims 2 to 5, characterized by temporary storage drive means (31 -35, 41 -45) adapted to be connect- means (36), fifth connecting means (126) adapted ed to a plurality of disk drive devices, selectively to selectively connect the output of said exclusive- controllable coupling means (1 0) connected to said OR circuit (110) with said temporary storage means host bus means (16) and to a plurality of array buses (36), and sixth connecting means (116) adapted to (21-25) coupled to said disk drive means (31-35, is selectively connect said temporary storage means 41-45), first connecting means (205, 207, 211) (36) to an input of said exclusive-OR circuit (110). adapted to connect said host bus means (16) to a first group of selected array buses (21 -25) charac- 7. A data storage system according to Claim 6, char- terized in that said first connecting means includes acterized by a diagnostic bus (53), seventh con- a plurality of register means (205, 207) each asso- 20 necting means (141-145) adapted to selectively ciated with a respective array bus (21 -25) and con- connect said diagnostic bus (53) with a fourth group nected to said host bus means (16) for receiving da- of selected array buses (21-25), and eighth con- ta therefrom, each register means (205, 207) hav- necting means (1 50) adapted to selectively connect ing a respective bus driver (211) connected thereto said diagnostic bus (53) to said host bus means and also connected to the associated array bus 25 (16). (21 -25), the bus drivers (211) being selectively con- trollable to connect said host bus means (1 6) to said 8. A data storage system according to any one of the first group of selected array buses (21-25), and in preceding claims, characterized by a plurality of that said coupling means (10) includes parity gen- controllable interface devices (31035) adapted to eration means (110), second connecting means 30 connect said array buses (21 -25) to respective disk (111-115) adapted to connect a second group of se- drive buses (41-45). lected array buses (21 -25) to the input of said parity generation means (110), and third connecting means (121 -1 25) adapted to connect the output of Patentanspriiche said parity generation means (110) to a third group 35 of selected array buses (21-25). 1. Datenspeichersystem, umfassend einen Host-Bus (1 6), der so ausgestaltet ist, dal3 er an ein Host-Ge- 2. A data storage system according to Claim 1 , char- rat angeschlossen werden kann, Laufwerke (31 -35, acterized in that said parity generation means in- 41-45), die an eine Mehrzahl von Laufwerkgeraten cludes an exclusive-OR circuit (110). 40 angeschlossen werden konnen, selektiv steuerba- re Kopplungsmittel (10), die mit dem genannten 3. A data storage system according to Claim 2, char- Host-Bus (16) und mit einer Mehrzahl von Array- acterized in that said coupling means (10) includes Bussen (21, 25) verbunden sind, die mit den ge- fourth connecting means (213-219) adapted to se- nannten Laufwerken (31-35, 41-45) verbunden lectively connect the output of said exclusive-OR 45 sind, ein erstes Verbindungsmittel (205, 207, 211), circuit (110) to said host bus means (16). das so ausgestaltet ist, dal3 es den genannten Host- Bus (16) mit einer ersten Gruppe von ausgewahlten 4. A data storage means according to Claim 2 or 3, Array-Bussen (21-25) verbindet, dadurch gekenn- characterized in that said second connecting zeichnet, dal3 das genannte erste Verbindungsmit- means includes respective multiplexers (111-115) 50 tel eine Mehrzahl von Registern (205, 207) auf- associated with said array buses (21 -25), each said weist, die jeweils mit einem jeweiligen Array-Bus multiplexer (111-115) having a first input connected (21 -25) assoziiert und mit dem genannten Host-Bus to its associated array bus (21-25), a second input (1 6) verbunden sind, urn Daten von diesem zu emp- connected to a reference voltage source, an output fangen, wobei jedes Register (205, 207) einen je- connected to an input of said exclusive-OR circuit ss weiligen Bustreiber (211) hat, der mit diesem sowie (110) and a control signal input. mit dem assoziierten Array-Bus (21-25) verbunden ist, wobei die Bustreiber (211) selektiv steuerbar 5. A data storage system according to Claim 2, 3 or 4, sind, urn den genannten Host-Bus (16) mit der ge-

8 15 EP 0 508 604 B1 16

nannten ersten Gruppe ausgewahlter Array-Busse temporare Speichermittel (36) selektiv mit einem (21-25) zu verbinden, und dadurch, da(3 das ge- Eingang der genannten Exklusiv-ODER-Schaltung nannte Kopplungsmittel (10) ein Paritatserzeu- (110) verbindet. gungsmittel (110), zweite Verbindungsmittel (111-115), die so ausgestaltet sind, dal3 sie eine 5 7. Datenspeichersystem nach Anspruch 6, gekenn- zweite Gruppe von ausgewahlten Array-Bussen zeichnet durch einen Diagnosebus (53), siebte Ver- (21-25) mit dem Eingang des genannten Paritats- bindungsmittel (141 -1 45), die so ausgestaltet sind, erzeugungsmittels (110) verbinden, und dritte Ver- dal3 sie den genannten Diagnosebus (53) selektiv bindungsmittel (121 -1 25) umfaBt, die so ausgestal- mit einer vierten Gruppe von ausgewahlten Array- tet sind, dal3 sie den Ausgang des genannten Pari- 10 Bussen (21-25) verbinden, und ein achtes Verbin- tatserzeugungsmittels (110) mit einer dritten Grup- dungsmittel (150), das so ausgestaltet ist, dal3 es pe von ausgewahlten Array-Bussen (21 -25) verbin- den genannten Diagnosebus (53) selektiv mit dem den. genannten Host-Bus (16) verbindet.

2. Datenspeichersystem nach Anspruch 1, dadurch is 8. Datenspeichersystem nach einem der vorherigen gekennzeichnet, dal3 das genannte Paritatserzeu- Anspruche, gekennzeichnet durch eine Mehrzahl gungsmittel eine Exklusiv-ODER-Schaltung (110) von steuerbaren Schnittstellengeraten (31035), die beinhaltet. so ausgestaltet sind, dal3 sie die genannten Array- Busse (21-25) mit jeweiligen Laufwerksbussen 3. Datenspeichersystem nach Anspruch 2, dadurch 20 (41-45) verbinden. gekennzeichnet, dal3 das genannte Kopplungsmit- tel (10) vierte Verbindungsmittel (213-219) auf- weist, die so ausgestaltet sind, dal3 sie den Aus- Revendications gang der genannten Exklusiv-ODER-Schaltung (110) mit dem genannten Host-Bus (16) verbinden. 25 1. Systeme de stockage de donnees, incluant un moyen de bus hote (16) adapte pour etre connecte 4. Datenspeichersystem nach Anspruch 2 oder 3, da- a un dispositif hote, des moyens d'unites de disques durch gekennzeichnet, dal3 die genannten zweiten (31-35, 41-45) adaptes pour etre connectes a une Verbindungsmittel jeweilige Multiplexer (111-115) pluralite de dispositifs d'unites de disques, un aufweisen, die mit den genannten Array-Bussen 30 moyen de couplage selectivement controlable (10) (21-25) assoziiert sind, wobei jeder genannte Mul- connecte audit moyen de bus hote (1 6) et a une plu- tiplexer (111-115) einen ersten Eingang, der mit sei- ralite de bus de grappes (21-25) couples auxdits nem assoziierten Array-Bus (21-25) verbunden ist, moyens d'unites de disques (31 -35, 41 -45), un pre- einen zweiten Eingang, der mit einer Referenz- mier moyen de connexion (205, 207, 211) adapte spannungsquelle verbunden ist, einen Ausgang, 35 pour connecter ledit moyen de bus hote (16) a un der mit einem Eingang der genannten Exklusiv- premier groupe de bus de grappes selectionnes ODER-Schaltung (110) verbunden ist, und einen (21-25), caracterise en ce que ledit premier moyen Steuersignaleingang aufweist. de connexion inclut une pluralite de moyens de re- gistre (205, 207) associes chacun a un bus de grap- 5. Datenspeichersystem nach Anspruch 2, 3 oder 4, 40 pe respectif (21-25) et connectes audit moyen de dadurch gekennzeichnet, dal3 die genannten drit- bus hote (16) pour recevoir des donnees de celui- ten Verbindungsmittel jeweilige Bustreiber ci, chaque moyen de registre (205, 207) ayant un (121 -1 25) aufweisen, die mit den genannten Array- pilote de bus respectif (211) connecte a ceux-ci et Bussen (21 -25) assoziiert sind, wobei jeder Bustrei- connecte egalement au bus de grappe associe ber (121-1 25) einen Eingang, der mit dem Ausgang 45 (21 -25), les pilotes de bus (211) etant selectivement der genannten Exklusiv-ODER-Schaltung (110) controlable pour connecter ledit moyen de bus hote verbunden ist, einen Ausgang, der mit dem assozi- (1 6) audit premier groupe de bus de grappes selec- ierten Array-Bus (21-25) verbunden ist, und einen tionnes (21-25), et en ce que ledit moyen de cou- Steuersignaleingang aufweist. plage (10) inclut un moyen de generation de parite so (110), un deuxieme moyen de connexion (111-115) 6. Datenspeichersystem nach einem der Anspruche 2 adapte pour connecter un deuxieme groupe de bus bis 5, gekennzeichnet durch ein temporares Spei- de grappes selectionnes (21-25) a I'entree dudit chermittel (36), ein f unftes Verbindungsmittel (1 26), moyen de generation de parite (110), et un troisie- das so ausgestaltet ist, dal3 es den Ausgang der me moyen de connexion (121-125) adapte pour genannten Exklusiv-ODER-Schaltung (110) selek- 55 connecter la sortie dudit moyen de generation de tiv mit dem genannten temporaren Speichermittel parite (110) a un troisieme groupe de bus de grap- (36) verbindet, und ein sechstes Verbindungsmittel pes selectionnes (21-25). (116), das so ausgestaltet ist, da!3 es das genannte

9 17 EP 0 508 604 B1

2. Systeme de stockage de donnees selon la reven- respectifs (41-45). dication 1 , caracterise en ce que ledit moyen de ge- neration de parite inclut un circuit OU exclusif (110).

3. Systeme de stockage de donnees selon la reven- s dication 2, caracterise en ce que ledit moyen de couplage (10) inclut un quatrieme moyen de con- nexion (213-219) adapte pour connecter selective- ment la sortie dudit circuit OU exclusif (110) audit moyen de bus hote (16). 10

4. Systeme de stockage de donnees selon la reven- dication 2 ou 3, caracterise en ce que ledit deuxie- me moyen de connexion inclut des multiplexeurs respectifs (111-115) associes auxdits bus de grap- 15 pes (21-25), chaque dit multiplexeur (111-115) ayant une premiere entree connectee a son bus de grappe associe (21-25), une deuxieme entree con- nectee a une source de tension de reference, une sortie connectee a une entree dudit circuit OU ex- 20 clusif (110) et une entree de signal de controle.

5. Systeme de stockage de donnees selon la reven- dication 2, 3 ou 4, caracterise en ce que ledit troi- sieme moyen de connexion inclut des pilotes de bus 25 respectifs (121-125) associes auxdits bus de grap- pes (21-25), chaque pilote de bus (121-125) ayant une entree connectee a la sortie dudit circuit OU exclusif (1 1 0), une sortie connectee au bus de grap- pes associe (21 -25) et une entree de signal de con- 30 trole.

6. Systeme de stockage de donnees selon I'une quel- conque des revendications 2 a 5, caracterise par un moyen de stockage provisoire (36), un cinquieme 35 moyen de connexion (126) adapte pour connecter selectivement la sortie dudit circuit OU exclusif (110) audit moyen de stockage provisoire (36), et un sixieme moyen de connexion (116) adapte pour connecter selectivement ledit moyen de stockage 40 provisoire (36) a une entree dudit circuit OU exclusif (110).

7. Systeme de stockage de donnees selon la reven- dication 6, caracterise par un bus de diagnostic 45 (53), un septieme moyen de connexion (141-145) adapte pour connecter selectivement ledit bus de diagnostic (53) a un quatrieme groupe de bus de grappes selectionnes (21-25), et un huitieme moyen de connexion (150) adapte pour connecter so selectivement ledit bus de diagnostic (53) audit moyen de bus hote (16).

8. Systeme de stockage de donnees selon I'une quel- conque des revendications precedentes, caracteri- 55 se par une pluralite de dispositifs d'interface contro- lables (31035) adaptes pour connecter lesdits bus de grappes (21-25) a des bus d'unites de disques

10 1

Q \ — i am — 2< v_ cc< u y CCS LUCCL^ hH Z LU 1 n ii in CMi -■— CD CsJ CM

i OO | »H> CO i i

I I 3 3

! i i r

__L r -

i i \ -fl) Mr I

3 1 S 1 1 2 - lo n q o UJ I DC I I I

i I

it !

rin i i

UJ | I O I LU 1 it- i ■

ItV!

! i L_.

J

i

\ \ \ EP 0 508 604 B1

19 -1 — 1—

J — 4 „^

rii — *t *\J XT

r r 4 4 I j 3 CO r

r II

IDI LUl I n i n

T

r 7 4 ^

1 Ml CM n_

N„ K Is r J

I o CO L i CO i T i 2 tt V. m LUl LU| LU im

LU < I

4—

—4 - h CM

3 ^

I I

r

CM o I I

in 1 cj < cm I - J

T— U- LU X X B .2 2 CO ' U3( 8©

— — — 1

n£_L — I — i — 4

* 4^ «

A 3

y5 1 a 1 7^ I CM in CM CM T

cc \ — UJ \ u. u. \'4 GO — 1

i ^ —J ~. J

(V7T- 7 'I

kr-4- — i w n

r l CM ^ CO ' CM CM

j1 I '

c r _] j co

-I

< c col col — 7 4

1 LO "7 Si v r i cm

r J «; ,

i G cm en CM J j en

51 51 M5

-1 B If /I I Ml < o0 8( " fir^l fit

Hi 1 1 Isl I g 1

^ 4 sf" 1

DC LU LL. U. m 11 3 ^£ TL7 CO rx j— J A ?1 T I c

4

X X X i r> =) r) i | S j S | S. 'lflTlf\

CD

1 Hii'

* if _1_ r-rr — „ CO LU

I 1 1 T~ CD 7 1

?1 I "4 T

TUT eg CM v

J 1

o CO 15 »-4- O CM CO

T in

2 s IvjcnIS e ^3 I < :r t

t- u ►<>

.1 c^'

I CM £ CM U. \ k V ZD CO CM w

CO E i— I J o co aJ\ C ▼ 2 T 5: £ I T

1^1 v CD I I I ml I I CQ I X X X X I £ £ £ ZD ZD ZD ZD a | s 1 s | s. ltlfltlt\

to

=UL|ii.

J go -

t— rr — »-

-A = S \ ' _i ' ,— J CO err ^ I |

it 4— 2 5 cv

II r c I 1 CM J I CO 4 - LU 1 2 it v 3 s Ml III