III III III IIII US005465375A United States Patent (9. 11) Patent Number: 5,465,375 Thepaut et al. 45) Date of Patent: Nov. 7, 1995

54 MULTIPROCESSOR SYSTEM WITH 5,086,498 2/1992 Tanaka et al...... 395/200 CASCADED MODULES COMBINING 5,165,023 11/1992 Gifford ...... 395/325 PROCESSORS THROUGHA 5,291,611 3/1994 Davis et al...... 395/800 PROGRAMMABLE LOGIC CELL ARRAY FOREIGN PATENT DOCUMENTS 75) Inventors: André Thepaut; Gerald Ouvradou, 433142 12/1990 European Pat. Off.. both of Plouzane, France OTHER PUBLICATIONS 73) Assignee: France Telecom, Paris, France S. Y. Kung; "Parallel Architectures for Artificial Neural Nets' IEEE 1988, pp. 163-174. (21) Appl. No.: 4582 S.Y. Kung et al., "Parallel Architectures for Artificial Neural 1. Nets'; IEEE International Conference on Neural Networks, 22 Filed: Jan. 14, 1993 San Diego, Calif. Jul. 24-27, 1988, 8 pages. 30 Foreign Application Priority Data Primary Examiner-Krisna Lim Jan. 14, 1992 (FR) France. 92-00312 Attorney, Agent, or Firm-Jacobson, Price, Holman & Stern (51) Int. Cl” . G06F 15/16 (57) ABSTRACT 52 U.S. C. 36,25.36." (E32. 3:53. In a multiprocessor data processing system, modules are se 3 64/DIG. cascaded by means of intermodule buses. Each module comprises a data processing unit, a first memory, a logic cell 58) Field of Search ...... 33529, 325, array programmable into four input/output interfaces, a 395/800; 370/53, 85.9; 364/137 second memory and a specialized processing unit such as a digital signal (DSP). A first interface, the first 56) References Cited memory and the data processing unit are interconnected by U.S. PATENT DOCUMENTS al module bus. A fourth interface, the second memory and the specialized processing unit are interconnected by another 4,200,930 4/1980 Rawlings et al...... 395/200 module bus. A feedback bus connects the second and third 36. 12:3: Selen - - 3. interfaces in the last and first modules for constituting a ring. 4.443,850 4/1984 Fini, -- a -- 395/275 Such a system is particularly intended for image recognition, 4663,706 5/1987 Allencial." ... 355/200 such as digitalized handwritten digits for postal distribution. 4,720,780 1/1988 Dolecek ...... 395/800 4,816,993 3/1989 Takahashi et al...... 395/250 2 Claims, 7 Drawing Sheets

SE,--- PROCESSOR-L-lRc TRC lap PROCESSOR:ifRC T COMPUTER COMMUNICAON COMMUNICATION COMMUNICATION nor : NETWORK For 1 201 211 20 : 20 21 - 251 ----- 25; MEMORY --- 25

221,1 22 LOGC CEL | 221; LOGIC CELL 22.LOGIC CELL 221, 22 LOGIC CEll ARRAY ARRAY ARRAY B12 B(i+1)(+2) Fitti INTEFULAR initi N-222 222; 222 i-1 2231|224 22011 felai. 22O:r 220-1* Bl(-1); 223,224."22O 2611-wevor ; 26.1 evor FEECK 251 23; 23 BR- : digital signal

is 24---gis-24, - - - - - 24,

------U.S. Patent Nov. 7, 1995 Sheet 1 of 7 5,465,375

W; ; FIG. 1 (PRIOR ART)

Wi-1,i-1

FIG. 3 (PRIOR ART)

COMMUNICATION NETWORK MODULE BUS ME: BM;

H c O PROCESSING/ s SWITCHING s

2. } 2 n INTERMODULAR in r BUS --is J n BIM-1 U.S. Patent Nov. 7, 1995 Sheet 2 of 7 5,465,375

FIG. 2 (PRIOR ART) ------NEURONS

HIDDEN LAYERS 2 TO (N-1)

U.S. Patent Nov. 7, 1995 Sheet 4 of 7 5,465,375

U.S. Patent Nov. 7, 1995 Sheet 5 of 7 5,465,375

OL=(I)XO|AZZZZZZZZZZZZ7(|x01)XºJONALEN U.S. Patent Nov. 7, 1995 Sheet 6 of 7 5,465,375

U=256 U-40 U.S. Patent Nov. 7, 1995 Sheet 7 of 7 5,465,375 FIG. 8

ALGORITHM 22 (> 1) ALGORITHM 24 (>1) RECEIVE V1 TO V-1 TRANSMIT V1 TO Vi TO 22-1 COMPUTE V =X Wije WRITE V IN 22

ALERT 20 ALGORITHM 221

RECEIVE V1 TO V TRANSMIT V1 TOW TO 231

ALERT 241 ALGORITHM 20 (>1)

WARN 221 ALGORITHM 241 DIRECT MEMORY

ACCESS TO 23i COMPUTE f:sig(V) Si-1

ALERT 201

ALGORITHM 201

ACKNOWLEDGE WAIT FOR WARNINGS FROM ALL THE 24 AUTHORIZE DIRECT MEMORY ACCESS TO 20

F RECOGNITION NOT ENDED THEN ALGORTHM 1 STEP (1) ELSE MEMORIZE RESULT READ MEMORY 231 TRANSMIT RESULT TO 1

WAIT FOR NEXT VECTOR O PROCESS 5,465,375 1 2 MULTIPROCESSOR SYSTEM WITH for each stage of the above-mentioned architecture. CASCADED MODULES COMBINING PROCESSORS THROUGH A PROGRAMMABLE LOGC CELL ARRAY SUMMARY OF THE INVENTION BACKGROUND OF THE INVENTION 5 Accordingly, there is provided a multiprocessor data 1. Field of the Invention processing system embodying the invention including a This invention relates to multiprocessor data processing plurality of cascaded modules. systems in general. Each of the cascaded modules comprises 2. Description of the Prior Art 10 a data processing unit connected to other data processing The increasingly greater computational throughput units in immediately adjacent downstream and requirements in data processing systems for applications upstream modules by way of a communication net such as image processing or scientific computation, have led work. Each of the cascaded modules further comprises; computer designers to introduce new processor architec a first memory, tures: parallel architectures. Three basic principles are used 5 an additional processing unit, for introducing this parallelism in the new achitectures. The distinction is made between: a second memory, segmented (or pipeline) architectures: this consists in a logic programmable cell array. The programmable logic breaking a task down into plural steps and in perform cell array is configurable into first, second, third and ing these steps independently by different processors. 20 fourth input/output interfaces for temporarily memo Every time an intermediary result is obtained after rizing data into memorized data, and into a central performance of a step, it is transmitted to the next processing and switching circuit for processing the processor and so on. When a step is completed, the memorized data into processed data and switching the processor in charge of performing it is freed and thus processed data towards one of the input/output inter becomes available to process new data. Presupposing 25 faces. Each cascaded module further comprises; the respective durations of performance of the different a first module bus for interconnecting the data processing steps to be substantially equal, the period required to unit, the first memory and the first input/output inter obtain the final results is then the duration of perfor face, and mance of one step, and not the duration of performance a second module bus for interconnecting the additional of the task; 30 processing unit, the second memory and the fourth array processor architectures or SIMD (Single Instruction, input/output interface. Multiple Data Stream) architectures. In this type of The second and third input/output interfaces in each of the architecture, the increase in computational throughput modules are interconnected to the third input/output inter is obtained by having the same instruction performed face in the immediately adjacent downstream module and by a large number of identical processing units. This 35 the second interface in the immediately adjacent upstream type of architecture is particularly well suited to vec module by two intermodular buses, respectively. According to another embodiment, given that, on the one torial processing; and hand, the processing and switching means is configurated multiprocessor architectures or MIMD (Multiple Instruc for once and for all for a given application and, on the other tion, Multiple Data Stream) architectures. In such an 40 hand, that several successive multiprocessor processings can architecture, several processors perform respective be carried out by the processing units on a same data stream, streams of instructions independently of one another. the data already processed according to a first processing Communication between the processors is ensured must be redistributed to the different modules for a next either by a common memory and/or by a network processing. In this case, the second and third input/output interconnecting the processors. 45 interfaces respectively in the programmable logic cell arrays Pending European Patent Application No. 433,142 filed of the last and first modules of the plurality of cascaded Dec. 6, 1990 discloses an architecture of a multiprocessor modules are connected by way of a feedback bus. data processing system in which the bus is shared between The invention also relates to a data processing method plural processor stages and is interfaced in each stage by a implemented in a multiprocessor data processing system programmable LCA Logic Cell Array configurated into 50 plural input/output means and a switching means. The main embodying the invention. The method comprises: advantage of such an architecture is to dispense each pro an first step further consisting in loading a respective set cessor from bus request and management tasks, the latter of weights into the second memory of each of the being carried out in the logic cells array associated with the cascaded modules via the communication network, and processor. Nonetheless, this architecture is not optimal for the input data into the first memory of the first module, the multiprocessor approach to scientific computation appli 55 and cations. Each processor is in fact entrusted with all the tasks at least one set of second and third steps, to be performed (excepting management of the bus). Numer the second step consisting in carrying out partial process ous multiprocessor applications require considerable com ings on the input data in the additional processing unit putational means and a single unspecialized processor per 60 of each cascaded module as a function of the respective stage restricts performances. set of matrix multiplication weights in order to deter OBJECTS OF THE INVENTION mine partial data, and the third step consisting in downloading the partial data to The main object of this invention is to remedy the any one of the programmable logic cell arrays or any preceding disadvantages. 65 one of the first and second memories in the cascaded Another object of this invention is to provide a data modules via the intermodular buses and the feedback processing system optimizing the multiprocessor approach bus. 5,465,375 3 4 BRIEF DESCRIPTION OF THE DRAWINGS to each of the neurons of the immediately adjacent upper layer (n+1), the integer n lying between 1 and N-1. As Further features and advantages of the invention will be specified, with reference to FIG. 1, a respective synaptic apparent from the following particular description of two weight W is attributed to each of these connections. preferred embodiments of this invention with reference to In practice, and by way of an example, the neural network the corresponding accompanying drawings in which: can be used for recognition of digits such as 0, 1,2,..., 8, FIG. 1 is a modelized diagram of an artificial neural 9. In this case, the input vector is a block of digital pixels of network, a digitized image of a given digit Written by any person whomsoever. To each connection between neurons is attrib FIG. 2 is a diagram of a layered architecture of the uted a respective synaptic weight W deduced during a modelized representation in FIG. 1; 10 learning phase of the network. These synaptic weights FIG. 3 is a block diagram of a multiprocessor data correspond to values of coefficients of multiplication matrix processing system with reconfigurable active bus according applied to pixels of the image. The output layer LAYERN to the prior art; produces an output vector which is a binary information FIGS. 4A and 4B are two respective block diagrams of identifying the "recognized' digit. Outputs of neurons of a two embodiments of a data processing system with special 15 respective layer produce a feature map which has "filtered” ized coprocessor embodying the invention; features of the feature map produced from the outputs of the neurons of the lower adjacent layer. Each step in the FIG. 5 is a diagram of feature maps obtained for succes implementation of this model for the multiprocessor data sive processings in the layers of an artificial neural network; processing system embodying the invention will be FIG. 6 is a diagram of connections associated with 20 described in greater detail further on. synaptic weights between two adjacent layers of an artificial A multiprocessor data processing system according to the neural network, prior art, as described in pending European Patent Applica FIG. 7 is a loading diagram of synaptic weights relating tion No. 433,142 filed Dec. 6, 1990, is shown in FIG. 3. The to two successive layers in a data processing system accord multiprocessor system comprises a plurality of modules in ing to a preferred embodiment of the invention; and 25 cascade, of which two adjacent modules M and M are FIG. 8 is a diagram of algorithms relating to the process represented in FIG. 3. Each of the modules M. M. ing of the connections between two successive layers used includes a processor PR, PR, called a transputer, a RAM in the system according to FIG. 7. memory ME, ME and a programmable logical cell array LCA, LCA. The respective processors of the various 30 modules are interconnected by means of an interprocessor DESCRIPTION OF THE PREFERRED communication network RC. This communication network EMBODIMENTS RC notably ensures the transfer of monitoring/control infor The multiprocessor data processing system embodying mation between processors. For a given module M, the the invention is described hereinafter for a particular processor PR, the memory ME, and the logic cell array embodiment concerning artificial neural networks. 35 LCA are interconnected by means of a respective module A very general model representing a multilayer neural bus BM. This module bus BM, is composed of three network is represented in FIG. 1: a certain number of specialized elementary buses which are a data bus, an elementary units ... N, N N . . . called neurons and address bus and a control bus, and interconnects the pro defined by their respective outputs . . . S-1, S, S. . . . cessor, the memory and a first input/output interface in the 40 logic cell array LCA Programmable logic cell arrays (LCA) constitute the nodes of the network. Each neuron N, is are known to those skilled in the art and are constituted by activated by a "potential” V defined by the equation: configurable logic, combinational and sequential circuits. The configuration of the programmable logic cell array (LCA) is set up by the module processor PR V=X. Wit. Si 45 According to the above-mentioned architecture, the pro I grammable logic cell array is configurated into three input/ output interfaces and a central data processing and switching in which s, represents an output level of a neuron N, circuit (hereinafter, the central circuit). The input/output "connected” to the neuron N and W designates a synaptic interfaces notably carry out temporary data storage func weight of the connection between the neurons N, and N. 50 tions. The central circuit ensures data switching functions With this potential V is associated the output level S. between the interfaces, and elementary processing functions corresponding to the neuron N, defined by the relation: (data format modification, encoding, precomputed func tions) e.g. in pipeline mode. The first interface of the logic cell array constitutes the interface between the module bus in which f is a non-linear function. 55 BM, and the central circuit whereas the second and third In practice and by analogy with the human brain, these interfaces respectively interface the central circuit with two neurons are not organized anarchically, but are grouped in intermodular buses BIM, and BIM. The buses BIM, and layers in the form of "columns', connections between two BIM are then respectively connected with a third inter adjacent layers being assigned to a particular function, as face in the logic cell array of an immediately adjacent shown in FIG. 2. This figure represents in layers of super 60 downstream module M and a second interface in the logic imposed neurons comprising two end layers LAYER 1 and array of an immediately adjacent upstream module M. LAYERN, and (N-2) hidden layers LAYER 2 to LAYER The introduction of a programmable logic cell array in (N-1) included between the two end layers. The end layer 1 each module of such a multiprocessor architecture is par is commonly called the "retina " or "input layer ' and ticularly interesting in that its induces a fine grain of receives an input vector whereas the end layer LAYERN, or 65 parallelism between modules while assigning the low-level output layer, produces a corresponding output vector. In this tasks (access to the intermodular bus, elementary functions) representation, each neuron of a given layer n is connected to the logic array LCA. 5,465,375 S 6 FIG. 4A shows the first embodiment of a data processing diagram, the number of neuron layers is presupposed equal system according to the invention for the carrying out of an to 5. The input vector is an input block having (28 by 28) artificial neural network within the scope e.g. of recognition pixels representing any digit whatsoever, 0 in this instance, of digits included between 0 and 9. The system comprises I written by a person and digitized. The network is composed modules 2 to 2 in a cascaded architecture. 5 of 4,634 neurons. Each module 2, ibeing an integer varying between 1 and Each neuron in the input layer (LAYER 1), called retina, I, comprises a data processing unit in the form of a processor respectively receives a pixel of the input vector. The first 20, a first RAM type memory 21, a programmable logic cell hidden layer LAYER2 is divided into 4 sub-layers of (24 by array 22, a second memory 23, and a digital signal pro 24) neurons. Each neuron of each sub-layer receives (5 by cessor constituting a coprocessor or dedicated specialized O 5) neighboring pixels of the input block after multiplication processing unit 24. Within the module 2, the processor 20, by a line matrix of respective synaptic weights. It is recalled the memory 21 and an input/output interface 221 of the that these synaptic weights are used for processing into programmable logic cell array 22, are interconnected by matrix multiplication coefficients. Four blocks of (24 by 24) means of a common module bus 25, Typically, this common pixels are thus supplied by the respective outputs of the four bus is constituted by the three elementary buses, i.e., address 15 sub-layers of neurons of LAYER 1. bus, data bus and control bus. The synaptic weights applied between the outputs of the By comparison with the foregoing description in refer neurons of the layer LAYER1 and the four sub-layers of the ence to the prior art according to FIG. 3, the programmable second layer LAYER2 relate to specific processings on the logic cell array 22, is programmed into four input/output image of (28 by 28) input block pixels. Respective synaptic interfaces 221, 222, 223, and 224 and a central data processing and switching circuit 220. According to this weights between the four sub-layers of LAYER 2 and four embodiment, the input/output interfaces 221 to 224, prin sub-layers of LAYER3 relate to averaging and subsampling cipally constitute temporary storage means or buffer means. by two processings. Respective outputs of the neurons of the The central circuit 220, is configurated to switch data from four sub-layers of LAYER3 thus produce four image blocks and to the input/output interfaces 221 to 224 and to of (12 by 12) pixels. conduct elementary processing of the data received through 25 Details of LAYER3 and LAYER 4 will not be provided. the input/output interfaces. It should nevertheless be remarked that the role of each layer The first input/output interface 221 is connected to the consists in extracting fundamental features from the digi module bus 25. This input/output interface 221 is e.g. used talized (28 by 28) pixel block of a handwritten digit. As for: shown in FIG. 5, an output layer of 10 neurons produces ten 30 pixels in black and white, the rank of the sole white pixel temporary storage of data transmitted by the processor 20, produced by one of the ten neurons being representative of in order to free the latter for other tasks; and the "recognized '' input digit subsequent to the various direct memory access (DMA) to the memory 21, con "digital filtering ' steps respectively performed by the nected to processor 20, by means of the logic cell array neuron layers. 22. 35 In reference to FIGS. 6, 7 and 8, the installation of an The second and third interfaces 222 and 223, in the logic artificial neural network in the multiprocessor data process cell array 22, of the ith module 2 are respectively connected ing system embodying the invention, as shown in FIG. 4A, to a third input/output interface 223 of an immediately will now be described. According to this preferred embodi adjacent downstream module 2, and a second input/output ment, the neural network comprises three layers of 256, 40 interface 222 of an immediately adjacent upstream mod 40 and 10 neurons respectively. The neurons of the first layer, ule 2. These connections are respectively made by means called input layer, and of the second layer each set up of two intermodular buses BI and BI-1. The I pro connections (each assigned to a respective synaptic weight) grammable logic cell arrays are thus cascaded by means of respectively with each of the neurons of the immediately intermodular buses BI2,..., BI-1. According to this first adjacent upper layer, i.e., the second layer and the third preferred embodiment, the third input/output interface 223 45 layer, called output layer. The input vector is a block of of the first module 2 and the second interface 222 of the I" (16x16)=256 pixels of a digitalized image of a handwritten module 2 are connected by a feedback bus BR. digit included between 0 and 9. The intermodular buses BI2 to BI-1 in series with the As shown in FIG. 6, all the connections assigned to feedbackbus BR thus constitute a ring. The second and third respective synaptic weights between two adjacent layers interfaces 222, and 223, can e.g. be used during a transmis 50 respectively having J and J' neurons are fully defined by a sion of data between processors 20 of non-adjacent modules single rectangular matrix of size (JXJ). Each weight W of and thus confer high-speed communication node functions the rectangular matrix,j being included between 1 and J, and upon the logic cell arrays 22 j' between 1 and J", corresponds to a value of a synaptic In each module 2, the 24, the weight of the connection between a neuron of rankj and a second memory 23, and the fourth input/output interface 55 neuron of rankj' of the two adjacent layers respectively. 224 are interconnected by means of a common bus 26. In compliance with the preferred embodiment, two With reference to FIGS. 5, 6, 7 and 8, the operation of the respective matrices of (JxJ")=(256x40) and (JxJ")=(4x10) data processing system embodying the invention will now synaptic weights between the first and second layers and be described for the preferred embodiment concerning arti between the second and third layers are then used, i.e., a total ficial neural networks. 60 of 10,640 weights or connections. FIG. 5 shows typical results obtained within the scope of For indicative purposes, these synaptic weights for par artificial neural networks for the recognition of handwritten ticular embodiments (recognition of digits, ...) are obtained digits included between 0 and 9. Such an application can e.g. during a learning phase by a gradient back-propagation concern the recognition of zipcodes for "automated' postal algorithm. Summarily, this algorithm performs the recogni distribution. In the diagram in FIG. 5, the vertical ordinate 65 tion computations for synaptic weights given initially. The axis relates to the numbers of neuron layers in an architec results of these computations are compared to expected ture such as that presented in reference to FIG. 2. In this recognition results. The weights are modified taking this 5,465,375 7 8 comparison into account. After several iterations, the syn initial blocks via the communication network RC. aptic weights converge towards optimal recognition values. The diagram of algorithms relating to a first processing of This learning phase is generally very costly as regards time. connections between first and second layers of neurons in According to the first preferred embodiment, the data FIG. 8 enables the operation of the multiprocessor data processing system embodying the invention (FIG. 4A) com processing system embodying the invention as an artificial prises I =10 modules 2 to 2. In a first step, as shown neural network to be grasped. Each of the "tables' in FIG. schematically in FIG. 7, each module 2, i lying between 1 8 relates to an algorithm performed by one or more of the and 10, is assigned to the processing relating to all the processors 20, digital signal processors 24, or logic cell connections between the input layer and respectively one of arrays 22 of the system. the ten quadruplets of neurons in the second layer (4x10= O It has been seen previously that the input data vector is 40). The matrix computations: initially loaded in the memory 21. The processor 20, reads this vector in the memory 21 and writes it in the first input/output interface 221 of the programmable logic cell Wpp = 2k Wipek,p array 22 of the first module 2. The central processing and 5 switching circuit 220 of network 22 then switches this vector towards the second interface 222 and this second where k varies between 1 and J=(16x16) and p varies interface 222 retransmits it to the third interface 223 of the between 1 and J-40, logic cell array 22 of the second module 2 and so on so are carried out by the same digital signal processor for forth. The vector is thus broadcast in the ring BI to BI four set values of the index p, and therefore in relation to the 20 successively to the adjacent modules immediately above. four neurons of a respective quadruplet. Each of the central means of the networks switches and takes One advantage of the invention is that these matrix via the fourth interface 224, the set of (16 by 16) pixels of multiplications are performed by the digital signal proces the vector towards the memory 23, associated with its digital Sors 24, to 24-ho. signal processor 24. The data input vector used by each Further to this first processing (connections between the 25 module 2 in the configuration previously described is thus first and second layers), each digital signal processor 24 to memorized in the respective memory 23, associated with the 24 is assigned to the processing of the matrix multiplica digital signal processor 24. tions relating to the connections between the neurons of the The first step (step 1) of the algorithm relating to each of second layer and a respective neuron of the third layer, the digital signal processors 24, with i lying between 1 and called output layer (1x10=10). 30 I, consists in computing the potential V, relating to the The utilisation of a digital signal processor or specialized neurons attributed to the module 2, then in writing the coprocessor 24, frees the processor 20, which can perform potential thus computed in the input/output interface 224 of other tasks. the logic cell array 22, of the same module 2. Each 220 of In reference to FIGS. 8 and 4A, the implementation and the central circuit of the logic cell arrays configurated for operation of an artificial neural network in the multiproces 35 this purpose gradually transmits the results of the potential sor data processing system embodying the invention will computations V to V, to the next logic cell array 22 until now be described. all the results V to V, have been received by the third Prior to the operation of the system as an artificial neural input/output interface 223 of the logic cell array of the first network in the recognition mode, the system is initiated at module 2 via the feedback bus BR (step 2). the initiative of a master computer 1 connected to the first 40 Then the input/output interface 223 of the first module 2 processor 20,. This initiation is established by the computer writes the potential computation results received in the 1 via the interprocessor communication network RC. The memory 23, and alerts the digital signal processor 24, (step initiation comprises: 3). The processor 24 computes the value of the sigmoid with regard to each processor 20 to 20: function (non-linear f function based on hyperbolic tangent loading of an operating program in the respective 45 functions defined from the model initially presented in the memory 21 to 21 via the bus 25 to 25, specification) for each "pixel " or neuron potential V, configurating of the associated logic cell array 22 to produced by the processors 24 (step 5) for obtaining the 2210 output levels of all the neurons in the second layer for this loading of programs (matrix multiplication, . . . ) first processing. At the same time and as each digital signal relating to the operation of the digital signal proces 50 processor 24, writes the computed potentials specific to the Sor 24 to 24 in the associated memory 23 to 23 four neurons of the second layer which it simulates in this via the network 22 to 22, and first processing, in the associated memory 23, the proces loading of a respective set of synaptic weights such as sors 20 to 20, then read the neuron potentials respectively previously described and relating to the digital signal memorized in the memories 23 to 23, when all the poten processor 24 to 24, in the associated memory 23 55 tials have been computed (step 2). to 2310, According to the embodiment, two processings relating to as well as the loading of the first input vector constituting the first and second layers and to the second and third layers input data e to e to be recognized by processing are provided. In this way the outputs of the neurons in the (block of 16 by 16 pixels), in the memory 21 of the second layer of the chosen configuration memorized in the processor 20 of the first module 2. 60 memory 23, are reprocessed by the digital signal processors In the case of a sequential processing of plural input data 241 to 24, for new potential computations after broadcasting vectors to be recognized, the latter are memorized as they in the ring of the neuron outputs as computed during the first become available in the memory 21 connected to the processing. processor of the first module 2. Each of the input vectors is According to the second embodiment shown in FIG. 4B, e.g. Supplied by the master computer 1 subsequent to a 65 the addition of an additional module 2 to the initial modules preprocessing (linear processing to normalize the 16 by 16 2 to 2 is proposed upstream of the latter. This additional size of the initial blocks supplied by a video camera) on the module comprises a processor 20, a memory 21 and a 5,465,375 9 10 programmable logic cell array 22. This module is provided modular buses, respectively, in order to directly inject the pixel images to be processed said second and third input/output interfacing means into an input/output interface 223 of the programmable respectively in said programmable logic cell array of logic cell array 22. This injection enables an increase of the a last module and a first module in said plurality of flow of images to be processed since the images then do not said cascaded modules being connected by means of transit via the master computer 1 and, furthermore, do not a feedback bus, require utilisation of the communication networkRC. Adata said input data processing method comprising: (images) acquiring system, such as a video camera or a first step further consisting in loading a respective scanner (not shown), is then directly connected to the third set of matrix multiplication weights into said input/output interface 223 of the programmable array 22 10 second memory of each of said cascaded modules of the additional module 2 via a bus BI. The memory 21, via said communication network and said input the processor 20 and a first input/output interface 221 of data into said first memory of said first module, the programmable logic cell array 22 in the additional and module 2 are interconnected in identical manner to the at least one set of second and third steps in each of interconnections in the other modules 2 to 2, by means of 15 said cascaded modules, abus 25. The images to be processed which are injected via said second step consisting in carrying out partial a bus in the third input/output interface 223 of the pro processings on said input data in said additional grammable logic cell array 22 can undergo a first prepro processing unit of said each cascaded module as a cessing (16 by 16 formatting), in a processing and switching function of said respective set of matrix multipli circuit 220 of the logic cell array 22 by programming of cation weights for determining partial data, and the latter. The second interface 222 is connected to the third said third step consisting in downloading said partial interface 223 of the first 2 of the modules in cascade 2 to data to any one of said programmable logic cell 2 via an additional intermodular bus BIo. arrays or any one of said first and second memo For identical reasons to those of the first embodiment, a ries in said cascaded modules via said intermodu feedback bus BR can also be provided. The latter intercon 25 lar buses and said feedback bus. nects the second input/output interface 222 of the program 2. An input data processing method implemented in a mable logic cell array 22 of the last module 2 to the fourth multiprocessor data processing system, input/output interface 224 of the logic cell array 22 of the said multiprocessor data processing system comprising a additional module. plurality of cascaded modules, each of said cascaded For indicative purposes, a recognition of a digitized 30 modules comprising: handwritten digit in (16 by 16)pixels by the data processing a data processing unit connected to other data process system embodying the invention simulating 10,640 neurons ing units in immediately adjacent downstream and requires 175 us. Durations of the order of a tenth of upstream modules by means of a communication millisecond are usually required for conventional systems. network, What we claim is: 35 a first memory for storing data, 1. An input data processing method implemented in a an additional processing unit, multiprocessor data processing system, a second memory for storing data associated with said said multiprocessor data processing system comprising a additional processing unit, plurality of cascaded modules, each of said cascaded a programmable logic cell array configurable into first, modules comprising: 40 second, third and fourth input/output interfacing a data processing unit connected to other data process means for temporarily memorizing data into memo ing units in immediately adjacent downstream and rized data, and into a central processing and switch upstream modules by means of a communication ing means for processing said memorized data into network, processed data and switching said processed data a first memory for storing data, 45 towards one of said input/output interfacing means, an additional processing unit, a first module bus for interconnecting said data pro a second memory for storing data associated with said cessing unit, said first memory and said first input/ additional processing unit, output interfacing means, a programmable logic cell array configurable into first, a second module bus for interconnecting said additional second, third and fourth input/output interfacing 50 processing unit, said second memory and said fourth means for temporarily memorizing data into memo input/output interfacing means, rized data, and into a central processing and switch said second and third input/output interfacing means in ing means for processing said memorized data into said each of said cascaded modules being intercon processed data and switching said processed data nected to the third input/output interfacing means in towards one of said input/output interfacing means, 55 said immediately adjacent downstream module and a first module bus for interconnecting said data pro the second input/output interfacing means in said cessing unit, said first memory and said first input/ immediately adjacent upstream module by two inter output interfacing means, and modular buses, respectively, and a second module bus for interconnecting said additional an additional module, a data acquiring means and an processing unit, said second memory and said fourth 60 additional intermodular bus, input/output interfacing means, said additional module including a data processing unit, said second and third input/output interfacing means in a first memory and a programmable logic cell array said each of said cascaded modules being intercon configurable into first, second, third and fourth input/ nected to the third input/output interfacing means in output interfacing means, said immediately adjacent upstream module and the 65 said acquiring means being interconnected with said second input/output interfacing means in said imme third input/output interfacing means in said program diately adjacent downstream module, by two inter mable logic cell array of said additional module, 5,465,375 11 12 said additional intermodular bus interconnecting said data into said first memory of said first module, second input/output interfacing means in said pro and grammable logic cell array of said additional module at least one set of second and third steps in each of and said third input/output interfacing means in a said cascaded modules, first of said plurality of said cascaded modules, said second step consisting in carrying out partial wherein said second input/output interfacing means in processings on said input data in said additional said programmable logic cell array of the last module processing unit of said each cascaded module as a in said plurality of said cascaded modules and said function of said respective set of matrix multipli fourth input/output interfacing means in said addi cation weights for determining partial data, and tional module are connected by means of a feedback 10 said third step consisting in downloading said partial bus, data to any one of said programmable logic cell said input data processing method comprising: arrays or any one of said first and second memo a first step further consisting in loading a respective ries in said cascaded modules via said intermodu set of matrix multiplication weights into said lar buses and said feedback bus. second memory of each of said cascaded modules 15 via said communication network and said input ck k k 3 k