Data Compression Method and System
Total Page:16
File Type:pdf, Size:1020Kb
Europaisches Patentamt J European Patent Office © Publication number: 0 678 986 A1 Office europeen des brevets EUROPEAN PATENT APPLICATION © Application number: 95106020.1 int. ci.<>: H03M 7/42, G06F 5/00 @ Date of filing: 21.04.95 © Priority: 22.04.94 JP 107837/94 © Applicant: SETA CO., LTD. 35-1, Nishi-Kamata 7-chome @ Date of publication of application: Ohta-ku, 25.10.95 Bulletin 95/43 Tokyo 144 (JP) © Designated Contracting States: @ Inventor: Watanabe, Hiroyuki DE FR GB c/o Seta Co., Ltd., 35-1, Nishi-Kamata 7-chome Ohta-ku, Tokyo 144 (JP) © Representative: Prufer, Lutz H. et al PRUFER & PARTNER, Patentanwalte, Harthauser Strasse 25d D-81545 Munchen (DE) © Data compression method and system. © A lossless type data compression method em- generated by registering the combined strings hav- ploying a dictionary system is suitable for character ing occurrence frequency higher than a given value generator of a game machine and so forth. A work- with a dictionary number. The combined strings in ing data strings are generated from an original data the data stream are replaced with the dictionary stream. Two sequential working data strings are numbers corresponding to the combined strings in combined to form a combined string. A dictionary is the dictionary. FIG. 1 100 RAM _ (ORIGINAL DATA (DATA STREAM) — (NUMBER OF CYCLES TO — CO STREAM) REPEAT DATA 00 COMPRESSION) Oi , CPU RAM COMPRESSED 00 (DICTIONARY DATA REGISTER " CO DATA) ROM (DATA COMPRESSION PROGRAM) Rank Xerox (UK) Business Services (3. 10/3.09/3.3.4) 1 EP 0 678 986 A1 2 The present invention relates generally to a is next age coding system in the facsimile. The data compression method and system. More spe- JBIEG may handle redundant data stream which cifically, the invention relates to a lossless type cannot be handled by Huffman coding system, by data compression method employing dictionary unitary compression based on probability of occur- system suitable for a character generator for a 5 rence of strings. Thus, the JBIEG realizes optimal game machine and so forth. data compression in view of information entropy. Conventionally, various data compression In general. LZ system is a data compression methods have been developed for reduction of system which performs data compression detecting necessary capacity of storage devices in data pro- repetition of strings. The LZ system is applied for a cessing systems and for improvement of data io data compression tool for personal computers or transmission efficiency. The data compression for data backup cartridge tape recording apparatus method may be generally divided into lossy type and other products. and lossless type in the viewpoint of capability of The LZ system generally includes LZ77 sys- bidirectional coding. tem, Ziv J. and Lempel, A. "A Universal Algorithm The lossy type data compression method is 75 for Sequential Data Compression", IEEE Transac- non-reversible coding system. JPEG (Joint Photo- tion on Information Theory, vol. IT-23, No. 3, pp graphic Coding Expert Group), MPEG (Moving Pic- 337-343, September, 1997 and LZ78 system, Ziv, ture Image Coding Expert Group), H.261 for PMS J. and Lempel, A. "Compression of Individual Se- (Picture-phone Meeting Service) or picture tele- quences via Variable Rate Coding" IEEE Transla- phone and so forth are internationally standardized 20 tion on Information Theory, vol IT-24, No. 5, pp 530 systems of this lossy type data compression. The to 536, September, 1978. The former LZ77 system lossy type data compression is advantage for high is also disclosed in U. S. Patent Nos. 5,003,307 compression rate, i.e. approximately 1/50 to 1/1000 and 5,016,009. The later LZ78 system is also dis- while loss of information amount is caused. closed in U. S. Patent Nos. 4,558,302 and On the other hand, the lossless type data com- 25 4,814,746. Algorithms of LZ77 and LZ78 systems pression method is a reversible data coding sys- are in common at the point where the currently tem. This type of data compression system gen- objective string and processed strings are com- erally holds data compression rate approximately pared and the longest matching string is obtained 1/2 and thus cannot achieve high compression rate through the comparison. However, LZ77 stores the as achieved by the lossy type data compression 30 processed strings in a buffer and takes means to method. However, the lossless type data compres- handled the processed data as if the processed sion method is advantageous for capability of en- data in the buffer is slide on the input data stream. coding and decoding without loosing an original On the other hand, the LZ78 system employs data. Run Length coding, Huffman coding, means for assigning dedicated codes for pro- Arithmetic coding, LZ (Lemple-Ziv) system and so 35 cessed strings and registering the codes in dic- forth are typical standardized systems in the los- tionary style. sless type data compression methods. As a result, in comparison of LZ77 and LZ78 The run length coding system is the simplest systems in terms of function, LZ77 system is supe- lossless type coding system. The system utilizes rior than the LZ78 system in terms of compression the fact that probability of appearance is differen- 40 rate, and LZ78 system superior than the LZ77 tiated depending upon the value of the run length. system in terms of data processing speed. Therefore, by assigning shorter code for the run On the other hand, in the field of gate machine, length having higher probability, data compression requirement for high level image expression is pro- is achieved. This coding system has been em- gressively growing. In the commercial gate ma- ployed in CD-I (Compact Disc-Interface), Video for 45 chine, image expression utilizing three-dimensional Windows (Trademark: Microsoft) and so forth. CG (computer graphics) has been employed. The Huffman coding system is a data compression trend is extending to home use television game system primarily used in the field of image pro- machines and multimedia systems. Thus, develop- cessing. MH (Modified Huffman) coding of G3 stan- ment for data processing systems capable of such dard facsimile and so forth are application of the 50 high level image expression are progressed. Huffman coding. Complication of image expression causes in- It should be noted that JPEG, MPEG, or H.261 creasing of data amount. Therefore, a demand for also employs Huffman coding. However, since lossless type data compression method having these method use DCT (Discrete Cosine Transfer) high compression rate and capable of high speed in preparatory process, they are classified as lossy 55 encoding and decoding, is growing. type. Particularly, in case of the game machine, un- Arithmetic coding system is used in JBIG less the display screen reacts to operation of a (Joint Bi-level Image Coding Expert Group) which button on a control pad by a user within 1/60 to 2 3 EP 0 678 986 A1 4 1/30 seconds (corresponding to display period of order in descending order from working string hav- one or two field of color television signal), game ing largest occurrence frequency, and having oc- becomes less interesting. Therefore, it is inherent currence frequency greater than or equal to 3; to achieve both of the high compression rate and performing fourth process step for registering high speed encoding and decoding. 5 compression dictionary data of (S + 1) bits con- In this sense, the above-mentioned the LZ sys- sisted of dictionary number and compression iden- tem (particularly LZ78) and LHA system, in which tifier bit, in the second storage means, correspond- the LZ system and Huffman coding system may be ing to each of combined strings detected by the said as suitable data compression method as data third process step; and compression method. io firth process step for replacing combined string However, in case of the LZ system, since re- among combined strings in the working data spective of the individual strings as object for com- stream matching with one of combined strings reg- pression have variable, and algorithm for compres- istered in the second storage means, with the com- sion and decompression is complicate, the data pression dictionary data corresponding to the processing procedure may contain large number of is matching combined string, steps and the hardware construction may become repeating the third to fifth process steps for R complicate. times with taking data stream replaced through the On the other hand, in the character generator fifth process step as working data stream for out- in the game machine, a relatively small data block putting data stream stored in the first storage is handled. In the conventional data compression 20 means and all combined string and compression method, compression becomes impossible or at dictionary data stored in the second storage least insufficient in compression. means, after R times repletion, as compressed Also, the data block of the character generator data. in the game machine, in has a tendency to have In the alternative, when R = 1, the second high probability of occurrence of the same string, 25 process step is performed without adding the non- and not to cause significant variation of number of compression identifier bit to establish the working the data blocks corresponding to variation mode of data stream take the original strings as the working the characters. strings, and the fourth process step is performed Therefore, it is an object of the present inven- with formulating the compression dictionary data tion to provide a lossless type data compression 30 solely with the S bit dictionary number. method which permits high speed compression In the foregoing construction, the number of and decompression and can realize high compres- repetition cycles of the third to fifth steps is des- sion rate irrespective of the size of data blocks.