<<

SCIENCE OF THE STROKE SEQUENCE OF

Takeshi SHIMOMURA Technical College of Osaka Prefecture Neyagawa-shi, Osaka 572, JAPAN

Summary 2. The Stroke Sequence on Hypothesis The sequence of kanji has In writing kanji, if the shape been investigated chiefly from the view- alone is enough, the direction and the point of energy in the dynamics of writ- sequence are various and therefore ing together with the informational great number of ways of writing produced viewpoint in memorization in learning. by their combinations will be possible. Sample characters include all the What is the reason for a specified members for daily use, and also sequence selected as the standard ? . The results show that the By daily experiences, it seems the standard sequence fundamentally follows author that according to the standard the law of energy minimization in writ- sequence "easy, rapid and beautiful" ing movements, The degree of satisfac- writing is possible to be accomplished. tion is highest for the vertical succes- Now, putting aside the factor of beauty sion among practical writing conditions. for the moment, it can be thought that The standard sequence is considered to "ease and rapidness" means the standard originate from the human adaptation for sequence is one requiring as little circumstances. As for the characters energy consumption as possible in writ- with more strokes, it is found that to ing movements. And therefore this study add the contribution of easy memori- began with building up a hypothsis that zation by compression of information the standard sequence satisfies the law content of the sequence, the law is of energy minimization and verifying by satisfied by selection of the sequence electronic computation for the kanji of sub-systems in place of individual samples with fewer strokes. 1 ) As for strokes. These results indicate that kanjis with more strokes, in addition the stroke sequence greatly affects the to the contribution to writing move- human ability of kanji processing. ments, the standard sequence is assumed to have an effect to ease the memoriza- . Introduction tion of kanji, i.. a contribution to reduce information content, which is In general the standard stroke studied by the use of the theory of sequence is traditionally designated for information. each kanji(and also its offspring letters as katakana), and both in 3. Modelling of the Writing Movements and in primary education, writing characters according to the standard is and the Rank Distances imposed. But the unknown origin prevents If the directions of strokes are justifying the observance. And also conventionally fixed, kanjis with recently the stroke sequece has applica- n-strokes have as many possible tions in engineering as kanji-recognition sequences as n!. The energy consumed in and so on. Actually, however, some vari- writing depends on pen-path length, pen eties in existence pose some problems. velocity, pen pressure and so on. For In view of these situations, simplicity, assuming that velocity and clarification of scientific basis of the pressure are constant and the effects standard sequence will be not a little of direction and others are negligible, valuable to the above fields and also the energy is reduced to a function of to serve as an aid to explicate the pen-path length alone. Pen-path intrinsic nature of the character in consists of stroke vectors and stroke- linguistics. with-stroke combining vectors for a In this report the considerations single character, and in case of from the viewpoint of energy in the character succession, character-with- dynamics of writing is extended, and the character combining vectors add to this. study of contribution to easy memoriza- Sample kanjis used number 1850 for tion, the possibility of which was daily use. The character form employed already suggested by the author as an is of square-style and each stroke is additional factor for the characters approximated by a straight line. As with more strokesl),are developed, by circumstances for character, an appro- aiming at the total research on the priate field is assumed and the situa- traditional standard sequence. tions of singleness and character

---270 - successions in three different direc- mostly related to the recent change in tions are introduced by the boundary circumstances. These facts assure that condition setting. the stadard sequence can be regarded as The process of verifying the an example of human adaptation for hypothesis is to calculate the energy circumstances. consumption in all the possible These characteristics of the sequences for normalized standard kanji standard sequence agree with the general samples and to examine whether the energy features of the natural language as a consumption for the standard sequence is social custom. (Though the above- in the lowest. mentioned considerations were performed The results of electronic computa- under the condition of the fixed direc- tion show, as a whole, the hypothesis is tions of stroke vectors, the direction fairly well satisfied, as an example of itself is clarified to follow the law which the degree of satisfaction is of energy minimization by introducing shown in Tab. 1 by the rank distance D direction dependence in energy consump- for samples of not more than 6 strokes, tion per path-length.) where D is defined as 4. The Structure of Kant! and D = [(k - l)/(n! - I)] x i00 (%) Memorization when the standard one is in the k-th from the lowest in energy. Among these, Some small D's spreading phenomena the relationship of D and cumulative with increasing strokes per character number of characters N (%) for 6-stroke suggest an additional factor existing. characters, for example, is shown in With this respect, in the previous Fig. i. As for katakana, the perfect report, possible relevancy to facili- satisfaction D = 0 holds good for more tation of memorization, i.e. the effect than 60% of all the samples in the on reducing information content, was vertically downward succession condition just point out. 1 ) Here this factor is and the anisotropy is quite small. examined, with correlation to the kanji In Tab. i, all the samples of not structure, by the aid of the informa- more than two strokes are independent tion theory. of either singleness or succession, and Information for writing kanji is of its direction, and completely opti- assumed to be input to/output from the mized, which fact is suitable to the memory device of the cerebrum as a nature as the most fundamental consti- symbol string of kanji-forming stroke tuent, together with katakana, in kanji vectors. In the following calculations, system. Though with the number of encoding only about direction is strokes per character increasing, some employed for simplicity, putting aside small spreading of D and anisotropy position and magnitude. Quantization appear, the whole trend can still be is in 8 different directions according seen to support the hypothesis. Here it to the traditional calligraphy. ( The is noticeable that human ability of case in which simplified to 5 direc- selection through the cumulative experi- tions is also considered.) ences is splendidly high: in spite of At the first stage, the amount of the number of possible sequences n! information by the statistcs of the abruptly increasing with increment of direction occurrence frequency, multi- strokes, perfect satisfaction is found grams, etc. about all the samples is in quite a few samples.( In case of 5- calculated, and possibility of data stroke samples, for instance, in which compression is analysed by the theories n! = 120, optimum holds good in more of Markov process and encoding. than 20% of them ! ) By the results, mean information Some difference in the degree of content per stroke by the transition satisfaction by differnt environments probability in Markov chain, is found is also observed. The degree is highest to be only about 10% less than that by for vertical succession among practical frequency statistics. Great reduction writing conditions, which corresponds of information amount is, therefore, well to the traditional kanji call!gra- unpromising as far as a stroke is the phical modes in past China and Japan. string element even if the transition The history of the standard probability is learned. sequence has been rather stable, in Next consideration is on compress- connection with the past stable modes, bility of data by deriding stroke in which,however, some examples changed string into sections, i.e. reduGtion by exist. And the inspection of their D's forming a kind of supersymbol, 2) viewed indicates that the transitions are from transinformation between strokes. mostly towards the lowest. New phenomena At the same time, as a graded structure observed now about the sequence are is observed in the stroke symbol string,

271 transinformation content between compound 5. Conclusion events and possibility of compression This investigation leads to the are also calculated by setting each following conclusion: the traditional previous section as an encoding element, standard stroke sequence of kanji is in which a close coupling relation is thought as a human experiential result found between them, such as the trans- toward the optimization of writing and information amount between compound memorization in learning. And the elements of 3-strokes, is about ten sequence, therefore, greatly affects times as much as in case of sectioning. human ability of linguistic activities These results suggest a possibilty of using kanji. compression to about a half or less. And then, therefore, the reduction 6. Acknowledgements of information content for all the samples by selecting suitable sub- The author would like to express systems like the traditional radicals, his cordial thanks both to Emerit.Prof. etc. as compound element, is examined. E.Sugata and to Prof. Y.Inuishi of Osaka With this respect, however, besides our University for their useful advice and calculation a similar study for oth@r encouragements throughout this investi- purpose has already been reported. 3) gation. Sincere appreciation is also With some different viewpoints included, shown to Prof. S.Ijichi of our college it is thought sufficient to cite here for his support and facilities, and to instead, for the estimation of this. my students for their experimental By their data, the whole information assistance. amount can be reduced to as small as 40% of the amount in case that the References individual stroke is the element. l) T. Shimomura: A Scientific Approach In view of these facts on informa- to the Stroke Sequence of Chinese tion,the satisfaction of the law of Characters, Trans. I.E.C.Japan, energy minimization in the dynamics of 58-D,12,756 (1975) writing is examined again by unit of 2) F. yon Cube: Ueber ein Verfahren der sub-system. That is, by expressing mechanischen Didaktik, Gr.K.G. sub-system in terms of a cummulative ~, 1 (1961) vector, the similar calculations to the 3) T. Sakai, M. Nagao,and H. Terai: A previous chapter are performed. And the Description of results prove the law is well satisfied, Using Sub-patterns, Johoshori( for instance, for about 450 samples Journ. I.P.S.Japan),l_0,5,285(1969) composed of two sub-systems, each of which is again the member of samples, more than 99% of all are optimized in the vertical succession condition. The system of kanji is, by origin, of a graded structure, and most sub- systems like radicals have important symbolic functions such as phonetic value, meaningful element, etc.. Therefore, as for characters compound in structure, the stroke sequence deter- mined by selection of sequence of sub- systems has an effect to fully function- ate these symbolic actions, by which generating additional redundancy in human processor is expectable. By the above considerations, it is clarified that for a simple character with fewer strokes, the standard stroke sequence is determined by the energy minimization in writing and that a compound character with more strokes are of multiplex structure, where the energy minimization is satisfied by selecting the sequence of memorization- facilitating sub-systems whose stroke sequences have previously been decided by the energy minimization.

- 272' Tab. 1 Average Rank Distance D (%)

conditions number of number of succession strokes samples singleness diagonal vertical horizontal

1 1 0 0 0 0

2 5 0 0 0 0

3 21 16.2 13.3 15.2 17 .i

34 18.5 ll.0 15 .i 20.2

5 58 12.5 7.5 i0.i 13.8

6 73 10.4 5.3 8.7 ii .7

i00

75 ©

KD Directions of succession 50 0 • • Diagonal --4 Vertical A Singleness ~ 25 s--~ Horizontal

O J l i

o 25 50 75 io0

Rank Distance D (%)

Fig. 1 Cumulative Number of Characters and Rank Distance for 6-stroke Characters

--273--