COMPUTER-BASED LEARNING OF BASIC JAPANESE Teiji Furugori—The University of Electro-Communications, Japan

Abstract: Described here is a computer-based system of learning Japanese. It offers three types of problems in learning the language; basic sentential pattern practices, counting drills, and - mixture writing exercises. The system is generative. first generate Japanese sentences using the English alphabet, .., phonetic representation. We then create from the sentences a sequence of problems for the students of Japanese practice. In the current implementation, the kana and kanji writing problem seems to serve the educational purpose most effectively, as it is crucial to master Japanese and, at the same time, seems to be the most fun, as the students like it best.

Keywords: Language learning, Japanese, System, Sentence generation, Kana/kanji characters

INTRODUCTION auxiliary verbs, and adjectivals, how these forms are combined to express Japanese is said to be a difficult certain modalities (three modes of language to learn. However, its expressing the same thing: humble, basic sentential structures are simple polite, and honorific ways), and and pronunciations are rather easy, multiple ways of writing the same as it has fewer phonemes than most linguistic sound sequence: kana, other languages. The difficulties lie kanji, and their mixed forms. in its highly inflected forms of verbs,

CALICO Journal, Volume 8 Number 2 7 This paper describes a language or she quits the exercise. Below, learning system for students of we show kinds of problems the Japanese. The system tries to serve system covers at this writing. educational aims1 by offering drills to practice basic sentential patterns Sentence Patterns. (in word inflections and counting The system generates a basic problems), and the standard sentence, and asks the student to Japanese writings in kana and kanji transform it to a specified form. For mixed forms. instance, it may generate the following sequence: OVERVIEW OF THE SYSTEM watashi gakkou iku. The control structure of the system is (I got to school.) simple. It first shows a menu from Past: which the student chooses a problem area to study. It then generates a The student is supposed to change sentence and creates a problem, in the present tense to the past: the chosen area, for the student to practice. Figure 1 shows the control watashi wa gakkou ni itta. structure. In the outer loop, the (I went to school.) student practices problems from When the answer is given, the different areas. In the inner loop, the system acknowledges it or gives the system alternates generating a right answer before going t sentence and creating a problem in generate the next problem. the area the student has chosen, until

Start

Menu

End

Sentence Generation

Patterns or Counting Kanji Writing

FIGURE 1 Control Structure

CALICO Journal, Volume 8 Number 2 8 the student is to write them down in Counting Things. his or her notebook. Then, by hitting The system generates a sentence that the return key, the system gives the contains a countable noun, and then right answer, which for this example creates a problem in which the is: student is to supply a suffix to the number that expresses the quantity of the countable noun. For example: SENTENCE GENERATION koko ni enpitsu ga 3 ( ) arimasu. Some 30 years ago, the computer (Here are 3 pencils.) seemed to generate sentences at Answer: will.2 However, it turned out that it wasn't really generating, but The student should supply "bon" as transforming sentences, using simple the answer for this problem; "hon" pattern matchings. The original happens to be the suffix for counting sentences were supplied by the user pencils, but the answer to the who wanted to talk to the computer. problem above must be "bon," for How far have we come since then? when "hon" is attached to the Now, we may be able to deal with quantity 3, a phonetic change for sentences in elegant ways in "hon" to "bon" occurs, i.e., the computational grammars, language pronunciation is changed. generation systems, and cognitive science (see footnotes 3, 4 and 5). Writing Practices. But we still have a problem of In this drill, the student tries to learn producing a range of sentences that Japanese writing with kanji and kana are also semantically realistic in mixed forms— difficult, yet an daily life. When in an example a important exercise in learning to grammar is able to produce: write standard Japanese. The system first changes the sentence generated watachi wa depa-to e kaimono ni iku. in English alphabet to kana form and (I go shopping to a department then puts some portions in store.) parenthesis. The student is supposed to change the it will also generate: parenthesized portions to kanji characters: watashi wa honya e kaimono ni iku. (I go shopping to a bookstore.) koko ni hon ga 2satsu arimasu. (Here are two books.) whish is a bit funny, and:

watashi wa yuugijyou e kaimono ni iku. Since the standard keyboard without (I go shopping to a game center.) word processing facility is not capable of inputting kanji characters,

CALICO Journal, Volume 8 Number 2 9 sono hana wa akai. which is unacceptable. But then: (That flower is red.) watashi wa { depa-to Here, yomu is a verb, watashi (1) is an / honya agent, and wa is the agent marker; / yuugijyou } e shuukin ni iku. shinbun (newspaper) is an object, and (I got to a department is the object marker; akai (red) is store/bookstore/game center to an adjectival, hana (flower) is an collect the money.) object, and wa is the object marker. Since the role of a noun phrase is are all acceptable sentences. We expressed explicitly, the order of have not yet seen any elegant noun phrases in a sentence may be solutions for controlling these arbitrary most the time in Japanese. problems. Thus, we can say:

Our system generates sentences from shinbun wo watashi wa yomu. which fresh drill problems are (I read a newspaper.) produced. We use sentential patterns for the sentence generation Sentences in our system are because they are correct and realistic, generated using sentential patterns both semantically and syntactically. for common verbs and adjectivals. In addition to these patterns, we let Sentential Patterns. the system stock other basic patterns Basic Japanese sentences are formed we commonly use in daily life. A with a number of noun phrases, few examples of each type are followed by a verb or an adjectival. shown below.

[NP, ...] V. We use a kind of BNF notation to [NP, ...] ADJ. express the patterns. The elements enclosed by two brackets are Each noun phrase takes a role or case optional, and those enclosed by two associated with the particular state braces are alternative terminals. The or action the verb or adjectival description between two slashes expresses. The cases are marked indicates the restrictions imposed for explicitly by particles or applying the rule. "..." in the right postpositions attached to the nouns. hand of a rule indicates "etc." The Consider the following sentences: non-terminals not rewritten further are word categories in the dictionary watashi wa shinbun o yomu. we use. (I read a newspaper.)

CALICO Journal, Volume 8 Number 2 10 Verb Patterns iku [ wa [ kara] {e | ni} [ de] [ to] [

CALICO Journal, Volume 8 Number 2 11 Adjectival Patterns. muzukashii [ wa [] ]] /second not equal to the first/ muzukashii [ <**1> wa] <**1>:= hon | ji | mondi | toi takai [ <**2> wa [<**2> yori]] <**2>:= | /choose both of <**2> from the same category and the second <**2> not equal to the first/

Basic Patterns.

[ wa] <**3> desu [] <**3>:= | |

<**3> wa desu [] desu [

CALICO Journal, Volume 8 Number 2 12 kare wa eki ni aruite iku. A sentence is generated by picking Yamadasan wa kyoukai ni kinou itta. up one of these patterns and (past) selecting appropriate terminals kanojo wa honya ni kuruma de (words or phrases). The noun Hanakosan to gog 2ji 30pun ni iku. phrases may be commuted. Thus, etc. examples from the first pattern for iku include: Dictionary. We use a dictionary that contains watashi wa gakkou e iku. words (and some phrases) according anata wa gakkou e 3ji ni iku. to the following categories: watashi wa eki ni yamadasan to iku.

:= | | | | | ... := | | | | | | | | | | | | | | | | ... := | | | | ... := |

CALICO Journal, Volume 8 Number 2 13 Each verb entry contains a verb with word, and its kana representation, its end form, followed by its stem, kanji representation, modifiers it may type of inflections, kana take, and number suffix, if it is representation, kanji representation, countable. An adverbial entry and typical modifiers it may take. contains a word, followed by its kana Japanese verbs have five different representation, kanji representation, types of inflecting patterns, each and the tense it indicates, if any. taking six different linking forms to other words (mostly to auxiliary We separated from verbs). For example, the verb iku other nouns just for convenience. changes its forms to: An entry for the person's name and its suffix contains a word, followed watashi wa gakkou e... by its kana representation, kanji (I go to school.) representation, and gender. Figures 2a-2d show a few dictionary entries (a) i-ka, i- in each category. (b) i- (c) i- (end form) PROBLEM GENERATION (d) i-ku (e) i- Word Inflections. (f) i-ke (imperative form) Verbs, adjectivals, and auxiliary verbs are inflectible words in The changes occur going through an Japanese, and take one of the six entire line (ka, ki, ku, ke, and ko) of a forms when combined with other column of the 50 basic sounds (ten words in a sentence. Verbs take five columns by five lines). We classify different inflecting patterns; the words that have this inflecting adjectivals take two; and auxiliary pattern as V5. verbs take ten or , depending upon how they are classified. Any word An adjectival's entry contains an belonging to the inflectables changes adjectival with its end form followed its form in the six different ways that by it kana representation and kanji are shown in Table 1. A difficult representation. Adjectivals have two challenge in forming sentences is to different inflecting patterns. See use auxiliary verbs with the same Table 1, V5 for examples of a verbal and/or different category of words, inflection pattern, and A1 for an and to inflect the inflectables adjectival inflection pattern. We properly. note here that there are verbs of V5 that change forms through other Once a basic sentence is generated, columns in the 50 basic sounds. we can add one or more auxiliary verbs to the verb or the adjectival in Nouns in Japanese would not change it and change its modalities. A verb forms. A noun entry contains a having the inflection pattern of V5

CALICO Journal, Volume 8 Number 2 14 Col. 1 2 3 4 5 6 Example V5 ka ka ki ku ku ke k, ko i-ku ga ga gi gu gu ge ge, go oyu-gu sa su se, so o-su ta tsu te, to -tsu na ni nu ne ne, no shi-nu ba ba bi bu bu be be, bo to-bu ma mi mu me, no-mu ra ru re, no-ru wa, a wa i u u e e, o suku-u A1 karo kat, ku i i kere tadashi-i TABLE 1 Inflection Patterns of Verbs and Adjectivals

CALICO Journal, Volume 8 Number 2 15 CALICO Journal, Volume 8 Number 2 16 may take the auxiliary verbs, u, nai, Other inflection patterns of verbs tai, reru, ta, and masu, and change the may take the same or different types modalities of the sentence, as is of auxiliary verbs to express the shown below in the case of iku. modalities. We can express other modalities with other auxiliary watashi wa... verbs, and complex modalities with certain combinations and defined (a) iku+u —> iko-u. order of more than one auxiliary (I will go.) (will) verbs. Some examples of complex (b) iku+nai —> ika-nai. modalities are: (I do not go.) (negation) (c) iku+tai —> iki-tai. watishi wa... (I wish to go.) (wish) (d) iku+reru —> ka-reru. (g) iku+masu+nu —> iki-mase-. (I can go.) (can) (I do not go.) (polite negation) (e) iku+ta —> it-ta. (h) iku+nai+ta —> ika-nakat-ta. (I went.) (past tense) (I did not go.) (negation+past) (f) iku+masu —> iki-masu. (i) iku+masu+nu+desu+ta —> (I go.) (polite expression) iki+mase+n+deshi+ta. (I did not go.) (polite+negation+past)

CALICO Journal, Volume 8 Number 2 17 A much more involved example is: The system generates a basic sentence and then chooses a kare wa... modality pattern. All the information needed to connect the (j) iku+tai+ta+desu+u —> iki-takat-ta- words in the pattern and other desho-u. inflectables is found in the dictionary ((I assume) he might have wished and the tables. Take a simple to go.) example:

Table 2 shows a part of modality kono mondai wa muzukashii. patterns we keep for our system; (This problem is difficult.) Figure 3 shows a part of auxiliary (past): verbs, and Table 3 shows examples of inflection forms of two auxiliary Table 2 shows that we should use ta verbs. In Figure 3 the information for the past tense. The dictionary associated with each auxiliary verb is entry for ta tells us that this word is its kana representation and type of connectable to the second form of the word it may take with the specific inflectables. From the entry for inflection form, if the word is muzukashii, we understand that the inflectable (this part is repeatable, if adjectival belongs to the inflection the auxiliary takes more than one type A1. Then, Table 1 indicates that type of words). For example, u takes the second form of the adjectival is verb of V5 and the adjectivals of A1 formed by adding kat to the stem of types, both with the first inflection the adjectival. The result of the form (e.g., iku+u). Other transformation therefore is: unexplained symbols used here represent the following: I kono mondai wa muzukashikatta. (Inflectables), N (Nouns), VS (verbs with inflecting pattern S), S (Stem).

CALICO Journal, Volume 8 Number 2 18 Types and Order of Aux. Verbs Modality 1234 1 will u, you 2 ngation nai, nu 3 wish tai 4 can reru, rareru 5 past ta 6 polite desu, masu 7 6, 2 masu nu 8 2, 5 nai ta 9 6, 2, 5 masu nu desu ta TABLE 2 Modality Patterns

127456 nai nakaro nakat, nai nai nakere naku masu mase, mashi masu masu masure mase masho TABLE 3 Inflection Forms of Auxiliary Verbs

CALICO Journal, Volume 8 Number 2 19 Counting Problems. Suppose we have a basic sentence: Counting in Japanese is troublesome, because we need to add different kyoushitu ni enpitsu ga 3(san)bon aru types of suffixes to the quantifier. (There are 3 pencils in the For instance, in English, we say: classroom.) Here are two pencils. In Japanese, we have to add a suffix to the The system eliminates (san)bon from number. the sentence, and gives the problem as: koko ni 3bon no enpitsu ga aru. kyoushitsu ni enpitsu ga 3() aru. We may use many different suffixes to count things. Figure 4 shows Here, other sentences like: typical ones, each with kana representation and kanji kyoushitsu ni knopyu-ta ga 3dai aru representation. (There are 3 computers in the classroom.) The system generates counting niwa ni tori ga 5wa iru problems using the sentential (There are 5 birds in the garden.) patterns such as: (Top figure on facing page.) are generated by consulting the dictionary and finding that konpyu-ta uses the suffix -dai, and tori uses the suffix -wa. The problem generation is automatic once we get a sentence that includes a specific number associated with the countable noun.

Kana and Kanji Representations. Writing Japanese with kana and kanji mixed forms is essential, but difficult to learn. When the system generates the sentence:

yamada san wa eki ni iku. (Yamada goes to the train station.)

it picks up information from the dictionary: (Bottom figure on the facing page.)

CALICO Journal, Volume 8 Number 2 20 aru [[<***1> ni] <***2> ga []] := | | | | | /choose according to the second variable / <***1>:= heya | kyoushitsu | ... <***2>:= enpitsu | hon | kami | knotpy-ta | kaban | ... iru [[<***3> ni] <***4> ga []] <***3>:= doubutsuen | niwa | yama | soto | ... <***3>:= | | | ... := 1(ip)pon | 2(ni)hon | 3(san)bon | 4(yon)hon | 5(go)hon | 6(rop)pon | 7(nana)hon | 8(hap)pon | 9(kyuu)hon | 10(jyup)pon := 1(hito)tsu | 2(futa)tsu | 3(mit)tsu | 4(yot)tsu | 5(itsu)tsu | 6(mut)tsu | 7(nana)tsu | 8(yst)tsu | 9(kokono)tsu | 10(tou) := nin /when takes 1, 2, or 4, use 1(hito)ri, 2(fut)ri, 4()nin, respectively/ := 1(ip)piki | 2(ni)hiki | 3(san)biki | 4(yon)hiki | 5(go)hiki | 6(rop)piki | 7(nana)hiki | 8(hap)piki | 9(kyuu)hiki | 10(jyup)piki

CALICO Journal, Volume 8 Number 2 21 sono hito kuru. Using this information, the system (That person comes (here).) changes the alphabetical form to (past)+(negation): kinakatta. ***wrong: konokatta***

In the last example, the second inflection form of nai is nakat (see parenthesizing kanji portions. Then Table 2), and nai takes the first it asks the student to write the inflection forms of the inflectables. portions in kanji characters. We kuru is a verb of VK type, and its first impose a restriction here that the inflection form is ko, while not student is to write them down on a exemplified in Table 1. sheet of paper, since, as was mentioned before, the standard Counting. keyboard would not allow us to input kanji characters. heya ni enpitsu ga 3( ) aru. (There are 3 pencils in the room.) EXAMPLES answer: sanbon We show a sequence of examples ***right*** taken from drill sessions. They are rather simple, as we want to give heya ni tsukue ga 2( ) aru. problems that may be understood (There are 2 desks in the room.) within the sentential patterns and answer: nidai the information supplied in Figures, ***right*** Tables, and the dictionary in this paper. niwa ni gakusei ga 2( ) iru. (Two students are in the garden.) Inflections. answer: ninin ***wrong: futari*** watashi wa gakkou ni iku. (I go to school.) heya ni kaban ga 3( ) aru. (polite): ikimasu (There are two bags in the room.) ***right*** answer: sanko ***wrong: mittsu*** watashi wa gakkou ni iku. (I go to school.) A thing may be counted in more (negation): ikanai than one way. For instance, tsu may ***right*** be used in the counting of bags in the last example. This is not a real watashi wa 9 ji ni okiru. problem, however, since more than (I get up at 9:00.) one suffix may be included in a (will): okoyou dictionary entry. ***right***

CALICO Journal, Volume 8 Number 2 22 be asked to write the same word in Kana and Kanji Writings. kanji over and over again. We are able to avoid this problem if we keep kyou wa getsuyou bi desu. a history of the kanji words being (It is Monday today.) learned in each session.

CONCLUSION

Currently, we use about 250 verbs and adjectivals for the sentential enpitsu ga 3bon aru. patterns, and slightly more than one (There are 3 pencils.) thousand words and phrases in the dictionary. The program is written in LISP on a microcomputer. The students like to practice kana and kanji mixed writings the most and it turned out that this practice is fun and educational for the Japanese watashi wa eki ni kuruma de iku. students, too. (I want to go to the station by car.) We have found that building a language learning system is not an easy task. When we devise such a system, a lot of effort has to be put into the project, regardless of which watachi wa kyoukai ni kinou itta. direction we take. Our experience (Yamada went to the church tells us that we cannot implement yesterday.) our system too elegantly, for it diminishes practicality. On the other hand, we can not afford to devise an "ad hoc" system, either, for then, no expansions are possible.

A problem nicely solved here is the The practicality and extensiveness of homophonic representations of our system are assured by using different words. For instance, in the sentential patterns and generating last example, kinou may be replaced sentences. With these mechanisms at hand, we can get realistic by ... when no sentences that are in textbooks for restrictions are imposed. It should beginning students of Japanese6 and take a time adverb in this sentential fresh problems to practice. We are pattern, however, and only one kinou also able to make the system more is acceptable, which is versatile by defining new types of problems which work on the We observed that sometimes the sentences generated. student found it very frustrating to

CALICO Journal, Volume 8 Number 2 23 REFERENCES

1 Cameron, K. 1989. Computer-Assisted Language Learning: Program Structure and Principles; Ablex, New Jersy. 2 Weizenbaum, J. 1966. “ELIZA”, CACM, Vol. 9, No. 1. 3 Woods, W.A. 1970. Transition Network Grammars for Natural Language Analysis, CACM, Vol. 13, No. 10. 4 McDonald, D.D. and Bolk, L. (eds.) 1988. Natural Language Generation Systems, Springer-Verlag. 5 Waltz, D.L. and Pollack, J.B. 1985. “Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation, Cognitive Science, Vol. 9, No. 1. 6 Suzuki, S. and Kawas, I. 1985. Nihongo Shoho (Basic Japanese), The Japan Foundation, Tokyo.

Author's Address

Teiji Furugori Dept. of Computer Science The University of Electro-Communications Chofu, Tokyo, Japan

CALICO Journal, Volume 8 Number 2 24