Design of a Cloud Service for Learning Chinese Pronunciation
Total Page:16
File Type:pdf, Size:1020Kb
54 JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 12, NO. 1, MARCH 2014 Design of a Cloud Service for Learning Chinese Pronunciation Hsueh-Ting Chu, Wei-Shan Tsai, and Shao-Yu Lee AbstractWith the fast growing of cloud computing Chinese articles in textbooks usually requires the help of infrastructure, learning from cloud services has become additional phonetic symbols. There are two Chinese more and more convenient for people worldwide. In phonetic systems: Pinyin and Zhuyin. Pinyin is currently order to integrate the cloud computing technology and the official Romanization which was published by the different e-learning platforms including variant mobile People's Republic of China (PRC) in 1958[6]. Before 1958, apps, Windows and web-based applications, we develop Zhuyin was the standardized phonetic system in entire our Chinese learning system “analytic Chinese helper” China for Chinese (Mandarin) pronunciation. The Zhuyin with a service-oriented architecture (SOA). Based on the system incorporates 37 Bopomofo symbols (Fig. 1) which new architecture we designed and developed a cloud transcribe precise sounds of Chinese characters among service for the e-learning of Chinese language on the different Chinese dialects. At present, Zhuyin is still the Internet as a convenient resource for foreign students, major phonetic system in Taiwan’s education system for especially in the reading of Chinese texts. teaching and learning Chinese. At the same time, Pinyin is There are two Chinese phonetic systems: Pinyin and also used in traffic signs and maps in Taiwan. Zhuyin. Pinyin is the official Romanization of Chinese Moreover, the learning of Chinese characters is also characters, and Zhuyin incorporates additional difficult for foreigners because there are two different Bopomofo symbols which transcribe precise sounds of writing systems of Chinese characters: Traditional Chinese Chinese characters. The proposed analytic Chinese and Simplified Chinese. Simplified Chinese characters helper provides real-time annotations with Pinyin or resulted from the simplification of Traditional Chinese Zhuyin symbols, and thereby the annotated articles can characters by the PRC government in 1956. However, be used as e-learning objects in learning Chinese. Traditional Chinese characters are still being used in Hong Index TermsChinese-learning, cloud service, Kong and Taiwan as well as in ancient Chinese books. e-learning, Pinyin, Zhuyin. Table 1 lists five different Chinese representations. The first row is the sentence in Simplified Chinese and the second row is in Traditional Chinese. Both of the sentences are difficult for foreign students to pronounce. Therefore, it 1. Introduction is profitable to provide additional phonetic annotation along Recently, the study of Chinese language has become with the sentences. There are three common types of more and more popular worldwide. For most foreigners, the Chinese phonetic annotation (rows 3~5). The first type uses most convenient way of learning Chinese is the e-learning Pinyin symbols with tone marks. Tones are an important through a variety of Internet services, including websites part of Chinese, which are the variation of pitch within a and online chat rooms[1]−[4]. One of the major challenges in syllable. There are basically four tones in Mandarin Chinese. Sometimes, people use the numbers 1~4 for tones learning Chinese is the pronunciation of Chinese characters (row 4). The last row is the equivalent phonetic because these characters are not phonetic alphabets[5]. For representation using the Zhuyin system. Excluding the native Chinese students in elementary schools, reading the Bopomofo symbols, the three tone symbols “ˊ, ˇ, ˋ” are used in the Zhuyin system. In summary, the Manuscript received June 20, 2013; revised August 30, 2013. pronunciation of Chinese characters is not easy for H.-T. Chu is with the Department of Computer Science and Information Engineering, Asia University, Taichung 41354 (Corresponding author foreigners. For this reason, we develop the cloud service to e-mail: [email protected]) help the learning of Chinese for foreign students through W.-S. Tsai is with the Department of Computer Science and the Internet[7]. Information Engineering, Asia University, Taichung 41354 ([email protected]). S.-Y. Lee is with the Department of Foreign Languages and Literature, Asia University, Taichung 41354 (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://www.intl-jest.com. Digital Object Identifier: 10.3969/j.issn.1674-862X.2014.01.011 Fig. 1. Bopomofo symbols. CHU et al.: Design of a Cloud Service for Learning Chinese Pronunciation 55 Table 1: Different Chinese representations for the sentence “Asia University welcomes you.” Chinese words/annotations Transcriptions Simplified Chinese words 亚洲大学欢迎你 Traditional Chinese words 亞洲大學歡迎你 Web app Pinyin with tone marks yàzhōu dà xué huān yíng nĭ Mobile app Win app Pinyin with tone numbers ya4zhou1 da4 xue2 huan1 ying2 ni3 Zhuyin with Bopomofo* ㄧㄚˋ ㄓㄡ ㄉㄚˋ ㄒㄩㄝˊ ㄏㄨㄢ ㄧㄥˊ ㄋㄧˇ *In the Zhuyin system, the first tone is without a tone symbol. Education Cloud 2. Two Chinese Systems China has become the second largest economic entity in the world. People, around the world, learn Chinese for the Fig. 4. Concept of education cloud for the access of learning business or culture reasons. Hundreds of Confucius contents from different terminal devices. institutes (Fig. 2) have been opened in dozens of countries to study Chinese[8]. Of course, these institutes use 3. Design of Cloud Service for Simplified Chinese with the Pinyin phonetic system. Learning Chinese Pronunciation However, people will find that there is another Chinese system if they visit Hong Kong or Taiwan. It is the original In order to provide the integration of heterogeneous Chinese system which uses traditional (non-simplified) application types today including web-based and Window- characters and additional Bopomofo symbols in the based applications and variant mobile apps as shown in Fig. 4, we aim to use web services to integrate remote accesses learning systems such as the Chinese Language Education [9], [10] Center at Asia University (Fig. 3). Besides, there are ten in service oriented architecture (SOA) . It is also the Chinese centers in different universities around Taiwan. common approach of many software as a service (SaaS) (http://english.moe.gov.tw/ct.asp?xItem=9693&CtNode=41 cloud services. We aim to consider the Chinese e-learning 7). Most of the Chinese centers in Taiwan introduce the as a set of services built by cloud computing techniques. Zhuyin phonetic system to students. As a result, it is useful The cloud services were implemented on Microsoft .Net web services which are based on the simple object access for a foreign student to know both of the Chinese systems if protocol (SOAP). The Web services are deployed at an he/she learns Chinese in Taiwan. internet information services (IIS) server. The web-based interface of the analytic Chinese helper (ACH) is developed by PHP (hypertext preprocessor) programs on Apache server (Fig. 5). The ACH service provides both of Pinyin and Zhuyin phonetic annotations for either simplified or traditional Chinese texts. The PHP programs use the SoapClient class to communicate with the SOA services through XML-based messages (Fig. 6 and Fig. 7). Fig. 2. Website of Confucius Institute (CI) which is supported by China government. http://www.chinese.cn/ Fig. 5. Website of proposed cloud service: analytic Chinese Helper (ACH) http://aiplab.net/ach/. <?xml version="1.0" encoding="utf-8"?> <wsdl:definitions xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/"> <wsdl:service name="achws"> <wsdl:port name="achwsSoap" binding="tns:achwsSoap"> <soap:address location="http://127.0.0.1:1266/ach/achws.asmx"/> </wsdl:port> </wsdl:service> </wsdl:definitions> Fig. 3. Website of the Chinese Language Education Center at Asia Fig. 6. XML file of ACH web services for service calling from University. http://clec.asia.edu.tw heterogeneous applications. 56 JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY, VOL. 12, NO. 1, MARCH 2014 $client = new SoapClient("ws/achws.xml"); 4.1 ST-Mixed Dictionary $params = array('Org' => $mytxt); The Chinese service is driven by an in-memory Chinese switch ($tab) dictionary. The dictionary inherits most of the entries of { [12] case 0: Chinese words from CC-CEDICT . However, the CC- $displaytext=$mytxt; CEDICT is unable to support the conversion between break; Traditional Chinese and Simplified Chinese vocabularies. case 1: We have built the mappings of Traditional Chinese and $wsresult= $client->__soapCall('AchDefinition',array('parameters'=> Simplified Chinese from the Chinese rules in the $params)); MediaWiki project[13]. Each Chinese word has one or two $displaytext=$wsresult->AchDefinitionResult; entries in the traditional and simplified mixed dictionary break; case 2: (ST-mixed dictionary). If the characters of the Chinese … word are different in traditional and simplified systems, } there are two entries. Each entry of the dictionary has the Fig. 7. PHP codes to call the ACH web services. predefined attributes such as the Pinyin string, the Zhuyin string, and the filename of sound file. The proposed cloud service, analytic Chinese helper, provides automatic annotation of speech sounds of Chinese 4.2 Segmentation Engine text to help the beginner to learn Chinese regardless of The input Chinese text from the web interface goes to using Pinyin or Zhuyin (Fig. 5). The user can paste Chinese the background SOAP service, which consists of the text unto the website, and then he can easily switch the segmentation and annotation functions. The segmentation interface between tabs. There are four tabs for different engine is implemented by a longest prefix match (LPM) annotations. The first annotation provides explanations of [14] algorithm . The segmented words then are converted into Chinese words. A balloon tip will be displayed for the Word IDs of the ST-mixed dictionary. meaning of a Chinese word if the user moves the mouse near the word. The other three tabs are for pronunciation of Chinese text.