<<

2021 International Conference on Education, Humanity and Language, Art (EHLA 2021) ISBN: 978-1-60595-137-9

The Digital Protection and Inheritance of Daur Language Under Big Data Yi-duo BIAN1,a,* 1College of Foreign languages, Zhaotong University, Zhaotong, ,

[email protected] *corresponding author

Keywords: Daur nationality, Language, Big data, Protection, Inheritance.

Abstract. China is composed of 56 ethnic groups, and there are 55 ethnic minorities besides the Han nationality. Minority culture is important part of Chinese culture. The language of ethnic minorities has a long history of development and is an important part of culture. Daur nationality is one of the traditional nationalities in the north, which has its own special development history. Culture is unique, so the protection and development of Daur language has become a hot topic for cultural protection workers. Under the background of big data, this paper puts forward the digital protection and inheritance scheme of Daur language, which can provide some reference for the protection and inheritance of Daur language.

1. Introduction National language carries the cultural history of a nation. The disappearance of national language means the disappearance of a national culture from human memory and the extinction of a national culture. Daur language is an endangered language. In the process of language inheritance, not only need to have cultural consciousness, but also need the joint efforts of researchers to inherit and protect it step by step. Yimin Daur association provides conditions for the protection of language and culture, moreover, modern technology makes it possible to preserve the Daur language for a long time. In recent years, with the development of economy and the popularization of Putonghua, the use of Daur in the new generation of young people is worrying. Therefore, it is necessary to record and protect the original language of Daur language so as to contribute to the inheritance of national language and culture.

2. The Analysis of the Research Status and Development Trend of Daur Language at Home and Abroad Daur comes from the name of a son of Qishou Khan, the ancestor of Qidan nationality. From the general history of Daur nationality in China, we can see that there is a kinship between Daur language and . Since the 20th century, scholars who study the sound and meaning of small Khitan characters have interpreted a certain number of Khitan words. The results show that Daur language is included in the Khitan branch. Unfortunately, the Daur language is missing, just word of mouth. The development of the history of world civilization is the development of a nation. The Daur nationality has a strong influence in the historical evolution. The protection and inheritance of the Daur language is particularly important, which is also the focus of the Daur culture research.

108 3. The Necessity of Daur Language Research Daur is one of 56 ethnic groups in China, mainly distributed in Molidawa Daur Autonomous Banner of Inner Autonomous , meilisdaur of City, , and Ewenki Autonomous Banner; a few ethnic groups live in of Autonomous Region, Province, etc. Daur has its own language, which belongs to the family of Altaic language family. Due to historical reasons, the original characters have been lost. Daur language has experienced historical changes in the communication and contact with various nationalities for hundreds of years. (1) Monolingual. In the late Ming and early , Daur society was in a state of self-sufficiency, and the Daur people used their own language to communicate. (2) Da-Man bilingual. From the middle of Qing Dynasty to the end of Qing Dynasty, under the rule of the Qing government, the two ethnic groups had great cultural contact, and thus the two nationalities had a bilingual language. (3) Bilingual Da-Meng. In the early Republic of China, Daur and Mongolian were divided into one nation, so the language of these two nationalities appeared high similarity. (4) Multilingual. During the period of , Japan carried out Japanese education in schools at all levels in Daur region. Therefore, the teaching mode of Da - Japanese-Han-Meng multi language has been formed. (5) Da-Chinese bilingual. After the founding of new China, many Daur people learned to use Chinese, and formed bilingual and dual-cultural people. (6) Non-native language. Now, with the popularity of Chinese, there are already some people who can't speak Daur. Especially many people born after 80, 90 and 00 have become Daur people who cannot speak Daur language, and many of Daur people have become monolingual (Chinese) people. With the development of social economy, the population flow is frequent. They are far away from ethnic areas, live in scattered areas, intermarry with other ethnic groups, and the language environment begins to change. As a result, the proportion of native language users in the total population decreases, the scope of native language users shrinks, and the number of monolingual (Chinese) people increases. Therefore, it has become an urgent issue to sort out and protect the Daur language[1].

4. Protect the Language of the Daur Nationality under the Background of Big Data The Daur nationality has set up an association for ethnic studies, which has carried out a lot of meaningful ethnic cultural protection activities since its establishment. For example, regular song and dance competitions, Daur costume design, national friendship, mutual help and public welfare activities of ethnic compatriots, national festivals and other national cultural publicity and education activities are carried out. Now it has become a base and an important part to protect Daur culture. At present, Daur language is facing the dilemma of being endangered. Apart from the reasons of ethnic settlements, scattered living, intermarriage, and changes in language environment, the most important reason is that Daur language has not been protected and inherited by modern means. In the context of big data, information is everywhere, digital survival has become a feature of the times, the use of modern scientific and technological means, will bring the possibility of long-term preservation and inheritance of Daur language[2-3]. In the process of language collation, we can use literature, interview, questionnaire, observation and other research methods. Based on the investigation and analysis of the current situation of Daur language, this paper reveals the main problems existing in the process of Daur language inheritance, puts forward the implementation plan of establishing Daur language corpus under big data, and probes into the inheritance of Daur culture under big data. 4.1. Research on Daur Language Ewenki Autonomous Banner, founded in 1958, is one of the three ethnic autonomous banners in

109 China. It is located in the east of Autonomous Region, the west of Daxinganling and the southeast of grassland. It borders city in the East, Zhalantun city and Keyouqian banner of Xing'an League in the south, xinbalhuzuo banner in the west, Hailaer district and chenbalhu banner in the north. The whole banner governs 4 towns, 1 Township and 5 Sumu, and the capital is located in bayantohai town. There are 25 ethnic groups with a total population of 144000. Among them, the minority population is 58843, accounting for 40.8% of the total population. There are 11193 Ewenki, accounting for 7.8% of the total population; 27809 Mongolian, accounting for 19.3% of the total population; 14239 Daur, accounting for 9.9% of the total population. According to the analysis of the use of Daur bilingual division, this paper investigates the use of Daur language in Ewenki banner. Bilingual formation area. It is mainly distributed in Bamu and Daur village. The Daur people here are very proficient in Daur language and relatively poor in Chinese. The bilingual use is mainly limited to primary school teachers, students and individual village cadres. Bilingual development zone. More than half of the Daur people in Ewenki Autonomous Banner are distributed in Bayantala Daur township. 20% of them live in remote mountainous areas with inconvenient transportation. About 30% of the Daur people are distributed in Township centers with convenient transportation. Most of them are proficient in both Daur and Chinese. The level of Daur language is high, but Chinese is average. Bilingual mature area. About one third of Daurs in Ewenki Autonomous Banner are distributed in Bayantohai town. Daur people in this kind of areas can use both Daur language and Chinese very skillfully. They can use Daur language in their daily family life and Chinese in their foreign communication. Bilingual decline zone. About 5% of the Daur people in Mo Autonomous Banner live in the villages and towns where the Han people live: Dayan Town, Yiminhe town and Honghuaerji town. Because they have been living directly with the Han people for a long time, and the number of people is small, their daily communication language is mainly Chinese, and the application scope of Daur language is very narrow. The characteristics of bilingual language use in such areas are the trend of becoming monolingual, that is, Chinese, and this area has become a declining area of bilingualism. In order to master the current situation of the use of Yimin Daur language, this paper conducts a questionnaire survey and key interviews with members of the Daur Association. The questionnaire survey mainly focuses on the language mastery. The survey of language acquisition mainly includes acquisition time, acquisition method, mastery degree and language communication. The main contents of the questionnaire are as follows: The main contents of this paper are as follows: the situation of mother tongue acquisition and mastery; the way of mother tongue contact; the frequency of mother tongue use in different occasions; the ability of listening and speaking. This paper focuses on interviewing those who have influence on the development of their own culture, asking about their growth experience, surrounding environment and language attitude, in order to explore the language choice and subjective attitude of the members of the Yimin Daur Association in communication and their attitude towards the language use of future generations. 4.2. The Establishment of Daur Language Corpus[4] Corpus is a database to record, organize and annotate the speech of endangered languages. Corpus includes speech corpus and text corpus. Because Daur has language only language and no characters, it can only be recorded by means of speech corpus. (1) Corpus Collection The collection of corpus includes the collection of speech material and image material. First of all, we select the pronunciation partners whose mother tongue is Daur, who have clear pronunciation and certain basic cultural knowledge, and collect the following four aspects of language materials from the pronunciation partners, including: the first is pronunciation; the second is vocabulary,

110 including food, clothing, houses, appliances, religion, consciousness, human organs, characters and relatives, animals, plants, location and time, quantity The third aspect is grammar, including phrases; sentences; word formation; the fourth aspect is long corpus, including ballads; folktales. Each corpus is recorded three times and about 10000 samples are collected. (2) Collation of Corpus Firstly, the original corpus is sorted out, including format conversion, labeling, noise removal and so on. Then words, paragraphs and sentences are annotated with Praat software. The annotation information includes APA annotation of actual pronunciation, Chinese literal translation, free translation and part of speech annotation. (3) Function Design of Corpus The function of Daur corpus is realized through user query and background management. The platform provides voice, vocabulary, grammar and long corpus retrieval. The retrieval results of this module appear in the form of text and voice. The background management includes two parts: corpus management and website system maintenance. (4) Technology Processing Corpus construction technology processing is an important part of corpus construction, which needs to cooperate with the technical personnel of data system development. Technical processing mainly includes data format adjustment, noise processing, Daur IPA tagging, part of speech tagging, and Chinese literal translation tagging; corpus alignment processing; corpus tool design. (5) Main Technical Treatment When building the corpus, HTML and Dreamweaver web editor will be used to design the external interface, and voice editor will be used to process the voice data ASP.net At the same time, ADO technology is used to realize the retrieval of each template and fuzzy matching query.

5. The Importance of Four Big Data to the Inheritance of the Chinese Culture Give Daur excellent traditional culture the content of the times. After the vicissitudes of the times, Daur traditional culture has experienced continuous inheritance and development. It goes along with the times, keeps pace with the society, carries out cultural transformation according to the development of the times, and injects its new era connotation while retaining the traditional culture.

6. Conclusion Based on the investigation and analysis of the present situation of Daur language, this paper reveals the main problems existing in the process of Daur language inheritance, puts forward the implementation plan of establishing Daur language corpus under big data, and discusses the necessity and importance of Daur culture inheritance under big data, so as to provide a certain reference for the protection and inheritance of Daur language and make the long-term preservation and transmission of Daur language.

References [1] The Conservation and Protection of Endangered Languages in China [J]. Fan Junjun. Journal of (Philosophy and Social Sciences). 2018 (10). [2]The Analysis of the Related Factors of Maintaining the Mother Tongue of the Daur Nationality in Moqi [J]. Ding Shiqing. Heilongjiang Ethnic Series. 2009 (03).

111 [3]The Current Situation and Analysis of Daur Bilingual Education [J]. Sun Dongfang. Journal of for Nationalities. 2007 (11). [4] The construction of Chinese Endangered Language Corpus: A Case Study of Rongshuiai Discourse Corpus [J]. . National Forum. 2015 (5).

112