Measuring Synonyms in Erya by Probabilistic Semantics Kam Tang LAU 1, Yan SONG 2, Pang Fei KWOK3 1, College of Professional and Continuing Education, The Hong Kong Polytechnic University, Hong Kong China Email:
[email protected] 2, Microsoft Search Technology Center Asia, Beijing China Email:
[email protected] 3, Department of Chinese and History, City University of Kong Kong, Hong Kong China
[email protected] Abstract Erya is the earliest Chinese dictionary which was edited and circulated before Qin Dynasty (BC 221). It is used for annotating the Confucian Classics. Therefore, the words collected in Erya are from pre-Qin Confucian Classics. It is recognized as one of the most important work of Chinese Philology. Erya later was listed as one of the Shisan Jing (Thirteen Confucian Classics). This shows that Erya is as important as the other Confucian Classics compiled in the pre-Qin Dynasty. Erya is like a synonym thesaurus. It puts all synonyms in parallel and explained by the last word and Ye. (“Ye” is using as verb-to-be “is” in Classical Chinese and placed after the explained word.) There are two main explanation patterns presented in Erya, one is “A, B, C, D Ye.” Another one is “E, F Ye, F, G Ye.” There are many Chinese Philology scholars discussed why the editors didn’t just record “E, F, G Ye”. They explained these phenomena by providing many evidences and examples from the Confucian Classics. They believed although E and F are the synonyms of G, F has shorter distance to G than E.