INSO Islamic Republic of Iran 16939-1
16939-1 1st.Revision Iranian National Standardization Organization 1392 !"
Dec.2013
/, -. %)*+ - $ %&' ( 4)/ 5 :1 (12 - 0
6 7
Language resource management - Word segmentation of written texts- Part 1: Basic concepts and general principles
ICS: 01.140.10 8 9 :$ $ '0" ( ( ( ' $ %& $ 3" ! (+( ) (7 8 6 ( 1 ' )' 45 2 3 12 + 0 , 1371 - '+, * ) . - ,; ( 90/6/29 < ( 8 = (; 8 (> (7 ' - ?@ *A B . 3 - > N& 3, 90/7/24 < 206/35838 - +> H I 7 J ( U(2 (T P ( ) J ( ( > 2 J 2 S 8 6 +2 Q7R 8 6 -J P ' )8 (= Y (> ( ( ( (7 \= * B [+6 1> 2 > B ? YZ - WX 8 *$ 8 = )16V@ )+7; W 2 ) W 2_* ) W 2 = ^ > )04 ]P ZP 4* 6 WX 32 1 J 2 3 8 ? 8 S (7 8 6 ( b a@. > ^ P = ` = 8 6 J ) 6 , )**R +7; U2 ) W 2 6 ,1@ 6T 3S J b@ > g f S 8 6 +2 8 e; 04 8d 0 6 cT 8 . > 1 h i (+ ) 7 ; * %H > X YZ 7 +2 (2 ( (, - (> '( Y j 3 ; U %& 8d $&; 8 6 J 2 6 b a@ 6 ) ' . > 1 h i 7 ; ) * %H 7 +2 (2 f ( ( (7 +2 ' 5 " +> 7 - > > 4 k 2 > 7 7 . > - * 6 ^A1 7 J 2(IEC) !(AA= 77+= ' +2) 1(ISO) 77+= ' J 7 8 e; J 7 J (12 5(CAC) l(` b2 (2 (+2 4Y , ; 3 3(OIML) $ > -J 77+= ' J 3S(1@ '(cX J ) 12 m c 8 6 8 J 72 Y > '+j 7 8 6 ' . 2 3= S . > 8W-, 77+= ' 8 6 , S )+7; 8 6 3& n4P ) W 2 _* J 3 +P 8 ) $ - > a@ 'J 3 ; 7 J 8 6 J e 8 )8 *$ p 3J TP& o * 342 J +H g *P ) +; 8S + ( J . + 8 Z ) = ; 8 > * ) B&$ / 12 ^c 8 = o * 8 7 8 (Z X 8 (( ( 8 6o 2 8 ) 12 o * 8 77+= ' 8 6 J n4P T )rJ (X )- (1 (J g S 6 J c J W 2 - 4 1R +H 8 'q+6 . + ( Z= 2 U(2 (6 - [(1 JX )(p3(J 3 ( 3(42 3 ( 8 6 s 6 W 8U+ ) J 3P&( (t B (T Y j k 6 J W ' 7 J )a? ^ (? ) . (2 ( T 6 X A7+; p; 6 X 3P& t 6 W )BJo Y > J P 2 J 8 8 2 B ? ,Z W U7S ; ' )a? ^ (? ) Z= 2 ) 6 A 77+= ' - [ u .3 J ' Q 5 [ J 7 8 6 \p 8
1- International Organization for Standardization 2 - International Electrotechnical Commission 3- International Organization of Legal Metrology (Organisation Internationale de Metrologie Legale) 4 - Contact point 5 - Codex Alimentarius Commission , '; )1)6
0 /, -. %)*+ - $ %&' ( "
"6 7 4)/ 5 :1 (12
- ' / ( : <)= B& P & JX - [1 k k Z; )8 +P ( > J > k > 2)
:)$ B& P 6 [1 , k + )[ 7; (7[ J + > 2)
( Z4= _P ) :? @A B& ! k > 2 + )+6 (7[ J + > 2)
B& x ! k > 2 8 , ) p (7[ J + > 2)
B& P 6 [1 , k > 2 U, ) c ( S J k > 2 )
B& * ^2 - k > 2 +H S )B 7T (+> A k > 2)
; + - S ) 2 ^2 - k > 2 J, )8 B& (7[ J k > 2 )
v B ' (;
:C57 'A 7 J y >X
v ' S +2 4W a@
J 1 2 _ 6 1
3 U= 0 2
3 Q P&p 3
12 Wz 0p 8 i i 4
18 Wz 0p +; g 5
23 XML Wz 0p a + ( U= ) Q= 3 @
24 2
(6 5- D)E
(2 «(72 g s6 4 :1 3+$ -8 > 8 6 ' Wz 0p- J 0Z 3 » ( |7 +2 k& ' *A - > ' , f 8 6 +2 X b a@ - ! ! )3 SW $ * 91/11/29< >J X 8 U,? (; ( )1371 - ( '+,Z * ) ' $ %& $ 3 . > 1 |7
) (c B (7; )0 ( (J (, (|7 8 6 3S1@ o [6 +6 [+6 n4P 8 '( ^(+A %&( 8 ( 2 8 ,1@ 6 > 6 c T ? BU= 0$ |7 8 6 ( )' .3SW 6 c $ f S +2 T ? B [6 ) > y 6 .2 - 4 |7 8 6 T ? 'cX J - +6
:3 J %> SW $ - 4 ' , 8 2 8lct 0Z
ISO24614-1:2012,Language resource management - Word segmentation of written texts – Part 1: Basic concepts and general principles
- : + s6 4 ' .3 - $ _ 6 8 > 8 6 J Wz 0p 7 ' .J @ 72 H 6 J 2 U2+ Wz 0p 72 g
« 4 < 2» )g } ; . > 2 3 J 8 6 P ' 0p ) Wz 0p = P - > 3 4 2 c 2 2 « 4» «< 2» 2 s P - > AX - o ,+ by 3 $ ^ 2 3 P A f « 4 < 2» 2 . ' ، ' W P 6 J 8 € P 6 0p 8 Wz WSU))1 - < . . > ^A1 -z ! J a J Wz 0p 8 6 P ) > - 1 7Z$ g } 2 p +6 »: ) > ^A1 6 • ! J :7+ J > > 47R 8 63= P WSU! : ) > %&p ! )(«k S u7c»: ) > m c s )(«- R» : ) 2 -z !)(« 2+ 2 6 J 8 .( « 000 $ »: ) > 8 -z i Z; )(« Z B» g } J - 4 Wz 0p 8 6 P ' 0p )7[ g } ; ) lW 7 S +72 ' 2 i6 ) > - 4 - Wz 0p 8 6 P 8 6J ? 8 8 @ ; 8 6 eS 81 TP& [ ' €8 - z i 8 6 P 8 lW 1 ) 63 > 2 " 8 ' 8 P 2 6 J 8 @ z )i ^} ) 7 S 6z ' 2 6 J 8 .W $ T J 8[ A Wz 0p 8 6 P ' 0p 8 - 2 8 7 ^} ) 7 S 6-z .3
W xi 2 6 J ) 8 -W Z2 2 i 6 J 8 6 -z 0p )' -&; (@ z (J (2 3 3$ ' )[ 8 J .3 - q@ ? 8 -2 )@ z -W .3 4 Wz 0p 8 ) 2+ Z1@ W i 8 6YR
ƒ ( !( (; ( « €8 @» g } 8 .3 S 2 ' 0p 8 , = c 8 eS )g P ' U(? "z 0(p P ; «» «8 @» b@ )3 - > • J - > c € - i 72 36 ! ; X P&p 2Z2 m c )[ J . - > SW T
1 -Word segementation units
2-Collocation
J (P ) > 4 6 J 8 s ; $ .2 7 X U? 0p P ! P .( 2 g P&p 8 2 $
( (6 (J ( 8 X ( ( ( 6 -z 0p 3, 6r ; 6 \1 ) &( g(2 (7+ J ) (J ( f ( 8 68 X' S ' . 2 ^, + J 8W v R( ) (;&H (J )(+ (TS P ) P&p( 3 )3I= „6S ) J =2 ; « c TS P c gU`» > B ; g } ; . 1R Z, > + ;&H T47= 3 + ! P&p + 8 )> 8 6 + 8 X 'S )! "z 0p P ! . 2 X J
% J 0 /, -. %)*+ - $ %&' (
6 7 4)/ 5 :1 (12
$ 6 :' F /1 J ^ 8 6 ^+= Wz 0p 72 g s6 4 ' ' ' J _ 6 (WSU) - z 0p 8 6 P )l@ ? Z 8 - > 8 > 0p 3 J .J
(6 -z ( '( !( 0(p 8 ( ' .3 8 j ,4 )-z )3 J f _ " J , 6 -z P ' 8 + 2 .s> > , Q ! 6 6-z " 6 ^A1 q X J 3 8 j s(6 ( 7( S Y(c ( (2 (2 8 (6 s 6 3$ 8 8 ; $ 'i . 2 - 4 8 lW 1 6 7 S ; $ 0(p. ( (j ) (6 - +(> 6 + ^ > 2 Z; 8 6-z P&p ) 6 3> 2 ) @ - x(i 8 6 J 8 ) 2+ - 4 7 S J 6-z 8J 3, 2 @ z i 6 J 8 P Wz .3 ^ A1 ) 2 - 4 6 ; 8 +72 8 6- c 2 8 -2
(J ( ( ' ) ( 6z ( ' 0p J 2 6J 8 2 8 6 J 8 : 2
: 8 6 Wz 0p . > + 8 6U6 Z 8 7 r 8 +> -z B ? %&p v R U 7 Wz 0p. 3 1 (CAT) !+2 + U + TS P ( CAT) !+2 + 8 6 U P&p 3 8 6 $ 6 W 2 ) > .W
1-Computer-Assisted Translation
1 C (
? 2 . 6 A +72 8 ? "J 3 8 6 W - [ @ 6 }2 a J ? 7+; )' -&;. 6 ? "z … Zp -J > 0p > . +72 8 6J
5- / ",; ) +72 8 ? 8 ? 2 = +72 k 4W ) 4W ' ^ Z 8 6 . J Wz 0p -` 2j; 8 [= †*R )1 1S †*R
: 'G $ . 6 B ? c Q 5 2 0p Wz ' 3(NLP)ZH J r J @ ƒ 8 6 :J Z; NLP 8 6 )8 -S 8 6- J @ - )8 8 6 - 2 U? - )& 8 6 - 2 g2 -
) ' 8 ZH 8 6 -
. J "A@ ; +? > P -
1 H'/; . > J ) +72 o + )-J k 7` Wz 0
( (+72 r +> g + H J 0 "J '. 3 6X 3 8 ) J 0 -J _ " ( ( !( (6 ) (2 ( - 4( 47R 0p 8 6r J NLP 8 2 8 6 i )g P ' . X 3 ) ( 8(W-J ! . 4 +72 ƒ +? A ' ! 8 2 Z 6- z 4 3( '(A+ 8 (2 8 (6 ( (2 3( ( ( '( .J s6 S ? ? )l@ ? ) +; ^ $
1- Stress assignment
2- Prosodic pattern assignment
3- Natural language process
2 3 'A+ 4W 2 8 2 ! )g } ; . 2 - 4 c US B * 0p 8 6 - >J . 2 0p W U Ai 2 8 6 P ' 8[
IJ % 2
' ( .3( - (> ƒ ( 6 X |7 ' ' 2 3 8 P J U= • ‡ ( (2d ( 2 2 . > |7 ' J yU X .3( ( |7 ' T X 8 8 6T ? 6 P& ) > - > - ƒ 1 8 6 P& T ? 'cX - +6 )3 - > - > 6 X 1 ‡ 2d 2 2 .3 T 6X 8
:3 U= ' 8 J 0 J - 4
2.1- ISO 1087-1:2000, Terminology work — Vocabulary — Part 1: Theory and application 2.2- ISO 1087-2:2000, Terminology work — Vocabulary — Part 2: Computer applications 2.3- ISO 24611, Language resource management — Morpho-syntactic annotation framework) 2.4- ISO 24612, Language resource management — Linguistic annotation framework (LAF)) 2.5- ISO 24613:2008, Language resource management — Lexical markup framework (LMF) 2.6- ISO 12620, Computer applications in terminology — Data categories 2.7- ISO 16642:2003, Computer applications in terminology — Terminological markup framework 2.8- ISO 30042:2008, Systems to manage terminology, knowledge and content — TermBase eXchange (TBX)
K L B M*73
: 2 J Q P&p ) '
1-3 1(0 :6
1- Abbreviation
3 †R1 A B ,4 SW ^A> [ BS ! J _P +72 _lP 2 3 &2 ' . 2 [ISO 1087-1:2000]
2-3 1 . > S j (14-3) 3I= ! (22-3) • ! 3 'A+ 2 (5-3) 2 z !
$ > 6 . + 8 ZH3 ) @ ) 1@ ;S ƒ ' i 6 _ " . > Z6 xi S >
3-3 4 )E . > (22-3) • !,(2-3) i ! ' @ s6 [ISO 24613:2008]
s S 8 [ _ "
4-3 )-: A 8 - z ˆ6 2 J o + ) > k Z$ [ J J J Z; ! X 2 3 -z ^A1 . B ,4 > 8
1- Affix
2- Bound morpheme
3- Circumfix-3 xi X SW S • _ H )- > ^A1 U? 3+$ J 2 3 8
4- Agglutination
4 5-3 1 )+ . P
. X [ z ! ' i ! YS 2 (18-3) 8z ! [ISO 24613:2008]
- 6 ^A1 ; J A ; X )‰ ; . > - 4 ' -z ! ; + «- W» : S:1 Q . 2 « +>- W» )«- W » )«- W » ) +72 J 8
)Unvoiced = . > 7[ J «un, non, ir, a, in » 8 6 1@ g 2 «» 1@ : S:2 Q .inactive =Š )atonic = 3c )irregular=- ; ($ )non-functioning= (Š
6-3 2R)6 . > c (14-3) 3I= i J 2(23-3)8 -z
.3 - > k Z$ 10-3 Q )ISO 24613:2008 J -1 "
- @ (2 ^2 7 8 8 P 2 aR ) 46 ! W > 3U2 'A+ 2 -2 " o (H ( ( 2 ! . > > 6 2 W > 5U2 ) > > ( 2 ' 2) 6 .2 Z; 2+6z : 2 J -W J )68 -z . >
7-3 1 6 R)6
1- Bound morpheme
2-Compound
3- Endocentric
4- Head
5- Exocentric
6- Lexicalization
5 c 7 ^A> )(14-3) 3I= ^$ P 'SW $ s6 2 J -z X 2 3 J -z . > ^A1 )+2 8 6 WW [ISO 24613:2008]
8-3 2… > (22-3) • I o + )(23-3) -z ? 8 3 (23-3) +72 ! ^A> I . a US [ISO 24613:2008]
9-3 3 JX z ! .W $ - 4 (23-3) -z ! ; c 8 c 2 3 (18-3) 8z !
.3 z A ).3 «8» 2 = P )3 JX z ! ! « c» )« c»" > - -z :1 Q
10-3 4:1 4/ s6) s6 4 [ + 2 6 A & (23-3) Wz (24-3) -z g A> i J ! 6 . > Q7R (68 s6) 8 8 6A7+; (5 [ISO 1087-2:2000]
1- Compounding
2- Derivation
3- Free morpheme
4- Homograph
5 - Semantic homography
6 - Syntactic homography
6 11-3 1KV . X (22-3) • (2-3) 2 S j (24-3) -z ^A> ! X 2 8 S . Wz 3 S Q* _1 "
12-3 2: W. X8 .3 (14-3) 3I7A 1 8 - > R 8 $ ^A> [ISO 24613:2008]
^c ; «'S »^A> ) S J )«'S »)« » )«3S »g } ; '+ Wz 8 6^ A> ; +? : Q . > R Wz 8 6^ A> J -W ' J W + 8 -z
13-3
3,8 : W. X8 . > ' ! ' (24-3) Wz ^A> ! 8 (12-3) 8 -z ^c '
.X 8 -z ^c ; «'S » > ? 'c 8 - z ^c ) S «3S » ' -z : Q
.14-3 Q )ISO 30042:2008 19-3 Q )ISO 1087-2:2000J SW - "
1- Inflection
2- Lemma
3 - Lemmatization
7 14-3 1(YJ . A +; 8 2 2 32> 6^ A> J 8 Œ +; 2 ; U 8 P [ISO 24613:2008]
.3 2 … > !? ) > [ I=J 1R 3 'A+ 3I=EF-1 " .3 - > Q « 6 •z J 8 -? J» ; ISO 24613 «^A>»-2 "
15-3 2 6 2 . 2 ^+6z ! ; A J P ! 'c
W» g } ; > 6• z J 8 - ? J)« c» g } ; > •z ! J P 'i- " . 6 ^A1 P&p Z; ! 2 > « 12 3+P ] » + Z; ! P «[ S
16-3 3Z 2 . > J `X 6X f ;&H (12-3) 8 - z 8 6^ c Œ +; 2 6^ c J ,S
17-3 4[. . > - 1 S * (18-3) z ! ! 7 2 p ^A> )« (6U» g } ; ) 6 «,» )« » )« 6» ^ > « » « 6» J 0+ 8 6z ! 8 6•z ) S J :g } «U(» •z ^ (> « (6U» "z ' ( J . ( 8(S (* p( ^A(> ˆ6 «,» 2 (« » « W @» . 6 « » «- @» « W» «- @» 8 6• z f 8 6z ! 2 = P 3 « 6»
1- Lexeme
2-Lexicalization
3- Lexican
4-Morph
8 18-3 1. P . > _P J 8 - ? J 6v J 8 - ? J 7 2 P 'Ai 2 [ISO 24613:2008]
. z A JX z ! : z ! 8 ; +?J ƒ - "
19-3 2 W. '\ B &A ^ $ , X 2 €ZH 3= P I= ! ! m c J 2 c - > ^A1 63 I= J 8 - ? J J 2 (14-3) I= .3 a@ [ISO 24613:2008]
7+ ! J 1R )%&p ! ) Z; Z2 - z Z2] > 2 Z; ! 8 -z i Z; - " (i (Z; " ( B (+ 8 ( 'R( ' 1+6 .[( >X g $ ^ ^}+= j ! g } ; ) 7+ ! .3 'A++ -z
20-3 3 &A R)6 .3 a@ ^ $ X - 6 ^A1 ; J X 8 2 )(14-3 )3I= i J ^A1 (23-3) 8 -z
' 8 X 2 )3 «» «3c »3I= J ^A1+ Z; 2 ! S J « 3c » : Q .3 - > n4P 2
. 6 + ^A1 Z; 2 ! ) 2 - 4 Wz *; i J P&p -1 " B (, Z g P' . X P Z; ! ; > J c T J 3 'A+ Z; 2-2 " )Z2 Z; ! 8 -z 2 ! ' > 8U + 1+6 ^+; )8J -z XS 8l@ a@
1- Morpheme
2- Multiword expression
3- Phrasal compound
9 (p ' +, a -†c H Wz –1 Wz X . Z; ! Z; 2 ! ' . 2 4
21 -3 2 ] .3 - > A )X J 1R )(23-3) -z + X 2 8 S
22 -3 [ Š 3 3 'A+ P (14-3) 3I= ! ^A> Ai 2 X ^A> 2 3 J P .W $ $ > Z2 )38 @ )4* [ISO 24613:2008]
23 -3 -z . B&2 U J 1R )^$ P WV ! ; ) 2 3 (14-3) I= [ISO 24613:2008]
24 -3 4W. X]0 .3 ' (23-3) -z ! J 8 -S 8 W [ISO 1087-2:2000]
. 6 «2 @»"z J 6^ A>« 2 @» « »)«2 @» > ) S J : Q
1- Lexico-statistics: W $ - 4 Q7R 8 6 J ^ A • 2 3 8 X AA -1
2- Reduplication
3 -Agglutinative
4-Word form
10 25 -3 1 -. %)*+ . > (26 -3) Wz 0p 8 6 P J 8 -? J ' 0p XS
26 -3 2(WSU) Wz 0p 8 6 P . > c X P ! ; 2 3 [ ; J 3&; > ! (24-3) -z ^A>
8 lW p sy&; ) c sy&; )8 ; sy&; J 3 'A+ 3 -z ! ^A> 2 3&; > ! - " ^A1 8 ; 'o sy&; J Z2 )H2O +> 8 6 + )i 8 6g A [ $4 sy&; J c .F16 > - >
27 -3 3W. 8 .3 >- z ^7 U? J ^ P (23-3) -z ! 7c c
z !( J o Z j ) 6z !J 8 > 3 'A+ -z 2 @ z )8 - 2 +, x 6 J - " )'> 8 m c 8 ! [ g + H ($ > 8 6 ) } 6 X 2 > )-z ( )X " 6 ^A1 ; ; W 8 6 JĤ 6z ! ) 6 J ' -z ! c .3 ! ! . > - q@
28 -3 42 "z .3 a@ ^ $ & 2 X " 6 ^A1 8 6a R J X 72 8 2 (6-3) 2 !
1- Word segementation
2-Word segementation units
3-Word structure
4-Word compound
11 «B +6 » )« 1W- W »: Q
-. %)*+ $ $\ \4
-. %)*+ :$ ^$ 4)/ 5 1 -4
.3 8 j Wz 0p g • ' - > - %> s6 4
«k ($» «-z ^A(>» )«•z » € ,4( ; «3I= » «z !» €; U ; f Z 1 ^A> !( .3( -z ^A(> ! 3I= ! ,4 ^A> .3 •z ! z ! ! ,4 ^A> . 6 1 J 'c ( 8 ( -z ^c ( (XS ](H J (2 (> ^A1 8 -z 8 6 ^c J Œ +; k $ . X 3 -z 8 6^A>
6' . 6 P&p > J 8 6 J Q7R 8 «-z » «z ! » q+6 P&p -1 " .3 6X > J ) > - %> 2 2 P&p [
s6 , X [i 6- z ^A> 8 A 3 J 8 6 P = p > -z ) 2 I7 -z 8W^A> +; 2 )8 I= > -z > -z .2 2 ! k -z 8W ^A> +; 2 )( J W k ) 18 @ >- z 4* >- z 8(W ( ; )3> 2 )2 )… > 8 6 XS ^ > 8 I= > -z .2 s 2 -z .3 A
1-Agglutinative morphology
12 ` V
AI 7 'A 5 7 'A
.__] [.
Z 2 ,8 : W. X8 (YJ W. /X]0
: W. X8
Z 2 P (8 5 AI 7 'A ,)$ ^ & - 1 X]0 ^A> XS ! , … > 2 i «$ > >- z » > - 4 «8 I= >- z » %&p 1 -2 " .3 -z 8W
•; 3 'A+ A . > A A7Z US , J 4 ƒ ^ 1 @ 4* >-z . W T 8 I= >- z S ! U X 2 3 ^= '+6 ) > Wz W^A> » «krap» ) 2 - 4 8 A "z B ,4 2 8 A J 1b ASX )g } ; 2 )8 @ 8 , J 8 .3 « > c > » « krap krap-krap» 2 = P 3 « > c J ( Wz 0p T >- z ; $ J m c ; +? ! ) xi • 6 6 X .3
. - > 'ISO 24614-2 ; $ ' -3 "
1-Afrikaans: 8 SX J
13 . Z 2 ^A>
'0W.
8 I= >-z 8 @ /4* >-z
… > 2 *c 8W ; A ) US (8 @ /4*
/ $ '0W . : -2 X]0
^A> .3 >X 8 6g $^ 6^ }+= j ) P&p ) Z2 ^ > 1(WWE) 8 - z i Z; 8 J + 8 - z 2 . > Z; Z2 Wz Z2 ^ > Z2. 2 ƒ 3 8 ,+ 3 3 $ ^ ; « 4 < 2»)g } ; . > - > SW X 8 6a R ! ! 8 )g P ' .3 4 2 c 2 ! YS ) S * B ,4 ! - > )- o ! «s ? »g } ; . > - > SW X 8 6a R ! ! 8 J Z; 2 ) ,4 2 3 ? ZA s6 « ,4 ? »i W . s 2 3 ? i Z; ! ? )(20 -5 g } 2 - [ ) X P Z ZZ2 !1@
1- Multi Word Expression
14 - 4 P&p Z; P 3 g «s ? » Wz 2 2 i ) 6 ^A1 8 -z . c 'i « ,4 ? » 2 = P )« 3SX s ? » ) >
8 -z i Z;
(MWE)
2 %&p ^}+= j >X 8 6 g $ ^
Wz 2 Z; 2
W. '\ B &A a -3 X]0
8 ; sy&; ^ > sy&; > . > ^A1 8[ sy&; > 6- z g A> J Wz 0p 8 6 P 6 ; ' * 3 sy&; [ sy&; > J c 8 lW p sy&; ) c .3 -z ? 3&; 8 P«! 6»)g } ; . > 2 , X
(WSU) -. %)*+ / M
Wz g A> [ sy&; >
8 lWp sy&; 8 ; sy&; c sy&; > $4 sy&; >
-. %)*+ / M a -4 X]0
15 '6 X)1 -. %)*+ :6 L$ ' 2-4
: W-, J 0 U J m c J -J P Wz 0p ’H k $ -KJ ’B 2 6 ) 6 ) 6 @ ) 6 1@ ^ > ) 6 J =-d ’ 6 -&; ) 8 6z ! J =-e 8 6 - @ k Wz 0p c ' 3, - J > -z 8 *R1- ’4 - > - %> g 3) J . J ! J W + J 8 -A@ -f
8 ((Q(7R 8 (6 U '+A ( ) Q(7R 8 (6 '( Wz 0p WJ J > '“+p T ) (> ( g +; 0p ! ' (3-4 2 ƒ ) €8 6 1 r +> 8 2 $ A J +H (yU 1 o ((6 (Q= J - > 2d 0 ) 2 s6 S 8l@ k $ ; . > - %>
-. %)*+ 3-4
. 2 \1 Wz 0p 5 ^A>
bx ) > 8 lW3 &; A 8 6† c > 0p sy&; ' )' = B c 8 6 - )- (> ( (> P B (c € J "A@ . > 0p 7 8 6 P ISO 24612 ( 8 (6z A J = Œo +P Wz g A> ^ > 2 X s6 S k $ ! ? 8 8 @ ( k ($ Wz 0p ; $ ) J -A@ . > s6 S U Wz 0p ; $ J 8 . > sy&; Wz 0p 8 6 P " 0p ! 0p ^ Z 8 2 6 ^A1 s6 . 6 8 j
16 B c = 8 6- :8 '
Wz 0p 8 6 P Wz 0p S J
J 8 6-A@ 8 ;&H 0 > P B c Wz 0p - >
; $ "& k $ sy&; J 8 Wz 0p m c
-. %)*+ -5 X]0
5 N 4 3> W 3 2 c 1 2 0
5 N 4 3> W 3 2 c 1 2 0
-. %)*+ %)*+ D -6 X]0 17 3 (&; '(= g (} (; () A †c > ' - ! ; 6 )= 0p \p (Wz 0(p (J ( (> P \p (3 - > †R1<0 )1> - ! 6 ^A> « 2» (J )3( - > †R1 <0 )2> " ) > c > -z ! ; « c 2» c '= « N( 3> W» B P .3 «» &; ! "z ! B P . X P , + -z (Wz 0(p (P J ^A(1 (2 (7c c ! <3 )5> " )3 Z; 2 ! (i ( Wz 0p P B . - > †R1<4 )5> «N» <3 )4> «3> W» . 2 !+2 -z 8 ! 6 > > ^ H 3&; '
J 8 (> ( - (> - '( s( •(; (Wz 0(p. ( 2 B c ' 8 Wz 0p Wz 0p P ! ) > J ? 'U[ 8 60 p 2 $ ) > Wz 0p 8 6 P ' J aR ' «2 • AX - o j »7+ . > > 0p c ! W ? 2 )2 0p1« 1 »B 6a R X )0p ; $ J c k 8 8 47R ; $ ) 2 + - 4 6 7 S J 2 i : 6 J ) > 7 S k ( (6 a(R ' J c J 8 > )k $ bx.(W $ - 4 28 lW 1 2 )38 -z i Z; ; «-z » B J P ! ; « AX - o » (c ) [( k $ B 7P u . > 7 )- > SW T -z ƒ ! ; ( ) (> '( ( -z ^c ! ; « AX - o » > B + ^ > 3 'A+ 6 k $ . > ^ > « AX» P « - o »
-. %)*+ A 75 '0 W. 7 1-5
8Ai 2 8 6 P 6 -z 8 J 6 2 3 ' ISO 24614 , ^ . > «z !»B
1- Token
2- Tokenization
3- Multi word expression
18 -. %)*+ M P ,8 &L 7 2-5
B )6 1-2-5
J - W J A : 6 ' Wz 0p P 'c Z 8 J J ^ g J m (c 8 6 J 2 ISO 24614 8 6 aR [ U*R y } .8 2 - W J 8[ 8 P ) > g +; Q7R 8 63 $ 3 'A+ Q7R g .3 - > - \j ) 2 .' A 8 6>
$ W - 7 2-2-5
1 )+ . P X7 -KJ « » g } ; ) 3 Wz 0p P ! ? - [ X ) > ^ -z ! z ! W .(«? » z A ;
2YJ () X7 -d ^ ' -z ! 2 W . 6 + $ -z ! 7c c 8 ; $ J - 4 2 $ )« 4 < 2» sy&; ' )g } ; . 3 Wz 0p P ! Œo +P 6 [ X )J - X • @ < 2»g } ; ) > v + Ui ˆ6 ) - > - o 8 ,+ 3 3 $ ^ . - > 4 < 2 6 2 $«U+ 4 < 2» 34W 2 = P « 4
" /D h$ W. P 'L $ ')$ D)E X$ 2 )g X7 -e
0(p (P !( - ([ X b(@ ) > a @^ $ ` 3 c ! 8 -z ! 2 W ' )' ) > UZ 3 'A+ ) > - Z U= «- R» ! )g } ; . > Wz .3 Wz 0p P ! " 6 ^A1 -z
1- Principle of bound morpheme
2- Principle of lexical integrity
19 1M*7 i 5 X7 -B 0p P ! ! ; - [ X b@ ) > - 4 P&p Wz 8 6^ A> J 8 > W - 4 P&p Z; ! ; «'4W ƒ S » g } ; ) X P Wz (. >
2 X7 -f -z ! )« A1@» )g } 8 .3 Wz 0p P ! b@ ) > J c -z 2 ! W 2 'U[ ^ 8 &; ˆ6 + «31@» 8 &; 2 i )3 S 8 J . > > S - X 3 2 2
$ 6 W - 7 3-2-5
3 X7 -KJ A Wz J 8 > -z ! .3 -z 248 - z ' 8 ! .3 Wz 0p P !
(68 '0 9A ) 5(J G- X7 -d Z; Z2 c 8 6 c 8 6 > ^ ' . > - ^2 ! ; 1 6Ui . T W &$ 6 5 , X W P k $ 8 -z ^c ;
1- Principle of idiomatic use
2- Principle of non-productivity
3 - Principle of frequency
4- Lexicalization
5- Gestalt principle
6- Cognitive science
20 1(8 '0 '0 $ ) '$ : 7 @A X7 -e
7( (` 8 (e; J 8 (6 8 7 8 e; ) k $ 27 8 [= T (P ( X ( - X ( ( - 2 TS P 81 3$ , X . 6 c > ^ > 8 X s6 S p ^ ' . > k - > n4P 7 TS P g } ; ) A Wz ^A1 8 [= ! 7 8 6 + ; A Z; Z2 J .k $ « 2 + ^ » «,> + ? » 8 6 [= S J « A1@» «s ? »
3 $ V2 X7 -B
3( ( b@ ) 6 a6 2 X J ^7 U? ^A1 k $ 8 2 "z g +> W (T 8( (;&H g (Z = 8 eS b7@») Q4R S « S» )g } ; . > -z ! (P !( (; ( X †R(1 ) (> - ?W k $ « S» > W )3 (« & 8 ,+ .3 X Wz 0p
-. jX 6 X8 X7 3-5
? U k $ . > - ?W k $ A - 4 Wz 0p 8 6 P ,+6 )^ . > WJ @ Wz 0p 8 6 P
-. %)*+ k $ J7 4-5 4 $ : : X7 -KJ 2 = - X 3 Wz 0p 8 6 P 7c 8 6 c 3 'A+ Wz 0p . 6 1 Q7R 8 6 2 [ 8 = +P 'U[ 8 60p
1 - Cognitive linguistic
2- Prototype theory
3- Principle of language economy
4- Principle of granularity
21 / :' 6 :')G)$ X7 -d
0(p (P (+6 J (1R 6 ( ) 2 c 0 > ^* X 2 6 B + • ! ; «8- 4 - - » :' ! ‰4 « 4 » )g } ; ) 6 ^A1 Wz ( « (4» « 4 » ' ! W g P' )3 Wz 0p P ! c ^2 ! . X P Wz 0p 8 6 P ; c )s >
/R )6 :' 6 :')G)$ X7 -e
(2 - ([ X )0( k ($ ) > [ 2 ! ^ > 2 > 3S 2 ! W (; ( Œo +P 2 - 2 2 - +6 X P Wz 0p P ! ; c W U «AZ(> ^ ; »‰4 > )g } ; . > 8 lW 1 [ 7c Wz 0p P ! X (2 ) (X ( (P ( (Wz 0p P ! ; ^2 ! > B + )' ! .3 Wz 0p P 'U[ U «^ ; » - 2 2
l / :0 %)*+ X7 -B
( ) 6X J Z2 6 > [ sy&; c sy&; 8 ; 8 6 > 7+ J )sy&; > 6 . 6 8 0 J c ^ P ' ! 2 > k P 2 > Wz 0p P ! 0p P ! «1945» 8 ; > )« @ 1945 g B , „ »g } )g } ; 2 m c ' ' «ㄱ» )«3 8 - 2 J 3 3&; '= ㄱ» jS 7+ . 3 Wz . > Wz 0p P !)3 7+ ' ^; S
, W 5 B &m X 6 D0E X75-5
'( (. 6 a> @ J 6 6 W $ - 4 WJ 8 - > *$ ' I J c m c 6 J 8 ) > - %> ' J 8[ aR 2 [ +6 )g P $ • 1 6a R ' U Wz 0p - ' 3+6 . 3 J . > - 1 3SW 6 c
22 KJ ()E
( An) 1XML -. %)*+ D
. > ISO 24611 k XML Wz 0p 8 6 P a + J g } '
entry="urn:lexicon:cn:: 8 - 6" />
1- Extensible Markup Language
23 : '$ 6
1. ISO 639-1:2002, Codes for the representation of names of languages — Part 1: Alpha-2 code 2. ISO 639-2:1998, Code for the representation of names of languages — Part 2: Alpha-3 code 3. ISO 639-3:2007, Codes for the representation of names of languages — Part 3: Alpha-3 code for comprehensive coverage of languages 4. ISO 639-5:2008, Codes for the representation of names of languages — Part 6: Alpha-3 code for language families and groups 5. ISO 704, Terminology work — Principles and methods 6. ISO 860, Terminology work — Harmonization of concepts and terms 7. ISO 1087-1:2000, Terminology work — Vocabulary — Part 1: Theory and application 8. ISO 1087-2:2000, Terminology work — Vocabulary — Part 2: Computer applications 9. ISO 24611, Language resource management — Morpho-syntactic annotation framework) 10. ISO 24612, Language resource management — Linguistic annotation framework (LAF)) 11. ISO 24613:2008, Language resource management — Lexical markup framework (LMF) 12. ISO 12620, Computer applications in terminology — Data categories 13. ISO 16642:2003, Computer applications in terminology — Terminological markup framework 14. ISO 30042:2008, Systems to manage terminology, knowledge and content — TermBase eXchange (TBX) 15. Britannica Online Encyclopedia, http://www.britannica.com 16. ALLEN, J., Natural Language Understanding, (1994) Addison Wesley 17. ARONOFF, M. and REES-MILLER, J., The Handbook of Linguistics. 2001, Blackwell 18. BIBER, D. et al., Corpus Linguistics. 1998, Cambridge University Press 19. BUSSMANN, H., Routledge Dictionary of Language and Linguistics. 1996, Routledge 20. CRYSTAL, D., The Cambridge Encyclopedia of Language. 1997, Cambridge University Press 21. JOHNSON, K. and JOHNSON, H., Encyclopedia Dictionary of Applied Linguistics: A Handbook for Language Teaching. 1999, Blackwell 22. KENNEDY, G., An Introduction to Corpus Linguistics. 1998, Addison Wesley Longman
24 23. MATTHEWS, P.H., Morphology. 1991, Cambridge University Press 24. PACKARD, J.L., The Morphology of Chinese: A Linguistic and Cognitive Approach. 2000, Cambridge University Press 25. POOLE, S.C., An Introduction to Linguistics, 1999, Macmillan 26. RICHARDS, J. et al., Longman Dictionary of Applied Linguistics. 1985, Longman 27. UNGERER, F. and SCHMIDT, H-J., An Introduction to Cognitive Linguistics. 1996, Addison Wesley Longman 28. Zhu, Dexi, Lecture on Grammar, 2003, Commercial Press (written in Chinese)
25