Towards a Formal Ontology for the Text Encoding Initiative DOI

Umanistica Digitale - ISSN:2532-8816 - n.3, 2018 F. Ciotti – owa"ds a Fo"mal Ontolog% fo" t'e e(t Encoding Initiati*e DOI: htt+:/,doi.o"g,10.6092,issn.2532-8816/8174 o!a"#s a Fo"mal $ntolog% &o" t'e e(t )nco#ing Initiati*e Fabio Ciotti Uni*ersit1 #i 2oma 3 or 4ergata5 [email protected] 60st"act. In 78esto a"ticolo viene +"esentata 8na +"o+osta +"elimina"e #i ontologia +e" la ra++"esentazione #ello schema ext )nco#ing Initiati*e : )I;. <e motivazioni e i bene&ici #i 8na *ersione semantica e machine-"ea#able #ella )I sono molte+lici: :1) l=inte"o+erabilit1 semantica tra co#i&iche #i testi com+o"te"ebbe la &acilitazione #ell=inte""ogazione cross-co"+ora e l=integrazione #i <in>e# $+en Data? :2; la &o"malizzazione #el )I abstract model a++o"te"ebbe migliorie signi&icati*e in te"mini #i consistenza e soundness. Data la com+lessit1 #ello schema )I, viene 78i +"eso in consi#erazione 8n consistente ma rist"etto sottoinsieme #i elementi e attri08ti. In secon#o l8ogo la meta-ontolog% )62@62A viene esaminata. In 8ltimo, *engono &orniti alcuni #ettagli im+lementativi. 'is article +"esents t'e rationale an# t'e +"o+osal o& a +"elimina"% a"c'itect8"e o& a &o"mal ontolog% o& t'e ext )nco#ing Initiati*e ma">8+ lang8age. 'e "easons to 'a*e a &o"mal an# machine-"ea#a0le semantics &o" )I a"e mani&ol#. In t'e &irst +lace, it !o8l# 'a*e a n8mbe" o& +ragmatic an# technical bene&its, li>e 0ette" s8++ort &o" semantic inte"o+erabilit% in te(t enco#ing +ractices, easie" cross-co"+ora 78e"% +"ocessing, seamless integration !it' <in>e# $+en Data ecos%stem. In secon# +lace, it !o8l# gi*e a &o"mali9e# acco8nt o& t'e 78asi-&o"mal notion o& t'e )I abstract model, &ostering t'e consistenc% an# so8n#ness o& t'e )I mo#el. Bi*en t'e com+lexit% o& t'e )I enco#ing schema, s+eci&%ing &ormall% s8ch an ontolog% !ill be a time cons8ming intellect8al activit%: in a &irst stage, !e +"o+ose to limit its sco+e to a !ell-#e&ine# s80#omain o& t'e )I, an# to 08il# it a#o+ting +"e-existing meta-ontolog% li>e )62@62A. 'e &inal +a"t o& t'e a"ticle gi*es some +"elimina"% #etails o& t'is #esign. 137 Umanistica Digitale - ISSN:2532-8816 - n.3, 2018 Introduction1 'e e(t )ncoding Initiati*e ma">8+ lang8age "e+"esents one o& t'e most significant achie*ements o& t'e Digital C8manities fiel# an# is no! 8ni*e"sall% acce+te# as t'e standa"# fo"malism fo" t'e creation o& te(t8al digital "eso8"ces in '8manistic "esea"ch an# schola"s'i+. $ne o& t'e "easons o& its s8ccess is t'e fact t'at it is base# on t'e D@< metalang8age, a so8n# and sim+le standa"d fo" data modeling and se"iali9ation. 'e"e a"e man% t'eo"etical, +"agmatic an# social "easons &o" t'e wide an# en#8"ing acce+tance of t'e )I,D@< co8+let, notwit'stan#ing t'e man% criticisms and s'o"tcomings. Fo" e(am+le: • D@< is "elati*el% eas% to lea"n an# 8se com+a"e# to ot'e" com+8te" lang8ages, es+eciall% if t'e com+le(it% le*el of t'e enco#ing is low o" medi8m? • D@< enco#ing affo"dances a"e simila" to t'ose o& t"aditional te(t8al annotation, a familia" p"actice to t'e a*e"age h8manist? • D@< data &o"mat is +o"table (es+eciall% in t'e editing +'ase; bet!een #iffe"ent +lat&o"ms?2 • D@< +"ocessing lea*es to t'e 8se" cont"ol on t'e editing +"ocess an# on t'e "es8lting *is8ali9ations? • D@< int"o#8ces data 78alit% cont"ol in te(t +"ocessing *ia its inte"nal s%nta( an# schema based pa"sing facilities? • D@< is fle(ible eno8g' to accommodate a *ast "ange o& '8manistic 8se"s "e78i"ements? • D@< has a good ecos%stem of related standa"ds and o+en so8"ce a++lications. $n t'e ot'e" hand, it is !o"t' +ointing o8t t'at, e*en if t'e )I is an X@< based lang8age, its e*ol8tion 'as some!'at le# to a ce"tain le*el o& abst"action &"om t'at lang8age and, to some e(tent, &"om its 8n#e"l%ing t"ee data model. Ee m8st "emembe", in fact, t'at it is +ossible to #"a! a neat distinction in t'e 8sage o& D@< lang8age: it can be ado+te# as a &8ll-fledge# fo"mal modeling lang8age, in !'ich case !e acce+t t'e 8n#e"l%ing t"ee data model as a goo# wa% to &o"mall% "e+"esent t'e o0Fect #omain? 08t it can also 0e 8se# as a me"e s%nta( facilit%, a se"iali9ation lang8age t'at is in#e+en#ent &"om t'e act8al data mo#el !e a"e 8sing to "e+"esent o8" #omain (as it 'a++ens in t'e D@< s%nta( o& lang8ages li>e 2DF an# $E<). 'e )I, in 1 'is a"ticle +"esents t'e "es8lts o& a collaborati*e e&&o"t o& t'e a8t'or !it' Francesca omasi, Fabio 4itali an# Silvio Ge"oni, in o"#er to #e*elo+ an $E< 2 ontolog% to &ormall% #e&ine t'e semantics o& t'e ext )nco#ing Initiati*e. Some +"elimina"% ste+s an# t'e general context o& t'is e&&ort 'a*e al"ea#% been +"esente# at t'e )I Con&e"ences in 2014 an# 2015 an# in a"ticle +8blis'e# on t'e Journal of the Text Encoding Initiative :8.? see also ..;. 'e a8t'orial "es+onsi0ilit% o& t'e +"esent !o"> is nonet'eless to 0e attri08te# solel% to Fabio Ciotti, !it' t'e e(ce+tion o& section 3Co!: t'e a"chitect8"e o& t'e )I ontolog%5, t'at 'as seen t'e contri08tion o& Silvio Ge"oni. 2 6t least as &ar as t'is +ortabilit% is limite# to a +8"el% s%ntactic le*el, since t'e limits o& D@< &or semantic +o"tabilit% is +"ecisel% one o& t'e "easons t'at moti*ates o8r +"o+osal. 138 F. Ciotti – owa"ds a Fo"mal Ontolog% fo" t'e e(t Encoding Initiati*e t'e co8"se o& its e*ol8tion, 'as mo*e# &"om a modeling o"ientate# 8sage o& D@< to a s%ntactic o"iented usage of X@<. 'is +"og"essi*e s'ift 'as 0een dete"mine# on t'e one 'an# 0% t'e nee# to "e+"esent man% not 'ie"a"chical feat8"es o& te(t8alit%, an# on t'e ot'e" 0% t'e 78est fo" a mo"e semantic o"iente# wa% o& modeling te(t8al &eat8"es. In fact, t'e common 0elie& t'at D@< ma">8+ e(+"esses semantic info"mation is tec'nicall% fla!ed, in t'at D@< per se is onl% a s%ntactic lang8age to "e+"esent a t"ee based data mo#el (4.; 12.). Sta"ting &"om t'e g"o8n#0"ea>ing wo"> o& 2enea", C8it&eldt an# S+e"be"g-McH8een 28., *a"io8s e&&o"ts to &o"mali9e t'e semantic "ole o& ma">8+ lang8ages 'a*e been ma#e in t'e +ast 20 %ea"s :17.? 33.? 30.? 24.? 29.).3 Co!e*e", none o& t'em 'as "eache# a mat8"e state an# 'as +"o#8ce# an o+e"ational sol8tion, mostl% beca8se o& t'e lack o& mat8"it% o& t'e enabling fo"malisms an# tec'nologies ado+ted, an# o& t'e lack o& s8++o"t &"om t'e comm8nit% o& 8se"s. I8ilding on t'e achie*ements (an# limits; o& t'ese +"e*io8s e&&o"ts, !e belie*e t'at t'e Semantic Ee0 stack o& lang8ages an# &"amewo">s co8l# +"o*ide a *iable an# balance# sol8tion to t'e t'eo"etical an# +"agmatic "e78i"ements &o" #e*elo+ing a &o"mal semantic com+onent fo" t'e )I. 'e "est o& t'is a"ticle is #e*ote# to an o*e"all ill8st"ation o& t'is +"o+osal, an# is di*i#e# in t'"ee +a"ts t'at can be con*enientl% entitled 3E'%5, 3E'at5 an# 3Co!5. )ach o& t'em t"ies to ans!e" to some basic q8estions t'at gi*e s'a+e to o8" p"o+osal: • E'%: !'% #o I t'in> t'at t'e idea o& gi*ing TEI a fo"mal semantics is a goo# i#ea, an# 'o! co8l# it en'ance its e(+"essi*e +o!e" an# 'ence its 8se&8lness &o" t'e comm8nit%J • E'at: !'at in t'e )I #o I "eall% t'in> can concei*abl% 0e &o"mali9e# in t'e &o"m o& an ontolog%J How fa" can we imagine going in t'is di"ectionJ • Cow: !'ic' a"e t'e bette" tec'nical an# &o"mal st"ategies to 08il# a semantic model of t'e )I s80set we ha*e identi&ied in t'e ste+ befo"eJ Why ontologize TEI? 'e "easons to 'a*e a &o"mal an# mac'ine-readable semantics fo" )I a"e mani&old.

Towards a Formal Ontology for the Text Encoding Initiative DOI

Transformation Frameworks and Their Relevance in Universal Design

The Text Encoding Initiative Nancy Ide

SGML As a Framework for Digital Preservation and Access. INSTITUTION Commission on Preservation and Access, Washington, DC

A Standard Format Proposal for Hierarchical Analyses and Representations

Syllabus FREN379 Winter 2020

Deliverable 5.2 Score Edition Component V2

Introduction

Text Encoding Initiative Semantic Modeling. a Conceptual Workflow Proposal

Markup Languages and TEI XML Encoding

A Personal Research Agent for Semantic Knowledge Management of Scientiﬁc Literature

Semantic Web Technologies and Legal Scholarly Publishing Law, Governance and Technology Series

Proceedings of Balisage: the Markup Conference 2012