Linguistic Analysis of Taxonomy Facet Creation and Validation

Linguistic Analysis of Taxonomy Facet Creation and Validation

LINGUISTIC ANALYSIS OF TAXONOMY FACET CREATION AND VALIDATION Linguistic Analysis of Taxonomy Facet Creation and Validation BY ASHLEIGH FAITH The more I talk with other indexers, the more I hear of the frustration that they face with how to create a well-shaped taxonomy. One of the most prevalent causes of this frustration has many names– associated terms, use for, reference terms, facets, variants–whatever name you choose, the different facets of taxonomy terms are often difficult to build. Using linguistic mechanisms, I will outline how to identify facets for creating taxonomies linguistically and engineer a linguistic analysis tool that you can use to gauge whether you have fully harnessed your Figure 2: Facets within a scope note term facets. Many consider hierarchical and faceted taxono- mies separate entities; however, if combined they become a very powerful indexing tool. By creating Ashleigh Faith received a master’s a hierarchical taxonomy and using frequent facets degree in Public History and Archiving as narrow terms, or building complex facets into from Indiana University of Pennsyl- scope notes, taxonomies will not only create a vania and she is pursuing her second coherent organization of parent terms but will apply master’s degree in Library and Informa- a multifaceted support structure on which parent tion Science from the University of terms may rely. Because a taxonomy is by the subject Pittsburgh. She has been indexing in matter it incorporates, deciding on the scope and archives, museums and libraries for breadth of your taxonomy and what parent terms to over five years and has been a taxono- include cannot be quantified here. However, once mist for almost 2 years. She is the lead you decide these key components, linguistic analysis taxonomist at SAE International and is may be introduced. reorganizing and reestablishing their taxonomy and indexing practices. Her Figure 1: Facets within a taxonomic heirarchy Facet and Concept Creation experience includes automatic indexing Lexicology of Terms practices, traditional indexing practices, Linguistics is defined as the study of language and bilingual taxonomies, and linguistic Whether you are indexing manually or using language structure through many applications, such studies. She is currently serving in an automatic software, a sound taxonomy is essential as morphology, syntax, phonology, semantics, ety- advisory capacity on the NATO Termi- for any indexer. Ideally, a taxonomy should always mology, and lexicology.1 All of these sub-disciplines nology Program reorganization project. be in a semi-permeable state in order to maintain of linguistics can be used to analyze and provide Responses and queries are welcome modernity and validity. No matter the indexing improvement to your taxonomy. I will be focusing and should be directly sent to Mrs. Faith method used, building expansion points within your on the morphological aspects of lexicology–the via email ([email protected]) and should taxonomy can raise the capabilities of your indexing study of the forms, meaning, and use of words. include your contact information. in both the short, and long, term. Linguistics is the There are differences in the analysis approach; key to unlocking this potential as well as enabling morphology is based on terms and their arrange- your taxonomy to grow without undue difficulties. ment while lexicology is based on terms and their 11 JANUARY – MARCH 2013 KEY WORDS / VOL. 21, NO. 1 LINGUISTIC ANALYSIS OF TAXONOMY FACET CREATION AND VALIDATION processes. The foremost linguistic analysis method that can be utilized This is an example of a lexical replacement or the replacement of a word for facet selection is a cross between the two methodologies. First, to by a new unrelated or seemingly illogical word.4 Another example of understand the creation of facets you must analyze the morphemes, or a lexicalization device would be the word entrée at a restaurant, which the subset of lexemes, meaning the single word analyzed in parts.2 The in the United States signifies the main course while in French it means second methodology is to create linguistic concepts. entrance and, therefore, would be more appropriate to use for an appe- Most often when linguistics is mentioned in conjunction with taxono- tizer. This is called “borrowing” as it means to borrow from another my, it is with creating a multilingual vocabulary in mind. While creating language. a multilingual taxonomy is admirable, and can be achieved with these The most irksome example of a complex lexeme in the English lan- same linguistic principles, the base language of your taxonomy must be guage is the word like which, as Alexandra D’Arcy’s linguistic work on dealt with first. Analysis can be conducted on existing taxonomies or the word like states, is “one of the most ubiquitous and multifunctional” used in the creation of taxonomies. Either scenario will result in empiri- lexemes in the English language.5 The word like can be considered the cal evidence for demonstrating the potential of your taxonomy. This personification of a morphology example and can be categorized into results in introducing depth to your taxonomy and expanding users’ many lexemes. The word like also demonstrates how complicated facet understanding of applied indexing terms. creation and analysis can become without linguistic tools. These examples cover the different forms and definitions that a Common Lexeme Analysis of Terms word can take on, but there are other factors to take into account as To understand the term facets on a part basis, there are three common well. Some examples would be common misspellings (amateur ver- lexemes to identify: roots, meaning the main parent term; derivational, sus amature), common alternate spellings (defence or defense), slang meaning the formation of new words from the root word; and desinence (dawg), compounded words (ain’t), abbreviations (NO2), abbreviated of a term, meaning to add a suffix. This is a form of conjugation and can compounds (LOL), acronyms (NATO), and many others. The root also be used when developing multilingual taxonomies. term should always be classified as the preferred term if you are using a thesaurus, and the parent term if using a taxonomy. Facets should be cat- egorized as use for if a thesaurus is being used and a child term if using a taxonomy. It is not suggested that lexemes be used as parent/preferred terms. Concept Creation The second application of linguistics to taxonomy creation and valida- tion is through concepts. When analyzing a taxonomy through concepts many avenues for future expansion can be created. While all taxonomies should have a process for adding new terms, it is a very different matter when adding new concepts. Concepts serve as links between terms, adding both delineation value and customization to a hierarchy. You must decide what concepts your subject matter and scope will allow because certain terms can be considered concepts; for example, electric vehicles can be a term or it can Figure 3. Example of Derivation Devices and Lexicological Breakdown of Terms be a concept of all the parts and engineering that amalgamate to form an electric vehicle. When looking at a hierarchical taxonomy, you can move through the terms in a strict linear fashion while, if concepts are When creating a mapping of derivation devices and lexical breakdown created, you can move through a taxonomy like a ripple where terms of terms there are also combinations that can apply affixes to other pick up similar terms to form connections between subject headings affixes, such as the example act-ive-ate-ion. Common derivation affixes and the hierarchy contained in each. By understanding the links that must follow grammatical rules and thereby are limited in how many can be created among terms you will be integrating expansion points facets they can produce. This type of general faceting may also be used for the evolution of your taxonomy. There are six types of controlled to support the progression from a taxonomical-controlled vocabulary vocabularies: flat lists, synonym rings, taxonomy, thesaurus, ontology, to a thesaurus-controlled vocabulary. By identifying these lexemes you and semantically-linked data. The progression of a controlled vocabulary will be able to categorize the most common facets of the terms you are flows through the six different types in the sequence in which they are analyzing. However, there are more compound applications that can be listed above. Therefore, if linguistics is used to create facets, whatever utilized for facet analysis. form of controlled vocabulary you are using now can evolve to the next Complex Lexemes Analysis of Terms level of controlled vocabulary. Each type transcended will improve your Language is made of subtle social differential disparities that cre- indexing as well as your controlled vocabulary. ate different, more complicated, lexeme facets. Again, for individual When adding multidimensional facets to a taxonomy, you are facilitat- terms, facet lexicalization devices have been shaped over time by both ing the possible progression from a taxonomy to an ontology, which is demographics and societal axioms. Taking English as my example, the defined as “explicit formal specifications of the terms in a domain and 6 language is spoken officially in roughly 75 countries, all of which have the relationships between them.” You are not required to take the leap their own vocabulary to the language even though it is the same root to the next controlled vocabulary form if you do not wish to, but by tak- language.3 These facets of the language can be considered fractures to ing these linguistic steps in the creation and analysis of your taxonomy the root trunk of the language. For instance, in the United States ver- you will be building in the materials necessary to make that step if you sion of English the front section of your car is called the hood while the so choose.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    5 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us