Controlled and Uncontrolled Vocabulary Systems

Total Page:16

File Type:pdf, Size:1020Kb

Controlled and Uncontrolled Vocabulary Systems

Hula Kahlal – S3236327

Controlled and Uncontrolled Vocabulary Systems Hula Kahlal – S3236327 Table of Contents

Executive Summary 1

Introduction 1

Advantages and Disadvantages of Controlled vocabularies 2

Advantages and Disadvantages of Uncontrolled vocabularies 2

Taxonomies and Thesaurus 3, 4

Folksonomies 4, 5

Decision making process on which method to use 5

Conclusion 5

References 6 Hula Kahlal – S3236327

Executive Summary: Above, we briefly defined controlled and uncontrolled vocabulary systems, examples of each system (such as taxonomies, thesaurus, folksonomies and the different social bookmarking websites), and the different ways they assist the organisations. In the following parts of this report, we will discuss the advantages and disadvantages of using the different systems of controlled and uncontrolled vocabularies, giving examples of each and how they work. We will discuss the usefulness and limitations of taxonomies, thesaurus, folksonomies or tagging, and how each one of these systems functions, in comparison to one another, and the considerations and guidelines on how to choose the best system for an organisation’s information, will be discussed and the factors affecting the decision- making process.

Introduction: Controlled vocabularies are centralised systems used to identify and easily detect words or terms that may not be familiar to the user and provide descriptions, for knowledge, storage and retrieval purposes. Adopting controlled vocabulary systems mainly helps solving language problems. For example, there are many languages, including the English language, have words that can take a number of different meanings, used in different situations, also there are many words in English that can be pronounced the same, but spelt differently, which is the reason why many people misspell words. Controlled vocabulary systems help narrowing down the possibility of language errors by providing the terms (Intellogist 2009), and help the user in identifying these differences in language and use the most appropriate one according to their need.

Types of controlled vocabularies differ in their structure, and according to the user’s need. Some systems can take alphabetical order, and some can take hierarchical structure, where the terms are narrowed down from the broadest concept to the narrowest term related to it (University of Glamorgan n.d.). One way of using controlled vocabulary systems is Taxonomies, which most medium to large organisations use to help their employees and other users understand the culture and the terminology used within the organisation. In the databases of such organisations, there may be terms or abbreviations that may not be familiar to the user, and this is why controlled vocabulary systems (or taxonomies) make it easier for the user to come back to the term and understand the full meaning of it by subject/word searching. Since this is the purpose of controlled vocabularies, the terms used should be clear and in a simple language.

Uncontrolled vocabularies (social tagging, social bookmarking, etc...) have become very popular in the past few years, for its usefulness and ease of use to get back to the resources or the materials used without having to search for it again, and the ability to retrieve, share and organise them by one click. In the past few years, social bookmarking websites such as Delicious, Twitter, Digg, and many others, have spread widely around the world and gained popularity for their ease of use. Hula Kahlal – S3236327

Main Points:

Advantages and Disadvantages of Controlled vocabulary systems Looking at the advantages of using Controlled vocabulary systems, when searching for a material or piece of information, controlled vocabularies reduce the likelihood of inaccurate results, as their main purpose is to increase the clarification and specification of the terms being searched. They help to ensure that the materials are listed in a consistent and predictable manner that will help the user obtain results that more closely match their needs, and less time consumed when searching, as the user is familiar with what they are searching for. Controlled vocabularies help users to identify the nature of the ordnance they are searching so they can more quickly decide if it is appropriate and remove ambiguities resulting from varying usage of different terms. A controlled vocabulary helps users identify the context of the terms more easily (Vernau 2005). On the other hand, adopting any type of controlled vocabularies can have some disadvantages or setbacks worth mentioning. First of all, there may be some human errors within the system as the entire controlled vocabulary systems are built by humans. Especially when deciding which terms to include and in what classification. It may also be time-consuming and costly, as it takes time to actually agree upon the terms that can be used and provide a full staff training on how the adopted system works and how to use it. Another issue that may arise is the need to keep the system updated, as it easily runs out of time, and many new terms may need to be added. When adopting and getting trained on how the controlled vocabularies work, it is a great search tool to use and brings back benefits to the organisation that’s using it.

Advantages and Disadvantages of Uncontrolled vocabulary systems Adopting uncontrolled vocabularies has some positives that are worth mentioning. Firstly, it gives the person the freedom to store whatever information that might be of a future use, and organise them the way a person prefers, which is referred to as “Democratic control”. It also allows people to share and exchange information more easily and freely, due to the use of social tagging websites. It is also flexible, easy to use and less costly and less time- consuming compared to controlled vocabularies (Sethearly 2007).

Although there are valuable advantages using uncontrolled vocabularies, there are also considerable disadvantages that need to be discussed. Uncontrolled vocabularies may create redundancy in terms, as it allows the use of singular and plural forms, as well as inconsistency of spelling, punctuality and capitalization. It is also less reliable as the person is free to store any information they come across, which may be meaningless and/or inaccurate (Sethearly 2007).

Taxonomy and Thesaurus Taxonomy is the practice of classifications of ‘things’ based on their relatedness and relationships. It follows a hierarchical manner, where the main purpose is to provide labels for major theories within a concept and narrows them down to all the related terms that fall Hula Kahlal – S3236327 under each label (Metataxis 2007). For example, classifying The Human Body, as a broad concept, and narrowing it down to The systems of the human body, each system’s organ etc... This use of hierarchal structure of taxonomy helps preventing the repetition of terms and data redundancy, and consequently, increased efficiency of search and decreased searching-time. The use of this taxonomic manner can be applied to any group of concepts that are related in a hierarchical manner.

In the recent years, the economy has shifted greatly towards knowledge and knowledge management. This is when organisations came to realise that it is essential to keep their information classified and stored efficiently, to be able to minimize the time and effort consumed when searching for information, and consequently minimizing costs by lowering the number of staff. It is important for an organisation to always up-date their taxonomy according to the type and amount of information it uses and the user requirement (TSO n.d.).

Tesaurus is defined as a controlled vocabulary system, containing a classified and agreed upon list of terms, for the purpose of searching and retrieval. The main three characteristics of a thesauri are:

1. Equivalence: describing the relationship between the synonyms

2. Hierarchy: links, establishing the hierarchal structure

3. Association: the connections between more loosely-related concepts in the thesaurus (Gilchrist 2002).

When using a thesaurus as a search tool, the user can enter the keyword they’re looking for, then the system matches this particular keyword to the available words that can have the same or similar meanings, it is then the user’s decision to choose which term is closest to the term searched. From a user’s point of view, thesaurus systems can be more complicated than search engines, however, it is important to realise the importance of using thesaurus, when it comes to providing specific results, in an efficient way, and also an important tool for term retrieval (University of Glamorgan n.d.).

An example of a thesaurus is the legal thesaurus created by The Legal Information Access Centre (LIAC) of the State Library of NSW, containing the legal terms. It is a two level list of subject headings with particular emphasis on plain English legal information. The two different levels are labelled as “Top level (parent) categories”, containing the broad headings, such as Accidents and Compensation, Business and Finance, Biotechnology, etc... And the “Second level (Child) terms”, including all the subheadings that fall under each broad heading, which makes it look like the following: Hula Kahlal – S3236327

Accidents and Compensation Emergency services Insurance Negligence and liability Victims compensation Workers compensation Business and Finance Banking Bankruptcy Credit Debt Electronic commerce Unclaimed money Biotechnology Agriculture Biotechnology Cloning Genetic engineering Genetically modified foods Genetics

Accidents and compensation, Business and Finance, Biotechnology are the BT’s (Broad Terms) of this thesaurus, where the subheadings that fall under the broader ones are the NT’s (Narrower Terms) (WebLaw n.d.). RT’s is defined as Related Terms, so for example, adding the term “Interest rates” would be directly related to Business and Finance. UF’s stands for Use For. It is basically developed to resolve the problems of synonymy. For example, the term “Liability” would be a UF, because it is a synonym for Debt, both falling under Business and Finance. Folksonomy Folksonomy is defined as the decentralised classification system resulted from the tagging a person does of any piece of information or a URL address of their interest and maintains, stores and organises them the way the person prefers for retrieval purposes.

Folksonomies allow internet users to find one another in regards to common interests. Social tagging websites (Delicious.com, YahooBuzz.com and so on) make this process easier for users to find their desired resources. They have also influenced the popularity of folksonomies, by establishing communities who share similar insights and who can easily locate the information that is within the topic of their interest.

Delicious.com is one of the most popular social bookmarking services, which allows you to tag and share your tags with other people once you are registered as a Delicious user. Delicious has several advantages, which makes it more useable and popular than many other tagging websites. First, delicious allows you to tag information under any keyword that the user might find relevant along with writing a note about it, and store it in any folder they would like. Second, the ease of access to Delicious bookmarks and readable by any internet browser and/or RSS readers. Third, it is searchable. So it means any term you might want to look up, it will come up with a number of tags within the area of your interest. Hula Kahlal – S3236327

There are many other social tagging websites, similar to Delicious, with their unique advantages that make them recognisable, such as Twitter, StumbleUpon, Digg, YahooBuzz, etc...

The difference between using a controlled vocabulary system and using Folksonomy is that Folksonomy gives the freedom to the individual to tag any piece of information that’s related to their topic of interest, put it under any name they think is relevant and store anywhere they like. While controlled vocabularies are more restricted in terms of language (i.e. spelling, punctuation, etc...) and narrows down the search terms to what’s available in the system.

Decision-making process on which method to use The choice of whether controlled or uncontrolled vocabularies is more useful for a particular organisation, depends on the size of the organisation itself, and the amount of information it deals with, and the degree of relatedness. For example, if the organisation uses a hierarchal flow of information, then it might be a better idea to use the controlled vocabulary systems, such as taxonomy, as it uses a hierarchal structure of information flow, but if an organisation’s aim is to define any term that might be ambiguous for their users, then using Thesauri would be useful.

Conclusion: Controlled and uncontrolled vocabularies are systems to organise data and information for retrieval and storage purposes. Controlled vocabularies are a centralised system where information is gathered and put in a hierarchal structure. For example, taxonomies follow a hierarchal structure in putting a concept or a set of related concepts together, from the broadest idea, and narrowing it down to the most detailed term or concept. Thesaurus is another controlled vocabulary system that is simpler than taxonomy. Its main purpose is to define any term that might be ambiguous to the user. Thesaurus breaks down the concept into two main categories, Broad Terms (BT’s) and Narrow Terms (NT’s), where the broader concepts fall under the BT section, and any narrower term falls under the NT section. The benefit of using such systems is that it cuts off the searching time, as it provides the users with specific terms, and their proper spelling, and it ensures that the terms are in a constant manner and unambiguous. On the other hand, there are some costs that the organisation needs to pay attention to if adopting controlled vocabularies. Such systems can be costly and time consuming, as training sessions may need to be conducted. Moreover, such systems need to be kept up-to-date and double-checked, as there’s a bigger chance of making human errors.

Uncontrolled vocabularies are more decentralised and more popular and common nowadays, especially in the World Wide Web. Folksonomies or social tagging allows individuals to tag and store whatever information they find useful under any term that may be relevant, and share it to those users with the same interest. It gives users the freedom to Hula Kahlal – S3236327 save and name information they way they find appropriate, which can be a disadvantage at the same time, because people usually tend to misspell some terms, and miss out on punctuality, which can be misleading to other users. Organisations should look deeply into the advantages and disadvantages of both systems, consider the costs and the benefits that can be achieved when adopting any of the systems, according to their need, culture and the size of information mass they deal with in their every-day and long term use.

References: Educause Learning Initiative, 2005, ‘7 things you need to know about Social Bookmarking’, Educause Learning Initiative, viewed 22 March 2010, http://net.educause.edu/ir/library/pdf/ELI7001.pdf Hula Kahlal – S3236327

Gilchrist, A, 2002, ‘Thesauri, taxonomies and ontologies- an etymological note’, CURA Consortium and TFPL Ltd, vol.59, no.1, pp.7-18, viewed 22 March 2010, Emerald

Intellogist, 2009, Controlled Vocabulary, Intellogist, viewed 22 May 2010, http://www.intellogist.com/wiki/Controlled_Vocabulary

Metataxis, 2007, Search vs.Taxonomy?, Metataxis Limited, viewed 22 May 2010, http://www.metataxis.com/exponent-0.96.5-GA/index.php?section=24

Murnane, L, June 2006, ‘Social Bookmarking, Folksonomies and Web 2.0 Tools’, Medford, vol. 14, Iss. 6; pg. 26, 13 pgs, viewed 22 March 2010, ProQuest

Plosker, G, Jan/Feb 2005, ‘Taxonomies: Facts and Opportunities for Information Professionals’ Medford, vol. 29, Iss. 1; pg. 58, 3 pgs, viewed 22 March 2010, ProQuest

Reamy, T, Nov/Dec 2007, ‘Taxonomy Development Advice’, Silver Spring, vol. 21, Iss. 6; pg. 35, 3 pgs, viewed 22 March 2010, ProQuest

Sethearly, 2007, ‘Folksonomy versus Taxonomy’, blog, 15 February, viewed 22 March 2010, http://sethearley.wordpress.com/2007/02/15/folksonomy-versus-taxonomy/

Spiteri, L, 2007, ‘Structure and form of folksonomy tags: The road to the public library catalogue’, School of Information Management, vol.4, no.2, viewed 22 March 2010, http://www.webology.ir/2007/v4n2/a41.html

TSO, n.d., The business benefits of taxonomy, The Information Management Company, viewed 22 May 2010, http://www.the-stationery-office.com/gempdf/TaxonomyV1.pdf

University of Glamorgan, n.d., Chapter 2- Thesauri in information searching, University of Galmorgan, viewed 22 May 2010, http://www.comp.glam.ac.uk/~FACET/dblocks/DBlocksPhD2004_Chapter2_thesauri.pdf

Vernau, J, 2005, ‘How Does Taxonomy Work’, ‘The Business Benefits of Taxonomy’, SchemaLogic, WA, viewed 22 March 2010, http://cm-mitchell.com/PDFs/WP-BusinessBenefitsTaxonomy.pdf

WebLaw, n.d., The WebLaw Thesaurus, WebLaw, viewed 22 May 2010, http://weblaw.edu.au

Recommended publications