Expanding the System Definitions and Configurations (Taxonomy and Data Structure)
Total Page:16
File Type:pdf, Size:1020Kb
Expanding the system definitions and configurations (taxonomy and data structure) Magan Arthur is the principal consultant at ACG — an independent consulting group for end-to-end planning and execution of innovative enterprise content management projects. Keywords: taxonomy, metadata, data structure, data presentation, metadata templates, polihierarchical structures Abstract This paper is part of a series of enterprise content management (ECM) best practices from ACG, an independent consulting group. The series provides practical tips and expert advice on topics covering planning, implementing, and improving enterprise content management systems and their components. This paper focuses on taxonomy and data structures. It is written from the point of view of the implementation team. It assumes you have some level of experience with the concept of metadata and taxonomies but it is not an academic study. This paper tries to be hands-on and intellectual only to the degree necessary to convey certain principles. It will provide links to resources which may also target more academic audiences. The complete ECM Best Practices Series from ACG is available at http:// www.arthurconsultinggroup.com. INTRODUCTION require companies to improve the Taxonomy is the science of describing organization of their content, and an object, in our case content or assets. taxonomy is a way to apply some very In addition to describing the object, a old and proven methods to a new form taxonomy will also place it into a of managing content. While some old relationship with other content and wisdom can help with the new group the content in logical collections challenges, there are various aspects of or nodes of a hierarchy.1 the new media which are not well Taxonomy is not a new term and covered in the age-old science. This library science is a good 2000 years old. paper will address both areas. The current renaissance is due to a A taxonomy for a larger system will growing understanding that file systems need to describe and group content from are not the right tool to manage and various sources in a logical but also Magan Arthur ACG control access to the growing digital useful way. This structure can become a 60 Canyon Road Fairfax, CA 94930, USA content repositories of companies, complicated hierarchy with hundreds of Tel: +1 415 462 2979 Fax: +1 415 482 9304 governments or any organization even nodes. If you plan for a larger system Email: Magan@arthur of medium size. Stricter rules and and you do not have a librarian on staff, consultinggroup.com regulations, specifically in the USA, you should seriously consider securing # Henry Stewart Publications 1743–6559 (2005) Vol. 1, 4 279–297 JOURNAL OF DIGITAL ASSET MANAGEMENT 279 Arthur the services of a consultant. In addition, However, administrative data are almost every industry conference (AIIM, very important for your system to Henry Stewart DAM Symposium and function. There is a second meaning others) has dedicated seminar tracks for of the term taxonomy which is more taxonomy. broadly describing any data used to This paper starts with a clarification of describe and classify content. It has terms. This is necessary because there are become common to refer to any not yet generally accepted standards or system used to find and describe even guidelines for the terms used in digital content as taxonomy. describing taxonomies or data structures. Metadata is a wider term which, for the (Interestingly, you will find that purpose of this paper, shall include building a larger taxonomy is a lot any data about an object both about clarifying terms.) descriptive and administrative in I will then follow the order, also used nature (data about data). by Ann Rockley in her outstanding Metadata structure is the system of book Managing Enterprise Content2,of metadata templates that will be used distinguishing metadata between the to classify, find and describe the categorization and individual data. First I objects of your system. will discuss classification or Collection will refer to any grouping of categorization of content and then content which could be a folder, provide ideas on how to build a collection, or also a job or project. metadata scheme for the individual assets Object will be any element that can be or content (Ann Rockley refers to this as described with its own set of element metadata). metadata: Individual files, collections, In the last part of this paper I will folders, jobs, projects, user groups, describe considerations in regard to users, upload or staging folders and expanding the system, which will touch more. One way to think of an object on other data structures not commonly is as a row in the database and included in the taxonomy discussions. metadata as columns. These data structures include user groups Authoritative term is the term used to and roles, security, ingestion and describe a node in the classification download folder structures, as well as hierarchy. An authoritative term can other searchable indexes. have many synonyms or related terms but it is chosen to represent all CLARIFICATION OF TERMS these concepts as the most Before getting into more detail I would identifiable term in the classification. like to clarify a few terms. Parent/child relationship expresses the hierarchical relationship in a Taxonomy is a system of describing an classification. ‘‘Mammal’’ is the object also through its relationship to parent of ‘‘human,’’ and ‘‘race’’ is the other objects. Usually these child of ‘‘human.’’ relationships are expressed in a Ontology is a related term to taxonomy hierarchy. Administrative data (use, and usually tries to explain any usage rights, status etc.) are usually object from its place in the hierarchy not considered in these definitions. of other objects. I will try to avoid 280 JOURNAL OF DIGITAL ASSET MANAGEMENT Vol. 1, 4 279–297 # Henry Stewart Publications 1743–6559 (2005) Expanding the system definitions and configurations highly academic terms and instead most companies, however, there is no use more descriptive language agreed enterprise structure to the file whenever possible. systems or for different content management systems, digital or TAXONOMY OR otherwise. Every department has HIERARCHICAL STRUCTURE different, sometimes poorly maintained, file folders. General considerations Independent of any software solution Before you start it should be said that you have or will employ to manage all common sense is a very important or part of that content, creating a map element in this exercise. The end result of the content in your organization is a should be a structure that is easy to use by very valuable exercise. Figure 1 shows a end-users, content contributors and sample structure. administrators alike. A classification Different users will make different system for all content of a large logical associations and search for the organization is the best case scenario, but same content in different ways. While it might not be practical to maintain, as it for the sales team ‘‘images’’ might requires ongoing maintenance from staff include anything from photos to logos with specialized skill sets. If your key and graphics, these are very separate concern is a useful classification or search categories for the professional designer. system for the daily tasks of the average In the example in Figure 1, it would person, your energy could be better spent make just as much sense to build the on refining or ‘‘harmonizing’’ a number hierarchy as shown in Figure 2. of smaller and more targeted structures managed by tools that are more Marketing HR departmental. Product marketing Benefits Data Sheets Forms Another important clarification is that Product Specification 401k your ‘‘enterprise taxonomy’’ is not Solution Overview Life Power Point Slides Disability necessarily tied to a software product Product Videos . (existing or planned). It makes a lot of Product Shots Info Docs . 401k sense to start with a piece of paper. The Trade Shows Life Banner Ad Disability following questions can be mapped in a Event Specific . spreadsheet or a simple table. NAB 2005 ... Building the structure Figure 1: A sample structure What constitutes content in your organization and where is it? HR As you brainstorm this question you Benefits will almost naturally start building a 401k Forms classification in a hierarchical structure Info Docs Life (taxonomy). This structure will likely Forms resemble the structure of any content Info Docs management solutions already in use and/or your existing file systems. In Figure 2: Alternative structure # Henry Stewart Publications 1743–6559 (2005) Vol. 1, 4 279–297 JOURNAL OF DIGITAL ASSET MANAGEMENT 281 Arthur To begin with, these details are not Technology that important. The first goal is simply Software Enterprise Software to identify all the content that is of value Content Management for your organization. As with any Ann Rockley, ‘‘Managing Enterprise Content’’ 2003, New Riders larger project it is very important to have a general understanding of the Figure 3: Library classification scope and context. Only after that has been established will it make sense to decide in which area more detail and HR Benefits organization will most benefit the Policies Procedures organization. ... As you think more about your specific situation it will make sense to refine this general map. It is highly recommended Figure 4: Simplified structure that you involve the people who will ultimately use this system when you last level of the hierarchy tree is a book think about the following issues. This is (see Figure 3). not simply general good practice, but The ability of technology to display involving the users is essential to search results intuitively and to refine capturing both the formal as well as the searches with specific metadata can make informal relationships and flows of your it slightly easier for a digital library. To content.