<<

 75

Jiri Pika

Knowledge Organization in Sciences – As a Classificatory Performance and Classification Design Model for Humanities

Abstract The paper provides an overview of natural science classification scheme development with major control of classification criteria presented in the Linnaean . Based on natural laws, the has been accepted worldwide. Unlike the indexing of the natural sciences items that follows the logic and systematics of natural laws – a real challenge still exists in classification of documents originating from human intellectual activity.Items, produced as a human output are a particular phenomenon and as such, follow no common rules. This lack of evident natural law as a basis for a common classification can be substituted by practices of facet classifications and Information Coding Classification (ICC) [1] that advances to the field of classifying literature. Their common feature is to analyse the information content with a set of categorical questions and to express the answers in exact terms, concepts and notations. The ensuing categorizations are certainly both concise and unequivocal: essentially Linnaean, or better!

Introduction Among the numerous examples of knowledge organization in sciences, one case is particularly interesting, mainly from the documentary point of view (Umstätter 2009). Ever since the Swedish naturalist Carl Linnaeuswasknighted to become Carl von Linné, in recognition of his classificatory work in 1761, we have learned that the natural arrangement of objects of intellectual and physical environment leads to knowledge. Linnaeus thus made a significant contribution to the development of documentary sciences, without being adequately appreciated in this area. How revolutionary his idea was, can be seen by the fact that his work “” (1735) was listed on the “Index Librorum Prohibitorum” by the pope (Jahn 2000). His influence in the 18th century was so great that J.W.v. Goethe on 7. November 1816 wrote to his friend Carl Friedrich Zelter [2]: This day I have reread Linnaeus and I am shocked by this extraordinary man. I have learned so much from him, but not . With the exception of Shakespeare and Spinoza, I know no one among the no longer living who has influenced me more strongly. Even his opponent, the director of the royal gardens in , Georges-Louis Leclerc, Comte de Buffon had to accept the systematics of Linnaeus on royal behest in 1774. The systematic arrangement of living creatures by Linnaeus came as a result of the increasing travel activities of naturalists and their and animal descriptions. To name a few: described in his book “De Plantis” (1583) more than 1500 and in his “Pinax Theatri Botanici” (1623) described 6000 plant species. Joseph Pitton de Tournefort (Eléments de botanique ou method pour connaître les plantes) characterized nearly 7000 in 1694 and described over 18 000 by 1704 in “Historia plantarum” (1686-1704). To organize this vast amount of information, trying to cope with various classifications was at that time extremely important, especially for the purpose of  76 and (Hansen, 1902). In particular, the use of different names for the same plant has led to dangerous misunderstandings. The rules, which Linné gives in his for choosing the name, are masterful. He points out the absurdity of most of the old names and calls the botanists to choose their Nomina vera with the words: idiotae imposuere nomina absurda. Linnaeus in his “Philosophia Botanica” (1751) characterized other botanists as “Fructistae“, “Corollistae“, “Calycistae“ and several other classes of botanists [3] (Hansen 1902), depending on which part of the plant his botanist colleagues (Linnaeus1751, Rádl 1905) used to design their classifications. Whereas other botanists are classed as Fructistae, Corollistae, Calycistae, under the Sexualists [4] stands a solitary, proud “ego”, which is correct, since he is the sole inventor of the “sexual system”, but it bears a strong aftertaste of the most sovereign self-confidence (Hansen 1902). Linnaeus regarded himself [5] as “Sexualist” because he based his system on the classification of plant sex organs. Linnaeus made it clear that sexuality is a ubiquitous phenomenon of nature. This, at that time truly brilliant discovery, can be found in his thesis (1730) – an account of plant sexual reproduction: [6] “Praeludia Sponsaliorum Plantarum“ (=On the prelude to the wedding of plants). He relied on knowledge of Rudolph J. Camerarius (1665-1721), professor of medicine and director of the in Tübingen, who had demonstrated by his publication (De sexu plantarum epistola 1694) that plants have sexuality. Nevertheless it was (1641-1712), who had actually discovered this fact, but wasn’t able to prove it. Linnaeus considered this phenomenon highly anthropomorphic. When he repeatedly talks about the bridal bed, or in connection with the "Polyandria", he asserts that in a with 20 stamens and one stylus, there are '20 males or more in the same bed with the female', a state of affairs enjoyed by the poppy (Papaver) and the linden (Tilia). He opens his dissertation: “In spring, when the bright sun…The actual petals of a flower contribute nothing to generation, serving only as the bridal bed which the great Creator has so gloriously prepared, adorned with such precious bed-curtains, and perfumed with so many sweet scents, in order that the bride-groom and bride may therein celebrate their nuptials with the greater solemnity”) [7] Blunt (1971). The introduction of sexuality as classification criterion led to the theory of evolution expressed later by Darwin, and found its basic fundament exactly in this classification. So it is understandable that the "Systema Naturae" was banned by the Pope and placed on papal Indexes of Prohibited Books (The Index Librorum Prohibitorum).Linnaeus pointed out that science is established primarily by its classification system, which arranges the knowledge relations within the specific system. Today we would say: integrated into semiotic networks (Umstätter 2009). Although Linnaeus initially regarded his system as artificial - today it could be called constructivistic - it soon became evident that it was a natural one. His system depicted the natural evolution of living nature, because it applied the sexual kinship of  77 species as a classificatory criterion. Thus he transformed his system from a pure constructivism into an evolution model (Umstätter 2009). Linnaeus did not suppose that his classification of the plant kingdom in the book was natural, reflecting the logic of God’s creation. His sexual system, where species with the same number of stamens were treated in the same group, was convenient, but in his view artificial. Linnaeus believed in God’s creation, and that there were no deeper relationships to be expressed. He is frequently quoted to have said: "Deus creavit, Linnaeus disposuit” (“God created, Linnaeus organized”) [8]

Linnaean taxonomy In 1727 Linnaeus became aware of a newspaper article, which reported on a public lecture by Sébastien Vaillant, the member of the Academy of Sciences and director of the royal garden in Paris, on the sexuality of plants. In it was the indication that the pollen of the plants have the same function as sperm. The sexualistic system of Linnaeus (Rádl 1905) became accepted despite the resistance of many botanists, because it was clear, consistent and provocative (Umstätter 2009). Thus, it could not have been ignored by the world of experts (Mayr 1982). Since then the newly discovered creatures could be classified and recognized again by a standardized procedure. Another important achievement of Linnaeus is the establishment of still-in-use standardized [9] (Paterlini 2007). In the "" (1737) he has determined the rules according to which the genera of plants should be named The name of a plant should be two-fold: a genus name equals to the human family name and a name of a species, as the name in daily life (nomina trivialia). The diagnosis depends on the associations in kinship circle of the respective species…. (Jahn 1985). Equally important were the terminology introduced by Linnaeus and his instructions about how to describe the plant species. He introduced and defined about 1000 botanical terms in "" by 1736. Crucial for the classification was the clear distinction of significant and insignificant characteristics. As insignificant Linnaeus recognized characteristics, such as color, odor and size, because it was obvious to him that these could vary easily even within one species. In contrast, the sexual system was largely a type- or species-consistent categorization. During his life Linnaeus realized ever more clearly that the species that he initially thought to be immutable can hybridize. Moreover, he observed some adaptation of plants to their environment and towards the end of his life he considered the origin of new species by hybridization to be quite feasible (Mallet 2007).

Cladistics, Knowledge Organization and Phylogenetic Classification The question of what can be used in a classification as a division-criterion for categorization has proved crucial in Linnaeus’ work. The key idea in the is to let the classes branch according to their relationship. Whereas the development of  78 many library classifications for routine indexing of publications requires no cladistic considerations,these are essential in the organization of knowledge,because knowledge develops epigenetically (Umstätter 2009). In the search for alternative approaches to disciplinary classification, Gnoli (2006) reviews and evaluates classification schemes, bibliographic classifications and facet analysis [10] and proposes a model called phylogenetic classification. It integrates both evolutionary order and similarity as its main criteria: “phylogenetic method seems to have some potential to give a significant contribution to the development of more satisfying and generally valid classification schemes”.

Status quo in contemporary classification systems Contrary to the Linnaean Period, which brought consensus and a worldwide- accepted system for organizing plants and animals, nowadays the world of complex information and documents seems to be still running in Pre-Linnaean Period, as can be seen: − The publishing rate is increasing – analogous to the rapidly increasing rate of plant and animal descriptions in the Linnaean Period. − Analogous “origin of new species by hybridization” is arriving in libraries: hybrid documents with all types of media formats, imposing a need for document and library-rules re-arrangement. − The flood of new scientific disciplines, concepts and terminologies keeps rising. − No agreement regarding one common classification system exists. On the contrary: − The increasing publishing rate is accompanied by an increasing number of thesauri and classifications. Dahlberg (1982) quantified their amount, which reached 2261 in 1982. Today’s amount is possibly a multiple of that. − Most of the classification systems are basically “mark and park” type (Slavic 2000), helping to create signatures, but providing hardly any document content description. − A real challenge exists in the classification of the documents from human intellectual activity. Human output (Dahlberg 2014) is a particular phenomenon and as such, follows no common rules: “inanimate objects (Mayr 1982) should be classified by principles different from those used in biology, because they lack any evolutionary history” (Mayr & Bock 2002). − Classification systems, thesauri or the knowledge organization possess different levels of constructivism. Most of the classifications are constructivistic in their systematic arrangements, as they deal with non-natural science phenomena i.e. human output, and they use arbitrary methods to accommodate their information about the documents (Dahlberg 2014, 2015).  79

− Indexing consensus across various cultures - Since not all cultures worldwide understand one item equally and consistently (Tillett 2015), it follows that the indexing of human output may differ or might be biased. Yet, the two following schemes help to categorize the human output almost analytically. − Information Coding Classification: to answer the challenge of indexing literature and various kinds of information became a goal for the ICC, a classification system covering almost all existing 6500 knowledge domains. “Its conceptualization goes beyond the range of the well known library classification systems, such as DCC, UDC, and LCC by extending into knowledge systems that so far have not afforded to classify literature. ICC actually presents a flexible universal ordering system for both literature and other kinds of information, set out as knowledge fields” [11]. It has nine ontical levels, grouped under three captions [12]: 1. Prolegomena, 2. Life Sciences and 3. Human Output. − Facet-analysis extracts information treated in the document and expresses the document content in the catalogue. This objectively uniform assessment consists of sequential query, extracting a set of facts such as subject, place, time and form - thus summarizing the document content. Examples are UDC 13, PMEST 14 or CRG 15 as main schemes for facet arrangement of concepts. Similarly Soergel (2009) suggested scrutinizing the document text with the set of categorical questions and expressing the responses in a sequence of exact terms, concepts and notations.

Significance of the Linnaean taxonomy for Classification Design Although Linnaean taxonomy has been challenged by contemporary genomics and DNA sequencing technology, its value as initial spark for the worldwide accepted classification remains unequalled. Its integrative impact serves as an example of best practice to establish one Unified Classification System for sharing uniform metadata among libraries and any other kind of collections. The use of above mentioned schemes like PMEST, SVOPT 16 or of UDC syntactic rules can perform this kind of classification competently. Their common feature is to analyse the information content with a set of established questions and to convey the answers in precise arrangement of concepts and notations. In case ofUDC this is expressed with hierarchically expressive notations that are friendly to navigate and use. Their complex notations can be deconstructed accurately into simple UDC concepts. Today's aim is to deposit a quest for such a classification system by consensus - conceivably carried out by the next generation of KO experts.

 80

Endorsement - quest for a synthetic classification system by consensus Everyone should benefit from commonly accepted, comprehensive classification rules! The approach: extracted metadata, arranged in fixed categories, shared in a catalogue. The objective: clearly structured search and finding with high precision / high recall. All the ensuing categorizations are concise and unequivocal - essentially Linnaean, or even better!

Examples of classification dynamics New concept and hierarchy - Quaternary, the youngest geological period we live in, has been used since 1759 as a concept for the period younger than Tertiary. As a name, it is contained in titles of hundreds of books and it is present in thousands of articles and names of numerous quaternary research groups, working groups, commissions and conferences almost in all countries. A similar concept is Tertiary, used since 1750 to define the second youngest geological period and it is used in a similar amount of titles, documents and organizations. Tertiary has been divided into the Paleogene and Neogene periods. Currently, based on scientific evidence, both these concepts together with Quaternary were joined into one term called Cenozoic ("628" Cenozoic 17).Use of Tertiary is discontinued, and Quaternary became a subdivision of Cenozoic. This fact needs a clear referencing for the older, current and future users, as both geological periods are archives of climate changes in the past and therefore all the written records hold important information for calculating climate models. Since the scientific language development, reflected by up-to-date classifications, will never cease to advance, this results in a steady enrichment of the well-maintained classifications. These relations can be visualized by “additional referencing” as discussed by Gnoli (’Commerce, see Rhetoric’, 2015) for relationships other than hierarchical, i.e. in cross-discipline relationships, by using: ‘see also’ as in DDC, LCSH 18 and in UDC that points to related classes in other hierarchies. NEBIS 19 system applies related terms (RT) for pointing into poly-hierarchies (Pika & Pika 2015). For that reason the newly added concepts, interlinked in due way, must enable any query for every particular term in its conceptual environment. Alas: the designers of library management systems should be aware of that too. Adjustment of redundant concepts - Monitoring of science terminology development versus classification schemes in the past 25-30 years 20 reveals that vast amount of information has been faultlessly classified and contributes to enhance the search yield (Pika 2010). Only sporadically the metadata were incorrect or inappropriate due to re- labelling or biased indexing. Labelling, re-labelling - One of the redundant sources of vocabulary enrichment is the red tape: some new expressions for current, still valid terms, originate from the fact that in many countries the grant organisations would not fund scientific projects bearing titles like “climate change” or “plate tectonics” for several years consecutively, as they seem out-dated. Hence a new scientific label “skin tectonics” instead of plate  81 tectonics, has much better chances to obtain further funding. As a consequence, coining of the new concept of “skin tectonics”, which expresses the same phenomenon as plate tectonics, achieved the funding. This particular success enriches the vocabulary, however the information entropy increases. Though these artefacts are scanty, they must be disambiguated and placed in correct context. Bias in point of view leads to different categorization - A “glacier movement”, seen and indexed by a physicist, would be frequently expressed as a “mechanics of continua” combined with “ice”. For a glaciologist or an earth scientist a “glacier movement” is a class descriptor on its own, manifested by its advancing and retreating due to climatic change. The truth is: both classifications are useful, though it is a costly arrangement. A user from natural sciences department would surely opt for the second description search, whereas a physicist would search under the first formulation.

Conclusion It is both challenging and rewarding to understand the way our users perceive. Our task is to adjust to their level of communication, nothing more, nothing less. The cooperatively attentive cognition yields the appropriate meaning for different terms (Umstätter 2009): “From the classificatory point of view, we basically have to realize that the human- limited mind can comprehend this world only by its partial generalization. It starts with children who, at an early age commence arranging the world into "quack quack" and "woof woof". Strictly speaking, it is the Aves and the Mammalia, but a lot of parents believe that these are ducks and dogs. With increasing knowledge, our thesaurus grows truly universal! A genuinely interesting challenge”!

Acknowledgement At the ISKO 2014 Conference I met a number of interesting people and some of them, from the Brazilian ISKO Chapter, sparked my curiosity in their country. In October 2015 during the Lisbon’s UDC Seminar I found the culture of Portuguese language and people very appealing and this prompted the idea of going to ISKO 2016 Conference in Rio. At the same time the work of Ingetraut Dahlberg and Walther Umstätter continues to be equally inspiring and encouraging.

Notes [1] Information Coding Classification https://de.wikipedia.org/wiki/Information_Coding_Classification [2] Goethe, J.W.v. 1816:27/7539 http://www.zeno.org/Literatur/M/Goethe,+Johann+Wolfgang/Briefe/1816. [3] Philosophia Botanica http://www.scientificlatin.org/philbot/pbbibl.html [4] http://www.scientificlatin.org/philbot/pb31.html [5] Philosophia Botanica http://www.scientificlatin.org/philbot/pb31.html [6] Praeludia Sponsaliorum Plantarum, in quibus Physiologia earum explicatur, Sexus demonstratur, modus Generationis detergitur, nec non summa Plantarum cum Animalibus  82

analogia concluditur. Dec.1729. Original manuscript 1730 preserved in Uppsala University Library, printed in 1908 In: Skrifter af Carlvon Linné. Utgifna af Kungl. Svenska Vetenskapsakademien. Band 4, Nr. 1, 1908, S. 1-26 [7] Blunt (1971), p. 244 and p. 34 [8] https://en.wikipedia.org/wiki/Systema_Naturae - cite_note-NG-10 [9] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1973966/ [10] Facet classification (FC) [11] Information Coding Classification https://de.wikipedia.org/wiki/Information_Coding_Classification [12] Literally from ICC: 1. Unbelebtes, 2. Belebtes 3. Produziertes Sein = 1. Inanimate, 2. Animate and 3. Human Production Or the formal version: 1. Prolegomena, 2. Life Sciences and 3. Human Output. [13] Universal Decimal Classification - UDC codes can describe any type of document or object to any preferred level of detail https://en.wikipedia.org/wiki/Universal_Decimal_Classification [14] In Colon classification (CC), facets describe "Personality" (subject), Matter, Energy Space and Time: PMEST. It is a system of library classification developed by S. R. Ranganathan. [15] CRG categories: an expansion of Colon classification to 13 categories: describes any desired level of detail - see Vickery‘s (1960) list for classifying scientific domains: Substance, Organ, Constituent, Structure, Shape, Property, Object of action (patient, raw material), Action, Operation,Process, Agent, Space, Time. [16] SVO, SVOPT, SVOMPT - Subject S, Verb V, Object O, Place P, Manner M, Time T. - English Grammar Word Order. https://en.wikipedia.org/wiki/Subject/verb/object [17] "628" Cenozoic (65.5 MYBP -now) in http://www.udcsummary.info/php/index.php [18] DDC, LCSH: Dewey Decimal Classification, Library of Congress Subject Headings [19] NEBIS library network, ETH-Bibliothek, Zurich. [20] KVK - Karlsruhe Virtual Catalog. https://www.google.ch/#q=kvk

References Blunt, Wilfrid & Stearn, William T. (1971). The compleat naturalist: a life of Linnaeus. London: Collins Blunt, Wilfrid (1994). The Art of Botanical Illustration. Dover Publications. Dahlberg, Ingetraut (Hrsg.) (1982). International Classification and Indexing Bibliography (ICIB 1): Classification systems and thesauri 1950-1982. INDEKS Verlag, Frankfurt 1982 Dahlberg, Ingetraut (2014). Wissensorganisation – Entwicklung, Aufgabe, Anwendung, Zukunft. Würzburg: Ergon Verlag Dahlberg, Ingetraut (2015). Warum Universalklassifikation? Lecture at ESZ-Kolloquium 24.10.2015, Darmstadt. (Personal communication from Ingetraut Dahlberg: Bad König) Gnoli, Claudio (2006). Phylogenetic Classification. Knowledge Organization 33(3): Gnoli, Claudio, De Santis, Rodrigo, & Pusterla, Laura (2015). Commerce, see also Rhetoric: cross-discipline relationships as authority data for enhanced retrieval. In The International UDC Seminar entitled "Classification & Authority Control: Expanding Resource Discovery" in National Library of Portugal in Lisbon, on 29-30 October 2015. Goethe, Johann Wolfgang V. (1816). Briefe: 27/7539. [http://www.zeno.org/Literatur/M/Goethe,+Johann+Wolfgang/Briefe/1816] Hansen, Karl Adolf (1902). Die Entwicklung der Botanik seit Linné. Gießen. Hjørland, Birger (2012). Facet analysis: The logical approach to knowledge organization. Information Processing & Management, 49 (2), March 2013: 545–557  83

Jahn, Ilse, Löther, Rolf &Senglaub, Konrad. (1985). Geschichte der Biologie, Theorien, Methoden, Institutionen, Kurzbiographien. Ed.2, 864 Seiten, VEB Fischer, Jena Jahn, Ilse (2000). Geschichte der Biologie: Heidelberg: Spektrum, 2000 Spektrum Akademischer Verlag; 3. Aufl. (8 Aug 2000). Linnaeus,Carl (1730). Praeludia sponsaliorum plantarum [https://de.wikipedia.org/wiki/Praeludia_Sponsaliorum_Plantarum] Linnaeus,Carl (1735). Systema Naturae. Lugduni Batavorum [Leiden, the Netherlands] [https://en.wikipedia.org/wiki/Systema_Naturae#cite_note-NG-10] Linnaeus,Carl (1751). Philosophia Botanica, P. 12, ed. 1, Stockholm & Amsterdam Mak, Christian (2011). "Kategorisierung des Datenbestandes der EuropeanaLocal-Österreich anhand der ICC" (Bericht des Instituts "Ang. Inf. Forschungsgesellschaft mbh" (AIT) (Graz) Mallet, James (2007). Hybrid speciation. Nature, 446(15). Mayr, Ernst (1982). The growth of biological thought: diversity, evolution, and inheritance. Cambridge (Mass.), London: Belknap Press. Mayr, Ernst, & Bock, W. J. (2002). Classifications and other ordering systems,J. Zool. Syst. Evol. Research, 40: 169–194 Paterlini, Marta (2007). There shall be order. The legacy of Linnaeus in the age of molecular biology.European Molecular Biology Organization Report 2007 Sep; 8(9): 814–816. doi: 10.1038/sj.embor.7401061 Pika, Jiri (2010): Erschließungssysteme in der Schweiz und in der ETH-Bibliothek. KIT Karlsruhe, 23.7. 2010 : (urn:nbn:de:bsz:ch1-qucosa-64942). 34. Jahrestagung der GfKl Pika, Jiri, & Pika-Biolzi, Milena (2015). Multilingual subject access and classification-based browsing through authority control: the experience of the ETH-Bibliothek, Zürich. In:The International UDC Seminar entitled "Classification & Authority Control: Expanding Resource Discovery" in National Library of Portugal in Lisbon, on 29-30 October 2015. Rádl, Emanuel (1905). Geschichte der biologischen Theorien seit dem Ende des 17. Jahrhunderts. Leipzig, 1905, 2 Bde, 1905-09 Slavic, Aida (2016) "mark and park" tool (DDC and LCC) A Definition of Thesauri and Classification as Indexing Tools [http://dublincore.org/documents/thesauri-definition] Soergel, Dagobert (2009). Illuminating Chaos. Using Semantics to Harness the Web. Presentation at UDC Seminar, The Hague, 29-30 October 2009 Tillett, Barbara B. (2015). Complementarity of perspectives for resource descriptions. In The International UDC Seminar entitled "Classification & Authority Control: Expanding Resource Discovery" in National Library of Portugal in Lisbon, on 29-30 October 2015. Umstätter, Walther (2009). Zwischen Informationsflut und Wissenswachstum, Berlin: Simon Verlag für Bibliothekswissen. 340 Seiten Umstätter, Walther (2012). Email from Inetbib: Re: [InetBib] Leser oder Surfer? Die Zukunft der NYPL - 6 June 2012. Vickery, Brian Campbell (1966). Faceted classification schemes. New Brunswick, NJ: Graduate School of Library Science at Rutgers University (Rutgers series on systems for the intellectual organization of information, edited by S. Artandi, V. 5).