MaterialsProc Indian for Natn Devleoping Sci Acad 74 Databases No.1 pp. 27-38 in Taxonomic (2008) Research – A Review 27

Review Article Materials for Developing Databases in Taxonomic Research – A Review

M SANJAPPA*, P VENU** and W DINESH ALBERTSON** *Director, Botanical Survey of , Kolkata – 700 064 ** Botanical Survey of India, Deccan Circle, Hyderabad – 500 048

(Received 26 July 2007; Accepted 28 March 2008)

The essentiality of databases in taxonomic research and in taxonomists’ perspective is emphasized. Specimens, illustrations and little known catalogues, besides consolidated floras are considered subjects for databases in taxonomic research. The present communication analyses the holdings in Indian herbaria, the problems that confront them and suggests ways for their greater exposition. It is felt that a consortium is to be evolved linking them to pool up the holdings and to facilitate greater accessibility of specimens. Images though not entirely substitute specimens, expedite in tracing specimens in specific herbaria, their prompt borrowing for study thereby taking appropriate taxonomic decisions and updating of nomenclature. This centralization accelerates revisionary studies and makes Indian taxonomists less dependent on European herbaria. Illustrations and less known catalogues, which too are crucially linked to literature, should find a place in databases. Images of live collections with names authenticated empower forestry personnel to collect locality specific information of various species in monitoring exercises. Problems associated with names and their inherent dynamism due to the flexibility given in the provisions of the International Code of Botanical Nomenclature in databases on consolidated floras of a state/ region are also addressed. The paper reviews the efforts put in by various institutions towards digitalization. It is suggested that the database on consolidated National Flora should be carefully phased initially building fully referenced species checklists for various families followed by appending additional data sets for value addition. This gives the needed stability to the developed database. Our approach should be gradual and graded and involving taxonomists as they are principle data generators. As the old databases are customized in configuration and utility and day by day the new ones getting added, this review cannot be claimed exhaustive. But it gives a broader perception with reference to the diverse resources on hand to be digitized and emphasizes how effective a database could be built upon. A thematic proposal is presented as a guideline with a submission for consideration/debate. Key Words: Databases; Catalogues; Digitization; ; Herbaria; Illustrations; Flora and Types

Background taxonomic databases concerning resources The Indian economy is essentially biomass based and including virtual herbaria. Though some attempts were the taxonomists, the inventory keepers of the biological made towards electronic databases in India, they have resources, are expected to play a major role in the not served fully the perceived needs of taxonomists nor exploitation and sustainable use of these resources in generated any desired impact towards data accessibility the coming years. A taxonomist finds himself at the door- by public at large. Another issue considered important step of biodiversity evaluation as he is the person is the image database for numerous collections existed supposes to announce the existence of new species, in various herbaria. The label data on the specimens is cautions the abundance/rarity of certain species and inadequate in case of old collections whereas collections announces the localities of species richness before from 1950’s onwards have reasonable field data that can anyone else can think of any further action. After assess where from the collections were made. Now Convention on Biological Diversity (CBD), a new explorations are undertaken keeping GPS, so both situation has been developed with recognition of latitudes and longitudes are accurately specified in each biological diversity and thereby taxonomy/taxonomists. of the specimens. The obvious issue to be addressed in Heywood [1] dealt how the taxonomists of the present database design is how to uniform the variable data day should make themselves relevant to societal needs. available in these sheets and simultaneously think of its Floral diversity generates a variety of stakeholders who integration for reference by users. More complications in turn need information and counsel for its sustainable may be anticipated in case of specimens bearing more utilization. Taxonomic databases are developed to than one name. It is implied here that one should provide the needed information to varied clients understand the specific issues inherent in taxonomic professionally and effectively. The information is to be research before developing a consolidated database on analyzed in the context of intended use and the format floral resources. It is likely while developing a database in which this is to be presented is to be decided prudently. that one may confront with numerous dilemmas with There is great activity around the world to build reference to names existing in the literature. As the 28 M Sanjapa et al. subject is historically connected, evolved with passage of the holotype) and paratype (additional specimens of time and specimens and literature distributed/dispersed selected from the material the author had on hand at the in various institutions, it is very difficult for a single time of writing the description) are some designated types person to take decisions correctly. People have been in the type series. Vouchers/types can be loaned from attempting floral databases on various themes and in different herbaria. Personnel manned in some herbaria different formats. There are no prescriptions while outside purview of BSI are sometimes over-protective. designing databases and are formatted based on In many there is no provision of lending specimens and individual’s choices ignoring compatibility with others. a revisoner has to go and spend considerable period. The The authors have developed a strong case for its initiation custodians (not always keepers) of certain others do not in India, so that taxonomists don’t face the usual know the conventional herbarium practices. Some constraints such as non-availability of literature and herbaria are drawn little attention for regular care and specimens in execution of developing the much needed are sidelined as storehouses of collections and are looked inventory of plant resources and public at large could upon as liabilities. There is no effective mechanism to use this unused and under used data generated by share these resources and to date only personal taxonomists. communications have been facilitating limited exchanges. Issues in Taxonomic Research Relevant to Databases Identity Crisis and Lack of Types /Authenticated Specimens Revisions and Holdings in Herbaria A number of Indian have not been collected after Revisions are considered serious research topics the type collected by Europeans and some do not have particularly when undertaken for larger genera. They representative specimens in Indian herbaria. It is pertinent pose problems of various kinds and the study of each to stress here that many Indian species require is an experience by itself. Some larger genera pose redescribing as the earlier descriptions are based on problems exhibiting indistinct continuities between scanty material. Hemigraphis venosa is a classic example certain species. Some exhibit rarity while others confine and the identity crisis in it has led some workers to to narrow geographical regions and some have taxonomic overlook and others to include but unfortunately based and nomenclatural complexities. These hurdles make on wrong determinations. Literature including Nees [6], anyone impossible to make his own collections of all Clarke [7] and Gamble [8] did not give the exact locality. species under the taxon for study in reasonable time This confusion of the taxon is due to nonexistence of period. It compels invariably one to depend on collections any authentic material in any of the Indian herbarium. It available with different herbaria. The country has many was with great difficulty and protracted correspondence herbaria under different institutions and some with huge the type collections of this species could be borrowed collections. In total, there are 11 herbaria under Botanical from K and LIV. Now the specimens’ images are stored Survey of India and ca 50 under universities and various for future reference to facilitate collection. Same is the institutions. They together hold ca 4 million specimens case with shevaroyensis Gamble which is allied [2, 3]. Types are considered highly important in any to V. arborea Buch.-Ham. (=V. monosis Benth. ex C.B. herbarium holdings. The type is the name bearer, the Clarke) and was erected based on Perrottet’s collection specimen associated with a name by first description and and delimited by leaf shape, tomentum, ribs of achene publication. It provides an objective basis and a fixed and the absence of glands between them. However, some reference for the use of the name [4, 5]. Most ecologists, others feel that V. shevaroyensis is not distinct from V. environmental biologists and chemists keep voucher arborea. The absence of type in CAL and its elimination specimens of their study. This enables the identification from type locality placed taxonomists in difficult verified by systematists and later researchers to collect situation. Perrottet’s collection was supposed to be in P the same again if necessary. In a sense, the type is a (Paris museum) to ascertain its identity. We should super voucher specimen, providing a fixed reference to borrow types of such names from International herbaria the named taxon for all time. The type series includes and at least retain images with us. all specimens on which a description of a new species is based. Holotype (a single specimen used by an author, The long established Kanjarum palghatense either the only specimen he found or one of the several Ramam. is now proved that it is only the less known found, but the only one designated as a type), Isotype (a dupeni Gamble. This is another glaring specimen believed to be duplicate of holotype), syntype example how lack of type has led to this confusion. (two or more specimens included in the type series), Ramamurthy [9] described a unispecific new genus lectotype (a single specimen selected from syntypes of Kanjarum palghatense based on material collected from a previously described species to serve as the equivalent Palghat district, with the purported affinity to unispecific Materials for Devleoping Databases in Taxonomic Research – A Review 29

Carvia Bremek. While revising the genus Strobilanthes Wallichian collections and ca 2,000 Wight’s collections. Blume for the flora of India, it has become necessary to These herbaria are important not only for great number look for a species truly allied to K. palghatense of collections, but for the historic and crucial nature of particularly after Khan et al. [10] made a detailed some in taxonomic research. CAL also harbors the description of S. dupeni after its rediscovery in Wynad collections of Griffith (from Northwest Himalayas, and Idukki. Strobilanthes dupeni was thought to be a Sikkim, Bihar & Orissa), Thomson (from Northwest narrow endemic confining to Wynad, Idukki and Palghat Himalayas & Kashmir), Thomson & Hooker (from in Kerala and critically endangered. Suggestions were Sikkim, Meghalaya & Northeast India), Clarke (from made for intensive searches for rediscovery. Critical Khasi Hills, Sikkim, Nilgiris, Kangra, Chamba & examination of the types of S. dupeni located at BM and Kashmir), King (Eastern India) and Kurz (Andamans that of K. palghatense located at MH and CAL and and Nicobar islands) [12In regional herbaria, MH has copious material of K. palghatense shows that the former collections of Barber, Beddome, Cleghorn, Gamble, is conspecific with the latter. Though collected by many Drew, Elliot and Lawson which are essentially from the but never outside Kerala, it was invariably determined erstwhile Madras Presidency [13]; BSI has holdings of as K. palghatense. Absence of an authenticated specimen Woodrow, Cooke (5000 sheets and 1400 spp.), Ryan and of Strobilanthes dupeni from any Indian herbarium Talbot (10000 sheets from North Canara) essentially including MH is again the sole reason for this mix up. from Bombay Presidency and Gammie who made large Scattered and untraceable specimens are also proved to collections from the erstwhile Bombay state and also be as good as not having them. This is experienced in from Assam, Himalayas and Kashmir [14 and 15]; deciding the identity of Strobilanthes circarensis Gamble. ASSAM has collections of Mann & Kanjilals from Gamble [11] described S. circarensis based on a material Northeastern region [16]; BSD has mostly recent collected by him and by Lushington from Visakhapatnam collections covering Himachal Pradesh, Uttranchal, District. He allied it to S. neilgherrensis Bedd. but the Jammu & Kashmir, Punjab Haryana, Chandigarh and authors found much nearer to S. pulneyensis. Both S. Delhi [17]; BSHC too has relatively recent collections neilgherrensis Bedd. and S. pulneyensis C.B. Clarke are from Sikkim and Darjeeling (West Bengal) and PBL has distinct species. There are no sheets under the name of collections of King and Prain from South Andamans and S. circarensis in MH to decide its affinity. middle group of Nicobars. Collections from Northeast Correspondence with K to get the type of S. circarensis by Roxburgh, Wallich, Griffith, King, Hooker, Prain, Mc couldn’t succeed because of restrictions in borrowing Clelland, Thomson and Kingdon Ward are in CAL. types. However the type was traced in the general Parkinson’s collections from Andaman Islands were herbarium of DD and BSI (Gamble 21779) after a deposited at CAL and DD. considerable time lapse whose examination had The collections in other herbaria that are either linked expedited synonymizing S. circarensis under S. to universities or other institutions account for 1.2 pulneyensis. Had the specimen databases been available million. Many of them contain collections from different with each herbarium, this would not have taken this long geographical regions. A few such as DD and BLAT have and the percolation of S. circarensis in other floras could importance for harboring many collections of Europeans, have been avoided. which formed the basis in the generation of some regional Herbaria and Specified/Diverse Collections floras. These herbaria also hold very old collections, which signify their importance in being associated with Specimens give better comprehension to a taxonomist taxonomic activity during the late 19th and early 20th about a species than anything else. A revisioner is often centuries. DD has collections of Royle (Himalayas), constrained to compromises because of non-availability Falconer (Kashmir & Ladakh), Duthie (Himalayas, of specimens –either types or any authentic collections. Gangotri, West Nepal, Rajasthan, Chanda, Nimar & The Survey maintains old and new collections in its Nagpur) [18] and BLAT has collections of Hallberg network of Regional herbaria at Allahabad (BSA), (Kashmir), Mc Cann (Bombay & Baluchisthan), Calcutta (BSIS), Coimbatore (MH), Dehra Dun (BSD), Sedgewick (South Canara) and Santapau (Khandala, Gangtok (BSHC), Howrah (CAL), Itanagar (ARUN), Kathiawar, Mahabaleswar, the Dangs near Surat) [19]. Jodhpur (BSJD), Pune (BSI), Port Blair (PBL) and RHT has collections essentially from Tamil Nadu Shillong (ASSAM). Specimen collections account for Carnatic [20] while CALI and JCB have holdings over 3 million in BSI herbaria alone, the largest share of exclusively from Kerala and Karnataka respectively. ca 2 million is from the CAL followed by ca 0.3 million from MH. Various herbaria are known for specified Endemism is a noteworthy component in Indian collections. The Central National Herbarium is a national flora. About 5750 species of the flowering plants are repository and has over 16,000 type specimens, ca 12,000 endemic and these are distributed in 147 genera and 47 30 M Sanjapa et al. families. Among the endemic species 3471 species are of his collections have gone into Wallichian herbarium. found in Himalayas, 2015 species in peninsular India Some of the Roxburgh’s icones (ca 300) were redrawn and 239 in the Andaman & Nicobar Islands [21 and 22]. by Wight and included in his icones. Many of the The Eastern Ghats possesses 79 endemic taxa under 58 Roxburgh’s collections are now dispersed in BM, K, genera and 25 families [23]. The Western Ghats harbors LIVE, MANCH, BRUX and LINN, and not easily 57 endemic genera of which 46 are monotypic [24]. A accessible to Indian taxonomists. Few of his illustrations considerable percentage of inherent endemism in Indian were redrawn, painted and have been published as flora makes certain collections concentrated in certain fascicles by Botanical Survey of India. Wight’s [26, 27 regional herbaria. Revisioners do not have easy access and 28] Illustrations of Indian Botany (2 vols. 182 plates, to these resources unless visit each one of these herbaria. mostly coloured) and Icones Plantarum Indiae Orientalis Many herbaria still hold types under general (6 vols., 2101 plates) concerning plants of South India, collections. Some are not in a position to loan as they particularly from hills and Spicilegium neilgherrense have no funds to pack and send. Many old collections at (202 plates) concerning plants of Nilgiris have great BLAT are still not mounted for want of funds. In some significance both in the identity of plants of this region. collections are cramped in smaller pigeonholes for want Around same time Beddome [29 and 30] produced the of space. It is high time that we did something to protect Flora Sylvatica for Southern India (2 vols., & 330 plates) them. RHT is another herbarium well maintained and which includes all the principal timber trees of southern accords visitors limited access to collections. In some India and Icones Plantarum Indiae Orientalis in 15 parts, herbaria specimens are under fast deterioration and in a each consists of 20 plates and in total 300 plates, which short time, may reduce to junk. In some herbaria, only contain new and rare plants from southern India. stitches are done to mount the specimens and neither Elsewhere in the country, plants of Himalayas and East glue nor fevicol applied. This enables one easy transfer Indian Plants are well illustrated. Royle’s [31] of specimens from one sheet to another if necessary. It Illustrations of the Botany of the Himalayan Mountains also reduces the proclivity to infection due to glue. This and the Flora of Cashmere (100 plates), Hooker’s [32, is a desired practice and be adopted, if agreeable, in all 33 and 34] The Rhododendrons of Sikkim Himalayas (30 herbaria. It is time to find ways and means to expose our plates), Illustrations of Himalayan Plants (24 plates) and specimens to critical studies by various taxa experts. A century of Indian Orchids (101 plates); Wallich’s [35] There should be a better coordination in free exchange Plantae Asiaticae Rariores (300 plates) and Griffith’s of collections from different herbaria to profitably use [36] Icones Plantarum Asiaticarum in 4 volumes and both old and new collections and subject to critical 664 plates (223 coloured) are some of them. Some [37] scrutiny, which make the determined names on of the finest Orchid paintings available at CAL were specimens more authentic. Efforts should be put towards prepared during Falconer’s period. In total, CAL has bringing the custodians of these historical collections about 1500 such drawings. Hooker in his “A Century of together to provide greater and efficient access to Indian Orchids” selectively published only 101 plates specimens and the associated label data. Whether a from them. A good number of them were first time consortium under BSI can be thought of? The problems described and acquire significance but unfortunately they identified in these herbaria point to the fact that are inaccessible and need wider circulation through digitalization is a must which can successfully achieve images to taxonomists. Equally important is the work of accessibility and ensure permanence of image at least King and Pantling [38] who had contributed to 448 when specimen perishes. illustrations from Sikkim Himalayas. Illustrations Microfiche and Catalogues We have many illustrations (published and unpublished) Microfiches are small films containing photographs of often consulted by taxonomists. Rheede’s Hortus specimens/literature in a highly reduced form. This is Malabaricus [25] in 12 volumes contain 794 illustrations one means adopted for a long time to access European covering 780 species especially from Malabar Coast. herbaria. And CAL has 25 European herbaria in Roxburgh’s famous icones bound in 35 volumes contain microfiche and MH has 3 such herbaria – Linnaean 2533 drawings especially from Coromandal Coast. (LINN & S), Willdenow (B) and Wallich (K–W). One Roxburgh’s icones gain significance as they are yet to should find ways to bring these images to desktops. be published entirely and the Roxburgh’s specimens on Catalogues are published/unpublished lists of names of the basis of which he has given Flora Indica and Plants plants grown in a garden/specific locality and serially of Coromandal Coast are not in any Indian herbarium. numbered (Hortus Bengalensis, [39]) or the lists of One report says there are hardly 30 sheets in CAL where specimens with specific numbers collected by individuals Roxburgh worked for about a couple of decades. Some (Wallich’s “A numerical list of dried specimens” [40], Materials for Devleoping Databases in Taxonomic Research – A Review 31

Graham’s “Catalogue of plants growing in Bombay and terminology and in such cases, the features are to be its vicinity” [41] and Wight’s untitled Catalogue. Such posted to the nearest agreeable terminology (without collections are usually done in 5 to 6 sets and later subverting the sense) that the format offers. In some cases distributed to various herbaria for the study by botanists. experts have to oversee the data and finalize the features. These catalogues serve as precursors to the regular floras Another issue concerns the names. Taxonomy deals with planned where these numbers were referred to. The identification, classification and nomenclature of objects importance of catalogues lies in tracing the actual in biological sciences. It is important to realize specimens cited by the authors in different herbaria. distinctions between them. Identification is the determination of a species as being identical with or Historical Collections, Handwritings and similar to another and already known element; or as being Background unlike any previously known element and therefore one This kind of information is most needed for herbaria that that is new to science. Identification in strict sense has hold old collections. As the documentation and label data nothing to do with the correct name of a species. The is very poor on such sheets, indirect indications such as determination of the category to which a species belongs handwritings will furnish adequate clues with reference is classification. The correct name of a species in one to collector, collection locality and other specimen classification may be different from the correct name it details. We have about a hundred European collectors would receive in another classification. It is not who had their specimens deposited in Indian herbaria uncommon to find a species to have different names in and their handwritings need wider exposure. Linnaeus different floras or manuals. But when there is an often made indications of geographical area of the agreement on the correctness of the identification and collections through symbols or abbreviations – central the classification, there will be only one correct scientific , western Asia and the Orient. The meanings of these name for a species. Nomenclature deals with the symbols are listed separately as Linnaeus’ symbols and determination of a correct name of a species. The these are quite useful for interpretations in taxonomic procedures/rules involved in determination of correct research. name, a synonym, or any other name that has any legitimate standing are dealt in nomenclature. The Data Volume and Consolidated National Flora general (mechanism of) nomenclatural procedures are A compilation of all published floras at national, regional, familiar to botanists, but so complex are some situations, state and district levels in the country (ca 450 by BSI that only few who have devoted years of study to the and other organizations) is essential as this data forms a subject are held competent to resolve them. In response basis for consolidated National Flora. Also one should to new situations these procedures have been modified, take into account important unpublished Ph. D. thesis revised and amplified. Alterations introduced in these concerning District floras/revisionary works and other rules orient towards greater stability, certainty and less aspects in taxonomy, which may be ca 100. Compilation confusion. One of the best ways to comprehend the scope of publications connected to new species and new records and modus operandi of nomenclature is to be thorough from India is crucial as there should be no omission of with important provisions of the Code as well as any species and to keep the distribution data accurate. A practicing resolving such complex situations. The rules consolidated account for the National flora is essentially of nomenclature prescribed by ICBN/ICZN are drawn from these works after critical and cautious mandatory and binding as they look into rectifying scrutiny. The Survey developed consolidated accounts inadequacies. In general, as names are considered static, in certain groups (ca 86 families) under Flora of India databases are created in a manner which do not allow Program where the spadework needed is minimum. But names to be changed are with little clarity and there are great many families (ca 160 families) where accommodation for synonyms. Databases involving information is scattered and this compilation is more scientific names must not be designed like any other ones relevant in these cases. In total, we are expected to that involve names of human beings in a banking sector, prepare ca 18,000 data sheets (as total spp. account to ca a company, an electoral list or the names of provisions 18,000 in angiosperms) under ca 247 families. in a shopping mall. But correct names of species are Problems Perceived in Data Preparation dynamic with revisionary/monographic works and databases must be flexible enough to accommodate these It is often difficult to judge how best we incorporate the changes. Accommodating both correct names and morphological variability within a species and in the synonyms is another problem in developing databases ambit of defined variables. Descriptions involve on consolidated floras. Before designing a database good descriptive morphological features and should be posted compilation of basic data with reference to names on all to the assumed heads of format of proposed database. species is a must and should comprise fully referenced, Certain descriptions which may possibly fall out of 32 M Sanjapa et al. accurate and factual information. Once this core general public, NGO’s, parataxonomists, traditional information is included, other related data could be easily medical practitioners and amateur naturalists who want linked to the different names. to know plants in their neighborhood can use it with greater ease. Above all a comprehensive floral inventory Implications with related information may turn out to be a major The process exposes the Survey, its achievements and revenue earner for the organizations building it. resources to public to a great extent empowering them to use the literature generated through course of years. Existing Databases It works as a precursor to the National flora. More over, A perusal of the web shows that about 400 institutions envisaging a database at this point accelerates the all over the world are attached with herbarium/museum momentum to the Flora of India program. The database facilities. These institutions have their own websites effectively builds a link among taxonomists, other subject (some partly developed) and provide information on specialists and public at large. It removes, at the same history, specimen collections/collectors, images, time, avoidable technical jargon to make it user friendly. checklists, catalogues, distribution maps, identification It can be at anybody’s access for reference and keys, publications, library, online journals, research consultation and boosts towards broader user base of the projects, photo galleries etc. But ca 125 institutions/ resources. The database also promotes awareness in herbaria have developed databases connected to the magnitude, distribution and status of plant diversity in materials they possess or the research outputs they have the country. The developed database not only takes them generated. Various kinds of databases exist in the above to correct identities of the plants but to other attributes institutions/herbaria are: Viruses (1), Bacteria (1), included in the database. It also enables taxonomists to Lichens (7), Cryptogams (17), Gymnosperms (1), establish identities with greater ease. As tracing of Vascular plants (6), Shade trees (1) Sea- (1), Type duplicates located in various Indian herbaria is possible specimen (9), Label data (3), Checklists (7), Phenology with this facility, it minimizes personal visits to various (1), Plant names (1), Red data books (3), Botanical herbaria. It also reduces the dependence on holdings in gardens (2), Agronomy (1), Palynology (2), Ecology (1), the foreign herbaria. As the specimens in various herbaria Forest pathology (1), Fossil types (1), and subjects related get exposed to experts nationwide, the process of to fauna in diverse combinations (56). Some of these scrutinizing the specimens, thereby expeditiously get include images for the entire herbarium, vouchers or type their names authenticated. Image database have specimens. Many of these websites provide links to other implications in conservation schemes too. Knowing institutions and search queries that get data from other names of the subjects we deal with gives additional herbaria. Some are accessible online or may be strength in managing plant resources. Honestly, one of downloaded or may be obtained in a CD on request. the problems faced by the foresters is that many of them These databases are scrutinized and few are reviewed are not aware of the names of numerous herbs and shrubs here, as they are found relevant in evolving a sound that occur in their jurisdiction. This can be achieved design for the databases proposed on floral resources in popularizing the good photographs of plants with correct India. names among foresters who in turn keep track of various International Legume Database & Information Service species and their populations in their routine visits. A (ILDIS): ILDIS is a founder member of Species 2000 good number of botanists who are also great plant and aims to document and catalogue the world’s legume collectors and are attached either to colleges or smaller species diversity. Species 2000 aims at a dynamic institutions with no worthwhile facility to taxonomic checklist of all known organisms by accessing an array studies. They can do little in arriving at identities except of global species databases. A global species database sending them to national/international herbaria or contains all known species for a particular group. Species experts. Interest can be generated in them if the existing 2000 will act as an index to a virtual library of material is made available through electronic database. biodiversity information on the www. Research groups In the total angiosperm flora only 20% constitute with at ILDIS regional centers in many countries contribute recognized utility. In this, ca 1600 are medicinal species; on a co-operative basis to pool information in the ILDIS edible plants (cereals, pulses, fruits and vegetables) World Database of Legumes. Information gathered from constitute ca 2000 species and plants of other uses ca local herbaria, national botanists and from literature in 2050. The Survey should strive to develop organized different languages is made accessible that would online databases in the said species. These caters to the otherwise remain hidden. ILDIS uses a consistent social needs of people at large and paves way in classification worldwide, edited and updated by a popularization of taxonomy in public minds and also network of experts, known as Taxonomic Coordinators. serve literature needs of EIA studies. Many commoners, The main menu contains 7 modules, Tribes, Scientific Materials for Devleoping Databases in Taxonomic Research – A Review 33

Names, References, Uses, Descriptors, Distribution, displays all sheets of S-LINN, a text box search based Common Names and Options. The tribe link displays 40 on species name or microfiche number. Linnaean types tribes recognized under ILDIS. Two options, accepted link to type collections present at S-LINN, and displays names and synonyms are given for both genera and images of herbarium sheet (frontage and flipside, species. References link displays uniquely generated inflorescence, flowers, text on the sheets, species name, reference number, author, year, title of the book/journal designated type, microfiche number, reference, type and the relevant data that has been taken for reference herbarium number). The Swedish species display the from the book/journal. Uses display various categories. alphabetic index, which lists names of Linnaeus’ species. Descriptors link to a page that displays Habit and One can also identify handwritings on the Linnaean Lifespan. References, uses and descriptors cannot be sheets by clicking the handwritings link. It displays the linked backwards to scientific names or synonyms or names of persons associated and samples of their any other data given in the database. The linking seems writings. In total they are handwritings of 23 collectors. to be done in the future by ILDIS. Distribution on Individual names hold a short description of their continent-, country- and state wise is given. A user can biography and more illustrations of their writings and generate a species list with reference to a continent, references. On the flipside of some sheets certain details country and a state. Common names are drawn from are given, sometimes in a language other than English. different books/journals and are linked to related These are translated into English and are provided with scientific names. Scientific names generated in this way a link ‘read a story about a Linnaean plant’. These details for both distribution and common names in turn give usually provide the information in what circumstances the entire data of any selected species. Querying or search the plant was collected and any specific/unique and less options are not present in this database and the users known details with reference to collection. easily browse information from CD. Data sheet of Sampada: It assists to develop databases in a individual species consists of Name and Authority, decentralized way by various institutions on their own, Status, Genus, Tribe and Reference, Other names, thereby managing their biological collections more Descriptors, Geographical Data and Uses. Status tells orderly. Individual institutions can produce outputs whether the name is accepted, synonym or provisional. related to taxonomy as well as images. Allocation of References give the source code of literature and these unique barcode to each collection sample is distinctive codes link to its related data. Other names provide the of this database. Central repository administrators of synonym data. Common names, uses, geographical SAMPADA integrate information of distributed distribution and notes are also provided with source collections received from various institutions. Certain codes. fields in the database appear redundant but as it combines A framework for creating further species diversity natural musea/herbaria collections of all kinds, these may data sets under ILDIS phase 2 projects more applied be relevant in one or the other occasion. Museum, disciplines, such as germplasm sources and plant Collector/identifier information, Specimen, Data Source, breeding data. As a whole these will constitute a full Search, Help and Theme are the menu items displayed species diversity information system for the world’s in the menu bar. Museum information comprises legumes. The Phytochemical Dictionary of the information on institution and only one entry can be Leguminosae, published in 1994, represents the first of saved as museum information as individual institutions the ILDIS Phase 2 projects. At the core are the should use this package separately. Institutions will be ‘phytochemical records’, the known substances recorded allocated individual numbers by the central repository from legume plants. Root Nodulation program has two administrators of SAMPADA only during the time of root nodulation projects, one that records the presence integrating distributed information provided by these or absence of root nodules and the other is compiling institutions. Collector/Identifier information contains data not only on nodulation, but also on various nodule data on individual/group of persons who are involved in characters, such as structure, morphology, mode of collection and identification of specimens. The list of infection and type(s) of rhizobia in some cases. collectors/identifiers connected to the museum/ Linnaean Herbarium (S-Linn): (S-Linn) in its website herbarium is to be compiled prior to its inclusion in (http://linnaeus.nrm.se/botany/fbo/ welcome.html.en) SAMPADA. Specimen module has sub-menus namely presented 3658 digitized herbarium sheets. The Repository, Field Information, Taxonomy, Images and homepage has links for Linnaean types, Swedish species Notes that collects actual repository and other related of Linnaeus and handwritings on Linnaeus sheets. Apart information. The repository information has details about from information on Linnaean Herbarium and about specimen/collection (scientific name, taxonomic Linnaeus, it provides an alphabetic index search that hierarchy, collector’s name, year of collection, collection number, accession number and location of the specimen) 34 M Sanjapa et al. and is mandatory to fill in fields in the repository form, languages/regions is used to search for desired species as without repository data all other forms would remain based on common names. This is a favored facility and disabled. This form generates unique barcode for every even a commoner can reach the species of his interest single specimen/collection taking into account museum/ with little difficulty and through multiple options. The herbarium acronym and accession number of the sample. images account for 2676 species and some are Barcode(s) can be printed individually and pasted on comparable to live specimens to establish effortless the corresponding specimen. Save option appends the identities. Some genera have poor representation and record into the database. Barcode must be generated poor quality images. The compilers have used data from before the record is saved Barcode for each specimen is ca 200 regional/sub-regional floras and from the Flora generated as a combination of geographic location of British India. The concept of synonymy is treated very (Country, State and District), acronym of museum/ differently and the compilers chose to adopt highly collection and accession number of the specimen. Date unconventional categories such as correct, synonym, of accession and poisoning are provided for museum probably correct, probably synonym and undecided and managers. Type field indicates specimen preserved as – the last three do not exist in taxonomic literature. holo, iso, para, syn, lecto, neo, or topotypes. Field Misapplication of names based on wrong identities information collates field data from where the sample mentioned in floras also needs some verification before has been collected. Local name and Local cause inputs inclusion. All these aspects are discussed in greater detail details of local name and reason for the name. Sex, Habit earlier [43]. and Habitat are other fields. Barcode, scientific name Database of Herbaceous Plants of Baroda and and locality appear automatically as entered in the Environs: Digital databases of fresh specimens are repository form. Latitude/Longitude, Altitude, proven greater use for public than conventional herbaria Temperature, Rainfall, Field Notes, Ecological Notes are as they give clearer images including flower and fruit, the other fields to be filled in. Here Field Notes - Notes which enable even a commoner to identify the plants of describing geographic, geological, climatic, atmospheric, his interest. Ms. Mona Dave prepared a digitized flora pollution and meteorological conditions on the day of of Baroda and its environs. This is built on HTML (Hyper sampling is to be filled. Much of the data in field Text Markup Language), where text or images are linked information is difficult for filling in as old sheets provide to different pages. It provides species search based on very scant information and the fields go without any family and scientific name. Family search links to a page entry. Taxonomy includes information about systematics, where all the species present under it are listed. Each synonym and common names through 3 separate forms. species is linked to a page that displays species name Images are provided to digitize the repository and it’s with author citation and its classification. Each page illustrations. This can be in the form of images, video contains 3 links namely, Taxonomic description, Ecology and audio. 5 digitized images can be stored for a single and Medicinal use. Names are arranged alphabetically repository specimen. What SAMPADA has envisaged both in family and genera search. The taxonomic details is of great significance in the sense it has given the include information on habit, habitat, and detailed freedom of developing the databases by various morphology. Photographs of plant habit, flower and fruit institutions on their own. But not enough maneuverability are in thumbnail form and a double click magnifies in changing format is given depending on the situational them. Database has details of 465 herbaceous species needs. Central administrators of SAMPADA are belonging to 74 angiosperm families with high-resolution supposed to integrate the information developed by digital photographs. The ecological details are on the various institutions to develop a vast database [41]. But current status of the plant in study area, distribution, this database may have problems as it has embarked on major associates (wherever observed), phenological collecting diverse information little foreseeing details and medicinal uses. Medicinal uses mention the intricacies. diseases for which they are used. This CD is particularly Sasyabharathi: Sasyabharathi aims at consolidation of useful to colleges around to popularize botany of the published floras from Southern India and opens through area among students and college teachers. Hyper linking a title page “Distribution, Taxonomy & Diversity of is easy as it involves no programming codes or separate Plants of South India” followed by second screen with 6 database management systems but is a tedious process modules, ‘Political Boundary’, ‘Gridwise Search’, in linking individual words or images to their respective ‘Biotic Zone Search’, ‘Thematic Maps’ ‘Images’ and pages or files. Updating, deletion or modification of data ‘Textform Search’ which facilitate data access of users’ is cumbersome when databases are developed on HTML. choice. The ‘textform search’ enables one to look in for Australian Virtual Herbarium (AVH) provides species through common names and conventional immediate access to the wealth of data associated with taxonomic hierarchy. An exhaustive list of 800 scientific plant specimens in each Australian herbarium Materials for Devleoping Databases in Taxonomic Research – A Review 35

(www.anbg.gov.au/avh/) developed under the and uploading specimen images, species and label sponsorship of the Council of Heads of Australian details. Images are categorized as Whole image, leaf Herbaria (CHAH) representing the major Australian image, flower image, fruit image and miscellaneous collections amassed at CANB, NSW, DNA, NT, BRI, image. Every image is linked to its respective herbarium AD, HO, MEL and PERTH. Australian herbaria house label. More than one image under a category can be over six million specimens that date from the earliest viewed through a navigator button. Enlarged view can days of European exploration. The AVH has its own be obtained by a double click on the image. The image priorities for taxa and targets to digitize species from resolution for inbuilt photos is low, and is not of good these herbaria based on the priority of taxa instead of clarity when enlarged. The ‘species details’ display the embracing and digitizing all the specimens in these hierarchy of species while ‘label details’ display the herbaria. Under prioritization, first priority taxa contain details of herbarium label. ‘Comments’ allow saving any data of 6 families of maximum reliability (Australian explanation to a species. Species details, label details taxa), second priority taxa (contains data of other highly and comments are editable, so that the details of these important families/groups with maximum reliability options in the database can be changed easily. One can (Australian taxa), third priority taxa contains taxonomic select a species name to view images from various data on the reminder of native Australian plants and lower herbaria. Alternately, herbarium/herbaria names can be groups which also includes external territories and New selected to view specimen images pertaining to these Guinea. and finally the fourth priority taxa (geographic institutions. Once a species is selected, a button called data of variable standards, with significant validation ‘Type Specimen’ facilitates to move to view images of effort being directed only at taxa of research and weed- vouchers/types. This is an incomplete image database potential interest). Accurate descriptions of over 60% of of selected specimens from few herbaria. Australia’s species have been compiled SABONET (Southern African Botanical Diversity in the last 20 years and over 70% of the specimens housed Network): The Southern African Plant Red Data Lists in Australian herbaria have been digitized, providing a database contains more than 7,000 records on 6,700 taxa, comprehensive resource for accurate depiction of providing information essentially drawn from Southern geographic distribution and occurrence of plant species, African Plant Red Data Lists book along with other data, historical mapping of all plant, algal and fungal species such as extent of occurrence, population size, past decline to understanding the threatening processes of vegetation and future decline. As the Red Data Lists book contained clearance and weed invasion and revising classifications only a limited number of assessments for South African of all plant groups for accurate portrayal of biodiversity plants, the assessments made by Craig Hilton-Taylor existing in this country. «HISPID (Herbarium were included for completeness. The Area of occupancy Information Standards and Protocols for Interchange of (AOO), Broad IUCN category, Common Name, Data)», a set of specimen data interchange standards Conservation Status, Current Decline, Distribution and developed by Australian herbaria, has now been adopted Endemism are given. ‘Fragmented’ specifies information internationally. The AVH can be accessed through the on whether the taxon is naturally disjunct or fragmented website of any participating herbarium through a gateway by urbanisation. Full IUCN Status and Future Decline provided at each of these herbaria. This gateway links in percentage predicted in the next 10 years are also to the databases of all the other herbaria, consolidating given. SABONET database opens with a search dialogue the combined data into a nation-wide view of the box that contains four different search options namely, botanical information. The vision of AVH is long-term, ‘General Search’, ‘In a Single Country’, ‘Multiple committed to the sourcing of data and making the Countries’ and ‘Advanced Search’. Searched data can information widely available. be viewed, printed or exported to Microsoft Excel or Specimen Image Manager: The Bangalore based Microsoft Word. Page set-up and Print preview are Ashoka Trust for Research in Ecology and University options available in the file menu before printing. of Agricultural Sciences, GKVK developed specimen In the overall review of various databases, there are image database of various herbaria under Specimen many positive features in ILDIS, SAMPADA and Sasya Image Manager Version 1.0. This database has a folder Bharathi with respect to either concept or design. A called ‘images’ in which 5 folders contain namely CES, thematic proposal is now built based on suited and FRLHT, HARVARD, IFGTB & KEW that represent 5 desired aspects of the above databases. different herbaria. Each of these folders in turn holds subfolders, which represents species names. CES A Thematic Proposal includes 300 species, FRLHT – 300, HARVARD – 173, In view of diverse nature of materials and varied clientele IFGTB – 300 and KEW – 421. They include images of one should be open minded in the database design various types. The database facilitates viewing, adding process and the following issues should be addressed. 36 M Sanjapa et al.

● The components intended for inclusion in the activity of building such a database should be centered database and how should they be formatted? In what on BSI. The process of collating information should be way repetition and stray exercises can be minimized? initiated simultaneously in different herbaria using the ● Whether a comprehensive database can be schemed common application developed for data entry and data that includes both materials and literature under one storage. format; or materials of taxonomic research should Usually the data on systematics, phenology, be separated from literature? ● Whether database should be built through a graded herbarium data, citation description, illustrations, and gradual approach initially attempting on photographs of live specimens and a sample herbarium taxonomy followed by appending additional sets for sheet, line diagrams, local names, flowering/fruiting value addition as in ILDIS? season, distributional data, status and known ● Whether this can be attempted in a much smaller ethnobotanical uses of each species should be the basis way group wise (Species reference information first to build up records in developing table. Basic taxonomic and related datasets later) and phase wise as data such as botanical names, synonyms and undertaken in ILDIS? classification are to be carefully compiled as they form ● Whether we develop databases based on various the basis for the integrity of the entire gamut of themes and in peace meal targeting users needs and information. Adopting correct names, deciding also based on different phytogeographic regions? synonyms, and compiling reports on new species in Our experience suggests that the format for various families are true jobs in building consolidated databases on plant materials get complicated as we go floras. If there exists dilemma, there should be a facility in for comprehensiveness and including various in database design to maintain provisional names. components. The best option is to separate related Organizing related components to build records under components and build simple databases, finally different tables is desirable. At least one component of a integrating them in to a master database. Specimens are table must identify a record uniquely which would aid important materials in taxonomic research and what a in relating data of one table to the other and maintain taxonomist needs today is a database management integrity without any confusion. Different tables can be system to digitize the herbarium sheets, to create a linked without much labour only in a Relational Database labeled database for all the specimens and to develop a Management System (RDBMS). Families, which are unique, easy to use and effective system that could be already revised for Flora of India, should be taken up in used by different herbaria for specimens’ scrutiny. One the first instance and we can progress in building the should consider standardized resolution and appropriate database for other families as revisionary studies get enlargement for specimens of various sizes. Scanners completed under Flora of India program. Distribution and digital cameras with high resolution should be used maps can be done at one place carefully accounting to ensure quality of the images. Prioritization is a must synonymy, misapplication of names and also reports while digitizing images as holdings of various herbaria concerning to new reports to maintain uniformity. If run into millions. It is also desired that we develop a necessary, endemic plants are to be mapped at much handwriting database of all European collectors, as lower scale. Status is again debatable for many species. recognition of handwritings is central for inferences The information can be best drawn from the experiences about the specimens. Type specimen database can be of recent field works/collections/related publications. maintained separately classifying them based on Red data for the country and for various regions are to recognized type series. As images include label data, be developed separately specifying localities that need what is available on it go with the image. There need conservation. Additional data sets that have implications not be any separate fields for collector information while in application can be added in a well-phased program. developing image database. Taxonomists being the Species diversity information such as palynology and general users of specimen images can understand what cytology and more applied disciplines germplasm exactly a specimen stands for even when they bear more resources, phytochemical data and known uses may be than one name. Accession number along with the appended gradually. Some efficiency in data entry can acronym of the herbarium can give the needed identity be brought in such as avoiding repetitive input on the of a specimen. “HISPID (Herbarium Information details common to specimens of a particular taxonomic Standards and Protocols for Interchange of Data)”, a set rank. The common application to build databases must of specimen data interchange standards developed by be developed using an advanced programming language, Australian herbaria has to be adopted as far as possible. which also should ensure its stability with advancement It is to be carefully phased as for the literature is in technology. At a later stage, centralizing the data by concerned. A dynamic checklist for the floral resources networking should be done improving the of the country should be the prime aim and the entire communicability between the regional offices and Materials for Devleoping Databases in Taxonomic Research – A Review 37

BSI HERBARIA A CONSORTIUM FOR HERBARIA NON-BSI HERBARIA

ARUN ASSAM BSA BSD DATABASE TYPES BLAT CALI CDRI

BSHC BSI FLORAS DD FRLHT Based on Regional / State / District Based on CAL Herbaria & Herbaria & CAL Publications IMAGES Publications BSIS BSJO Specimens, Illustrations, JCB TBGT Fresh Specimens, Handwritings Etc... LWG DECCAN MH PBL PCM & RHT PHASE II Others PHASE I Consolidated Floras Consolidated Floras Family-wise identification family-wise species appending additional reference informations sets for value addition OFFLINE ONLINE

Users Users obtain Data CD HEADQUARTERS (CAL) via Internet

Data Collection Server FEEDBACK FEEDBACK

R & D Wing

Central Data Server

Fig. 1 various other taxonomic institutions. This is possible as provided to obtain user comments/suggestions/ opinion, the application used for data collection by these regional so that, any future modification on data especially validly offices will be of the same type. The data may be directed published names could be done easily. Data can be to a data collection server that is placed at the modified periodically depending on the research outputs headquarters of the survey. Teams of R & D personnel that come through course of time. The organizations scrutinize the data concerning to various components of involved can generate user-specified reports as and when materials and literature and export it to the central data required from this master database (Fig. 1). These server to maintain the quality of the database. They features are imperative as systematics as a subject is not should adopt consistent classification system edited and only huge but dynamic. Such a concerted effort in linking updated by experts known as taxonomic coordinators our resources and literature will definitely yield for various groups. It should be ensured that appropriate anticipated results and improve our understanding on information is supplied to global users by verifying data plant resources, inventory and conservation. with scientists & researchers before storing in the central References server. This central data server in turn not only share the 1. Heywood V Taxon 50 (2001) 361–380 information with all the regional offices but also can be 2. Nair NC, VJ Nair and P Daniel Botanical History; in Flora linked to the world wide web (www) through which it of India Introductory Volume Part I, PK Hajra and BD Sharma will be accessible for users online. The central data can (Eds), Botanical Survey of India, Kolkata (1996) pp. 53– also be transferred to various other server nodes if needed 196 preferably at regional offices to prevent network 3. Singh NP and HJ Chowdhery Botanical Survey of India, trafficking when number of users increase. The database Shiva Offset Press, Dehra Dun (2002) rd should possess the facility of simple query search for 4. Jeffrey C Biological Nomenclature 3 ed., Edward Arnold, London (1983) the laymen and compound query search for the scientists. 5. Winston JE Describing species – Practical taxonomic This enables scientists a common and controlled procedure for biologists, Bishen Singh Mahendra Pal Singh, approach for adding new data and modifying and Dehra Dun (1999, reprinted 2002) retrieving existing data within the database. To keep the 6. Nees von Esenbeck CGE : In Prodromus database up to date, a separate feedback link should be systematis naturalis regni vegetabilis 11 AP de Candolle (Ed), Victoris Masson, Paris (1847) pp 46–519 38 M Sanjapa et al.

7. Clarke CB Acanthaceae: In Flora of the British India JD 28. Wight R Illustrations of Indian Botany vol 2, JB Pharoah, Hooker (Ed) L. Reeve & Co., London (1885) pp 387–560 Madras (1850) 8. Gamble JS Flora of the Presidency of Madras 11 Parts (Parts 29. Beddome RH Icones Plantarum Indiae Orientalis vols 1–3, 1-7 by JS Gamble & 8-11 by CEC Fischer (Eds) L. Reeve & Gantz Brothers, Madras (1868–1874) Co. London (1915) 30. Beddome RH The flora sylvatica for southern India vols 1 9. Ramamurthy K Bull Bot Surv India 13(1 & 2) (1971) 153- & 2, Gantz Brothers, Madras (1869-1874) 155 31. Royle JF Illustrations of the Botany and other Branches of 10. Khan AES, ES Santhosh Kumar, S Binu and P Pushpangadan the Natural History of the Himalayan Mountains and of the Ann For 4 (1996) 200-202 Flora of Kashmir, Wm H Alland and Co., London (1833– 11. Gamble JS Bull Misc. Inform. (1923) 373 1840) 12. Panigrahi G Bull Bot Surv India 19 (1977) 212–224 32. Hooker JD The Rhododendrons of Sikkim Himalayas, L 13. Henry AN Bull Bot Surv India 19 (1977) 225–227 Reeve & Co., London (1849–1851) 14. Puri GS Bull Bot Surv India 1 (1959) 74–77 33. Hooker JD Illustrations of Himalayan Plants, L. Reeve & Co., London (1855) 15. Singh NP Bull Bot Surv India 19 (1977) 228–235 34. Hooker JD Annals Roy Bot Gard Calcutta 5 (1895) 1–168 16. Rao RS and G Panigrahi Bull Bot Surv India 1 (1959) 62–69 plates 1–101 17. Rau MA Bull Bot Surv India 1 (1959) 59–61 35. Wallich N Plantae Asiaticae Rariores or Descriptions and 18. Vaid KM Bull Bot Surv India 19 (1977) 236–240 figures of a select number of Unpublished Indian Plants 3 19. Sukthankar R, J Pathak and PV Bole Bull Bot Surv India 19 vols, Treuttel and Wurtz, London (1830–1832) (1977) 241-243 36. Griffith R Icones Plantarum Asiaticarum vols 1–4, Govt. 20. Mathew KM Bull Bot Surv India 19 (1977) 276–278 Bengal, Calcutta (1847–1854) 21. Nayar MP Bull Bot Surv India 19 (1950 145–154 37. Sanjappa M, C Satish Kumar and SD Biju Indian Brushes 22. Nayar MP Hotspots of Endemic plants of India, Nepal and with Orchids, St. Joseph’s Press, Thiruvananthapuram (2002) Bhutan, TBGRI, Trivandrum (1997) 38. King G and R Pantling Annals Roy Bot Gard 8 (1898) 23. Rao RS Vegetation and valuable plant resources of the Eastern 1–342, t. 1-448 Ghats with specific reference to Andhra Pradesh and their 39. Roxburgh W Hortus Bengalensis, Mission Press, Serampore conservation; In Proc Nat Seminar on Conservation of (1814) Eastern Ghats, EPTRI, Hyderabad (1998) 59–86 40. Wallich N A numerical list of dried specimens of plants in 24. Nair NC and P Daniel Proc. Indian Acad Sci (Anim Sci/ Plant the East India Companies Museum Catalogue, London Sci) Suppl (1986) 127–163 (1828–1849) 25. van Rheede HA Hortus Malabaricus 12 vols, Amsterdam 41. Graham J Catalogue of plants growing in Bombay and its (1678–1693) vicinity, Govt. Press, Bombay (1859) 26. Wight R Icones Plantarum Indiae Orientalis, JB Pharoah, 42. Chavan V and S Krishnan Curr Sci, 84 (1) (2003) 34-42 Madras (1838–1853) 43. Sanjappa M, P Venu and D Albertson Curr Sci, 88(5) (2005) 27. Wight R Spicilegium neilgherrense or a selection of 825–826 Neilgherry Plants drawn and coloured from nature with brief descriptions of each, JB Pharoah, Madras (1846–1851