Fuzzy Logic Search Engine for Soybean Shape Data
Total Page:16
File Type:pdf, Size:1020Kb
Use of Dublin Core and XML for the organization of agricultural information in the web
Marcia Izabel Fugisawa Souza1; Adriana Delfino dos Santos2; Roberto Hiroshi Higa3; Laurimar Gonçalves Vendrusculo4
Abstract
article presents ho Dnizing information in the project AgencyBeef Cattl organizing ishing in the interne iation abothe beef clein. The result ject will bse contents are electronic res (publication to tormats and types co will bets, images, sound, software, data, interactive, events, etc. This project oping aesources metadathaving as functionality: insertion; alteration; exclusion; and search to classesourcest also approaches the nveloping appropriate tools for organization of information. XML language (eXtensible Markup Language) is used for structuring information contents, and the Dublidard for description of electronic resources. As obtained result is presented the tool for meteation, which uses XML for structuring and storing electronic.
Keywords: Dublin Core, XML, eXtensible Markup Language, Metadata, Beef cattle supply chain, Agricultural information, Information technology, Cataloging.
Introduction
The decade of 90 attended the appearance of World Wide Web, event this that more has been contributing to the popularization and use of the Internet. In spite of your huge growth, the volume of information published in Internet it only tends to increase. That disordered growth imposes serious difficulties in the location and recovery of the wanted information, in spite of supplying fuel for the development of appropriate tools to your organization. It is more and more intense the propagation and the development of information technologies dedicated to the edition and publication in digital format, as well as in the conversion of paper documents in electronic media (Marcondes & Gomes, 2000).
Web pages are constituted more thoroughly in the support spread of Web publication, being used for such the language of hypertext markup HTML - Hypertext Markup Language. Through HTML great part of the Web publication it is formatted, however, as your tags are fixed, your function is limited, just, to control relative aspects to the document appearance, as type, style, color, margin, source size and of page, tables, etc. Important aspects related to the content representation of the Web information are not contemplated in a satisfactory way by HTML. The semantic markup, for instance, is the resource that gives to the computers the capacity to identify the meaning that each descriptive element contains. It is this markup type that allows the information content structuring, so that the same it is interpreted by machine and it is translated in direct aid to the humans in the task of information recovery. The tool that offers that facility is 1 Specialized Technician, Embrapa Information Technology, Campinas, SP, PO Box 6041 - 13083-970 – Brazil - [email protected]. 2 Researcher, Embrapa Information Technology - [email protected] 3 Researcher, Embrapa Information Technology - @cnptia.embrapa.br 4 Researcher, Embrapa Information Technology - [email protected] WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES XML–eXtensible Markup Language, a language of expandable markup that turns easier to develop and to publish in the Web.
The opportunity of limitless access to the information distributed globally by the Web it requests the metadata use for standardized description of electronic resources, seeking effectiveness in your recovery (Miller, 1998). Metadata request common conventions on semantics, syntax and structure.
XML, tool developed under the auspices of World Wide Web Consortium, it offers infrastructure that makes possible the code, it changes and reprocessing of structured metadata. That infrastructure makes possible the interoperability through the mechanisms that support the semantics, syntax and structures.
The objective of this work is the development of an infrastructure for metadata generation of electronic resources for the repository of information of the Agency, that will be constituted of electronic resources (publications web) related to the supply chain of beef cattle, varied formats and types, as: texts (home page, periodic publications, monographs, proceedings, etc.); image; sound; data (databases, statistical data, etc.); software; interactive; events and other.
The metadata standard chosen to do the description of electronic resources is Dublin Core. For the structuring and storage of the metadata, it was chosen the XML language.
Markup languages: SGML, HTML and XML
Starting from 1993, HTML became thoroughly diffused and adopted as pattern to produce hypertext pages in the Web. However, HTML is summarized in a limited tags group, and it doesn't offer the flexibility today demanded by the applications Web. Due to that verification, researchers starting to study other markup language that could offer more resources than HTML, and that at the same time was more usable for humans and machines, than the complex SGML (Khare & Rifkin, 1997).
In 1996, a World Wide Web Consortium team started to work in the development of a markup language that turned SGML (Standard Generalized Markup Language) simpler and it maintained your extensibility aspects, it structures and validation. That new language should still offer other no-available means in HTML, such as the creation of own tags, with your private semantics, seeking to express the information content and not just your appearance. That effort resulted in the appearance of XML, a markup language specifically projected to transmit data structured among applications Web (Khare & Rifkin, 1997).
The XML is directed to the structuring (description of the content), while HTML is directed to the presentation (format description) of the information. In XML, the relative aspects to the information presentation format they are solved by the style leaves, that indicates as to generate formatted reproductions in the chosen presentation format.
The effective information management presupposes the existence of a system that allows to logically to structure itself, in way to allow your retrieval, it changes and integration. It is emphasized that the need of standardization of the electronic
WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES resources description seeking the improvement of the effectiveness of the search mechanisms and retrieval.
Electronic resources cataloging
Cataloging of electronic resources is a theme that has been thoroughly discussed and protected, mainly, for content producer of information for Internet. It is calculated that the number of web pages already approaches to 3 billion (Online..., 2000), growing to the rhythm of 7 million page/day. In your immense majority, those billions of pages are unprovided for any organization standard and description, what directly contributes to the low effectiveness in the retrieval and in the consequent dissatisfaction and frustration of the Internet users.
To catalog electronic resources means to describe them in agreement with standars, constituting in aggregation of value to the information; to catalog is a form of organizing the information and as much organized more easily will become accessible. The most efficient method to give access the those resources is the creation of catalogs and databases seeking your on line retrieval, whose registrations can be incorporate through the use of techniques and cataloging procedures (Mey, 1995).
Information producer institutions in Internet electronic format are more and more concerned in how to prepare and to turn available your resources of information in catalogs on line, so that they have visibility and they can be accessed in a satisfactory way.
Metadata
Metadata can be defined as: data on data; information on information; structured description of essential properties of the information. They make possible the representation of the information, they create standardized structure of the information description, they join value to the information, and, consequently, they facilitate the recovery and access to the wanted information (Gill, 2000; Gilliland- Swetland, 2000). The metadata describe the attributes and the content of an original document, and if used in an effective way, they make possible the access to the necessary information (Milstead & Feldman, 1999).
The qualification of the information through metadata is a need and it seeks to create a structure of standardized description of electronic documents. Information in electronic media needs appropriate methods of description, because it possesses elements and specificity that are not contemplated by the traditional methods of treatment and description.
The Dublin Core it is an international standard for description of electronic resources of information, initiative led by Online Computer Library Center (OCLC). Dublin Core it consists of a group of 15 (fifteen) metadata elements: Title, Creator, Subject, Description, Publisher, Collaborator, Dates, Type of the Resource, Format, Identification of the Resource, Source, Language, Relation, Coverage, Copyright. Your principal characteristics are: 1) simplicity in the description of resources; 2) semantic interoperability – it promotes the common understanding of the describe, it helps to unify patterns of contents description, increasing the possibility of semantic interoperability among disciplines; 3) WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES international consense – standard of recognition description and international acceptance concerning the coverage and scope of the resources; 4) extendible – it allows to join others metadata and it is constituted in alternative to the more elaborated description models, slow and expensive.
Tool in XML to store metadata Dublin Core
The project of the Embrapa Beef Cattle Agency website contemplates a tool of metadata cataloging of electronic resources, presented in this section, whose functionality understand the creation, alteration, exclusion and metadata searching stored in XML.
The tool of metadata creation is based on the version 1.1 of the recommendation of use of the standard “Dublin Core Metadata Element Set” (Dublin Core..., 1999). That tool incorporates attributes and qualifiers to the description of the elements Dublin Core, which seek to enlarge the degree of specificity of the data be described. The attributes follow the pattern for description of metadata elements ISO/IEC 11179 and they form a group of ten attributes: name, identifier, version, registration authority, language, definition, obligation, datatype, maximum occurrence and comment.
The qualifiers are values attributed for each one of the fifteen elements of Dublin Color, described in the attribute “comment”, and that need to be differentiated some of the other ones. Those qualifiers can have identification (scheme) and/or a value (modifier) and both are to inform how to understanding the value (modifier) in the own element. The qualifiers contribute to the improvement of the consistence, clarity mark of the definitions of the Dublin Core metadata elements, facilitating the understanding for the user.
The use of XML to store Dublin Core metadata it is based in the likeness of the element concepts and of extensibility. Both elements, Dublin Core and XML they are identified for a group of attributes. In Dublin Core, it can be increased new elements, in agreement with the need of the application; and, in XML, it can be increased new elements alternating the rule of the XML document structure formation. The qualifiers need to be stored with the content of the element Dublin Core. In XML, these are represented as attributes of document element XML.
Conclusion
The metadata incorporation through XML for the structuring of the information and the adoption of the standard Dublin Core they are of great usefulness for the establishment of systems as the Beef Cattle Agency. Besides the mapping of the defined elements for the standard Dublin Core in XML to be practically direct, the characteristics of that language project allow to extend the group of elements that compose the metadata, based on the standard Dublin Core or not.
New technologies for the organization of the information, mainly in the description of electronic resources, just represent the framework for the optimization of techniques and processes, as the cataloging, based on the human interference.
For future works, it is suggested to study: 1) the standardization of the structure of the Embrapa publications content with the metadata inclusion in the own document WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES and use of XML as storage form; 2) the evolution of the cataloging tool in the sense of using the own definition of the structure of the documents as parameter, what would turn it independent of future updating in the structure of the documents, including alterations in the metadata due to evolutions of the Dublin Core.
References
DUBLIN CORE METADATA INITIATIVE. Dublin Core metadata element set, version 1.1: reference description. Disponível na Internet: http://purl.org/dc/documents/rec-dces 19990702.htm> Acesso em: 29 maio 2000.
GILL, T. Metadata and the World Wide Web. In: BACA, M. Introduction to metadata: pathways to digital information. Disponível na Internet:
GILLILAND-SWETLAND, A. J. Setting the stage. In: BACA, M. Introduction to metadata: pathways to digital information. Disponível na Internet:
KHARE, R.; RIFKIN, A. XML: a door to automated Web applications. IEEE Internet Computing, p. 78-86, July/Aug. 1997.
MARCONDES, C. H.; GOMES, S. L. R. O impacto da Internet nas bibliotecas brasileiras. Rits, v.2, n.2, jul. 2000. Disponível na Internet:
MEY, E. S. A. Introdução à catalogação. Brasília: Briquet de Lemos/Livros, 1995. 123 p.
MILLER, E. An introduction to the resource description framework. Bulletin of the American Society for Information Science, p. 15-19, Oct./Nov. 1998.
MILSTEAD, J.; FELDMAN, S. Metadata: cataloging by any other name... Online: the leading magazine for information professionals, v. 23, n. 1, Jan. 1999. Disponível na Internet:
ONLINE COMPUTER LIBRARY CENTER. OCLC Office of Research: Web characterization. Disponível na Internet:
WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES Figure Inserção de Recurso Inserção de Recurso 1. TÍTULO do recurso a ser descrito (requerido): 7. DATA:
Avaliação econômica de técnicas de recuper 2000-07-26 Esquema: AAAA-MM-DD Modificador: Data da última m + Português Idioma do Título: 8. TIPO do recurso (a natureza ou genero do conteúdo do recurso): Outro título (além do título principal): Texto Economic evaluation of different technologies 9. FORMATO (a representação de dados do recurso): Modificador: Traduzido texto/html (.htm, .html) + Idioma do outro título: Inglês + 10. IDENTIFICADOR: 2. CRIADOR: http://atlas.spi.embrapa.br/pab/pab.nsf/ Yokoyama, Lidia Pacheco; Viana Filho, Anton Esquema: URL + Modificador: Nome pessoal 11. FONTE: Criador (2):
[email protected] Esquema: Texto livre
Idioma da Fonte: Português Modificador: Endereço pessoal + 12. IDIOMA: 3. ASSUNTO e palavras-chave (requerido) Português + Sistema barreirao; Cultivo associado; Milho;A 13. RELAÇÃO com outros recursos: Esquema: Thesagro http://atlas.spi.embrapa.br/pab/pab.nsf/FrAnual Idioma de palavra-chave: Português + Modificador: Parte de Esquema: URL Categoria do assunto (requerido): Idioma da Relação: Português +
Plant Production (Range and Pasture Grasse 14. COBERTURA:
Esquema: AGRICOLA Idioma de categoria de assunto: Modificador: Nenhum Inglês + Idioma da Cobertura: Português + 4. DESCRIÇÃO (relato do conteúdo do recurso): 15. DIREITOS autorais: O objetivo deste trabalho foi comparar a Embrapa economicidade de algumas técnicas de recuperação de pastagens, ao longo de Esquema: Texto livre Idioma dos Direitos: Português + Modificador: Texto livre Idioma da Descrição: Português + 16. CENTRO DE DADOS (requerido): 5. PUBLICADOR: Embrapa Informática Agropecuária
Embrapa Transferência para Comunicação 17. Nó da Árvore do Conhecimento: 1.2.1.3.2.6 + Modificador: Nome corporativo + 18. Perfil do cliente: 6. COLABORADOR: Técnico Pesquisador Agroindústria Produtor rural
Modificador: Nenhum + 19. Origem para upload: Procurar... +
Figure 1- Template metadata generator.
WORLD CONGRESS ON COMPUTERS IN AGRICULTURE AND NATURAL RESOURCES