Logic-Based Web Information Extraction∗

Total Page:16

File Type:pdf, Size:1020Kb

Logic-Based Web Information Extraction∗ Logic-based Web Information Extraction∗ Georg Gottlob and Christoph Koch Database and Artificial Intelligence Group, Technische Universit¨at Wien, A-1040 Vienna, Austria. fgottlob, [email protected] 1 Introduction These are just two reasons for which wrapping tools need to assist humans to render the creation of wrap- The Web wrapping problem, i.e., the problem of ex- pers a more manageable task. Two ways to approach tracting structured information from HTML docu- this requirement have been proposed: the use of ma- ments, is one of great practical importance. The often- chine learning techniques to create wrappers automat- observed information overload that users of the Web ically from annotated examples (e.g. [23, 30]) and the experience witnesses the lack of intelligent and encom- visual specification of wrappers. The first approach passing Web services that provide high-quality col- currently suffers from the need to provide machine lected and value-added information. The Web wrap- learning algorithms with too many example instances ping problem has been addressed by a significant { which have to be wrapped manually { and from neg- amount of research work. Previous work can be clas- ative theoretical results that put a bound on the ex- sified into two categories, depending on whether the pressive power of learnable wrappers.1 HTML input is regarded as a sequential character By the second candidate2 for a substantial produc- string (e.g., [34, 27, 24, 30, 23]) or a pre-parsed docu- tivity leap, the visual specification of wrappers, we ment tree (for instance, [35, 25, 22, 29, 3, 2, 26]). The ideally mean the process of interactively defining a latter category of work thus assumes that systems may wrapper from one (or few) example document(s) using make use of an existing HTML parser as a front end. mainly \mouse clicks", supported by a strong and intu- Robust wrappers are easier to program using a itive design metaphor. During this visual process, the wrapper programming language that models docu- wrapper program should be automatically generated ments as pre-parsed document trees rather than as text and should not actually require the human designer strings. Writing a fully standards-compliant HTML to know the wrapper programming language. Visual parser is a substantial task, which should not have to wrapping is now a reality supported by several imple- be redone from scratch for each wrapper being created. mented systems (cf. XWrap [25], W4F [35], and Lixto The use of an existing parser allows the wrapper im- [2]), however with varying thoroughness. plementor to focus on the essentials of each wrapping One may thus want to look for a wrapping language task and to work on a higher, more user-friendly level. over HTML document trees that Simplifications such as this one are important, since Web service designers often face the task of wrapping (i) has a solid theoretical foundation, a large number of Web sites. In order to provide a (ii) provides a good trade-off between complexity and useful Web service, the information from a significant the number of practical wrappers that can be ex- number of source sites relevant to the domain of the pressed, service has to be integrated and made accessible in a (iii) is easy to use as a wrapper programming lan- uniform manner. Otherwise, a Web service may fail to guage, and attract the acceptance of the users it is intended for. (iv) is suitable for being incorporated into visual tools, Moreover, Web page layouts may be subject to fre- since ideally all constructs of a wrapping lan- quent change. This is often intentional { to discourage guage can be realized through corresponding vi- screen-scraping wrapper access and to force humans to sual primitives. personally visit the sites. In such a case wrapper pro- 1For example, it is known that even regular string languages grams may need to be adapted. cannot be learned from positive examples only [12]. 2A promising direction of future work on Web wrapping may ∗ Database Principles Column. Column editor: Leonid be to combine visual specification with machine learning tech- Libkin, Department of Computer Science, University of niques, in order to make the visual specification process more Toronto, Toronto, Ontario M5S 3H5, Canada. E-mail: \intelligent" and at the same time guide the learning process to [email protected]. reduce the number of examples needed to learn a wrapper. Clearly, languages which do not have the right ex- pressive power and computational properties cannot be considered satisfactory, even if wrappers are easy P P to define. A few words on the \right expressiveness" Q Q Q Q of a wrapper programming language are in order here. Throughout the literature, the scope of wrapping is a conceptually limited one (see e.g. [11, 20]). Informa- R R R R R R tion systems architectures that employ wrapping usu- ally consist of at least two layers, a lower one that is re- (a) (b) stricted to extracting relevant data from data sources and making them available in a coherent representa- Figure 1: Tree annotated with predicates P , Q, and tion using the data model supported by the higher R defined by information extraction functions (a), and layer, and a higher layer in which data transformation wrapping result (b). (Original node labels of the input and integration tasks are performed which are neces- tree are not shown.) sary to fuse syntactically coherent data from distinct node-selecting queries on trees (see e.g. [33, 32, 16, 21]; sources in a semantically coherent manner. With the [36] acknowledges the role of MSO but argues for even term wrapping we refer to the lower, syntactic integra- stronger languages). In our experience, when con- tion layer. The higher, semantic integration layer will sidering a wrapping system that lacks this expressive not be further considered in this article. Therefore, power, it is usually quite easy to find real-life wrapping a wrapper is assumed to extract relevant data from a problems that cannot be handled (see also the related possibly poorly structured source and to put it into the discussion on MSO expressiveness and node-selecting desired representation formalism by applying a num- queries in [21]). ber of transformational changes close to the minimum In this article, we discuss monadic datalog over possible. A wrapping language that permits arbitrary trees, a simple form of the logic-based language data- data transformations may be considered overkill. log, as a wrapper programming language. As argued In this article, we provide a survey of the (logic- for in detail in [16], monadic datalog is the first and based) approch to visual wrapping that has been in- currently only language which satisfies desiderata (i) troduced in [2, 3, 16] and has been implemented in a to (iv) raised above. commercial software product, the Lixto Visual Wrap- The article is structured as follows. We start with per [26]. The core notion that this wrapping approach preliminaries regarding trees and MSO in Section 2 is based on is that of an information extraction func- and introduce monadic datalog in Section 3. In Sec- tion, which takes a labeled unranked tree (representing tion 4, we discuss the complexity of monadic datalog a Web document) and returns a subset of its nodes. A over trees and its expressive power. Interestingly, mo- wrapper is a program which implements one or several nadic datalog is equivalent to MSO in its ability to such functions, and thereby assigns unary predicates to express unary queries on trees. Monadic datalog can document tree nodes. Based on these predicate assign- be evaluated in linear time both in the size of the data ments and the structure of the input document viewed and the query, given that tree structures are appropri- as a tree, a new tree can be computed as the result of ately represented. We also define a simple normal form the information extraction process in a natural way, for monadic datalog over trees, TMNF, to which any along the lines of the input tree but using the new la- monadic datalog program over trees can be mapped in bels and omitting nodes that have not been relabeled. linear time. In Section 5, we discuss monadic datalog (See Figure 1 for an example of such a transforma- as a wrapper programming language. Finally, in Sec- tion.) That way, we can take a tree, re-label its nodes, tion 6, we look at the specification of wrappers based and declare some of them as irrelevant, but we cannot on monadic datalog without requiring the wrapper im- significantly transform its original structure. This co- plementor to deal with or know datalog, by an entirely incides with the intuition that a wrapper may change visual process that works on example Web documents. the presentation of relevant information, its packaging We conclude with Section 7. or data model (which does not apply in the case of Web wrapping), but does not handle substantial data 2 Tree Structures transformation tasks. We believe that this captures the essence of wrapping. Trees are defined in the normal way and have at least We assume unary queries in monadic second-order one node. We assume that the children of each node are in some fixed order. Each node has a label taken logic (MSO) over trees as our expressiveness yardstick 3 for information extraction functions. MSO over trees from a finite nonempty set of symbols Σ, the alpha- is well-understood theory-wise [38, 7, 5, 8] (see also bet. We consider only unranked finite trees, which cor- [39]) and quite expressive. In fact, it is considered by 3The finite alphabet choice is discussed in more detail below, many as the language of choice for defining expressive in Remark 2.1.
Recommended publications
  • Javascript and the DOM
    Javascript and the DOM 1 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento The web architecture with smart browser The web programmer also writes Programs which run on the browser. Which language? Javascript! HTTP Get + params File System Smart browser Server httpd Cgi-bin Internet Query SQL Client process DB Data Evolution 3: execute code also on client! (How ?) Javascript and the DOM 1- Adding dynamic behaviour to HTML 3 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento Example 1: onmouseover, onmouseout <!DOCTYPE html> <html> <head> <title>Dynamic behaviour</title> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> </head> <body> <div onmouseover="this.style.color = 'red'" onmouseout="this.style.color = 'green'"> I can change my colour!</div> </body> </html> JAVASCRIPT The dynamic behaviour is on the client side! (The file can be loaded locally) <body> <div Example 2: onmouseover, onmouseout onmouseover="this.style.background='orange'; this.style.color = 'blue';" onmouseout=" this.innerText='and my text and position too!'; this.style.position='absolute'; this.style.left='100px’; this.style.top='150px'; this.style.borderStyle='ridge'; this.style.borderColor='blue'; this.style.fontSize='24pt';"> I can change my colour... </div> </body > JavaScript is event-based UiEvents: These event objects iherits the properties of the UiEvent: • The FocusEvent • The InputEvent • The KeyboardEvent • The MouseEvent • The TouchEvent • The WheelEvent See https://www.w3schools.com/jsref/obj_uievent.asp Test and Gym JAVASCRIPT HTML HEAD HTML BODY CSS https://www.jdoodle.com/html-css-javascript-online-editor/ Javascript and the DOM 2- Introduction to the language 8 Introduzione alla programmazione web – Marco Ronchetti 2020 – Università di Trento JavaScript History • JavaScript was born as Mocha, then “LiveScript” at the beginning of the 94’s.
    [Show full text]
  • MSD FATFS Users Guide
    Freescale MSD FATFS Users Guide Document Number: MSDFATFSUG Rev. 0 02/2011 How to Reach Us: Home Page: www.freescale.com E-mail: [email protected] USA/Europe or Locations Not Listed: Freescale Semiconductor Technical Information Center, CH370 1300 N. Alma School Road Chandler, Arizona 85224 +1-800-521-6274 or +1-480-768-2130 [email protected] Europe, Middle East, and Africa: Information in this document is provided solely to enable system and Freescale Halbleiter Deutschland GmbH software implementers to use Freescale Semiconductor products. There are Technical Information Center no express or implied copyright licenses granted hereunder to design or Schatzbogen 7 fabricate any integrated circuits or integrated circuits based on the 81829 Muenchen, Germany information in this document. +44 1296 380 456 (English) +46 8 52200080 (English) Freescale Semiconductor reserves the right to make changes without further +49 89 92103 559 (German) notice to any products herein. Freescale Semiconductor makes no warranty, +33 1 69 35 48 48 (French) representation or guarantee regarding the suitability of its products for any particular purpose, nor does Freescale Semiconductor assume any liability [email protected] arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation consequential or Japan: incidental damages. “Typical” parameters that may be provided in Freescale Freescale Semiconductor Japan Ltd. Semiconductor data sheets and/or specifications can and do vary in different Headquarters applications and actual performance may vary over time. All operating ARCO Tower 15F parameters, including “Typicals”, must be validated for each customer 1-8-1, Shimo-Meguro, Meguro-ku, application by customer’s technical experts.
    [Show full text]
  • OSLC Architecture Management Specification 3.0 Project Specification Draft 01 17 September 2020
    Standards Track Work Product OSLC Architecture Management Specification 3.0 Project Specification Draft 01 17 September 2020 This stage: https://docs.oasis-open-projects.org/oslc-op/am/v3.0/psd01/architecture-management-spec.html (Authoritative) https://docs.oasis-open-projects.org/oslc-op/am/v3.0/psd01/architecture-management-spec.pdf Previous stage: N/A Latest stage: https://docs.oasis-open-projects.org/oslc-op/am/v3.0/architecture-management-spec.html (Authoritative) https://docs.oasis-open-projects.org/oslc-op/am/v3.0/architecture-management-spec.pdf Latest version: https://open-services.net/spec/am/latest Latest editor's draft: https://open-services.net/spec/am/latest-draft Open Project: OASIS Open Services for Lifecycle Collaboration (OSLC) OP Project Chairs: Jim Amsden ([email protected]), IBM Andrii Berezovskyi ([email protected]), KTH Editor: Jim Amsden ([email protected]), IBM Additional components: This specification is one component of a Work Product that also includes: OSLC Architecture Management Version 3.0. Part 1: Specification (this document). architecture-management- spec.html OSLC Architecture Management Version 3.0. Part 2: Vocabulary. architecture-management-vocab.html OSLC Architecture Management Version 3.0. Part 3: Constraints. architecture-management-shapes.html OSLC Architecture Management Version 3.0. Machine Readable Vocabulary Terms. architecture-management- vocab.ttl OSLC Architecture Management Version 3.0. Machine Readable Constraints. architecture-management-shapes.ttl architecture-management-spec Copyright © OASIS Open 2020. All Rights Reserved. 17 September 2020 - Page 1 of 19 Standards Track Work Product Related work: This specification is related to: OSLC Architecture Management Specification Version 2.0.
    [Show full text]
  • Towards the Second Edition of ISO 24707 Common Logic
    Towards the Second Edition of ISO 24707 Common Logic Michael Gr¨uninger(with Tara Athan and Fabian Neuhaus) Miniseries on Ontologies, Rules, and Logic Programming for Reasoning and Applications January 9, 2014 Gr¨uninger ( Ontolog Forum) Common Logic (ISO 24707) January 9, 2014 1 / 20 What Is Common Logic? Common Logic (published as \ISO/IEC 24707:2007 | Information technology Common Logic : a framework for a family of logic-based languages") is a language based on first-order logic, but extending it in several ways that ease the formulation of complex ontologies that are definable in first-order logic. Gr¨uninger ( Ontolog Forum) Common Logic (ISO 24707) January 9, 2014 2 / 20 Semantics An interpretation I consists of a set URI , the universe of reference a set UDI , the universe of discourse, such that I UDI 6= ;; I UDI ⊆ URI ; a mapping intI : V ! URI ; ∗ a mapping relI from URI to subsets of UDI . Gr¨uninger ( Ontolog Forum) Common Logic (ISO 24707) January 9, 2014 3 / 20 How Is Common Logic Being Used? Open Ontology Repositories COLORE (Common Logic Ontology Repository) colore.oor.net stl.mie.utoronto.ca/colore/ontologies.html OntoHub www.ontohub.org https://github.com/ontohub/ontohub Gr¨uninger ( Ontolog Forum) Common Logic (ISO 24707) January 9, 2014 4 / 20 How Is Common Logic Being Used? Ontology-based Standards Process Specification Language (ISO 18629) Date-Time Vocabulary (OMG) Foundational UML (OMG) Semantic Web Services Framework (W3C) OntoIOp (ISO WD 17347) Gr¨uninger ( Ontolog Forum) Common Logic (ISO 24707) January 9, 2014
    [Show full text]
  • APL Literature Review
    APL Literature Review Roy E. Lowrance February 22, 2009 Contents 1 Falkoff and Iverson-1968: APL1 3 2 IBM-1994: APL2 8 2.1 APL2 Highlights . 8 2.2 APL2 Primitive Functions . 12 2.3 APL2 Operators . 18 2.4 APL2 Concurrency Features . 19 3 Dyalog-2008: Dyalog APL 20 4 Iverson-1987: Iverson's Dictionary for APL 21 5 APL Implementations 21 5.1 Saal and Weiss-1975: Static Analysis of APL1 Programs . 21 5.2 Breed and Lathwell-1968: Implementation of the Original APLn360 25 5.3 Falkoff and Orth-1979: A BNF Grammar for APL1 . 26 5.4 Girardo and Rollin-1987: Parsing APL with Yacc . 27 5.5 Tavera and other-1987, 1998: IL . 27 5.6 Brown-1995: Rationale for APL2 Syntax . 29 5.7 Abrams-1970: An APL Machine . 30 5.8 Guibas and Wyatt-1978: Optimizing Interpretation . 31 5.9 Weiss and Saal-1981: APL Syntax Analysis . 32 5.10 Weigang-1985: STSC's APL Compiler . 33 5.11 Ching-1986: APL/370 Compiler . 33 5.12 Ching and other-1989: APL370 Prototype Compiler . 34 5.13 Driscoll and Orth-1986: Compile APL2 to FORTRAN . 35 5.14 Grelck-1999: Compile APL to Single Assignment C . 36 6 Imbed APL in Other Languages 36 6.1 Burchfield and Lipovaca-2002: APL Arrays in Java . 36 7 Extensions to APL 37 7.1 Brown and Others-2000: Object-Oriented APL . 37 1 8 APL on Parallel Computers 38 8.1 Willhoft-1991: Most APL2 Primitives Can Be Parallelized . 38 8.2 Bernecky-1993: APL Can Be Parallelized .
    [Show full text]
  • Using Common Logic
    Using Common Logic = Mapping other KR techniques to CL. = Eliminating dongles. = The zero case Pat Hayes Florida IHMC John Sowa talked about this side. Now we will look more closely at this side, using the CLIF dialect. Wild West Syntax (CLIF) Any character string can be a name, and any name can play any syntactic role, and any name can be quantified. Anything that can be named can also be the value of a function. All this gives a lot of freedom to say exactly what we want, and to write axioms which convert between different notational conventions. It also allows many other 'conventional' notations to be transcribed into CLIF directly and naturally. It also makes CLIF into a genuine network logic, because CLIF texts retain their full meaning when transported across a network, archived, or freely combined with other CLIF texts. (Unlike OWL-DL and GOFOL) Mapping other notations to CLIF. 1. Description Logics Description logics provide ways to define classes in terms of other classes, individuals and properties. All people who have at least two sons enlisted in the US Navy. <this class> owl:intersectionOf [:Person _:x] _:x rdf:type owl:Restriction _:x owl:minCardinality "2"^^xsd:number _:x owl:onProperty :hasSon _:x owl:toClass _:y _:y rdf:type owl:Restriction _:y owl:hasValue :USNavy _:y owl:onProperty :enlistedIn Mapping other notations to CLIF. 1. Description Logics Classes are CL unary relations, properties are CL binary relations. So DL operators are functions from relations to other relations. (owl:IntersectionOf Person (owl:minCardinality 2 hasSon (owl:valueIs enlistedIn USNavy))) (AND Person (MIN 2 hasSon (VAL enlistedIn USNavy))) ((AND Person (MIN 2 hasSon (VAL enlistedIn USNavy))) Harry) (= 2ServingSons (AND Person (MIN 2 hasSon (VAL enlistedIn USNavy)))) (2ServingSons Harry) (NavyPersonClassification 2ServingSons) (BackgroundInfo 2ServingSons 'Classification introduced in 2003 for public relation purposes.') Mapping other notations to CLIF.
    [Show full text]
  • Active Learning – Concept Maps Leighann Tomaswick & Jennifer Marcinkiewicz March 30, 2018 Cite This Resource: Tomaswick, L
    Active Learning – Concept Maps LeighAnn Tomaswick & Jennifer Marcinkiewicz March 30, 2018 Cite this resource: Tomaswick, L. and Marcinkiewicz, J. (2018). Active Learning – Concept Maps. Kent State University Center for Teaching and Learning. Retrieved [todaysdate] from [insert hyperlink] What is a Concept Map? . A concept map is a visualization of knowledge that is organized by the relationships between the topics. At its core, it is made of concepts that are connected together by lines (or arrows) that are labelled with the relationship between the concepts. The concepts are usually found in circles or boxes. Concept maps are a cross disicplinary active learning technique that help students manage concepts into sub-concepts, synthesize information, see a larger picture and develop higher-order thinking skills and strategies (Lee et al, 2013). Concept maps can summarize a part of a book, connect historical events, describe how a business is run, develop a personal care plan or patient treatment, describe how the body works, or the interconnectedness of a wetland’s ecology. A concept map may be used as a pre- class assignment, small group activity, whole class activity or a way to summarize the information at the end of a class or project. Instead of reading explanations from students, concepts maps provide a way to quickly look through students thinking process and understanding of concepts. Introduction . Novak developed concepts maps as a way to prompt meaningful learning (Novak and Gowen, 1984). They were built on Ausubel’s cognitive psychology, learning happens with the assimiluation of new concepts and proposition into exisiting concepts (Novak & Musonda, 1991).
    [Show full text]
  • Implementing Consolidated-Clinical Document Architecture (C-CDA) for Meaningful Use Stage 2
    Implementing Consolidated-Clinical Document Architecture (C-CDA) for Meaningful Use Stage 2 ONC Implementation and Testing Division April 5, 2013 Remember how healthcare data was exchanged prior to Electronic Health Records (EHRs)? Office of the National Coordinator for 2 Health Information Technology Healthcare Data Exchange (Pre-EHRs) Vast amounts of patient data collected by clinicians Medical information such as vitals, orders, prescriptions, lab notes, discharge summaries, etc. dictated or recorded by hand All of this clinical data was stored as paper records (documents) at each point of care If patient health records needed to be shared between providers, they usually required manual exchange (e.g. fax, “snail mail”) • Coordination of care between providers slow, costly; patient outcomes inconsistent • Duplicative healthcare services (e.g. labs imaging) frequent Office of the National Coordinator for 3 Health Information Technology What do the 2014 Edition EHR Certification Criteria and Meaningful Use Stage 2 objectives say about Health Information Exchange? Office of the National Coordinator for 4 Health Information Technology CMS & ONC Rules: 2014 Edition EHR Certification Criteria & MU2 ONC: Standards, Implementation Specifications & Certification Criteria (S&CC) 2014 Edition • Specifies capabilities and functions that Complete EHRs and EHR Modules must perform electronically in order to be certified under the ONC HIT Certification Program Reference: ONC Health Information Technology : Standards, Implementation Specifications,
    [Show full text]
  • Answers to Exercises
    Answers to Exercises A bird does not sing because he has an answer, he sings because he has a song. —Chinese Proverb Intro.1: abstemious, abstentious, adventitious, annelidous, arsenious, arterious, face- tious, sacrilegious. Intro.2: When a software house has a popular product they tend to come up with new versions. A user can update an old version to a new one, and the update usually comes as a compressed file on a floppy disk. Over time the updates get bigger and, at a certain point, an update may not fit on a single floppy. This is why good compression is important in the case of software updates. The time it takes to compress and decompress the update is unimportant since these operations are typically done just once. Recently, software makers have taken to providing updates over the Internet, but even in such cases it is important to have small files because of the download times involved. 1.1: (1) ask a question, (2) absolutely necessary, (3) advance warning, (4) boiling hot, (5) climb up, (6) close scrutiny, (7) exactly the same, (8) free gift, (9) hot water heater, (10) my personal opinion, (11) newborn baby, (12) postponed until later, (13) unexpected surprise, (14) unsolved mysteries. 1.2: A reasonable way to use them is to code the five most-common strings in the text. Because irreversible text compression is a special-purpose method, the user may know what strings are common in any particular text to be compressed. The user may specify five such strings to the encoder, and they should also be written at the start of the output stream, for the decoder’s use.
    [Show full text]
  • Video - Dive Into HTML5
    Video - Dive Into HTML5 You are here: Home ‣ Dive Into HTML5 ‣ Video on the Web ❧ Diving In nyone who has visited YouTube.com in the past four years knows that you can embed video in a web page. But prior to HTML5, there was no standards- based way to do this. Virtually all the video you’ve ever watched “on the web” has been funneled through a third-party plugin — maybe QuickTime, maybe RealPlayer, maybe Flash. (YouTube uses Flash.) These plugins integrate with your browser well enough that you may not even be aware that you’re using them. That is, until you try to watch a video on a platform that doesn’t support that plugin. HTML5 defines a standard way to embed video in a web page, using a <video> element. Support for the <video> element is still evolving, which is a polite way of saying it doesn’t work yet. At least, it doesn’t work everywhere. But don’t despair! There are alternatives and fallbacks and options galore. <video> element support IE Firefox Safari Chrome Opera iPhone Android 9.0+ 3.5+ 3.0+ 3.0+ 10.5+ 1.0+ 2.0+ But support for the <video> element itself is really only a small part of the story. Before we can talk about HTML5 video, you first need to understand a little about video itself. (If you know about video already, you can skip ahead to What Works on the Web.) ❧ http://diveintohtml5.org/video.html (1 of 50) [6/8/2011 6:36:23 PM] Video - Dive Into HTML5 Video Containers You may think of video files as “AVI files” or “MP4 files.” In reality, “AVI” and “MP4″ are just container formats.
    [Show full text]
  • Application of Portable CDA for Secure Clinical-Document Exchange
    J Med Syst (2010) 34:531–539 DOI 10.1007/s10916-009-9266-9 ORIGINAL PAPER Application of Portable CDA for Secure Clinical-document Exchange Kuo-Hsuan Huang & Sung-Huai Hsieh & Yuan-Jen Chang & Feipei Lai & Sheau-Ling Hsieh & Hsiu-Hui Lee Received: 14 November 2008 /Accepted: 12 February 2009 /Published online: 25 February 2009 # Springer Science + Business Media, LLC 2009 Abstract Health Level Seven (HL7) organization published the CDA—the Portable CDA. This allows the physician to the Clinical Document Architecture (CDA) for exchanging retrieve the patient’s medical record stored in a portal device, documents among heterogeneous systems and improving but not through the Internet in real time. The security and medical quality based on the design method in CDA. In privacy of CDA data will also be considered. practice, although the HL7 organization tried to make medical messages exchangeable, it is still hard to exchange Keywords Health level seven . medical messages. There are many issues when two hospitals Clinical document architecture . Privacy. want to exchange clinical documents, such as patient privacy, Information protection network security, budget, and the strategies of the hospital. In this article, we propose a method for the exchange and sharing of clinical documents in an offline model based on Introduction Even within a city, a patient may go to different hospitals, K.-H. Huang (*) Department of Computer Science and Engineering, and the personal medical records are distributed across each Tatung University, hospital. Medical errors can be avoided if the related Taipei, Taiwan medical information of the patient can be retrieved correctly e-mail: [email protected] and efficiently.
    [Show full text]
  • File Allocation Table - Wikipedia, the Free Encyclopedia Page 1 of 22
    File Allocation Table - Wikipedia, the free encyclopedia Page 1 of 22 File Allocation Table From Wikipedia, the free encyclopedia File Allocation Table (FAT) is a file system developed by Microsoft for MS-DOS and is the primary file system for consumer versions of Microsoft Windows up to and including Windows Me. FAT as it applies to flexible/floppy and optical disc cartridges (FAT12 and FAT16 without long filename support) has been standardized as ECMA-107 and ISO/IEC 9293. The file system is partially patented. The FAT file system is relatively uncomplicated, and is supported by virtually all existing operating systems for personal computers. This ubiquity makes it an ideal format for floppy disks and solid-state memory cards, and a convenient way of sharing data between disparate operating systems installed on the same computer (a dual boot environment). The most common implementations have a serious drawback in that when files are deleted and new files written to the media, directory fragments tend to become scattered over the entire disk, making reading and writing a slow process. Defragmentation is one solution to this, but is often a lengthy process in itself and has to be performed regularly to keep the FAT file system clean. Defragmentation should not be performed on solid-state memory cards since they wear down eventually. Contents 1 History 1.1 FAT12 1.2 Directories 1.3 Initial FAT16 1.4 Extended partition and logical drives 1.5 Final FAT16 1.6 Long File Names (VFAT, LFNs) 1.7 FAT32 1.8 Fragmentation 1.9 Third party
    [Show full text]