Processing XML and TEI Into What? a Free-For-All Pair of Workshops
Total Page:16
File Type:pdf, Size:1020Kb
Processing XML and TEI into What? A Free-for-all Pair of Workshops DHSI 2021, Course #12: Elisa Beshero-Bondar Coursepack contents (coauthored by Elisa Beshero-Bondar and David J. Birnbaum) A. Contents ..................................................................................................................................................................... 1 B. About this course ................................................................................................................................................... 3 C. Syllabus (2019) ......................................................................................................................................................................5 D. Resources (bibliography and links) ............................................................................................................. 8 E. Exercises and tutorials (links) ..................................................................................................................... 10 F. XPath 1. What can XPath do for me? .................................................................................................................... 11 2. The XPath functions we use most ....................................................................................................... 21 G. Regular expressions (regex) 1. Autotagging with regular expressions (regex) ............................................................................. 25 H. XSLT 1. Introduction to XSLT ................................................................................................................................. 33 2. Attribute value templates (AVT) ......................................................................................................... 41 3. The XSLT identity transformation a. Tutorial ..................................................................................................................................................... 45 b. Exercise .................................................................................................................................................... 49 4. Modal XSLT .................................................................................................................................................... 52 5. XSLT, part 2: advanced features .......................................................................................................... 55 6. Using <xsl:analyze-string> ..................................................................................................................... 63 I. Schematron 1. Guide to schema writing with Schematron .................................................................................... 66 2. Validating references with Schematron........................................................................................... 69 3. Coding with unique identifiers and Schematron ......................................................................... 73 J. What’s new in XSLT 3.0 and XPath 3.1 (under development) ...................................................... 76 K. Obdurodon exercises and tests 1. Regular expressions (regex) .................................................................................................................. 81 2. XPath ................................................................................................................................................................. 95 3. XQuery ........................................................................................................................................................... 103 4. XSLT................................................................................................................................................................ 108 5. Schematron ................................................................................................................................................. 129 L. Newtfire exercises and tests 1. Regular expressions (regex) ............................................................................................................... 134 2. XPath .............................................................................................................................................................. 136 3. XQuery ........................................................................................................................................................... 144 4. XSLT................................................................................................................................................................ 147 5. Schematron ................................................................................................................................................. 163 M. Mulberry guides and quick references 1. Guide to using the Oxygen XML Editor (v20.0) ......................................................................... 176 2. XQuery 1.0 and XPath 2.0 functions and operators quick reference .............................. 189 3. Regular expressions in XSLT 2.0, XQuery 1.0 and XPath 2.0 quick reference ............ 191 4. ISO Schematron quick reference ...................................................................................................... 193 5. XPath 2.0 quick reference .................................................................................................................... 195 6. XQuery 1.0 quick reference ................................................................................................................. 197 7. XSLT 2.0 quick reference...................................................................................................................... 199 N. Supplemental readings 1. XQuery (Priscilla Walmsley; sample) ............................................................................................. 202 2. XPath 2.0 and XSLT 2.0 programmer’s reference (Michael Kay; table of contents) .. 232 Instructors: Elisa Beshero-Bondar and David J. Birnbaum | XPath for Document Archaeology and Project Management 3/16/19, 7)38 PM XPath for processing XML and managing projects DHSI 2019 Course 49 (Week 2, 10–14 June, 2019) Course description Syllabus Resources and references Course Pack View on GitHub Instructors: Elisa Beshero-Bondar and David J. Birnbaum Description: Learn XPath intensively and gain superpowers with XML processing! Whether you’ve recently learned XML and want to build something with it, or whether you’ve worked with XPath before but are rusty, new and experienced coders alike will benefit from our course. XPath is usually not the center of a DHSI class, and people often gain hasty “ad hoc” experience with it when learning it only along the way to doing something else. Concentrating intensively for a week on XPath will “power up” what you can do with XML, and will help you refine the way you code your documents. Our course will assist XML coders (whether beginners or experienced) with complex processing of information from markup and from plain text. Our goals are 1) to increase our participants’ confidence and fluency in reading and extracting information coded in XML archives and databases, and 2) to share strategies for systematically reviewing, designing, and building those archives and databases. Because we can “dig” latent information out of the document “strata” of texts, we think of working with XPath as something like planning an archaeology project, turning an XML project into a carefully managed digital dig site for cultural data! In our course you’ll gain experience with writing precise and powerful XPath to illuminate information that isn’t obvious on a human reading. For example, we’ll write XPath to calculate how frequently you have marked a certain phenomenon, or locate which names of persons are mentioned together in the same chapter, paragraph, sentence, stanza, footnote, or other structural unit. We’ll apply XPath to check for accuracy of text encoding—to write schema rules to manage your coding (or your project team’s coding). You will learn how XPath can help you to pull data from your documents into lists, tables, and graphic visualizations. XPath is the center of the course, but we will explore how it applies in multiple XML processing contexts so that you learn how these work similarly and how these are used, respectively, to validate documents and to transform them for publication and other reuse. Thus we devote serious, sustained attention to writing and applying XPath by surveying how it is expressed in a variety of frameworks (including XSLT, XQuery, and Schematron), with a variety of materials (including XML and plain-text documents), and involving a variety of task types (such as date arithmetic to calculate how much time elapsed between dates and string surgery to look for and manipulate patterns inside your coded elements). You’ll gain fluency with XPath expressions and https://ebeshero.github.io/UpTransformation/dhsi-XPath_CourseDescription.html Page 1 of 2 Instructors: Elisa Beshero-Bondar and David J. Birnbaum | XPath for Document Archaeology and Project Management 3/16/19, 7)38 PM patterns, including predicates, operators, functions (from the core library and user-defined), regular expressions, and other features, and we’ll practice these in different XML-related contexts, starting with XQuery, and moving to XSLT and Schematron). Whether you are an XML beginner or a more experienced coder, you’ll find that XPath will