Using XML and PDF Together

Using XML and PDF Together

Using XML and PDF Together why you don’t necessarily have to choose Leonard Rosenthol Director of Software Development Appligent, Inc. Copyright©1999-2001, Appligent, Inc. Overview · Introduction to XML · Quick review of PDF · XML & PDF meta-data · XML & PDF forms · XML & PDF structure · XML & PDF content · XML & PDF creation Copyright©1999-2001, Appligent, Inc. You are here because... · You’re interested in XML, XHTML & SVG and all the hype surrounding it · You’re currently working with PDF and XML, but not together · You were already awake, but lunch isn’t till noon, and had to find something to kill time · You’re a friend of mine, and wanted to heckle Copyright©1999-2001, Appligent, Inc. How I do things · You already have a draft version of this document in your proceedings, so you shouldn’t need to take too many notes · A copy of the final document is on Appligent's website (<http://www.appligent.com>) for your downloading pleasure · Although I’ve left time at the end for Q & A, I’m more than happy to take questions at any time. Copyright©1999-2001, Appligent, Inc. XML eXtensible Markup Language · XML is really a specification that allows for specific markup languages to be created for specific purposes all within the same compatible syntax (but not the same tag set!) · Although the name clearly gives you this impression, unlike HTML, XML itself has no tags to learn. · It provides a standard organization and set of rules for building any markup language · separation of data vs. presentation · hierarchical structure · human readable Copyright©1999-2001, Appligent, Inc. XML History How we got here · SGML · Standardized General Markup Language · Developed in 1974 by Charles F. Goldfarb & others as a means to create a single basis for any type of markup language · ISO Standard as of 1986 · HTML · HyperText Markup Language · Developed by Tim Berners-Lee as part of a research project to enable sharing of data over the Internet. · Presentation NOT data oriented ! · HTML has a fixed set of tags. · HTML 3.2 was standardized by the W3C in 1996 Copyright©1999-2001, Appligent, Inc. XML Specification · XML 1.0 specification can be found at <http://www.w3.org/TR/1998/RECC-xml-19980210.h tml>. · Though an even better version called the “Annotated XML Specification” by Tim Bray can be found at <http://www.xml.com/axml/testasml.htm> Copyright©1999-2001, Appligent, Inc. Design Principles of XML · XML should be straightforwardly usable over the Internet · XML shall support a wide variety of applications · XML shall be compatible with SGML · It shall be easy to write programs that process XML documents · The number of optional elements in XML is to be kept to an absolute minimum -ideally zero Copyright©1999-2001, Appligent, Inc. Design Principles of XML (cont.) · XML documents should be human readable and reasonably clear · The XML design should be prepared quickly · The design of XML shall be formal and concise · XML documents shall be easy to create · Terseness in XML markup is of minimal importance Copyright©1999-2001, Appligent, Inc. HTML vs. XML · No fixed tag set · ALL tags have both a start and end · Can’t just do <P> - have to do <P></P> · <IMG xxx /> · Tags must be perfectly nested · <B><I>foo</B></I> - NOT! · Capitalization of tags is significant · <BOLD> text </bold> - NOT! · Whitespace is significant · Spaces, tabs, etc. are maintained! Copyright©1999-2001, Appligent, Inc. XML Example <?XML version=“1.0” encoding=“UTF-8”?> <!DOCTYPE customerDB SYSTEM “customerDB.dtd”> <!-- Customer DataBase for some unnamed company --> <DOCUMENT> <CUSTOMER> <NAME> <LASTNAME>Edwards</LASTNAME> <FIRSTNAME>Britta</FIRSTNAME> </NAME> <DATE>April 17, 1998</DATE> <ORDERS> <ITEM> <PRODUCT>Cucumber</PRODUCT> <NUMBER>5</NUMBER> <PRICE>1.25</PRICE> </ITEM> <ITEM> <PRODUCT>Lettuce</PRODUCT> <NUMBER>2</NUMBER> <PRICE>.98</PRICE> </ITEM> </ORDERS> </CUSTOMER> </DOCUMENT> Copyright©1999-2001, Appligent, Inc. XML Example 2 <?XML version=“1.0” encoding=“UTF-8”?> <!-- Minneapolis Airline Schedule - January 3rd, 1998 --> <SCHEDULE> <AIRLINE>NorthWest</AIRLINE> <FLIGHT> <NUMBER>449</NUMBER> <STATUS>Cancelled</STATUS> </FLIGHT> <FLIGHT> <NUMBER>640</NUMBER> <STATUS depart=“0100”>Delayed</STATUS> </FLIGHT> <AIRLINE>TWA</AIRLINE> <FLIGHT> <NUMBER>1010</NUMBER> <STATUS gate=“17 Gold”>On Time</STATUS> </FLIGHT> </SCHEDULE> Copyright©1999-2001, Appligent, Inc. PDF “the reliable digital master” · final form presentation · container for associated materials · multimedia · interactivity · security (encryption) · authenticity (digital signatures) Copyright©1999-2001, Appligent, Inc. Vive La Differance · Content vs. Presentation · Although new XML grammars are appearing that start to move into PDF’s “space” (SVG, SMIL, XML-Sigs), its true strength is in structure and data exchange. · All in one · XML provides wonderful tools (XPath, XLink, XPointer) for linking content around the Web, but has no provision for “bundling it all together into a single package” · Size does matter · PDF files will always been smaller than XML, given their ability to incorporate binary data and selective compression Copyright©1999-2001, Appligent, Inc. Let’s work together · XML & PDF forms · XML & PDF meta-data · XML & PDF structure & content · XML & PDF creation Copyright©1999-2001, Appligent, Inc. Forms · FDF: Forms Data Format · More details on FDF · Sample FDF · That’s just like XML · Good-bye fair FDF, we knew you well · XForms · XForms Requirements · Sample XForms · XML form filling Copyright©1999-2001, Appligent, Inc. FDF: Forms Data Format · Documented in Appendix H of the PDF 1.3 Specification · FDF is used when submitting Form data to a server, receiving the response, and incorporating it into the Form. It can also be used to generate (i.e. “ export” ) stand-alone files containing Form data that can be stored, transmitted electronically (e.g., via Email), and imported back into the corresponding Form. · FDF can also be used to control more of the document structure. That is, constructs within FDF allow it to control which Acrobat Forms are used in the creation of a new PDF document. This functionality can be used to create complex documents dynamically. · FDF is also used to define a container for annotations that are separate from the PDF document to which the annotations apply. Copyright©1999-2001, Appligent, Inc. More details on FDF · FDF is based on PDF, and uses the same syntax and set of basic object types as PDF. · FDF also has the same file structure as PDF, except for the fact that the cross-reference is optional. · The document structure is much simpler than PDF, since the body of an FDF document consists of only one required object. · Objects in FDF can only be of generation 0; no two objects can have the same object number, and FDF files cannot have updates appended to them. · The value of the Length attribute in the dictionary of any stream object appearing inside an FDF document must be a direct object. Copyright©1999-2001, Appligent, Inc. Sample FDF %FDF-1.2 1 0 obj << /FDF << /Fields 2 0 R >> >> endobj 2 0 obj [ << /T (name) /V (Virginia Gavin) >> << /T (birth) /V (9/16/59) >> << /T (sex) /V (F) >> << /T (address) /V (215 E Providence Rd. Aldan, PA 19018 610-284-4006) >> << /T (essnum) /V (555-222-1512) >> << /T (employer) /V (Digital Applications, Inc.) >> ] endobj trailer << /Root 1 0 R >> %%EOF Copyright©1999-2001, Appligent, Inc. That’s just like XML <?xml version=”1.0” encoding=”UTF-8”?> <xfdf xmlns=”http://www.adobe.com/std/schema/xfdf” xml:space=”preserve”> <fields> <field name=”name”> <value>Virginia Gavin</value> </field> <field name=”birth”> <value>9/16/59</value> </field> <field name=”sex”> <value>F</value> </field> <field name=”address”> <value>215 E Providence Rd. Aldan, PA 19018 610-284-4006</value> </field> <field name=”ssnum”> <value>555-222-1512</value> </field> <field name=”employer”> <value>Digital Applications, Inc.</value> </field> </fields> </xfdf> Copyright©1999-2001, Appligent, Inc. Good-bye fair FDF, we knew you well · Adobe has made it clear that the future of data exchange for their products will be XML-based syntax · Therefore it’s probably a good bet that future versions of Acrobat will use and XML-based grammar (syntax) rather than FDF to represent form data · There are a number of limitations with FDF and Acrobat forms in general that can be removed by moving the XML - not the least of which is the ability to have both form field names AND values in Unicode! Copyright©1999-2001, Appligent, Inc. XForms the future of Web Forms · The replacement for the current HTML form technology. · Based on XML and separates data, logic and presentation · Currently a W3C recommendation as part of the XHTML working group. Copyright©1999-2001, Appligent, Inc. XForms Requirements · Defined in XML, usable in any XML grammar · Migration from HTML 4 · Ease of Authoring · Separate Purpose from Presentation · Integrate with DOM · Device and Application Independence · Unicode, Internationalization, and Region Independence · Modular Construction Copyright©1999-2001, Appligent, Inc. Sample XForms <XFA> <Subform Name=”order”> <Proto> <Format ID=”telno”> <Picture>999-999-9999</Picture> </Format> </Proto> <Subform Name=”customer”> <Field Name=”name”> <Caption> <Value> <Text>Your name:</Text> </Value> </Caption> </Field> <Field Name=”street” W=”40ch”> <Caption> <Value> <Text>Street:</Text> </Value> </Caption> </Field> <Field Name=”phone” W=”40ch”> <Caption> <Value> <Text>Phone:</Text> </Value> </Caption> <Format Use=”#telno”/> </Field> </Subform> <!-- end customer subform --> </Subform> </XFA> Copyright©1999-2001, Appligent, Inc. Meta-data · What is Meta-data? · PDF’s “Info Dictionary” · XML in the Info Dictionary Copyright©1999-2001, Appligent, Inc. What is Meta-data? · extra information (outside of content) contained in a document that provides additional information about the document · document history & information · description, keywords · digital rights information · and just about anything else you can think of! Copyright©1999-2001, Appligent, Inc. PDF’s “Info Dictionary” · a set of defined entries · Author, Title, Keywords, CreationDate, etc.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    51 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us