Vector Graphics: from Postscript and Flash to SVG Steve Probets, Julius Mong, David Evans, David Brailsford
Total Page:16
File Type:pdf, Size:1020Kb
Vector Graphics: From PostScript and Flash to SVG Steve Probets, Julius Mong, David Evans, David Brailsford School of Computer Science & IT University of Nottingham (+44) (0)115 9514230 sgp|jxm|dre|[email protected] ABSTRACT standard for displaying vector graphics on the Web. However, SWF uses a binary rather than a text-based format which The XML-based specification for Scalable Vector provides few ‘hooks’ for text searchability or image extraction. Graphics(SVG), sponsored by the World Wide Web consortium, allows for compact and descriptive vector graphics for the Web. In early 1998 the World Wide Web consortium (W3C) invited proposals for a new standard for vector graphics on the Web. SVG's domain of discourse is that of graphic primitives whose Given the widespread adoption of XML[2] meta-syntax as a optional attributes express line thickness, fill patterns, text size means of expressing new functionality in Web documents it was and so on. These primitives have very different properties from to not surprising that two of the initial proposals those of the traditional document components (e.g. sections, –PGML (from paragraphs etc.) that XML is normally called upon to express. Adobe Systems Inc.) and VML (from Microsoft Inc.)–were XML based. In September 1998 these proposals were subsumed into a This paper describes a set of three tools for creating SVG, either new draft proposal called SVG (Scalable Vector Graphics) under from first principles or via the conversion of existing formats. the auspices of the W3C[3]. The SVG working group has more The ab initio generation of SVG is effected from a server-side than 20 members representing most of the major companies in CGI script, using a PERL library of drawing functions; later the graphics software industry. The SVG 1.0 proposal is now a sections highlight the problems of converting Adobe PostScript W3C proposed recommendation. It would be fair to say that the and Macromedia's Shockwave format (SWF) into SVG. graphics model of SVG owes much to PostScript, whereas the tags and their attributes have been influenced by VML. The text- Keywords based nature of SVG (and the tree structure implicit in its XML SVG, Flash, SWF, PDF, PostScript. representation) should make text search and graphics transformations relatively straightforward. 1. INTRODUCTION The remaining sections of this paper focus first of all on The World Wide Web's enormous success has been achieved by generating SVG from first principles via PERL scripts, followed marking up Web documents with a simple set of HTML tags to by a description of how PostScript and Shockwave/SWF can be denote basic structural and layout concepts such as bold converted to SVG. Firstly we focus on SVG-PL, which is a headings, tables etc. But perhaps the most striking weakness of library of Perl functions for generating ‘SVG on the fly’ from today's Web technology is that all graphic material has to be Web CGI scripts. The great advantage of creating such a library expressed in one of three raster-based formats (GIF, JPEG, from first principles is that its design can faithfully reflect the PNG). Such formats distort badly at any scaling factor other than possibilities for object attributes, and object groupings, that are 1:1, owing to the pixel-by-pixel way in which the image data is inherent in the SVG target language. stored. Moreover, these formats are innately non-searchable and We then discuss two further software tools for converting this is a severe limitation given that a common use of graphics on PostScript and Flash into SVG. In both cases the faithfulness of the Web is for representation of annotated maps and line the conversion to SVG, and the degree of code optimisation that diagrams, together with exotic characters and logotypes that lie can be achieved, depends on the degree of similarity between the outside conventional character sets. underlying graphics models of source and target languages. If In recent years Macromedia's Flash and its associated translation occurs from a less complex format into a richly Shockwave format (SWF)[1] have achieved a degree of descriptive format then, in order to make maximum use of the acceptance, via plug-in software to popular Web browsers, as a richer format, intelligence must be derived from the translation process. Bottom-up inference of intent is notoriously hard. For . example, PostScript to SVG is an obviously desirable translation Permission to make digital or hard copies of all or part ofthis ork for and yet most PostScript is in final form, making it difficult to personal or classroom use is granted ithout fee provided that copies decide which parts of a graphical object are inter-related and are not made or distributed for profit or commercial advantage and that capable of being abstracted into groups. copies bear this notice and the full citation on the first page. To copy otherise, or republish, to post on servers or to redistribute to lists, The fact that SVG is an application of XML means that the reuires prior specific permission andor a fee. purely syntactic aspects of the target language are well specified. DocEngl, November 910,2001, tlanta, eorgia, US. Copyright 135 But even within this well-defined framework the design of the The meta-syntax for specifying the DTD itself is similar to SVG tagset has to decide whether certain graphic features shall SGML, but is neither identical to it nor a proper subset of it. be elements in their own right or be hierarchically nested sub- Moreover a DTD is very limited in the amount of type elements. Thus it will be helpful to develop some background to information it can convey, either for the elements themselves or the XML meta-syntax, since this affects certain of the non- for their attributes. For these reasons XML parsers (while graphic semantic properties of SVG. continuing to accept DTDs as a possibility for tagset definition) are increasingly able to accept schemas which specify an XML 2. XML AND SGML tagset (with more powerful typing information than a DTD) XML is designed to be a simplified subset of the Standard using XML meta-syntax. Generalised Mark-up Language (SGML) which has been used in Among the dozen or so restrictions of XML compared to full the publishing community for many years. Like SGML, XML is a SGML, three notable restrictions are: standard meta-syntax . The thing that is ‘standard’ about it is not that it defines a fixed tagset but, rather, that it defines the • No tag omissions, all start and end tags must be present. alphabet and the ‘punctuation’ that are used within and around • All attribute names (c.f. ‘formal parameter names’ in the defined tags. For example, angle brackets are used to programming languages) must be fully specified and attribute delineate a tag (so the start of a paragraph might be marked by values (c.f. ‘actual parameter values’ in programming language <para> ) whereas a forward slash following an open angle function calls) must be enclosed in quotation marks. bracket is the meta-notation for an end tag (so </para> would • Tags that do not enclose embedded content, but which may denote the end of a paragraph.) By contrast, although HTML still have attributes, are allowed a special shorthand form uses SGML meta-syntax, it is not a subset of it. HTML is an which elides the end tag with the start tag. Examples of this application of SGML; it is a fixed set of tags which have been would be <BR/> and <EMPTY attributes = defined for straightforward interpretation by Web browser "allowed"/>. software. XML has its own specification[2] which sets out the allowed A document type definition (DTD) can be made available to an syntax for the tags, together with their related attributes and any SGML or XML parser and this sets out a syntax for the tags that required tag ordering constraints. will be allowed when marking up documents of a particular class. Thus, a DTD is functionally similar to a meta-syntactic XML can represent nested document structures (e.g. sections, definition for the syntax of a programming language and a full sub-sections and so on) and document elements which are SGML system is capable of creating a parser for the given sequenced (e.g. a sequence of paragraphs forming a section of a document class directly from the DTD itself (this is reminiscent document). More generally it can represent arbitrary tree of the use of a meta-syntax or ‘two-level grammar’ in languages structures, with optional attributes attached to the internal nodes, such as Algol 68). and the Document Object Model (DOM) allows the tree structure to be traversed, inspected and modified. The particular properties Although it is possible to specify, in a meta-syntax, a wide range of the DOM for SVG will be addressed in a later section. of structural options within the generated language, this is not generally regarded as a good idea; many years of compiler- 3. PROPERTIES OF SVG writing experience have persuaded language designers not to flirt Because SVG is XML-based, the files are generally easier to with anything that allows valid programs in a language to be parse than the binary format files of SWF, and there is the added either difficult to parse or to veer towards ambiguity. benefit that elements within the files are stored as text, enabling Nevertheless, full SGML has an ability, in a DTD, to specify search engines to index SVG files, or SVG components within ‘inclusions’ and ‘exclusions’, together with optional end-tag and HTML files.