Vector Graphics: From PostScript and Flash to SVG Steve Probets, Julius Mong, David Evans, David Brailsford
School of Computer Science & IT University of Nottingham (+44) (0)115 9514230 sgp|jxm|dre|[email protected] ABSTRACT standard for displaying vector graphics on the Web. However, SWF uses a binary rather than a text-based format which The XML-based specification for Scalable Vector provides few ‘hooks’ for text searchability or image extraction. Graphics(SVG), sponsored by the World Wide Web consortium, allows for compact and descriptive vector graphics for the Web. In early 1998 the World Wide Web consortium (W3C) invited proposals for a new standard for vector graphics on the Web. SVG's domain of discourse is that of graphic primitives whose Given the widespread adoption of XML[2] meta-syntax as a optional attributes express line thickness, fill patterns, text size means of expressing new functionality in Web documents it was and so on. These primitives have very different properties from to not surprising that two of the initial proposals those of the traditional document components (e.g. sections, –PGML (from paragraphs etc.) that XML is normally called upon to express. Adobe Systems Inc.) and VML (from Microsoft Inc.)–were XML based. In September 1998 these proposals were subsumed into a This paper describes a set of three tools for creating SVG, either new draft proposal called SVG (Scalable Vector Graphics) under from first principles or via the conversion of existing formats. the auspices of the W3C[3]. The SVG working group has more The ab initio generation of SVG is effected from a server-side than 20 members representing most of the major companies in CGI script, using a PERL library of drawing functions; later the graphics software industry. The SVG 1.0 proposal is now a sections highlight the problems of converting Adobe PostScript W3C proposed recommendation. It would be fair to say that the and Macromedia's Shockwave format (SWF) into SVG. graphics model of SVG owes much to PostScript, whereas the tags and their attributes have been influenced by VML. The text- Keywords based nature of SVG (and the tree structure implicit in its XML SVG, Flash, SWF, PDF, PostScript. representation) should make text search and graphics transformations relatively straightforward. 1. INTRODUCTION The remaining sections of this paper focus first of all on The World Wide Web's enormous success has been achieved by generating SVG from first principles via PERL scripts, followed marking up Web documents with a simple set of HTML tags to by a description of how PostScript and Shockwave/SWF can be denote basic structural and layout concepts such as bold converted to SVG. Firstly we focus on SVG-PL, which is a headings, tables etc. But perhaps the most striking weakness of library of Perl functions for generating ‘SVG on the fly’ from today's Web technology is that all graphic material has to be Web CGI scripts. The great advantage of creating such a library expressed in one of three raster-based formats (GIF, JPEG, from first principles is that its design can faithfully reflect the PNG). Such formats distort badly at any scaling factor other than possibilities for object attributes, and object groupings, that are 1:1, owing to the pixel-by-pixel way in which the image data is inherent in the SVG target language. stored. Moreover, these formats are innately non-searchable and We then discuss two further software tools for converting this is a severe limitation given that a common use of graphics on PostScript and Flash into SVG. In both cases the faithfulness of the Web is for representation of annotated maps and line the conversion to SVG, and the degree of code optimisation that diagrams, together with exotic characters and logotypes that lie can be achieved, depends on the degree of similarity between the outside conventional character sets. underlying graphics models of source and target languages. If In recent years Macromedia's Flash and its associated translation occurs from a less complex format into a richly Shockwave format (SWF)[1] have achieved a degree of descriptive format then, in order to make maximum use of the acceptance, via plug-in software to popular Web browsers, as a richer format, intelligence must be derived from the translation process. Bottom-up inference of intent is notoriously hard. For . example, PostScript to SVG is an obviously desirable translation Permission to make digital or hard copies of all or part ofthis ork for and yet most PostScript is in final form, making it difficult to personal or classroom use is granted ithout fee provided that copies decide which parts of a graphical object are inter-related and are not made or distributed for profit or commercial advantage and that capable of being abstracted into groups. copies bear this notice and the full citation on the first page. To copy other ise, or republish, to post on servers or to redistribute to lists, The fact that SVG is an application of XML means that the re uires prior specific permission and or a fee. purely syntactic aspects of the target language are well specified. DocEng l, November 9 10,2001, tlanta, eorgia, US . Copyright
135 But even within this well-defined framework the design of the The meta-syntax for specifying the DTD itself is similar to SVG tagset has to decide whether certain graphic features shall SGML, but is neither identical to it nor a proper subset of it. be elements in their own right or be hierarchically nested sub- Moreover a DTD is very limited in the amount of type elements. Thus it will be helpful to develop some background to information it can convey, either for the elements themselves or the XML meta-syntax, since this affects certain of the non- for their attributes. For these reasons XML parsers (while graphic semantic properties of SVG. continuing to accept DTDs as a possibility for tagset definition) are increasingly able to accept schemas which specify an XML 2. XML AND SGML tagset (with more powerful typing information than a DTD) XML is designed to be a simplified subset of the Standard using XML meta-syntax. Generalised Mark-up Language (SGML) which has been used in Among the dozen or so restrictions of XML compared to full the publishing community for many years. Like SGML, XML is a SGML, three notable restrictions are: standard meta-syntax . The thing that is ‘standard’ about it is not that it defines a fixed tagset but, rather, that it defines the • No tag omissions, all start and end tags must be present. alphabet and the ‘punctuation’ that are used within and around • All attribute names (c.f. ‘formal parameter names’ in the defined tags. For example, angle brackets are used to programming languages) must be fully specified and attribute delineate a tag (so the start of a paragraph might be marked by values (c.f. ‘actual parameter values’ in programming language
and
136
137 A complete PERL program (ignoring some initialisation code) fixed overhead for invoking the libraries and setting up File and using SVG-PL, which generates a simple ‘Hello World’ SVG Graphic objects. For larger examples the Perl code is about 50% drawing would be: the size of the generated SVG because of the short aliases already discussed and the availability of control constructs such as ‘for’ and ‘while’ to simplify the generation of similarly- ### Hello World! ### structured SVG elements. #OPEN FILE There are also functions which allow users define a set of x and my $svg = File->new(); y coordinates as a pivot point to which a text string could be $svg->open("Hello.svg", "public", aligned. For example one could indicate the top left hand corner "encoding", "iso-8859-1", "silent"); of a text string to be the point specified by the user-defined set of #BEGIN GRAPHICS x and y coordinates, and thus have the text aligned to the left and my $g = $svg->beginGraphics(); bottom against the pivot point. An example SVG drawing generated from use of such text alignment functions provided by # TEXT lines SVG-PLisshowninfigure1.: $g- >b("svg","viewBox","0","0","600","400","wid th","400","height","300"); $g- >b("text","xval","200","yval","200","style" ,"font-size:24pt; font-face:Helvetica"); $g->printTxt("Hello World"); $g->e(); $g->e(); $svg->close($g); # END line Figure 1: SVG Drawing Generated From Text Alignment Functions In the first OPEN FILE line, a local variable ($svg) is defined SVG code produced using the current release of SVG-PL is, in and initialised to be the reference to a new File object. The some sense, about 80% optimised. This is due to the fact that the second line opens a file for output and prints the relevant headers begin ...() and end...() constructs are a natural way to the SVG file. The BEGIN GRAPHICS line creates a of expressing SVG's groupings of graphic primitives having the Graphics object from the File object and all graphic same attributes and SVG-PL also has some low-level elements must be drawn using Graphics methods only. The optimisations that will, for example, convert multiple lineto TEXT lines first open a drawing area, then define a text marking commands into a single one. It is hoped to implement better area for putting the text string ‘Hello World’ onto the page at a optimisations in the near future. For example, attributes within a specified location in user co-ordinates. The two e() statements group boundary are currently not remembered by SVG-PL; if close the text followed by the graphics area. The END line closes elements within a group boundary repeat the attributes already first the Graphics object and finally the File object as well as the defined in the group statement then the repeated attributes can be output file. discarded. Moreover, there is useful functionality in the The generated SVG code would look like the following: companion PDF-PL library for generating Acrobat/PDF (namely, a function to save the current graphics state) which has not yet been implemented in SVG-PL. This and other improvements are planned for the next release of SVG-PL. an interpreted graphics description language, which uses postfix notation, and which is capable of specifying complex two- Note that in this example the number of lines in the SVG code is dimensional paginated graphics in a manner which is both device rather less than that in the Perl program, largely because of the and resolution independent.
138 The Portable Document Format (PDF)[8] is another page quickly as possible and where optimisation is not a major issue. description standard which has been developed by Adobe There is a large amount of ‘graphic state’ information implicitly Systems, with an imaging model which is very similar to that of present in GhostScript's data structures but it is not usually Level 2 PostScript[9]. necessary to probe this state extensively to generate adequate code for a printer. Indeed, many target printers will be building 5.1 PostScript/PDF and SVG up their own internal notion of ‘state’ from the stream of As part of our SVG project, we have been developing a commands that they receive. For these reasons GhostScript Ghostscript driver to output SVG files. Ghostscript is a output drivers generally retain little or no knowledge of what has PostScript interpreter, which is available under the terms of the gone before, or what is coming next. GNU Public License, and in commercial forms too. It is widely To achieve the desirable code optimisation of grouping elements used as a free PostScript interpreter on various operating together where possible, the GhostScript SVG driver must save systems, both for on-screen display and for printing to various substantial extra state information to decide when it is worth printers through its various ‘back-end’ printer drivers. As well as while to start a group. This grouping of repeated graphical Ghostscript's wide use and extensibility, another benefit is that elements is possible in SVG by using
139 Flash files are created using a Flash editor available from For comparison, one way in which Flash could define the shape Macromedia, which generates files with a fla extension. These is as follows: files are editable by Flash tools and the objects, images and move to A, line to B with fill left = blue, animations can be altered. Flash movies for Web distribution are line to D with fill left = blue, line to C turned into final form by exporting them as SWF files from with fill left = blue and fill right = red, within the editor. This format stops the file from being editable line to E with fill right = red, line to F and is a compact format suitable for Web distribution. The SWF with fill right = red, line to D with fill format defines shapes, which are then placed onto a stage using a right = red, move to A, line to C with fill transformation matrix to scale and position them. The right of blue transformation matrix can be altered from frame to frame to The example above illustrates that in SWF, each line can have create the animation effects. This leads to sequences of controls two fill styles, one of which fills to the left of the line, the other such as: to the right. This difference causes some problems when Define shape 1 converting SWF into SVG. Start and end coordinates of Define shape 2 individual lines need to be compared and those with adjoining Place shape 1 coordinates and similar attributes can be assigned to SVG paths. Place shape 2 6.2 Properties of Groups and Shapes Show frame 1 A further difference between SVG and SWF is that SVG has a Place shape 1 in new position hierarchical model whereas Flash is essentially linear. Groupings of primitives in SVG can support sub-groups which have Place shape 2 in new position inherited properties, enabling common properties to be Show frame 2 abstracted up the hierarchy and to affect whole sub-groups. In comparing SWF and SVG the key differences in the semantics Groups and sub-groups can be named, and named groups can be of the graphics primitives, their grouping into shapes and the referenced through the Document Object Model. Properties such manipulation of these shapes have to be addressed. as styles, fill colour etc. can be manipulated through the DOM by JavaScript or other XML transformation techniques such as XSL. 6.1 Grouping Graphics Primitives into Shapes If the groups are designed intelligently then DOM manipulations Both SVG and SWF allow shapes to be built up from paths can be used to facilitate object re-use and reduce file size. For described using position, line and curve primitives (note that example, let us assume the effect required is to draw the outline SWF curve primitives are described using quadratic B-splines). of a black circle within a square filled in red. If a similar shape is In both languages the graphics primitives have associated required, but with the circle filled in blue (i.e. in a later frame in properties such as fill and stroke colours, and patterns. SVG uses an animation) (see figure 3.), there is no need to define two a similar model to PostScript and PDF in that a path is a set of separate circles; rather, the fill-colour attribute of the circle could points that have equivalent stroke and fill properties. SWF, on be altered. the other hand, has a more flexible method of defining paths, where adjacent primitives can be described in the same path but with different associated fill or stroke properties. This can best be described by looking at the following simple shape (figure 2). A CE