Technical Advisory Service for Images Advice Paper

Choosing a

Introduction Over the years, there have been a number of image file formats that have been proposed and used. Of course, every year, this choice gets larger and larger as new file formats are introduced and it is not always immediately clear which file format is the best one to use in any particular case. The choice will depend on a number of factors, which will vary according to how you intend to use the file. Each stage of the process, from capture through to delivery, has its own requirements that may affect this choice.

This report provides a brief look at some of these factors and provides guidelines to making the best choice from what is available.

For a full introduction to the file formats themselves, see the TASI Advice Document File Formats and Compression - http://www.tasi.ac.uk/advice/creating/fformat.html.

Choose a non-proprietary open ‘standard’ Despite the large range of available file formats, choosing one should not be too hard as only a very few of them are normally recommended for digitisation projects. Any digitisation project will need to consider the long-term usefulness and accessibility of the images and this means choosing a file that is both an established industry ‘standard’ as well as a non-proprietary format. This limits the range to a much more easily considered number that includes the most common four below:

• Tagged Image File Format (TIFF) • Portable Network Graphics (PNG) • Joint Photographic Experts Group File Interchange Format (JPEG or JFIF) • Graphic Interchange Format (GIF)

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 1 of 10 There can be good reasons why a project might wish or need to use another file format at some part of their project, such as some of the proprietary formats including:

• Adobe Acrobat File (PDF) • Adobe Photoshop Image File (PSD) • The camera’s native RAW file

However this is likely to come about because of some specific need of a particular project and cannot be covered here. For details of these file types and many others, please see the TASI Advice Document File Formats and Compression - http://www.tasi.ac.uk/advice/creating/fformat.html.

File Formats for Capture This is the first step in the digitisation process. When capturing images, it is important that they are all created at the highest possible quality and at a size appropriate for all subsequent uses. Errors at this point will certainly compromise the quality of the whole project and the only recovery option will be to go back and re-capture the original.

All digital capture devices originally capture values of Red, Green and Blue. The number of different describable colours (or tones of grey) will depend upon the ‘bit-depth’ of the device. Any modern device will be able to capture in at least 24-bit colour (or 8-bit B&W) (see the TASI Advice Document The Digital Image - http://www.tasi.ac.uk/advice/creating/image.html), although some modern devices can capture at higher bit depths, right up to 48-bit.

Some of the more advanced cameras offer their own un-processed RAW formats. These files contain all of the original data as captured by the sensor without alteration. These images are then processed on the computer where fine adjustments can be made to the white balance, exposure and sharpness before saving in a non-proprietary format. RAW files usually contain higher bit depths than the equivalent JPEGs and produced by the camera.

Once the capture device has created the image, it must be saved for later use.

Format Requirements A file format should be chosen that:

• Retains all information that was created by the capture device. This will mean using a file format that can store the image in at least the same colour depth as it was created. 24-bit for colour and 8-bit for B&W should be considered the minimum although files captured with a larger bit-depth should really be archived with this information

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 2 of 10 • Retains any capture device colour management information (ICC profile) • Uses (or can be set up to use) no compression

The suggested format here is: TIFF or the proprietary format of capture device.

Although it is normally advisable to avoid all proprietary file formats, there can be an argument for the temporary use of a proprietary format within the scanning software if it is able to offer some level of additional functionality. However it would still need to be converted to another open standard format before being archived.

When we mention or specify TIFF, it is important to realise that the TIFF file format comes in a range of types, supporting different functionality, such as multipages and even a choice of compressions including JPEG. So when we specify TIFF for archival purposes we always mean an uncompressed Baseline TIFF v6 with Intel byte order (PC option).

File Formats for Master Archive There are two possible methodologies for creating a Master Archive and both have advantages, depending on the project.

Method 1 – Archive all data exactly as created by the capture device. The Master Archive contains a copy of each image in a form as close as possible to the original captured data. This enables the project to go back to the archive knowing that they have an exact copy of everything that was originally created by the capture device for the project. It should be realised that as images are pre-optimisation, they might not look as good as those archived using Method 2. They will be in a totally original form but not necessarily the highest visual quality. With this approach, it is important to use a colour space that in no way compromises the colour gamut of the original data. This will often mean leaving the image within the capture device’s own colour space, but could mean using a larger or unbounded colour space such as CIE Lab. See the TASI Advice Document on Colour Management – http://www.tasi.ac.uk/advice/creating/colour.html.

Method 2 – Archive an optimised version of image file. The Master Archive contains a copy of each image after it has been prepared and optimised for use at its highest quality (see Basic Guidelines for Image Capture and Optimisation - http://www.tasi.ac.uk/advice/creating/img_capt.html). This has the advantage of archiving the image in a ready-to-use state. The optimisation need only be done once, and all images can be handled in a consistent way. However it is inevitable that some data will have been lost in the process and if the optimisation (see Basic Guidelines for Image Capture and

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 3 of 10 Optimisation - http://www.tasi.ac.uk/advice/creating/img_capt.html) is in any way inappropriate or badly undertaken then the project will be unable to go back to the original data and work from there. For this approach it would make sense to save the image in a colour space appropriate for the intended use of the image in the future. (Adobe RGB 1998 would be advised for print/Web, but sRGB could be used if the only delivery medium was going to be the Web).

Format Requirements The requirements of a file format for archiving are the same as for creation except that it should also:

• Be an open standard file format - proprietary formats should not be used, as there is uncertainty about the ability to open the file in the future. A possible exception to this might be the Adobe Photoshop format - see below • Preferably not use any compression, although lossless compression may be acceptable. Be aware that one of the most common lossless compressions is LZW, which is based upon patented technology and should therefore be avoided.

Suggested formats: • Method 1 DNG, TIFF, PNG • Method 2 TIFF, PNG or possibly PSD

One way around the question of whether to archive before or after optimisation is to use the ‘layers’ features of Photoshop and save the image as a PSD file. This proprietary file format allows both the original image (un-optimised) and any optimisation to be stored within the same file. This effectively allows both states of the file to be archived within the same file. The PSD file is however a ‘Proprietary’ format and its use should therefore be approached with great care.

File Formats for Optimisation and Manipulation All image optimisation and manipulation is undertaken within image processing software. Whilst carrying out this work, it can be useful to save the image in the proprietary format of the image processing software.

Editing can be a time consuming process and the proprietary formats offer increased functionality that enable extra information (e.g. layers, masks and channels) to be stored. This enables subsequent editing to resume from where the last session finished without having to recreate any prior work. Unfortunately using a proprietary file format in this way conflicts with the preservation requirements of our archive images. This is where archiving after optimisation can have an advantage.

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 4 of 10 On the other hand, if the image is going to require a lot of manipulation or will be made for a specific use then it can be helpful to have access to the original file before any other processing has been undertaken. This is an advantage of archiving before optimisation.

Suggested formats: Image processing proprietary formats such as PSD for Photoshop, PSP for Paint Shop Pro and PNG for Fireworks. However TIFF is still a good choice if the increased functionality of the proprietary formats are not required (the TIFF format can save some layer information but only a few programs such as Photoshop CS can read this information - so it can no longer be considered a truly open source file).

However, once the image manipulation has been finished the file should be saved in a form appropriate to its subsequent use.

Formats for Delivery Choosing the correct image file format for delivery probably poses the hardest decision with the biggest variety of choice. These are just some of the issues that will need to be considered:

• What is the intended use of the image after delivery? • How much image resolution is needed to convey the intellectual content to the user? • On what output device is the image going to be used - monitor, printer, projector? • What are the capabilities of the output device? What bit depth can it handle? What is the required resolution? • What bandwidth is available for delivery? • Is the image for photo-realistic or presentation use? • How is the image going to be delivered? CD-ROM, tape, WAP, Internet (dialup, broadband, LAN or WAN connection)? • Is there a requirement to add any watermarking or deal with any other digital rights management issue? • Do the users require the image to be provided with any colour profile or other colour management information?

With so many considerations, combined with the proliferation of file formats, each designed for a specific use, it is little wonder that this subject continues to confuse and engender debate.

With this in mind, the following are more in the form of ideas for consideration than guidelines.

Formats for Commercial Printing It is hard to give generic advice in this area, the important thing is to talk to the person doing the printing as mistakes can be costly and it is the printer who should understand what must be provided for the agreed use. They

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 5 of 10 will hopefully be able to give you specific image preparation guidelines so as to help you prepare images correctly for their workflow.

Normally the printer will want images in a high quality uncompressed format such as TIFF or within an encapsulated metafile such as EPS or PDF (although in the commercial world Quark files are also popular as many printers have an established workflow based around Quark XPress, which provides all layout and sizing, whilst the image is provided as a linked TIFF).

Remember that the printing process uses subtractive colour rather than additive colour (see the TASI Advice Document The Digital Image - http://www.tasi.ac.uk/advice/creating/image.html) and this means the image must be printed from a CMYK file rather than an RGB one. It will therefore be necessary for either you or the printer to convert the image file from RGB to CMYK. This is rarely an easy task and should be undertaken with care by a skilled operator who understands the workings of a CMYK printing workflow. Due to problems with this process, it is becoming more common to provide the printer with an RGB file and ask them to undertake the transformation. When this is done, it is normal to use an RGB colour space that is designed to transform to CMYK easily. There are a few possibilities, but the most common and almost standard is Adobe RGB 1998.

Suggested formats: TIFF (RGB), TIFF (CMYK), EPS, PDF

Desktop Printing It is quite normal to have to undertake a fair amount of testing and adjusting with a desktop printer before it is possible to get the best results out of it. Most of these devices (certainly all those using ink/pigment) print in CMYK, however they normally undertake the conversion themselves and have been designed to work best with RGB data. The exception to this are ‘continuous tone’ printers such as the dye sublimation and photo-printer types which print in RGB.

The normal desktop printers (ink-jet and colour photocopier) are designed to work happily with a range of image file formats, including JPEG compressed files. However they will still work best with the maximum amount of image data supplied by an uncompressed image such as a TIFF or PSD. Nonetheless, surprisingly good results can be obtained from JPEG compressed files as long as the quality is set at the highest setting (with a file size larger than 10% of original).

Suggested formats: TIFF (RGB), PSD, JPEG (high quality setting)

Web Delivery For most digitisation projects, the most common delivery format is simply a monitor with the images viewed through a Web browser interface. This

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 6 of 10 makes the choice of file format easy as the current selection of Web browsers only support a small range of image file formats (JPEG, GIF & PNG), although this range can be extended with the use of the appropriate plug-in.

Delivering images through a Web browser has some inherent advantages and unfortunately some challenges. The main advantage is that (in common with all monitor delivery) images naturally look ‘good’ on a monitor where their perceived ‘brightness’ (the light is being transmitted to you, rather than reflected) hides many small deficiencies in quality that would compromise quality if the image was printed. On the other hand, present browsers have only limited image-viewing capabilities and are unable to ‘zoom’ in and out of the images. This means that delivery is limited to images with pixel dimensions that fit within the user’s browser - suggested standards at present are to design Web pages to a size of 800 x 600 pixels giving standard image sizes of approx 512 pixels on the longest edge.

The largest limitation on the quality of images delivered on the Web and the main influence on ‘choice’, is the need for them to be compressed to a size that makes their delivery over the limited available bandwidth possible. All the file formats supported by Web browsers provide compression, however the amount and method of compression varies.

Web browsers currently support the following file formats:

• JPEG (JFIF) - JPEG is not actually a file type, but a type of compression proposed by the Joint Photographic Experts Group. It is used within the JFIF file format that uses the file extension .jpg and we colloquially call the ‘JPEG’. It is a lossy compression and will provide the best quality and lowest file size for continuous tone images. The amount of compression given to the file is chosen at the time of saving the file and allows for variation in quality against file size: as a rule of thumb, it is normally considered that a file compressed with JPEG to 10% of its original size will be visually acceptable with no obvious compression artefacts. However it is common if required, to compress right down to 2-4% if the lower quality is acceptable.

• GIF - The Graphic Interchange Format, is an 8-bit (and under) indexed file type only offering a range of 256 (or less) different colours (these can either be a standard selection or a image-dependent selection by user-choice). It was designed in the early days of the Internet by Compuserve and works best for use with simple images using block colours, such as graphics, logos and banners. GIF uses lossless LZW compression, the amount of compression will depend totally on the type of image being saved. A full colour continuous tone image is

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 7 of 10 unlikely to compress to less than 30% of its original size, however a solid colour vector image should compress far more. The GIF file format supports layers allowing it to offer both transparency and animation.

• PNG – The Portable Network Graphic (colloquially called ‘PING’) file is an open source ‘standard’ that was introduced to overcome the possible patent problems associated with the GIF format (the LZW patent expired in 2004). It is normally used in either an 8-bit indexed version or as a 24-bit full colour version, although there is also an infrequently used 48-bit version as well. This makes it a very versatile format offering either the advantages of lossless compression in full colour (as an archive format) or as a GIF replacement in 8-bit form. However it cannot compete with the JPEG in terms of producing high quality and small, full colour images for viewing on the Web. The compression available from PNG in 24-bit mode is typical for a lossless compression providing a file of about 60-75% of the original size and in 8-bit mode it is much the same as GIF. PNG supports transparency (even variable opacity, although browsers do not!) but is not able to provide animation.

The JPEG 2000 (j2k or jp2) format was developed to replace the popular JPEG format; it makes use of wavelet compression, which can use either lossless or lossy methods of compression. While it doesn’t offer any significant increase in compression ratios over normal JPEG there is less of the blockiness and artifacting associated with standard JPEG compression. While JPEG2000 is not as widely supported as was first hoped, it is slowly gaining in popularity however; it looks unlikely that it will replace JPEG in the near future. Further details of this can be found in the TASI Advice Document New Digital Image File Formats - http://www.tasi.ac.uk/advice/creating/newfile.html.

Suggested formats and relevant uses: JPEG, PNG and GIF It is quite legitimate to use any of these file formats for Web delivery, however they do have particular strengths and weaknesses that should be considered in your choice. The table below sets out some of the more common needs, the best choice and the reason for making your choice:

Need or Use Recommended Reason File Type Normal JPEG or PNG PNG will allow you to continuous-tone deliver an image at the full colour image highest quality using at the highest lossless compression. quality However file size will be very large (approx 60% of original). JPEG at its best

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 8 of 10 quality setting, should be visually identical but provide a larger compression (approx 10-25% or original). Normal JPG JPG will allow compression continuous-tone of the image down to full colour image approx 2-4% of the original at highest size. At this compression, compression quality is likely to suffer, but in some cases this can be acceptable A banner or logo PNG or GIF Both PNG and GIF offer the with 8-bit or less best compression for file colour size. PNG is ‘patent’ free, but might have problems with the browsers prior to v4 Continuous-tone JPEG, PNG or GIF As greyscale is only 8-bit greyscale image anyway, all of the formats should provide comparable quality, however JPEG is likely to provide highest compression (with corresponding drop in quality) Black and White PNG or GIF In this case, GIF or PNG bi-tonal images should provide equal quality. JPEG is not recommended as it will give a file size larger than PNG/GIF due to it being unable to store less than less than 8-bit greyscale Image or logo PNG or GIF Both PNG and GIF support with transparent transparency. PNG is non- layers patented. PNG also offers multi-layers and variable- transparency, however at present this is not supported by any of the current browsers A full colour PNG As stated above, only PNG image with allows you to deliver a lossless losslessly compressed compression image Animated image GIF At present only GIF can support animation

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 9 of 10 A zoomable or JPEG, JP2, VFZ This will largely depend streamable upon server software, image however it is hoped that browsers will be able to provide this with new file types such as JPEG 2000 or VFZ A file with JPEG, PNG, JP2 At present this is not reliable image supported by the current metadata Web browsers, however tagging JPG and PNG both do support IPTC data. JPEG 2000 also has an XML-based inbuilt metadata system, which should hopefully be readable by future Web browsers A file with VFZ, JP2 So far all these systems will integral rights need some server-side management software and plug-ins within the user’s browser, however again it is hoped that JPEG 2000 and next generation browsers will be able to provide this functionality

File Formats for PowerPoint or other Multimedia Programs As long as the intended delivery format is still using a monitor, all the file formats recommended for use within a Web browser will still be good choices. However if MS PowerPoint is being used to create posters or some other printed media, it might well be better to consider some of the image file formats suggested in the section for ‘Commercial Printing’ or ‘Desktop Printing’.

The main influence on choice will be the available bandwidth for the delivery of this material. If there are bandwidth restrictions then it will make sense to use some of the file formats suggested for Web delivery, however if the presentation is to be delivered locally then there is no reason to not use images of a correspondingly higher quality.

Suggested formats for monitor delivery: JPEG, PNG and GIF (at compression rate to suite delivery bandwidth and PC performance)

Suggested formats for print delivery: JPEG – High Quality, TIFF, PNG and GIF

This document was downloaded from the TASI Web site at http://www.tasi.ac.uk/ Further help and advice available for the FE and HE sector from [email protected]

Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 10 of 10