Technical Advisory Service for Images Advice Paper Choosing a File Format
Total Page:16
File Type:pdf, Size:1020Kb
Technical Advisory Service for Images Advice Paper Choosing a File Format Introduction Over the years, there have been a number of image file formats that have been proposed and used. Of course, every year, this choice gets larger and larger as new file formats are introduced and it is not always immediately clear which file format is the best one to use in any particular case. The choice will depend on a number of factors, which will vary according to how you intend to use the file. Each stage of the process, from capture through to delivery, has its own requirements that may affect this choice. This report provides a brief look at some of these factors and provides guidelines to making the best choice from what is available. For a full introduction to the file formats themselves, see the TASI Advice Document File Formats and Compression - http://www.tasi.ac.uk/advice/creating/fformat.html. Choose a non-proprietary open ‘standard’ Despite the large range of available file formats, choosing one should not be too hard as only a very few of them are normally recommended for digitisation projects. Any digitisation project will need to consider the long-term usefulness and accessibility of the images and this means choosing a file that is both an established industry ‘standard’ as well as a non-proprietary format. This limits the range to a much more easily considered number that includes the most common four below: • Tagged Image File Format (TIFF) • Portable Network Graphics (PNG) • Joint Photographic Experts Group File Interchange Format (JPEG or JFIF) • Graphic Interchange Format (GIF) Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 1 of 10 There can be good reasons why a project might wish or need to use another file format at some part of their project, such as some of the proprietary formats including: • Adobe Acrobat File (PDF) • Adobe Photoshop Image File (PSD) • The camera’s native RAW file However this is likely to come about because of some specific need of a particular project and cannot be covered here. For details of these file types and many others, please see the TASI Advice Document File Formats and Compression - http://www.tasi.ac.uk/advice/creating/fformat.html. File Formats for Capture This is the first step in the digitisation process. When capturing images, it is important that they are all created at the highest possible quality and at a size appropriate for all subsequent uses. Errors at this point will certainly compromise the quality of the whole project and the only recovery option will be to go back and re-capture the original. All digital capture devices originally capture values of Red, Green and Blue. The number of different describable colours (or tones of grey) will depend upon the ‘bit-depth’ of the device. Any modern device will be able to capture in at least 24-bit colour (or 8-bit B&W) (see the TASI Advice Document The Digital Image - http://www.tasi.ac.uk/advice/creating/image.html), although some modern devices can capture at higher bit depths, right up to 48-bit. Some of the more advanced cameras offer their own un-processed RAW formats. These files contain all of the original data as captured by the sensor without alteration. These images are then processed on the computer where fine adjustments can be made to the white balance, exposure and sharpness before saving in a non-proprietary format. RAW files usually contain higher bit depths than the equivalent JPEGs and TIFFs produced by the camera. Once the capture device has created the image, it must be saved for later use. Format Requirements A file format should be chosen that: • Retains all information that was created by the capture device. This will mean using a file format that can store the image in at least the same colour depth as it was created. 24-bit for colour and 8-bit for B&W should be considered the minimum although files captured with a larger bit-depth should really be archived with this information Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 2 of 10 • Retains any capture device colour management information (ICC profile) • Uses (or can be set up to use) no compression The suggested format here is: TIFF or the proprietary format of capture device. Although it is normally advisable to avoid all proprietary file formats, there can be an argument for the temporary use of a proprietary format within the scanning software if it is able to offer some level of additional functionality. However it would still need to be converted to another open standard format before being archived. When we mention or specify TIFF, it is important to realise that the TIFF file format comes in a range of types, supporting different functionality, such as multipages and even a choice of compressions including JPEG. So when we specify TIFF for archival purposes we always mean an uncompressed Baseline TIFF v6 with Intel byte order (PC option). File Formats for Master Archive There are two possible methodologies for creating a Master Archive and both have advantages, depending on the project. Method 1 – Archive all data exactly as created by the capture device. The Master Archive contains a copy of each image in a form as close as possible to the original captured data. This enables the project to go back to the archive knowing that they have an exact copy of everything that was originally created by the capture device for the project. It should be realised that as images are pre-optimisation, they might not look as good as those archived using Method 2. They will be in a totally original form but not necessarily the highest visual quality. With this approach, it is important to use a colour space that in no way compromises the colour gamut of the original data. This will often mean leaving the image within the capture device’s own colour space, but could mean using a larger or unbounded colour space such as CIE Lab. See the TASI Advice Document on Colour Management – http://www.tasi.ac.uk/advice/creating/colour.html. Method 2 – Archive an optimised version of image file. The Master Archive contains a copy of each image after it has been prepared and optimised for use at its highest quality (see Basic Guidelines for Image Capture and Optimisation - http://www.tasi.ac.uk/advice/creating/img_capt.html). This has the advantage of archiving the image in a ready-to-use state. The optimisation need only be done once, and all images can be handled in a consistent way. However it is inevitable that some data will have been lost in the process and if the optimisation (see Basic Guidelines for Image Capture and Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 3 of 10 Optimisation - http://www.tasi.ac.uk/advice/creating/img_capt.html) is in any way inappropriate or badly undertaken then the project will be unable to go back to the original data and work from there. For this approach it would make sense to save the image in a colour space appropriate for the intended use of the image in the future. (Adobe RGB 1998 would be advised for print/Web, but sRGB could be used if the only delivery medium was going to be the Web). Format Requirements The requirements of a file format for archiving are the same as for creation except that it should also: • Be an open standard file format - proprietary formats should not be used, as there is uncertainty about the ability to open the file in the future. A possible exception to this might be the Adobe Photoshop format - see below • Preferably not use any compression, although lossless compression may be acceptable. Be aware that one of the most common lossless compressions is LZW, which is based upon patented technology and should therefore be avoided. Suggested formats: • Method 1 DNG, TIFF, PNG • Method 2 TIFF, PNG or possibly PSD One way around the question of whether to archive before or after optimisation is to use the ‘layers’ features of Photoshop and save the image as a PSD file. This proprietary file format allows both the original image (un-optimised) and any optimisation to be stored within the same file. This effectively allows both states of the file to be archived within the same file. The PSD file is however a ‘Proprietary’ format and its use should therefore be approached with great care. File Formats for Optimisation and Manipulation All image optimisation and manipulation is undertaken within image processing software. Whilst carrying out this work, it can be useful to save the image in the proprietary format of the image processing software. Editing can be a time consuming process and the proprietary formats offer increased functionality that enable extra information (e.g. layers, masks and channels) to be stored. This enables subsequent editing to resume from where the last session finished without having to recreate any prior work. Unfortunately using a proprietary file format in this way conflicts with the preservation requirements of our archive images. This is where archiving after optimisation can have an advantage. Copyright © TASI www.tasi.ac.uk advice – training – resources Last reviewed May 2006 Page 4 of 10 On the other hand, if the image is going to require a lot of manipulation or will be made for a specific use then it can be helpful to have access to the original file before any other processing has been undertaken.