Compression of the Image General Schemes for Application Of

Compression of the image Adolf Knoll National Library of the Czech Republic General schemes for application of compression The schemes adapt to the character of the represented objects: ¡ Bitonal image (1-bit, black-and-white) ¡ Colour photorealistic image ¡ Mixed document (two above-mentioned components) 1 Trends n Bitonal ¡ from CCITT Gr. Fax 3 and 4 to JBIG variants n Fotorealistický ¡ Lossless compression: PNG, TIFF/LZW ¡ Lossy: from JPEG DCT to wavelet n Mixed document ¡ Both applied (Mixed Raster Content – usually vertically) 2 How is it built into formats? n Trying to have it in ISO TIFF (even JPEG, LZW, or PNG) – but it is not enough due to lack of tools for conversion and display. n That is why the other more suitable formats are used: JPEG, PNG n That is why there is a lot of development in the area of mixed formats – they do not aim to become ISO Relevant directions n Bitonal image ¡ JBIG2 (ISO) – no support (exc. Xerox), but many similar activities n Photorealistic image ¡ wavelet JPEG2000 and many other non- ISO initiatives (WI, LWF, IW44, SID, Imagepower IW, … ) n Mixed content ¡ DjVu, LDF, Imagepower MRC Aims n Image Archiving n Image Delivery ¡ standardized ¡ More efficient archival format modern format (TIFF, JPEG, PNG, (JB2, MrSID, DjVu, … ) LDF, … ) Which relationship will be between both of them? It will be defined by the goal of the project. 3 Around compression n Pre-processing of the image n Compression n Encoding in a format n De-coding from the format n De-compression n Display – print-out Pre-processing of the bitonal image - I n Efficient schemes are built on possibilities to apply vocabularies of pixel chunks/groups: ¡ E.g. a text is an image that can be interpreted as several dozens of images of letters, while the repeated occurrence of each letter can be represented by its coordinates (x,y) and reference to a dictionary in which there is only one representation of similar letters (digitized only once as a bitmap) ¡ This method is called PATTERN MATCHING, but… Pre-processing of the bitonal image - II n However, scanned texts have a lot of information noise in individual pixel chunks representing, for instance, letters in text n Therefore, it is convenient to reduce differences between identically indebtifiable chunks ¡ smoothing ¡ pixel flipping ¡ noise removal 4 Smoothing and pixel flipping Problems in pattern matching Česká republika Low quality original and/or scan + inappropriate processing Soft pattern matching n Better work with dictionaries; replacement only there, where the threshold value of the pixel chunk is satisfied n If not, the whole small bitmap is stored n Tuning of these mechanisms is a key to successful application of the lossy compression of a bitonal image. 5 How to know… n Libraries have documents of various qualities- also very bad n These documents are more difficult to process than good samples presented by software producers n Tests… tests… tests… on typical materials Bitonal compression n Lossless (LZW, PNG, … , CCITT Fax Group 3 a 4, JB2, JBIG, JBIG2, Algo Vision/Luratech (1-bit LDF component) n Lossy modern schemes: ¡ AT&T (Lizardtech) (JB2) – soft pattern matching ¡ ImagePower Inc. JBIG2 (JB2) – only pattern matching ¡ Summus Inc. (Lightning Strike), ... GIF would be slightly worse than PNG 6 Kvě ty české – 19th century Czech journal Impact of the quality of digitized originals on performance of compression schemes 7 JB2 n Most efficient compression schemes JB2 from the DjVu format (AT&T). n It enables compression: ¡ lossless ¡ lossy ¡ aggressive – while preserving high quality JB2 as a component part of the DjVu format n More files can be merged and saved into one (as PDF) – they have the common dictionary so that together their size will be smaller than the sum of all individual files n More files can be virtually joined (they are called one after another from the server) n More advantages: display, references, OCR, … (DjVu plug-in) n Expensive or free software for Linux or Solaris Samples and ré sumé n Monitor and test new approaches for image processing n They can be very suitable for document delivery services ¡ Image servers ¡ Scanned content ¡ CLICK!!! 8 Which formats to use for bitonal image? n If you have no special tools: ¡ GIF n If you wish smaller files, use PNG n Both are recommended for WWW n However, TIFF/CCITT Fax Gr. 4 is better n Use DjVu, if you wish very small files Problems n Good image editing software does not support TIFF with Gr. 4 encoding n Display possible within normal Windows tools n GIF and PNG support also higher brightness resolution (8-bit / 24-bit) – take care not to save bi-level image in higher image depth n DjVu – necessary to solve authoring software problem Lossy compression – bitonal image 9 Compression of colour images Lossless Lossy n LZW n DCT (JPEG) ¡ GIF (8-bit only) n Fractals ¡ TIFF (5.0) n Wavelet n PNG ¡ IW44 n Wavelet ¡ LWF, WI ¡ JPEG2000 (JP2) ¡ JPEG2000 (JP2) n … ¡ MrSID, … Classical (LZW, RLE, DCT) versus wavelet approaches. True colour image DCT wavelet 10 Testing compression efficiency n Sample ¡ Reference ¡ Full-colour (JPEG, wavelet) ¡ 1-bit (establish tresholds – Paint Shop Pro, LuraWave) ¡ MRC (same sample – DjVu Solo) Compression efficiency – bitonal image Compression efficiency True colour 11 How to apply compression? It depends on the character of objects in the image: ¡ Photorealistic image (JPEG, wavelet) ¡ Text and simple blac-and-white graphics (Fax Group 4, JB2, … ) ¡ Colour graphics (problem to compress with losses – better lossless PNG or GIF – application area of vector graphics - SVG) ¡ Mixed content (composed solutions: DjVu, LDF, … ) The most efficient solution To segment images into two or more groups of objects: 1. Objects good for bitonal conversion 2. Objects good for true colour representation Tto compress each group separately and then merge into one format. Horizontal segmentation/zoning Horizontally - Text - Grafics - Photographs Imagepower Inc. 12 Vertical segmentation/zoning Vertically n Foreground n Backgorund Lizardtech Inc. (AT&T) Luratech GmBH DjVu, LDF Comparison of DjVu and LDF DjVu LDF 6 layers 3 layers n Foreground: n Foreground: ¡ JB2 ¡ LDF 1-bit Comp. ¡ IW44 ¡ LFW n Background: n Background: ¡ 4 layers IW44 ¡ 1 layer LWF, JP2 Bitonal versus composed image Vě tší podíl grafiky Text n Slož ený n Slož ený komprimovaný komprimovaný obraz se svou obraz je větší velikostí blíží bitoná lnímu 13 Grey level Other DjVu properties More images in one: ¡ as TIFF, PDF, LDF, … , with use of the common dictionary of pixel chunks ¡ Virtually: pages remaion on server and only that page that is called is delivered Multiresolution image MrSID n In one file several (up to 8) images in various resolutions n Sample n Efficient with an image server 14 SAMPLES Samples of various compression solutions 15.

Compression of the Image General Schemes for Application Of

PDF/A for Scanned Documents

Chapter 9 Image Compression Standards

Preparation Method for TIFF File (*.Tif) Over 300Dpi

Electronics Engineering

Understanding Image Formats and When to Use Them

JPEG and JPEG 2000

One Software Solution. One World of Difference for Your Content

Making TIFF Files from Drawing, Word Processing, Powerpoint And

Analysis and Comparison of Compression Algorithm for Light Field Mask

PDF Image JBIG2 Compression and Decompression with JBIG2 Encoding and Decoding SDK Library | 1

Widetek® Wide Format MFP Solutions 17

Optimization of Image Compression for Scanned Document's Storage