<<

Image Compression Compression

 Because of data sizes and perceptual issues, compression is typically applied to media data  Compression may be lossless or lossy  ◦ Data bits can be recovered exactly in the compressed version ◦ Decompressed file has identical bits in identical order to original file before any compression ◦ Example: zip files

2

 Some data bits cannot be recovered after compression ◦ Decompressed file has lost some bits or bytes compared to original file before any compression  Goal: Discard data that doesn’t typically affect perception; ◦ Human perception of rendered decompressed data should be similar to perception of rendered data before compression  It is best to apply lossy compression only once, at end of production of media, and if possible, work with uncompressed data throughout the production process ◦ i.e., Avoid compressing to lossy format, editing that format, then compressing again to lossy format, etc. 3 4 Raster File Formats

Extension Name Notes

.jpg Joint Photographic Experts Group Lossy compression format well suited for photographic images

.png Portable Network Graphics Lossless compression image, supporting 16bit sample depth, and Alpha channel

. Graphics Interchange Format 8bit indexed format, is superceded by PNG on all accounts but animation

. Tagged Image Flexible Format Lossless compression format (Lempel-Ziv-Welch – LZW) good for high-res images

.exr EXR HDR, High format, used by movie industry..

Direct memory dump from a , contains the direct imprint from the imaging .raw, .raw Raw image file sensor without processing with whitepoint and gamma corrections. Different cameras use different extensions, many of them derivatives of TIFF, examples are .nef, .raf and .crw

A subset/clarification of TIFF, created by Adobe to provide a standard for storing RAW files, as .dgn well as exchanging RAW image data between applications.

.psd Photoshop Document Native format of , allows layers and other structural elements .xcf Gimp Project File GIMP's native image format. Lossless Compression Image Sampling

 Earlier we saw what sampling was in the context of audio

Sampling  In analog to digital conversion of an image what is a sample? ◦ : Small, often square, dots of color or grayscale which merge optically when viewed at a suitable distance to produce the impression of continuous tones

7

 Resolution: ◦ dimensions of image; also: number of pixels that a device can display (render) per unit of length  Examples ◦ My laptop display is about 13” (wide) by 8 ¼” (high), with a resolution of 1440x900 pixels (maximum), giving a resolution of about 108 pixels/inch  Referred to as 108 dpi (dots per inch) or ppi (pixels per inch)

8 iPhone and iPad resolution

Device Native Resolution (Pixels) UIKit Size (Points) iPhone X 1125 x 2436 375 x 812 iPhone 8 Plus 1080 x 1920 414 x 736 iPhone 8 750 x 1334 375 x 667 iPhone 7 Plus 1080 x 1920 414 x 736 iPhone 6s Plus 1080 x 1920 375 x 667 iPhone 6 Plus 1080 x 1920 375 x 667 iPhone 7 750 x 1334 375 x 667 iPhone 6s 750 x 1334 375 x 667 iPhone 6 750 x 1334 375 x 667 iPhone SE 640 x 1136 320 x 568 iPad Pro 12.9-inch (2nd generation) 2048 x 2732 1024 x 1366

iPad Pro 10.5-inch 2224 x 1668 1112 x 834 iPad Pro (12.9-inch) 2048 x 2732 1024 x 1366 iPad Pro (9.7-inch) 1536 x 2048 768 x 1024 iPad Air 2 1536 x 2048 768 x 1024 iPad Mini 4 1536 x 2048 768 x 1024 Typical Resolution

Dimensions Megapixels Name Comment 640x480 0.3 VGA VGA Dimensions used for 720x576 0.4 CCIR 601 DV PAL PAL DV, and PAL DVDs PAL with square 768x576 0.4 CCIR 601 PAL full sampling grid ratio 800x600 0.4 SVGA The currently (2004) most common 1024x768 0.8 XGA computer screen dimensions. 1280x960 1.2 1600x1200 2.1 UXGA interlaced, high 1920x1080 2.1 1080i HDTV resolution digital TV format. Typically used for digital 2048x1536 3.1 2K effects in feature films. 3008x1960 5.3 3088x2056 6.3 4064x2704 11.1 Sampling

When measuring the value for a pixel, one takes the average color of an area around the location of the pixel.

A simplistic model is sampling a square, this is called a box filter, a more physically accurate measurement is to calculate a weighted Gaussian average

When perceiving a bitmap image the human eye blends the pixel values together, recreating an illusion of the continuous image it represents. Sampling Grid png Sampling Depth (Quantization)

 8bit ◦ A common sample format, 8 bit integers can only represent 256 discrete values (28 = 256), thus brightness levels are quantized into these levels.  12bit ◦ For high dynamic range images (images with detail both in shadows and highlights) 8bits 256 discrete values does not provide enough precision to store an accurate image. Some digital cameras operate with more than 8bit samples internally, higher end cameras (mostly SLRs) also provide RAW images that often are 12bit (212bit = 4096).  16bit ◦ The PNG and TIF image formats supports 16bit samples, many image processing and manipulation programs perform their operations in 16bit when working on 8bit images to avoid quality loss in processing. Sampling Depth (quantization) Image Compression

 Bitmap images take up a lot of memory, image compression reduces the amount of memory needed to store an image. For instance a 2.1 megapixel, 8bit RGB image (1600x1200) occupies 1600x1200x3 bytes = 5760000 bytes = 5.5 megabytes, this is the uncompressed size of the image.  Compression ratio is the ratio between the compressed image and the uncompressed image, if the example image mentioned above was stored as a 512kb file the compression ratio would be 5.5mb: 0.5mb = 11:1. Typical Format Compression Description Ratios Lossless for images <=256 colors. Works best for flat color, sharp-edged art. Horizontally oriented GIF 4:1 - 10:1 bands of color better than vertically lossless oriented bands. Lossless for high res images. Uses LZW TIFF 2:1 compression. Not great choice for web due to large lossless file sizes. High quality - has little or no loss in image quality JPEG 10:1 - 20:1 with continuous tone originals. Worse results for (High) lossy flat color and sharp-edge art. JPEG Moderate quality - usually the best choice for the 30:1 - 50:1 (Medium) lossy Web. JPEG Poor quality - suitable for thumbnails and previews. 60:1 - 100:1 (Low) lossy Visible blockiness (pixelation). PNG's behave similarly to only better; they 10-30% smaller work best with flat-color, sharp-edged art. PNGs PNG than GIFs compress both horizontally and vertically, so solid lossless blocks of color generally compress best. Lossless Image Compression

When an image is losslessly compressed, repetition and predictability are used to represent all the information using less memory.

The original image can be restored.

One of the simplest lossless image compression methods is run-length encoding. Run-length encoding encodes consecutive similar values as one token in a data stream. Digital Coding How would you encode this image? run length encoding

RLE (monochrome)

20, 3, 1, 1, 11, 1, 1, 4, 10, 1, 1, 2, 1, 2, 9, 1, 1, 1, 3, 2, 8, 2, 5, 2, 6, 2, 7, 2, 4, 2, 9, 2, 2, 3, 9, 3, 3, 1, 1, 3, 1, 3, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 35 run length encoding Run Length 20 RLE (monochrome) 3 1 There are 75 runs of length 1 or more 1 11 These could be coded 1 as integers 1 File size = 75 x 4 bytes 4 = 300 bytes 10 1 1 2 … run length encoding

RLE (by row) There are now 78 values

16, 4, 3, 1, 1, 11, 1, 1, 4, 10, 1, 1, 2, 1, 2, 9, 1, 1, 1, 3, 2, 8, 2, 5, 2, 6, 2, 7, 2, 4, 2, 9, 2, 2, 3, 9, 3, 3, 1, 1, 3, 1, 3, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 3, 16 16 run length encoding Run Length 16 RLE (monochrome) 4 3 There are 78 runs of length 1 or more 1 1 These could be coded 11 as integers 1 File size = 78 x 4 bytes 1 = 312 bytes 4 10 1 1 … Advantages of using smaller runs

 Storing a standard integer takes 4 bytes ◦ To store the integer 20 you need 4 bytes  Storing a value from 1-16 can be done using only 4 bits!

Value Bit configuration Value Bit configuration 1 0001 9 1001 2 0010 10 1010 3 0011 11 1011 4 0100 12 1100 5 0101 13 1101 6 0110 14 1110 7 0111 15 1111 8 1000 16 0000 Compression

1 integer per run = 75x4 = 300 bytes RLE (by row) – 4 bit/run = 39 bytes 0 4 20, 20 16*, 3 1 3, 1, 1, 11, 3 4, 3, 1, 1, 11, 1, 1, 4, 10, 1 11 1, 1, 4, 10, 1 1, 1, 2, 1, 2, 9, 1 1 1, 1, 2, 1, 2, 9, 1 1, 1, 1, 3, 2, 8, 1, 1, 1, 3, 2, 8, 2, 5, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 2, 7, 2, 4, 2, 9, 2, 2, 2, 9, 2, 2, 3, 9, 3, 3, 9, 3, 3, 1, 1, 3, 1, 3, 1, 1, 2, 3, 1, 1, 3, 1, 3, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 35 1, 1, 1, 1, 5, 3, 16 *16 coded as 0, since 16 0 is not otherwise valid Compression – example 2

If we assume that run lengths cannot exceed 256 (1 byte) RLE (by row) – 4bit/run = 39 bytes then1 byte per run = 75 bytes 0 4 20 16*, 20, 3 1 3 4, 3, 1, 1, 11, 3, 1, 1, 11, 1, 1, 4, 10, 1 11 1, 1, 4, 10, 1 1, 1, 2, 1, 2, 9, 1 1 1, 1, 2, 1, 2, 9, 1 1, 1, 1, 3, 2, 8, 1, 1, 1, 3, 2, 8, 2, 5, 2, 6, 2, 5, 2, 6, 2, 7, 2, 4, 2, 7, 2, 4, 2, 9, 2, 2, 2, 9, 2, 2, 3, 9, 3, 3, 9, 3, 3, 1, 1, 3, 1, 3, 1, 1, 2, 3, 1, 1, 3, 1, 3, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 3, 1, 1, 1, 1, 5, 35 16 *16 coded as 0, since 16 0 is not otherwise valid Compression and Color Values

RLE (by row) – 4bit/run = 39 bytes If you want to include color and 4 bits for color = 78 bytes total values you can easily do that. 16*, 4, 3, 1, 1, 11, For example, if we use 0 for white 0 0 1, 1, 4, 10, and 1 for black, then we could use 4 0 1, 1, 2, 1, 2, 9, the second half of the byte to store 3 1 color data (up to 16 different colors) 1, 1, 1, 3, 2, 8, 2, 5, 2, 6, 1 1 2, 7, 2, 4, Run length Color code 2, 9, 2, 2, 3, 9, 3, 3, 1, 1, 3, 1, 3, 1, 1, 2, 4 1 4, 1, 1, 1, 1, 1, 1, 1, 1, 7, 1, 1, 1, 1, 1, 3, 1, 7, 1, 1, 1, 1, 5, 7, 1, 1, 1, 1, 5, 3, 16 *16 coded as 0, since 16 0 is not otherwise valid Types of Run Length encoding RLE with color

X Y R W Example of RLE Encoding Example - 1 Compress the following raster file using RLE. How much space is needed for storage, if we assume that bytes (8 bits) can be used to store the data values?

color run length Color/position pairs

This method stores the pixel value and then the pixel position where the 'run' ends. 188 pieces of information are needed to store the example data. These can be encoded using 4-bit “nibbles”, storing the value and the ending location in one 8-bit byte: 94 bytes total Encoding Example - 2

Compress the following raster file using (color, position) pairs. How much space is needed for storage, if we assume that bytes (8 bits) can be used to store the data values?

color end position Quad tree encoding Examples

Recursive strategy! Quad tree Example 2 Quadtree encoding

recursively divides the space into quarters until all pixels in a given quarter have the same value. Requires 160 data items and can be stored in 80 bytes. Encoding Example - 3 Draw the quadtree compression representation of this image. How many data items are required? Typical Quadtree Code (C++)

class { public: Node(int x, int y, int w, int h); Node int posX;

int posY; posX int width; posY int height; width bool leaf; height int value; Node* child[4]; leaf }; value child[0] child[1] child[2] child[3] // CONSTRUCTOR Node::Node(int x, int y, int w, int h) { posX = x; posY = y; width = w; height = h; leaf=false; } Huffman Encoding

Huffmann coding compression technique involves preliminary analysis of the frequency of occurrence of symbols. Huffman technique creates, for each symbol, a binary data code, the length of which is inversely related to the frequency of occurrence. Decoding example: Color/position pairs

Image files should provide all of the information necessary to be decoded. Assume that the data shown here was to be put into a file. What other information must be in that file for it to be self- contained. File Structure

File header Color table 1 = yellow 2 = blue 3 = green Image descriptors width height maybe location Image data RLE encoded values

header color table image desc. data GIF

The Graphics Interchange Format (GIF, pronounced jiff, though most people say giff) is the oldest graphic file format on the Web, and all browsers except Lynx support it. GIFs are 8-bit images, which limits them to a maximum of only 256 colors.

GIFs use a lossless compression and supports transparency, and animation (display of multiple images within a single GIF file). GIF Format

Header

Color table

Image descriptor

Image data PNG

The Portable Network Graphic (PNG), pronounced ping, format was designed to be a better, legally patent-free replacement for GIF. PNG is a lossless compression format for transmitting a single bitmap image over computer networks. PNG matches all of GIF's features except animation.

PNG has better compression and interlacing than GIF and adds new features of its own, such as gamma storage, full alpha channel, true color support, and error detection. PNG's full alpha channel makes it possible to create beautiful glows and drop shadows which layer over different- colored backgrounds perfectly.

PNG file structure explained here: http://en.wikipedia.org/wiki/Portable_Network_Graphics#Technical_det ails Lossy Image Compression

Lossy image compression takes advantage of the human eyes ability to hide imperfection and the fact that some types of information are more important than others.

Changes in luminance are for instance seen as more significant by a human observer than change in hue.

JPEG is a file format implementing compression based on the Discrete Cosine Transform DCT, together with lossless this provides good compression ratios.

The way JPEG works is best suited for images with continuous tonal ranges like photographs, logos, scanned text and other images with lot's of sharp contours / lines will get more compression artifacts than photographs. Lossy Image Compression

JPEG is most suited for photographic content where the adverse effect of the compression algorithm is not so evident.

JPEG is not suited as an intermediate format, only use JPEG for final distribution where file size actually matters. JPEG Loss

This is an image specially constructed to show the deficiencies in the JPEG compression algorithm, saved, reopened and saved again 9 times. jpeg artifacts JPEG Compression Algorithm

The JPEG compression scheme is divided into the following stages:

1. Transform the image into an optimal . 2. Downsample chrominance components by averaging groups of pixels together. 3. Apply a Discrete Cosine Transform (DCT) to blocks of pixels, thus removing redundant image data. 4. Quantize each block of DCT coefficients using weighting functions optimized for the human eye. 5. Encode the resulting coefficients (image data) using a Huffman variable word-length algorithm to remove redundancies in the coefficients. jpeg File Markers JPEG Compression/Decompression Original Image

250 x 375 pixels 93,750 pixels 24-bit color = 3 bytes per pixel 93,750 * 3 = 281,250 bytes

This jpeg however is 34,414 bytes

Compression ratio ~9:1

source: http://www.ams.org/samplings/feature-column/fcarc-image-compression Compression Algorithm First, divide into pixel grids Example: let's chose one cell Transform RGB into YCbCr

Notice how similar the pixels in this grid cell.

From the RGB values the following can be computed for each pixel Y = Luminescence (brightness) Cb = Chrominance (blue) Cr = Chrominance (red)

To reconstruct the image we will go from Y, Cb, Cr to R, G, B Sampling

Luminescence shows Sampling occurs when more variation than 2x2 pixel sections are chrominance. averaged. So… chrominance can be averaged together without being detected.

Y Cb Cr Discrete Cosine Transformation

 Assume that the amount of variation over one cell of pixels is small  Since we now have a single value for Cr and Cb we now just need to find one for Y  Y (luminescence) ◦ determine the average ◦ create a difference in which you record for each pixel how much it differs from the average ◦ small frequency changes will be small values ◦ large frequency changes will be high values

Quantization

 Quantize the frequency values so that higher frequencies are ignored (or lumped together in one category)  the human eye is less sensitive to high frequency changes, so we can get away with this undetected  As usual, the more quanta the better the image but the larger its size Image Reconstruction

Reconstructed (q=50)

+ +

Reconstructed (q=10)

+ + Image reconstruction

Original Reconstructed (q=50)

Original Reconstructed (q=10) Comparison

Reconstructed (q=5) Reconstructed (q=50) Quantization Level Selection

 If you use too many quantization levels, you may create more perceptual distinctions than most humans can distinguish (or more distinctions than hardware can render)  Too few quantization levels can introduce obvious inaccuracies, such as posterization (brightness contouring) in images

224 levels 4 levels

62 Example: Vector Graphic Image

physical pixels more computation

63

 Typically fewer bytes used for representation or models (smaller files) than  Resolution-independent (can be scaled without loss of quality)  Typically suited for certain sorts of images (e.g., “synthetic” images; not photographs)

65 Other Pros/Cons: Bitmap Vs. Vector?

 Scaling ◦ Bitmap images generally loose quality when scaled larger ◦ Vector images can be scaled large or smaller without loss of quality  Selecting “logical” parts of image ◦ Bitmaps: Can be difficult to select “logical” parts of image as an object, to apply an operation (e.g., change color)  e.g., select the petals of a rose in a bitmap ◦ Vector: Easy to select pre-defined parts and apply an operation  Selecting arbitrary geometrical parts of an image (e.g., rectangular) ◦ Typically easier with bitmap than vector  Sizes ◦ Bitmaps are generally larger because they increase in size with number of pixels represented; size doesn’t vary with complexity of image ◦ Size of vector graphics don’t change with number of pixels, but rather with number of objects in scene  Portability ◦ Sometimes easier to use a bitmap image (e.g., JPG) than a vector graphic (e.g., SVG) in a web page because of format support and because of complexity of representation

66 End!

67 JPEG

JPEG (pronounced jay-peg), is designed for compressing either full-color or gray-scale images of natural, real-world scenes. JPEG is a lossy compression algorithm. When you create a JPEG or convert an image from another format to a JPEG, you are asked to specify the quality of image you want.

Since the highest quality results in the largest file, you can make a trade-off between image quality and file size. The lower the quality, the greater the compression, and the greater the degree of information loss.

JPEGs are best suited for continuous tone images like photographs or natural artwork; not so well on sharp-edged or flat-color art like lettering, simple cartoons, or line drawings. support 24-bits of color depth or 16.7 million colors.

JPEG is actually just a compression algorithm, not a file format. JPEG is designed to exploit certain properties of our eyes, namely, that we are more sensitive to slow changes of brightness and color than we are to rapid changes over a short distance. JPEG

JPEG compression introduces noise into solid-color areas, which can distort and even blur flat-color graphics. This is why JPEGs are not well suited to flat-color sharp-edged art or type. A JPEG can reduce a 900K 24-bit image to 45K (high quality) or 30K (medium quality), a factor of 20:1 to 30:1. With JPEGs, however, the more you compress, the more edge definition and sharpness you lose. JPEGs do not support transparency, either.

It is important to note that saving a graphic to JPEG format with compression should be a last step. Compression effects are cumulative. This means that every time you re-save a JPEG file, you are compressing it further, and thereby tossing away data (photographic detail) that you can't get back. Sampling Rate Selection

 In general, higher sampling rates give better representation of the  Consider taking picture snapshots of a clockwise rotating disk with a radial line on it; the disk is rotating at n rotations per second (e.g., n=1) ◦ Goal: Preserve information about the direction of rotation of the disk  Suppose we sample at 4n snapshots per second (e.g., 4 Hz if n=1)

1 sec. 1 sec.

 Notice that we can correctly tell which direction the disk is spinning from the snapshots; i.e., the information about the rotational direction has been correctly preserved in the snapshots

70 Sampling Rate Selection

 Now consider what happens if we sample at considerably less than 4n samples per second; e.g., (4/3)n samples per second: 1 sec. 1 sec.

4/3n 4/3n  Notice that now it seems that the disk is rotating counter-clockwise  The low sampling rate (undersampling) has introduced error into the “digital” signal  Aliasing: Error introduced by undersampling

71 Sampling Rate Selection

 A sampling rate of 4n is generally higher than necessary to accurately represent a signal, but the previous sampling rate (at 4/3n) is too low  What about a sampling rate of 2n?  Here are our samples of the disks:

1 sec. 2n 1 sec. 2n

 Any problem?  Again, have aliasing–can not tell if disk is rotating to left or to right

72 End!

73