1
coding
5.9. Video Compression Compression Video 5.9. (1) interframe
and
motion motion prediction/estimation, motion compensation, block intraframe
according according peculiar to the videoconferencing,conditions, e.g. applications such as broadcast video matching video compression algorithms and standards are distinguished carried only once wanted: strategies to exploit temporal redundancy and irrelevance! assumption: successive images of a video sequence are similar, directly e.g. adjacent images contain almost the same information to be that has video time sequence := of single images frequent point of view: video compression image = compression temporal with a component
Basics:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
2
quantizer
coding
interframe
Video Compression (2)Compression Video coding, color subsampling 4:1:1, 6 bit 6 bit subsamplingcoding, color 4:1:1,
compression of a time sequence of images based on the sequence of images a time ofcompression
JPEG - hardware solutions hardware possibility synchronize audio is not provided to data position every time full images at direct access to application and video softwarein proprietary consumer cutting temporal redundancy is not used! istemporal redundancy not 20:1, to 5:1 applicable compression ratios from for call higher for rates unfortunately, not standardized! unfortunately, JPEG, of baseline system the use of makes intraframe M JPEG standard
Simple Simple approach:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
3 -
head texture
, e.g. , e.g. and elements
motion
(3)
picture
the
: objects of
geometric
of
of
:
position describing
of pixels
relations
of camera
processing deformation
, ,
of techniques
change compensation Compression
/ prediction
parameters
ranges
shadows semantic objects scaling
or , , zoom )
and no
and based values
model
prediction but video prediction
: Video Video
( pixels
4 rotation rotation , with -
color lights , , of of
arrangements region -
of of
based or
motion motion
model
of of prediction
information e.g. MPEG neighbouring model grid shoulder object extraction change translation change translation prediction • • • • • • • kinds kinds
Motion Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
4
Video Compression (4)Compression Video
little change in the contents of the image from one frame to the next one frame image from the contents of in the little change one from change do not image the of portions significant even if,
transmitted. Objects tend to move between frames! move between Objects tend to to be needs that information of amount the increase would We assumptions: - - next frame to the • • • encoding)? Solution: same the pixel value of the at value a pixel by the Predict the of (differential difference location encode the and previous framein a use a previous frame to generate prediction for the next the prediction generate for to previous frame use a Block matching: No!
Motion prediction and compensation: (cont.) compensation: and prediction Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
5
(5)
Compression
Video Video Block (cont.)matching:
Motion prediction and compensation: (cont.) compensation: and prediction Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
6
and is and
MxM
uncompensable
Video Compression (6)Compression Video that most closely matches the block being encoded beingthe block matches closelythat most
MxM previous reconstructed frame (reference image) is greater than some some is greater than image) (reference reconstructed frame previous is declared the block threshold, predefined of prediction. benefit without the encoded block being of the the difference the threshold, is below If the distance a transmitted. Additionallyand to its closest block is encoded encoded is transmitted. motion vector divide the frame being encoded into blocks of equal size size blocks of equal into encoded frame being the divide of the block for reconstructed frame search a previous for each block: size measure? closest block in a to the encoded block being of the If the distance • • • • Forward Forward prediction:
Motion prediction and compensation: (cont.) compensation: and prediction Motion Block matching: (cont.)
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
7
.)
cont (7)
: (
compensation Compression
.) and
cont : ( Video Video matching prediction Block
Motion Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
8
Video Compression (8)Compression Video Block (cont.)matching:
Motion prediction and compensation: (cont.) compensation: and prediction Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
9
Video Compression (9)Compression Video
objects moving in different increases different positions in moving objects prediction in poor can result to restrict block for search region matching to miss a match probability the increases more computations per comparison but fewer blocks per per frame but blocks fewer comparison per more computations of covering the probability size block increasing with major drawback: reduce search space: reduce increase size of the blocks: of size increase • • "find closelymatching block" of requires large amount computation! Block (cont.)matching: Search strategies:
Motion prediction and compensation: (cont.) compensation: and prediction Motion
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
10
general general video compression encoding /
Video CompressionVideo (10)
Block diagram: Block procedure
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
11
4 part 2" 4 part - 1) were specific 1) were developed for -
4 / H.264) were developed H.264) wider range for 4 / -
standards: e.g. "H.263.1" or "MPEG "H.263.1" e.g. standards: -
x series - Video Compression OverviewCompression Video
new standards (MPEG new standards application of both define sub both define MPEG (H.261 / early standards purposes ITU: H.26x series ITU: H.26x MPEG ISO:
Two organizations Two
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
12
H.261 / H.263(1)
-
Video CompressionVideo start / stop playingvideo start / stop restart of video play for automatic + timeoutimage freeze a still ... • • • control commands from encoder to decoder to encoder from control commands error correction information correction error bit) H.263: 8 bit / image sequence 5 (H.261: number Compressed data stream: p* 64 Kbit/s, p= 1, ..., 30 ..., 1, p= 64 Kbit/s, p* Compressed stream: data Compression in real time, symmetric procedure Compression symmetric in real time, ISDN to Targeted
The data stream stream more information data structure The contains video: just than Video compression conferences video for compression Video
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
13
H.261 / H.263(2)
-
CIF) -
Video CompressionVideo 704 x 576 1408 x 1152 4CIF (4 time CIF) 16CIF (16 times CIF) 128 x 96 176 x 144 (Sub Quarter CIF) SQCIF CIF (Common CIF (Common Intermediate Format) (Quarter QCIF 352 x 288 Video Video input must provide 29,97 images per second YUV representation with color sampling ratio of 4:2:0
Decoders must implement rest only, QCIF is optional the H.263 only: Image sizes: Image preparation: Image
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
14
changed
be
V plane V
may
and only
U
of
factor -
0,0) ) H.263): Q
block null
difference is
H.261 / H.263(3)
the for -
the blocks
position macro
at coefficients
a GOB
difference
macro V plane V each
contains null options
Y plane plane Y + 1 block
other
or the factors
- for be
if of
within
Q
block coefficient
Y U Y
( = all may
more (=
code
of (
vector : fixed
blocks blocks (GOB) = 33 = 33 (GOB) macro
Compression blocks vector pixel
with the
special
to a moving
coefficient coefficients coding coding macro blocks :
is
a
block = 4 moving of
AC DC DCT
is
Video Video for the otherwise there for • • • • • there between apply quantization block = 8 x 8 macro group
Interframe Intraframe Intraframe Structures
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
15
1 - MPEG (1) MPEG
-
Video Compression Compression Video
1.5 Mbit/s for video 1.5 Mbit/s for for audio Kbit/s 448 Today the first standard is often called MPEG called standard is often first Today the algorithms Compression mediacompressed storing or transmitting Formats for • • • • • decoder is limited The resulting bitrate Includes Video and Audio coding the thanencoder is complex more the codec, Asymmetric Definition Definition of MPEG = Moving Group Pictures Expert MPEG =
Properties International standard of ISO 1993 of ISO standard International
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
16
MPEG (2) MPEG
-
interlaced Video interlaced -
Video Compression Compression Video
16:9 for HDTV (Europe and USA) 16:9 for HDTV (Europe 1:1 PAL or NTSC Hz for interlaced 25 Hz or 30 non 50 Hz and 60 Hz for frequency possible lowest Hz is the 23,976 4:3 for PAL 4:3 for • • • • • • Picture frequency Picture frequency Relation of width and height of the pixels to support different TV different pixels support Relation width to of the and height of example: formats; Separation of luminance luminance Separation of and chrominance (similar YUV to using 4:2:0)
More attributes required than for JPEG for than required More attributes Image preparation Image
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
17
MPEG (3) MPEG Quantization Quantization scaling
- -
frame). The order of pictures -
using the same DCT
Layer
Video Compression Compression Video macroblocks chrominance chrominance plane) One 8x8 block compressed using DCT A set of set A block of 16x16 A pixel build of four blocks8x8 of the luminance plane and blocks chrominanceof two 8x8 of the planes (one block of each Contains Contains one picture A set of pictures where of pictures one set at least (usually is encoded first) the A without references to other pictures (I within layer this may be different the from presentation order Controls buffering of video data; each sequence with the starts constant bitrate and amount of memory required for the following sequence
Block Layer Block Macroblock Picture Layer Layer Picture Layer Slice Group of Pictures Layer PicturesGroup of Sequence Layer Layer Sequence
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof. 6. 5. 4. 3.
2. 1.
18 - run
quantized, quantized,
MPEG (4) MPEG
- encoded
huffman Video Compression Compression Video
references to frame #1 in frame #2 and frame #3 Moving are vectors references to a 16x16 pixel area in prior or following frames describing temporal correlations Parts of frame Parts #1 which change only slightly, be may encoded as
Temporal correlations between video frames are used to frames video Temporal correlations between macro reduce the of 16x16 pixelblocks size reduced transformed block is Each 8x8 using the DCT, length encoded, and During image preparation preparation image During the chrominanceis precision
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof. 3.
2. 1.
19
MPEG (5) MPEG
-
Video Compression Compression Video "good" "good" encoders find "good" matches in prior or following frames "medium" encoders find some matches in other frames "bad" encoders do not even try to find moving vectors • • • Usually Usually luminance only the values are similar used to find areas two adjacent macro blocks often have similar moving vectors MPEG does not define MPEG an algorithm for finding such a vector Applying Applying the DCT + quantization + entropy encoding difference to the special only. A value is used to describe a complete 8x8 empty block similar area Calculate the difference actual of the 16x16 pixel and the similar area (using luminance and chrominance planes) The moving The moving vector, describing the frame and the position of the
MPEG defines MPEG how to describe a moving vector only similar 16x16 area in 16x16 area If such other frames. similar an area has been found the block is encoded macro by: While encoding encoding While block a macro the encoder for seeks a
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
20
Frames) are Frames) -
MPEG (6) MPEG Frames)
- -
Frames -
Frames: -
Frames) - , and B and , - , P , -
Video Compression Compression Video frames of the "future" (B "future" the of frames frames that depend only depend (P on preceding frames that frames simplifycalculationB the used to of "past" and the of depend on framestemporal correlations may full frames are transmitted periodically periodically transmitted are (I full frames
Sequence of I Sequence Several frame types with different temporal temporal different frame Several with types correlations:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
- 21
MPEG (7) MPEG
-
frames may be encoded like macro blocks of I be encoded like blocksmay of macro frames -
Video Compression Compression Video
Macro blocks of P Macro blocks of only the difference that used, so may be or moving vectors frames be encoded must frame a prior to coding ~ JPEG coding ~ frames Coding I with relation prior P or to Coding without relation to other images Coding other without relation to 8 pixel Apply entropy DCT quantization blocks spectral 8 x + + to frames - frames -
P I
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
22
MPEG (8) MPEG
-
frames may be encoded be like blocks may of macro frames - Video Compression Compression Video
frames or may be described as an interpolation of I or P frame P frame or I interpolation described as an may be of or frames - P coded be to block macro the to differences its blocks and macro Coding following with relation prior and and P frames to I B of Macro blocks
frames -
B
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
23
MPEG (9) MPEG
- frames are used frames -
Video Compression Compression Video
used to support fast forward fast support used to necessary periodic when not I contain low frequency information only, i.e. the DC only the DC coefficient contain low only, i.e. frequency information
frames -
D
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
24
I B 12 13
B B 11 12
B B 10 11
I 13 B 10 MPEG (10) MPEG
9 B 8 P -
8 B 7 B
7 B 6 B
6 P 9 B
5 B 4 P
4 B 3 B
2 B 3 B
5 B 2 P order
1 I 1 I Video CompressionVideo structure - bitstream display order
GOP
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
25
2 (1) 2 1 - - frames enables the enables frames -
MPEG 1 for high quality quality high 1 for
- -
frames and (optionally) D (optionally) frames and -
Video Compression Compression Video 2 defines scaling capabilities, i.e. a decoder a decoder i.e. capabilities, scaling 2 defines 2 is an extension to MPEG to extension an 2 is - - playback with different frame rates with different playback Each picture is coded in different sizes, whereby the coding of size n the coding of sizes, whereby in different Each picture is coded of size n to the image the differences contains only I of placing Adequate
• • Rate; the number of frames per second may be defined by the be defined second may per frames of number Rate; the receiver Spatial; the horizontal horizontal Spatial; theand vertical be adapted by resolution can receiver the Profiles are defined for application application Profiles classesfor are defined profile qualities per levels of Different
can select the required scale select required can the MPEG video is flexible standard The MPEG
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
26
2 (2) 2 - MPEG
-
Video Compression Compression Video Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
27
use
, as objects
:
of :
video such
including
4 (1) 4 , )
- types example
scene
these objects
analog a conditions
range data
of
decoder
of with
MPEG quality the specific animation
bitrate -
describe
by as
service
high
face 1993 1993
wide number 4
well - the , i.e. i.e. ,
a very
audio as
very to
data a
(MPEG synthesized
started for
spaces objects be
rebuilding bitrates
represent
D to data -
of
bodies digitized 3
low
audio Compression of video plane video + plane
and - D
- music
very 3 : speech
standard
instead graphics efficiently
dynamic
and a 2D for
faces
and from to
of and the Video Video
flexibility
full
Generic human http://www.cs.ualberta.ca/~anup/MPEG4/Demo.html Speech support Text Video Music • • • • • the instead requires Motivation: Motivation:
Work on on Work requirements
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
28
use
-
4 (2) 4 -
MPEG
-
Video Compression Compression Video requirements:
Provide a delivery media that is independent from representation Provide independent is a delivery from thatmedia delivery different borders of the crosstransparently to format environments Provide hyperlinkinginteraction and capabilities intellectualon audiovisual property Manage and protect content only have access authorized users that and algorithms, so Compose Compose audio and visual as well natural and syntheticas audiovisual objects one into scene scenein the Describe the events the objects and mobile ones thus scene, in the various objects the represent Independently allowing their manipulation independent and re accessfor Provide encoding the in the layer resilience residual for to errors channel difficultconditions and for such as types various data
Further
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
29
visual -
Coding Coding audio of
--
4 Standards OverviewStandards 4 (3) - 11:2005 Part 11: Scene description and application engine 13:2004 Part 13: Intellectual Property Management and 14:2003 Part 14: MP4 file format 15:2004 Part 15: Advanced Video Coding (AVC) file format 1:2004 Part 1 Systems (first Version published: 1998) 2:2004 Part 2: Visual 3:2001 Part 3: Audio 6:2000 Part 6: Multimedia Delivery Integration Framework 8:2004 Part 8: Carriage of ISO/IEC 14496 contents over IP 10:2005 Part 10: Advanced Video Coding ------MPEG
total of 78 standards subdivided to 22 Parts ISO/IEC 14496 Protection (IPMP) extensions ISO/IEC 14496 ISO/IEC 14496 networks ISO/IEC 14496 ISO/IEC 14496 ISO/IEC 14496 ISO/IEC 14496 (DMIF) ISO/IEC 14496 ISO/IEC 14496 ISO/IEC 14496
objects ISO/IEC 14496: Information technology technology Information 14496: ISO/IEC
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
30
4 (4) 4 - MPEG
-
Video Compression Compression Video Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
31
4 (4) 4 - , e.g. wavelet , wavelet e.g. MPEG
CoDec -
Video Compression Compression Video
for textures each each AVO is encoded Each AVO by a special A scene is described is described A scene and tree an object by coordinates for
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
32
VOPs -
matchings
of the first row or
2) 8x8or blocks -
2 using I, P, and B or an alpha channel -
coeffizients 4 Visual -
unvisible (0,0) and 4 Part2: Visual 4 Part2:
4 Visual 4 - - An overview
-
koeffizient rectangles, which consists of - MPEG : DC
are codecs that use MPEG codecs that are
predicition
may be predicted by adjacent blocks (only upper or left block)
Xvid colum 4 Natural Video Video Coding 4 Natural a texture defining a bit mask which pixels are visible/ - • • Motion vectors use half pixel resolution to improve precision of Coefficient first compensation Motion compensation may use 16x16 blocks (as MPEG still textures still textures can be encoded using wavelets changing textures are encoded to similarly MPEG encoding of bitmasks: arithmetic encoding (specialized variant) plus motion VOPs may be used to encode foreground and background objects separately A VOP is an arbitrary 2D
MPEG DivX and Some additional features of MPEG Some additional of features A video scene is made up of video sceneA video object planes of made up is (VOP)
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
33
4 - 4 (5) 4 - 1 or 2) in most cases 2) in most 1 or
- MPEG
- 4 is based on VRML on based 4 is -
of a scene adds complexity to the encoder and decoder complexitythe a scene adds to of
Video Compression Compression Video 4 face animation animation 4 face example: - Distinction background and between foregroundwill a lead to MPEG to (comparedcompression better Modelling extensions to the standard are described by the "MPEG bydescribed the are standard the extensions to Description Syntactic Language" (MSDL) new compression techniques data types
http://www.cs.ualberta.ca/~anup/MPEG4/Demo.html MPEG Compression efficiency: Compression The scene description of MPEG of scene description The The standard is very flexible, it may be extended by may be extended it flexible, is very standard The
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
34 2 -
discs)
ray - matchings blu
4 AVC 4 -
length and arithmetic coding), improve efficiency improve coding), arithmetic and length -
codecs MPEG filters (reduce artifacts, so far a common post processing step) are already taken already are step) processing post a common sofar artifacts, (reduce filters
high quality video (AVC is a onmandatoryvideo is codecfor HDTVvideohigh (AVC quality enable very low bandwidth video (e.g. enablelow bandwidthvideo very for video conference) • • Flexible macro block ordering, enables arbitrary ordering of macro blocks, could be used to usedbe couldblocks, of macro ordering arbitrary enables ordering, block macroFlexible often) more regions important (update stream video of the robustness enhance (run encoding entropy adaptiveContext Motion vectors use quarter pixel resolution to improve precision of precision improveto resolutionpixel quarter use vectors Motion boundaries pictureof outside reference may vectors Motion Deblocking mandatory becomethus and compensation motion during account into The motion compensation may reference to multiple previous or following pictures (MPEG pictures following or previous multiple to reference may compensation motion The is picturesof order reference and displaythe Thus only). picture one to references allowsonly decoupled. pictures) fadingof efficiencies (higher compensation motionby prediction Weighted pixel4x4 to down 16x16 betweenvary may blocksize Macro a design goal was to be very scalablevery be to was goal a design
rates but requires much more CPU resources CPU resources more than but requires much rates video previous MPEG The high The degree of flexibility enables high compression AVC is a very flexible a very flexible AVC is format Advanced Advanced Codec Video is identical (AVC) to H.264
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
35 x -
MPEG
-
Video CompressionVideo
7 Overview 7 -
)automatic search capabilities, filtering and selection of - 21: 7: - - includes: includes: financial services, directory services, content providers, aggregators, consumer regulatory devices, services, technology services, delivery services,... MPEG Multimedia framework contents automatic processing, intelligent applications call for proposals in October 1998 efficient efficient identification of multimedia content due to standardized description (semi activities that focusactivities on the development of a multimedia content description interface development of compact descriptions information of
MPEG MPEG
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
36
MPEG Overview MPEG "Multimedia Framework" has
- 21 -
MPEG
7 Overview 7
-
21: 2: 4: 7: 1: Video Compression Compression Video - - - - - includes: includes: financial services, directory services, content providers, aggregators, consumer devices, regulatory services, technology services, delivery services,... Work on Work on the new standard started in June 2000; the standard for description and search of audio and visual content. MPEG the standard the standard which such products as Digital Television boxes set top are based and on; DVD the standard for multimedia for the fixed and mobile web; the standard the standard on which products such as Video CD and MP3 are based;
MPEG MPEG MPEG MPEG MPEG
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
37
uses a nonlinear uses a nonlinear
Companding (Compressing (Compressing Expanding),
: more precise samples low ear requires at
5.10. Audio Compression (1) Compression Audio 5.10. sound compression like silence sound compression methods,
lossy amplitudes amplitudes. than high sample per bits of number the reduce to formula were silence (as samples of zero), since some people since some people were silencehave less samples of zero), (as sensitive hearing Companding Silence they as if compression: some small treated samples are
better quality better Some Some compression and taking advantage perception of our can get of sound; Conventional, Conventional, lossless compression such methods, as Run Length Coding, compress sound can be used file, to depend results but the heavily the specific sound on
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
38
μ=255
Law - based, to encode digitized based, to audio -
Law and A and Law bit input sample x, normalizing it to , it is it normalizing , bit input sample to x, it - - 1,+1], G.711 standard recommends standard G.711 1,+1], µ - Law: encoder Law: - Since encoder performs logarithms are complex, in practice the an approximation produce simpler calculations that much
within interval [ the Experiments indicate that low indicateExperiments speech signals thatamplitudes of amplitudes high than information contain more receivingEncoder 14 International standard, logarithm standard, International ISDN digitalfor samples servicestelephony
Law and A and Law
-
µ
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
39
bit among bits 5 through 5 through 12 bit bits among -
> Segment Code > Segment -
5 = 0 0 1 1 -
bits quantization code is set of the four bits following the bit the set of the code is four following bits bits quantization bit bit is #9 - - 656 Audio Compression Compression Audio (2) – Format
Q0: the 4 S0: subtracting 5 from 5 from S0: subtracting that position -
-
: 5 = 4 in binary representation S2 S1 S0 = 1 0 0 S0 S2 S1 representation 5 = 4 in binary
- Codeword 656| + 33 = 656| - the most significant 1 most significant the 9 Q3 Q28 Q1 Q0 = bits at position
Encode Q3 Encode step 2 in determined position | Adding 33 to the absolute value of the input sample input of the value to absolute the 33 Adding 1 of the most significant the bit position Determining of the input S2 Encode Law - 2. 3. 4.
4. 1. 1. 2. 3.
Example, Example, encode : Algorithm
µ Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
40
33 = 655 33 =
655 – -
Audio Compression Compression Audio (3) equals decimal 4, so 43 x 2^4 = 688 = 2^4 equals so 43 x decimal 4,
Law: decoder decoder Law: example - - Decrement the result by 33: 688 result by 33: Decrement the sign: the Use bit P to determine Binary 0101 equals decimal 5, so 5 x 2 + 33 = 43 33 = 2 + Binary 0101 equals so 5 x decimal 5, code: segment power of the the to raised Multiply result by 2 the Binary 100 Multiply quantization 33: and add the code by 2 Law Law and A - 3. 4. 2. 1.
µ
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
41 1; up to 1; up -
3 3 3 - - - 1, from 32 to from 32 to 1, - 2, Layer 2, -
2 and Layer 2
- 3) -
1, Layer 1, - 3) - channel (stereo) coding coding of channel (stereo) -
2, and from 32 to 320 Kbit/s for Layer Kbit/s and from 32 to 320 2, - MPEG Audio (1)Audio MPEG
channel and two (mono) 1, and from 8 to 160 Kbit/s for Layer for Kbit/s160 8 to from and 1, - - 2 BC: (1994, ISO/IEC 13818 2 BC: (1994, ISO/IEC 1: (1992, ISO/IEC 11172 ISO/IEC 1: (1992, - - Predefined bit rates range from 32 to 448 Kbit/s for layer Kbit/s for to 448 from 32 rates range bit Predefined for Layer 384 Kbit/s • Lower bit rate extension, bit rates down from 32 to 256 Kbit/s for 256 Kbit/s 32 to down from extension, bit rates Lower bit rate Layer channels) coded be can 22.05, extension,16, sampling Low samplefrequencies rateat and 24 kHz A backwards compatible multichannel MPEG extension to 5 main channels enhancement" channel plus a low (5.1 frequent 3 coding methods are defined: Layer defined: are coding methods 3 signal digitized sound signal and 48 kHz sampling rate 44.1, high 32, quality audio atused for
MPEG MPEG
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
42
3) -
MPEG Audio (2)Audio MPEG
7) - 1 audio decoder based coding - - 4: (1999, ISO/IEC 14496 4: (1999, ISO/IEC 2 AAC (Advanced Audio Coding): (1997, (1997, Coding): (Advanced 2 AAC Audio - - Object sampling rates of 8 to 96 kHz 96 8 to of sampling rates an by interpreted and read be cannot compatible, backward Not MPEG A high quality audio coding standard for 1 to 48 channels channels 48 at A high 1 to quality audio coding standard for
MPEG MPEG 13818 ISO/IEC
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
43
Kbit/s
128 Kbit/s 256 (DCC) - - Broadcast Application 112 Internet Audio Internet 192
DVD Digital audio 384 Kbit/s, Stereo 384 Kbit/s, Digital compact Cassette
2 BC three are layers three 2 BC - Delay Layer 19 to 50 ms
35 to 100 ms 59 to 150 ms
4:1 ratio 1 and MPEG 1 and 6:1 to 8:1 6:1 to - 10:1 to 12:1 10:1 to Compression
3 1 2 - - -
(MP3) Layer Layer Layer
complexity, performance delay increase and with each performance complexity, layer Both in MPEG in Both defined. coding These Layers represent family a of algorithms. but codec Basic model is the same,
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
44
Psychoacoustic Psychoacoustic (1)
Critical low audible at high wide band: narrow frequencies, at frequencies Range is about 20 Hz to 20 kHz, most sensitive at 2 to 4 kHz to 2 sensitive at most 20 kHz, Hz toRange 20 is about
Spectral Spectral masking effect the psychoacoustic psychoacoustic the principles. psychoacoustic term The describes human the characteristic auditory of the system human Sensitivity hearing of MPEG's modern perceptual perceptual audio coding modern techniques exploit MPEG's
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
45 can be up to 100 to up can be
postmasking , the the , ms
Psychoacoustic Psychoacoustic (2)
ms Masking effect before and after a strong sound a strong and after before effectMasking 5 to Pre masking 2 is about
Temporal Temporal masking effect
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
46
* 32) and 1152 *
subban subband
of frequency (the are audio samples
MPEG Audio encoderAudio MPEG 1 (12 samples/ 2/3 (36 * 32) - - subbands Mask Ratio) for each Mask Ratio) - to - samples are packaged containing into frames 384
The input audio stream through simultaneouslypassesa threshold the masking model that determines psychoacoustic (SMR, Signal subband Layer samples in Layer samples in The input audio stream passes bank that divides through a filter the inputinto 32 transformed from time domain to frequency domain). The
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
47
2
-
data Ancillary 1 Layer1 -
1 frame 1 - Layer
Samples
are written in the frame in the are written
could by represented be 384)
- Bitstreams (0 Scalefactors 1 this allocation 15 1 this can be 0 to - subband
are scaled such that the largest one one largest the such that scaled are
Audio: Audio: subbands 256) - Bit rate, and coding and coding mode rate, - Allocation (128 2/3 (36 * 32) in each frame (36 * 2/3 subband -
CRC (0,16)
(samples in (samples different
subband
FrameFormat MPEG of (32) Header
SCFSI (scale factor selection information) has 2 bits, it indicates indicates it selection bits, SCFSI 2 information) has (scale factor per scale3 factors 2 or whether 1, has 6 bits has 6 1152 samples in Layer each resembles a Layer is divided parts,A frame into 3 different numbers of bits), for Layer for bits), numbers of different bits per each signals of 12 The scaleeach factor than one, greater not one but close to becomes very number of channels, the bit channels, number of in sample each Bit allocation per bits of number describes the subband The header of each frame contains general contains suchinformation as the frame of each The header the sampling the synchronizationfrequency, MPEG layer, information,
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
48
3 -
" Layer
Samples
main_data_begin
Bitstreams information (136,256) Audio: Audio: Side Side
CRC (0,16)
3 variable allowed is rate bit
- 32 Header Encoder can borrow bits donated from past frames from donated can borrow bitsEncoder Side information includes " bits point, a 9 Frame Format MPEG of
For Layer For
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
49
should should
subband subband
Bit Allocation (1) Allocation Bit Since sample rate, samples per frame and bit rate are known, theare known, and bit rate per frame samples Since sample rate, is available fixed bits per frame
be allocated no more bits than necessary to make the the to make necessary bits than be allocated no more quantization noise inaudible For the most efficient compression, the most efficient each For below the masking threshold, while using than not more the available bits over a frame. The bit allocation algorithm works in this way, that the the that bit allocation way, in this The works algorithm quantization noise (difference between the original spectral and values the quantized spectral values) is The psychoacoustic psychoacoustic The model and allocation the bit algorithm are invoked to determine allocated the number of bits to the quantization each scaled sample of in each
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
50
> -
2: - Bit Allocation (2) Allocation Bit
3: 1 Layer and - -
Outer iteration loop Outer Achieving quantization a smaller noise larger number requires a bits of If the number of bits exceeds the number of bits available number of bits exceeds the number of the If leading larger quantization to The process is done by two nested iteration loops two nested is done byThe process The quantized Coding Hoffman value are coded by loop) iteration loop (rate Inner The process repeats until no more code bits can be allocated bits can be no more code until repeats The process
For Layer For Layer For
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
51
3 (high - 4 - uniform quantization, -
1 -
rates - Advanced Advanced Coding) Audio
AAC : ( AAC
2 -
2 AAC is the state the art of is the schemeaudio coding for 2 AAC - 3 and usesin a lot of details coding tools for new - Digital Digital broadcasting system Audio via Internet
MPEG
The only coding used within MPEG audio scheme the standard Application improved low bit at quality on the contrary standards onlyAAC the format of the encoded audio data. frequency resolution filter bank, non filter bank, resolution frequency coding,Huffman loop structure), iteration but improves on Layer generic codinggeneric up it givessignals; multichannel of stereo to MPEG backwards compatibility coding paradigm the same basicfollowsas Layer AAC MPEG
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
52
e.g. BMP, TIFF, TIFF, e.g. BMP, PNG JPEG, e.g. SVG MP3 e.g. WAV, e.g. MIDI in practice not relevant OGG AVI, e.g.
raster image vector image wave audio synthesis audio 5.11. MultimediaFormats File
combined formats/ container formats formats/ container combined audio video images Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
53
8 bit per channel - rare deployment on non Windows platforms well not for the suited Internet 1 resulting 24 in 1, 4, 8, bit none (mostly) RLE (only colorat 4 or 8 bit depth supported) on Windows platform: standard format large files
Color depth: Color Compression: Advantages: Disadvantages:
,
File Formats: Formats: File BMP
BitMaP
grayscale
1 or 3 color channels no alpha channels only Windows color management Monochrome, RGB; indexed colors Microsoft Corp. (Microsoft Windows) .bmp
Channels: ICC profiles: Development: Color representation: File extension: File extension: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
54
)
PackBits
16 bit per channel - files are comparatively large, thus not well suited for Internet use different compression methods cause interoperability issues 1 None RLE ( Group LZW, CCITT 3 and 4 JPEG multiple pictures per file are supported extendable platform independent
Color depth: Color Compression: Advantages: Disadvantages:
; RGB; File Formats:File TIFF
grayscale
) Image File Format ) Image File
ged 1, 3, 4 color channels or channels alpha 20 indexed colors indexed others and monochrome; monochrome; CMYK Systems, Systems, Inc.), Microsoft, HP, as scanner as and printer other well manufacturers
, .tiff tif yes Aldus Corporation (now part part of Adobe (now Corporation Aldus Tag( .
Channels: ICC profiles: ICC Color representation: Development: File extension: File Name: Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
55
)
compression not suited
16 bit per channel - lossy for graphics (unsatisfying results with outlines, missing transparency) no advanced features (see JPEG2000 1 lossless (prediction, Huffman; used seldom: verynot effective) DCT well suited colorful for images widely supported good compression results Color depth: Color Compression: Advantages: Disadvantages:
File Formats: Formats: File JPG
; RGB; CMYK ; RGB;
3 or 4 color channels no alpha channels and others grayscale Group (JPEG) File Group (JPEG) Interchange Format (JFIF) yes JPEG, ISO JPEG, .jpg, .jpg, .jpeg Joint Photographic Expert
Channels: ICC profiles: Color representation: File extension: Development: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
56
and lossless
more processing needed compared to JPEG still not widely supported Wavelet better compression results at same quality compared to artifacts)(no JPEG errors images in file affect only locally (Region ROI of Interest) many options for progressive image display lossy compression possible Disadvantages: Compression: Advantages:
File Formats:File JPEG2000
yes
; RGB; CMYK ; RGB;
16 bit per channel -
1 256 grayscale and others JPEG, ISO JPEG, Group 2000 .jp2 Joint Photographic Expert
Color depth: Color ICC profiles: Channels: Color representation: Development: File extension: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
57
not suited for colorful images (only 256 colors, limited to lossless compression) two incompatible versions (GIF87a und GIF89a) LZW well suited for Internet graphics widely supported interlaced: progressive image display possible animated graphics possible license issues until mid 2004 Compression: Advantages: Disadvantages:
File Formats:File GIF
max. 8 bit 1 transparent color possible indexed indexed colors
gif no CompuServe CompuServe Inc. Graphics Interchange Format .
Color depth: Color Channels: ICC profiles: Development: Color representation: File extension: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
58
web
web
since IE4 (no since IE4
16 bit per channel 16 bit - Improved interlacing Improved Still lacking by support browsers 1 LZ77) todeflate (similar Supported by browsers Netscape transparency!), 4.04 GIF Superset of functionality Disadvantages: Color depth: Color Compression: Advantages:
File Formats:File PNG
; RGB
3 channels 1 alpha channel grayscale indexed colors Technology Technology (MIT)
png limited Massachusetts Massachusetts of Institute Portable Portable Network Graphics .
ICC profiles: Channels: Color representation: Development: File extension: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
59
files are files comparatively large, suited for Internet thus not use PICT (Macs), TIFF PICT Metafile (WMF; Windows), EPSI none (usually) JPEG well suited printing for supports vector graphics and raster graphics Disadvantages: Format of Preview Images: of Format Preview Compression: Advantages:
;
File Formats:File EPS
grayscale
8 bit per channel -
1 1, 3, or 4 color channels no alpha channels monochrome; CMYK RGB; and others eps yes Adobe Inc. Adobe Systems, Encapsulated Encapsulated PostScript .
Color depth: ICC profiles: Channels: Development: Color representation: File extension: Name:
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
60
1.1)
Format
1 Layer 1 1,2 Layer 1 3 4 - - - Audio FormatAudio (Java
Sun MPEG MPEG MPEG Format Microsoft Wave Interleaved Video Audio Microsoft ogg (not an acronym) QuicktimeApple Audio/Video Formats Audio/Video
Suffix MooV .
,
mov wav . .au .avi .ogg .mp4 . .mp2 .mp3 Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
61
cost,
, bit colors 16 only
bitrate
Microsoft AVI for large uniform uniform for areas large Microsoft, good quality, fast computation, fast high computation, quality, Microsoft, good cost computational high TV quality, high quality, good colors, high computational cost computational high colors, good quality, high computational much for movement, high Intel, without good pictures bad bit colors 8 fast, only relatively lossless, Encoding, Length Random codec creates about 60 MB/min for 384x288 pixels and 25 MB/min for about 60 codec creates
1 5.10 codec creates about 40 MB/min for 384x288 pixels and 25 MB/min for about 40 5.10 codec creates -
Indeo RLE 1 Video MPEG Cinepak
Indeo fps. Cinepak fps.
Some of the codecs used for video compression in an AVI file: AVI in an video compression for codecs used the of Some RIFF is Resource Interchange File Format and serves as general as general File isRIFF serves Interchange Resource and Format was defined exchanging It multimedia for data types. format purpose Electronic Arts. from IFF IBM based on and by Microsoft followed with a header RIFF starts and lists. by chunks AVI Files AVI Files special RIFF files. are a case of
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
62
1) available as video codec on -
Microsoft WMV
4 Video - 9) -
Ray discs -
Blue Three versions: WMV1, WMV2, WMV3 (released WMV3 (released with WMV2, versions: Three WMV1, Media Player 7 WMV3 (also called VC High definition High definition available format (WMVHD) Supports DRM files Similar to MPEG Window Window Media Video Proprietary video from Microsoft codec encapsulated Often ASF (Advanced in Streaming Format)
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
63 -
4. -
based based compression of -
based based measurement data. Quicktime -
4 is based on the QuickTime 4 is based QuickTime on the -
Apple Apple
feel user. for the -
Apple claimsMPEG that with MPEG 6 works QuickTime video format. the video compression the video compression algorithms all are from Apple and not documented. It is possible is possible add new compression It to algorithms and driver codecs. for video Unfortunately, the technical details are hidden. Especially QuickTime comes with a lot of convenient convenient comes QuickTime a lot of with to functions ease programmer's the a standard and create look task and video and synchronized audio. can also be used time for It QuickTime is a method is a method QuickTime time for
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
64
2 with -
1 with 352x288 1 with 352x288 - 2 with 720x576 pixels -
384 Kbit/s audio. 35 to 60 min384 audio. fit on a Kbit/s -
Storing Multimedia Data Multimedia Storing
6144 Kbit/s audio. 60 to 180 min fit on a to 180 min fit audio. 60 6144 Kbit/s -
including 64 including (4.7GB). DVD single disc MB). single (650 (DVD) uses MPEG The digital video disc interlaced frames. and allowsThere to 9.8 is up Mbit/s data The super video compact disc (SVCD) uses MPEG The super video(SVCD) uses compact disc interlaced frames. allows There 480x576 pixels to 2.6 and is up 64 includingMbit/s data single CD (650 MB). According to the standard, isthere 1.15 standard, to the MB). According CD (650 single kHz samplingaudio with 44.1 audio and 224 Mbit/sKbit/s video rate. popular formats: VCD, SVCD, and DVD. and formats:SVCD, popular VCD, MPEG (VCD) uses The video compact disc and can store pixelsup to 74 min and audio on a of video For multimediastoring home, at data there currentlyare three
Prof. Dr. Paul Müller, University of Kaiserslauternof University Müller, Paul Dr. Prof.
kl.de -
2263
-
67653 Kaiserslautern 67653 - Email: [email protected] Email: Phone: +49(0) 631 205 631 +49(0) Phone: D Integrated Communication Systems ICSY Systems Communication Integrated Kaiserslautern of University Science Computer of Department 3049 Box P.O. Prof. Dr. Paul Müller