The Daala Directional Deringing Filter Jean-Marc Valin, Member, IEEE

1 The Daala Directional Deringing Filter Jean-Marc Valin, Member, IEEE Abstract—This paper presents the deringing filter used in 0 0 1 1 2 2 3 3 the Daala royalty-free video codec. The filter is based on a 1 1 2 2 3 3 4 4 non-linear conditional replacement filter and is designed for vectorization efficiency. It takes into account the direction 2 2 3 3 4 4 5 5 of edges and patterns being filtered. The filter works by 3 3 4 4 5 5 6 6 identifying the direction of each block and then adaptively filtering along the identified direction. In a second pass, the 4 4 5 5 6 6 7 7 blocks are also filtered in a different direction, with more 5 5 6 6 7 7 8 8 conservative thresholds to avoid blurring edges. The proposed deringing filter is shown to improve the quality of both Daala 6 6 7 7 8 8 9 9 and the Alliance for Open Media (AOM) AV1 video codec. 7 7 8 8 9 9 10 10 Figure 1. Line number k for pixels following direction d = 1 in an 8 × 8 I. INTRODUCTION block. The Daala video codec has been under development since 2012 with the goal of achieving royalty-free status by using technology that differs greatly from more traditional video The filter works by identifying the direction of each block codecs such as H.264 and HEVC. For example, Daala and then adaptively filtering along the identified direction. In uses lapped transforms [1], [2] rather than relying on more a second pass, the blocks are also filtered in a different di- traditional deblocking filters. It is also based on the percep- rection, with more conservative thresholds to avoid blurring tual vector quantization(PVQ) [3] technique rather than on edges. The deringing filter also vectorizes well, requiring no scalar quantization of a transformed residual. Some of these per-pixel scalar operation. An high-level interactive demon- techniques achieve better quality on textured content, at the stration of the algorithm is available at [7]. cost of more ringing artefacts. These are especially present in sharp edges due to Daala’s lack of intra prediction modes capable of predicting diagonal directions. For these reasons, II. DIRECTION SEARCH Daala greatly benefits from a deringing filter. The main goal of deringing is to filter out ringing artifacts The proposed deringing filter is based on the direction of while retaining all the details of the image. In HEVC, this is edges, so we will start by describing the direction search. achieved by the Sample Adaptive Offset (SAO) [4] algorithm For this, the image is first divided into blocks of 8 × 8. The that defines signal offsets for different classes of pixels. block size is chosen to be fine enough to adequately handle Unlike SAO, the approach we take in Daala is that of a non-straight edges, while being large enough to reliably non-linear spatial filter. From the very beginning, the design estimate directions when applied to a quantized image. of the filter was constrained to be easily vectorizable (i.e. Having a constant direction over an 8 × 8 region also makes implementable with SIMD operations), which was not the vectorization of the filter easier. case for other non-linear filters like the median filter [5] and For each block we want to determine the direction that the bilateral filter [6]. best matches the pattern in the block. This is done by The design of the deringing filter originates from the minimizing the sum of squared differences (SSD) between following observations. The amount of ringing in a coded the quantized block and a perfectly directional block. A image tends to be roughly proportional to the quantization perfectly directional block is a block for which each line step size. The amount of detail is a property of the input im- along a certain direction has a constant value. For each age, but the smallest detail actually retained in the quantized direction, we assign a line number k to each pixel, as shown image tends to also be proportional to the quantization step in Fig.1. size. For a given quantization step size, the amplitude of the For each direction d, the pixel average for line k is ringing is generally less than the amplitude of the details. determined by: This paper describes a deringing filter that takes into account the direction of edges and patterns being filtered. 1 X µd;k = xp ; (1) Nd;k Jean-Marc Valin is with Mozilla Corporation and Xiph.Org Foundation. p2Pd;k Corresponding email: [email protected] Copyright 2014-2016 Mozilla Foundation. This work is licensed under CC-BY 4.0. Send correspondence to Jean-Marc Valin <jm- where xp is the value of pixel p, Pd;k is the set of pixels in [email protected]>. line k following direction d and Nd;k is the cardinality of 2 Figure 3. Conditional replacement filter computation Figure 2. Example of direction search for an 8 × 8 block. The patterns III. CONDITIONAL REPLACEMENT FILTER shown are based on the µd;k values. In this case, the 45-degree direction 2 The conditional replacement filter is designed to remove is the one that minimizes σd, so it would be selected by the search. Note that the error values σd shown are never computed in practice (only sd is). noise without blurring sharp edges. A low-pass finite impulse response (FIR) filter with (2M + 1) taps in one dimension can be expressed as P (for example, in Fig.1, N = 2 and N = 8). The k=M d;k 1;0 1;4 1 X SSD is then defined as: y (n) = w x (n + k) ; (6) W k 2 3 k=−M 2 X X 2 M σd = (xp − µd;k) : (2) P 4 5 where W = k=−M wk. Although it removes noise from k2block;d p2Pd;k a signal, it also blurs any details and sharp edges, which Expanding (2) and then substituting (1) into it, we get is undesirable. Bilateral filters avoid the blurring effect by making each weights w dependent on the difference signal 2 3 k x (n + k) − x (n), typically using a Gaussian function. One 2 X X 2 X X 2 σd = 4 xp − 2 xpµd;k + µd;k5 disadvantage of this approach is that it requires keeping track k2block;d p2Pd;k p2Pd;k p2Pd;k of the sum of the weights for each sample n being filtered. 2 3 1 It also results in having a different W normalization factor X X 2 X 2 = 4 xp − 2µd;k xp + Nd;kµd;k5 for each pixel, making vectorization harder. k2block;d p2Pd;k p2Pd;k Rather than change the wk values, the conditional replace- 2 0 123 ment filter changes the signal x (n + k) used in the filter. X X 2 1 X When filtering sample n, any of the x (n + k) tap inputs that = 6 xp − @ xpA 7 4 Nd;k 5 differs from x (n) by more than a threshold T , is replaced k2block;d p2P p2P d;k d;k by x (n) in the calculation: 0 12 k=M X 2 X 1 X 1 X = xp − @ xpA : (3) y (n) = wkR (x (n) ; x (n + k) ;T ) ; (7) Nd;k W p2block k2block;d p2Pd;k k=−M Note that the simplifications leading to (3) are the same with 2 as to those allowing a variance to be computed as σx = ( P 2 P 2 xk ; jxk − x0j < T x − ( x) =N. Considering that the first term of (3) R (x0; xk;T ) = : (8) is constant with respect to d, we simply find the optimal x0 ; otherwise direction dopt by maximizing the second term: The filter computation is illustrated in Fig.3 and an example is shown in Fig.4. dopt = max sd ; (4) d Through algebraic simplifications, we can express the where filter (7) in terms of the differences x (n + k)−x (n), which 0 12 yields X 1 X sd = @ xpA : (5) k=M Nd;k 1 X k2block;d p2Pd;k y (n) = x (n)+ w f (x (n + k) − x (n) ;T ) ; W k k=−M;k6=0 We can avoid the division in (5) by multiplying sd by (9) 840, the least common multiple of the possible Nd;k values with the threshold function (1 ≤ N ≤ 8). When using 8-bit pixel data (for higher d;k d ; jdj < T bit depths, we downscale to 8 bits), and centering the values f (d; T ) = : (10) 0 ; otherwise such that −128 ≤ xp ≤ 127, then 840sd and all calculations leading to that value fit in a 32-bit signed integer. The advantage of this formulation is that the normalization 1 Fig.2 shows an example of a direction search for an 8×8 by W can be approximated without causing any bias, even block containing a line. when W is not a power of two. Also, because W does 3 50 50 80 45 45 23 24 40 40 25 Second filter 35 35 22 8 Filtered pixel 30 30 26 25 25 Effective spatial support 20 20 0 10 20 30 40 50 0 10 20 30 40 50 sample sample 50 50 Figure 5. Filtering along the direction (left) and filtering across the direction (right). 45 45 40 40 Table II 35 35 SECOND STAGE FILTER PARAMETERS 30 30 Index Direction dx dy 25 25 0 $ 1 0 1 0 1 20 20 $ 0 10 20 30 40 50 0 10 20 30 40 50 2 0 1 sample sample $ 3 0 1 $$ 1 0 Figure 4.

The Daala Directional Deringing Filter Jean-Marc Valin, Member, IEEE

Network Working Group A

Automatic Detection of Perceived Ringing Regions in Compressed Images

On Audio-Visual File Formats

Amphion Video Codecs in an AI World

DVP Tutorial

Opus, a Free, High-Quality Speech and Audio Codec

High Efficiency, Moderate Complexity Video Codec Using Only RF IPR

Encoding H.264 Video for Streaming and Progressive Download

The Daala Video Codec Project Next-Next Generation Video

IP-Soc Shanghai 2017 ALLEGRO Presentation FINAL

PVQ Applied Outside of Daala (IETF 97 Draft)

Efficient Multi-Codec Support for OTT Services: HEVC/H.265 And/Or AV1?