Estimating a Small Signal in the Presence of Large Noise

Estimating a Small Signal in the Presence of Large Noise Amy Zhao Fredo´ Durand John Guttag [email protected] [email protected] [email protected] MIT Computer Science and Artificial Intelligence Laboratory Abstract to or lower than the noise level in the video. It is important to understand the signal and noise characteristics in order Video magnification techniques are useful for visualiz- to produce accurate and informative visualizations. In this ing small changes in videos. For instance, Eulerian video work, we discuss how the signal is affected by the noise magnification has been used to visualize the flow of blood that is introduced by the acquisition and the compression in the human face. Such visualizations have possible ap- processes. We demonstrate a preliminary algorithm for es- plications in remote monitoring or screening for diseases. timating the amplitude of a small signal in the presence of However, when visualizing blood flow, the signal of interest relatively large noise. may be similar in amplitude to the noise in the video. This raises the question of what one is actually seeing in a mag- 2. Related work nified video: signal or noise? We seek to understand these signal and noise characteristics with the goal of producing Noise Modern images and image sequences are typically informative and accurate visualizations. We present a pre- affected by noise introduced by the acquisition process, the liminary algorithm for estimating the signal amplitude in compression process, or other factors such as motion [15]. the presence of relatively high noise. We demonstrate that CCD noise models usually assume that the acquisition noise the algorithm can be used to accurately estimate the sig- level is dependent on the uncorrupted pixel intensity, but is nal amplitude in an uncompressed simulated video, but is independent between pixels. The relationship between the susceptible to compression noise and motion. noise level and mean pixel intensity is described by a noise level function (NLF), which is dependent on the natural be- havior of photons, the properties of the camera and some recording parameters [14, 12]. 1. Introduction Compression noise can introduce additional artifacts to Video magnification can be used to visualize minute an image or video. Many modern image and video compres- changes in videos that would normally not be visible. Stud- sion standards apply a transform (commonly the discrete ies have indicated that there is a signal associated with blood cosine transform, or DCT) in a block-wise fashion to the flow in the skin; this signal is undetectable by the naked image and then quantize the transform coefficients [26, 27]. eye, but can be measured using video processing techniques Images and videos compressed in this fashion typically ex- es and visualized using video magnification. Signals relat- hibit “blocking” artifacts. ing to blood flow (e.g., heart rate) have been obtained from videos of human faces by applying temporal and spatial fil- Noise and signal estimation There is a large body of tering [25] or blind source separation [18] to the intensity literature regarding noise estimation and denoising tech- signal, or by tracking the motion of the head [4]. Tech- niques for images and videos. The NLF of a CCD cam- niques for examining these signals have potential applica- era is easy to estimate from a video or multiple images of tions in remote monitoring or screening for diseases relating a static scene [8, 2, 12]. In the image processing litera- to blood flow. ture, the signal of interest is typically defined as the uncor- Wu et al. showed that Eulerian linear video magnifica- rupted image; denoising aims to recover this image from a tion can be used to visualize the blood flow signal in hu- noisy image. A large class of image denoising techniques man faces [28]. In the context of visualizing blood flow, rely on the concept of averaging to reduce noise levels, e.g., the signal of interest (that is, the intensity changes reflect- through Gaussian smoothing or anisotropic filtering [17]. ing the flow of blood) can have an amplitude that is similar There are also deblocking techniques that remove visible 432120 block-based compression artifacts by detecting and smooth- we can apply a small spatial averaging filter without signifi- ing block boundaries [23, 22, 3]. cantly altering the signal characteristics; and (2) has a small Several key differences make it difficult to directly ap- enough amplitude as to not affect the intensity-dependent ply any of these noise estimation and denoising techniques noise level. Earlier, we discussed the difficulty of model- to video magnification. Firstly, most existing denoising al- ing temporally correlated noise terms such as motion. We gorithms focus on applications with relatively high input assume for now that the subject of the video has been me- SNR’s, and may not be accurate enough to estimate a low chanically stabilized such that the effect of motion is negli- amplitude signal. Typically, video denoising algorithms are gible. evaluated on image sequences with peak signal-to-noise ra- In the image and video denoising literatures, the acqui- tio (PSNR) values in the 10-30 dB range, and obtain im- sition noise is commonly assumed to be additive zero-mean provements in PSNR on the order of 0-10 dB [6, 19]. In Gaussian noise with a variance that is proportional to the contrast, in order to visualize blood flow in the human face luminance of the scene [24]. We assume in our model that using Eulerian video magnification, a magnification factor each video frame is affected by such acquisition noise, as of 100 has been reported [28]; this suggests that the ampli- well as by JPEG compression. tude of the color variation in the input video is extremely We further assume that the video scene is smooth, so low relative to the image intensity, and that the SNR is fre- that pixels within small neighborhoods exhibit similar noise quently much lower than 10 dB. levels. We can enforce this assumption in real videos by Secondly, video magnification is affected by noise that excluding parts of the scene that contain high gradients. may be hard to model and/or correlated with the signal. For instance, when visualizing blood flow, the color change sig- 3.2. Signal and Noise Model nal is commonly corrupted by a motion signal with a simi- We model an uncompressed video frame as follows: lar temporal signature. Many state-of-the art video denoising techniques are capable of removing the types of noise I = I0 + φ + nacq, (1) commonly introduced by imaging devices (e.g., additive, multiplicative or structural noise [5, 13]), but may not be where I0 denotes the mean observed image intensity, φ applicable to more complex noise models. the signal of interest and nacq the zero-mean Gaussian noise that is introduced during the acquisition process. These 3. Signal Estimation for Video Magnification terms are assumed to be independent. Similarly, we model a compressed video frame as: We discuss the signal and noise in the context of visualizing blood flow using the Eulerian linear video magnification Ic = I0 + φ + nacq , (2) technique described in [28]. Specifically, we focus on the Q{ } problem of estimating the amplitude of a color change sig- where represents the quantization operator that is nal in a video that has been corrupted by acquisition noise applied duringQ{·} the JPEG compression process [26]. of a similar amplitude, as well as compression noise in- In JPEG encoding, the image is first divided into non- troduced by the Motion JPEG (MJPEG) compression pro- overlapping 8 8 blocks. The discrete cosine transform cess. The MJPEG scheme is less commonly used than other (DCT) of each× block is computed according to the formula: video compression schemes such as H.264/AVC or MPEG- 4. However, we choose to study it because it compresses 7 7 1 each frame independently according to the JPEG compres- H(k,l)= C(k)C(l) I(x,y) 4 · sion scheme, and thus does not introduce any inter-frame xX=0 Xy=0 correlations. (2x + 1)kπ (2y + 1)lπ cos cos , In [25], the green channel was observed to have the 16 16 strongest signal pertaining to blood flow. We focus on the 1 Y or luminance channel, as it contains much of the green where C(k),C(l)= for k,l = 0, √ channel information, and is not subjected to the subsam- 2 pling process that is applied to the chroma channels during C(k),C(l) = 1 otherwise. MJPEG encoding [26]. H(k,l) describes the DCT coefficient value at location 3.1. Assumptions k,l. Each DCT coefficient is then quantized using an application-defined quantization table Q as follows: We assume that the input video depicts a stationary scene containing a small color change signal. We assume that the H(k,l) HQ(k,l)= Q(k,l) round . signal: (1) is smooth and varies slowly in space, so that · Q(k,l) 21 The quantization operation is lossy and may be modeled In our algorithm, we choose several box filter sizes and 2 differently depending on the image content. DCT coeffi- measure the variance σ for each mi. We then extrapolate Imi σ2 cients that are smaller than the corresponding quantization 2 1 nacq a linear fit of σ against 2 to the point at which 2 = step are quantized to zero and result in a reduction in image Im m m 0 and σ2 = σ2 ; the intercept of the linear fit is an estimate energy, while larger DCT coefficients contribute noise that Im φ may be modeled as additive random noise [9, 21].

Load more