VIDEO RECORDINGS LOCALIZATION BASED ON ELECTRIC NETWORK FREQUENCY VARIATION

Vlad-Dragos Darau

Trabajo de Fin de Grado Escuela de Ingeniería de Telecomunicación Grado en Ingeniería de Tecnologías de Telecomunicación

Tutores Fernando Pérez González

2016 VIDEO RECORDINGS LOCALIZATION BASED ON ELECTRIC NETWORK FREQUENCY VARIATION

Vlad-Dragos, D˘ar˘au Director: Fernando P´erezGonz´alez Academic Course 2015/2016

20th of June, 2016 ii Contents

1 Introduction 1 1.1 The Signal Processing in Communications Group 1 1.2 Fingerprinting : a Branch of Watermarking ..2 1.3 ENF - What Is It? ...... 2 1.4 ENF - Properties And How The Electric Net- works Work ...... 3 1.5 ENF- Relation with Generators ...... 3 1.6 ENF- How to extract it? ...... 6

2 State Of The Art 13 2.1 Audio Authentication/Tampering ...... 13 2.2 Location Stamping ...... 14 2.3 Video Synchronization and Historical Alignment 15 2.4 Time Stamping ...... 15

3 Measuring the ENF - Circuits 17 3.1 Analog Option ...... 17 3.2 Digital Option ...... 18 3.2.1 Acquisition Part ...... 18 3.3 Processing Part ...... 22 3.4 Other Circuits for Sensing The Mains Hum .. 24

4 Video as an ENF carrier 27 4.1 Color Spaces and Video Standards ...... 27 4.2 Aliasing Effect Consequences ...... 29

5 Experiments 33 5.1 Category I-ENF Spatial Localization ...... 33 5.1.1 ROC Curves and Signal Features ...... 33 5.1.2 Experiment I-Frequency Classification .. 35 5.1.3 Experiment II-Range Classification .... 36

iii iv CONTENTS

5.2 Category II- ENF Time localization ...... 37 5.2.1 Experiment I- Simulated Delay ...... 40 5.2.2 Experiment II-Simulated Time Localiza- tion ...... 43

6 Conclusions and Future Work 45

Appendices 51

Appendix A 51 A.1 Circuit Schematic and BOM ...... 51 A.2 Matlab Functions and Scripts ...... 53 A.3 C Code for PIC Microcontroller ...... 69 A.4 Luminosity Extraction Algorithm ...... 73 Chapter 1

Introduction

In our days, the continuous growth of digital data is supported by the nu- merous modern web sites and platforms which encourage each user to be not only a consumer, but a producer of data. More specifically, everyone can now have its own channel or web page on any video or social platform, where one can promote or just talk about anything, by means of media, like video or audio recording. One of the problems that arises from this fact is itself the name of one characteristic of the digital data: veracity. The level with which we can trust one data set is very important, especially when we are trying to prove something based on it. In fact, digital media security is and will always probably be a current topic of interest in the forensic world. Moreover, the future development of networked multimedia systems, in par- ticular on open networks like the Internet, is and will be conditioned by the development of efficient methods to protect data owners against unautho- rized copying and redistribution of material. This thesis will be organized in the following manner: first, the concept of fingerprinting and more specifi- cally, ENF as a fingerprint will be defined. In the second part, a state of art overview shall be presented. In the third part the circuit and Matlab code, followed by numerous experiments will be described. Finally, a section of results of conclusions, followed by appendices, will conclude the document.

1.1 The Signal Processing in Communications Group

The following diploma project has been made with the support of The Sig- nal Processing in Communications Group (GPSC), GRADIANT and The Hardware in Communications Laboratory, which are part of Univerdidad de Vigo. The Signal Processing in Communications Group (GPSC) has been consistently developing two parallel research lines which make extensive use of advanced signal processing tools. On one hand, they have experience in the application of such techniques to communication systems and sensor networks, e.g., in synchronization, equalization, link monitoring and adapta- tion, satellite and mmWave communications, spectrum sensing, etc. On the

1 2 CHAPTER 1. INTRODUCTION other hand, the group has been active in the field of multimedia security, including digital watermarking, digital forensics, and signal processing in the encrypted domain. The proficiency of the group can be attested by the results obtained in these areas: in the last 10 years, GPSC members have directed 9 PhD theses, published over 70 papers in international journals, and secured 9 M ein funding from public (including European projects) as well as private sources (with more than 10 patents).

1.2 Fingerprinting : a Branch of Watermarking

Watermarking is the process of protecting anything, from an object till any virtual information, from the threat of piracy, assuring like this intellectual property protection and it has been used starting with the year 1272 for banknotes authentication. There is an increasing need for software (or in the worst case, hardware) that allows for protection of ownership rights, and it is in this context where watermarking techniques come to our help. Perceptible marks of ownership or authenticity have been around for centuries in the form of stamps, seals, signatures or classical watermarks, nevertheless, given current data manipulation technologies, imperceptible digital watermarks are mandatory in most applications. A digital fingerprint, on the other side, is just a signal which identifies a recipient uniquely, used mainly to limit unauthorized redistribution of multimedia content. The main difference between a fingerprint and a watermark is that the first one analyses the content, identifying a unique set of inherent properties rather than adding information to the content, like the latter one.

1.3 ENF - What Is It?

The variation of the ENF is a robust, natural fingerprint which is present in audio and video recordings in some conditions. The ENF, which comes from Electric Network Frequency, is nothing than the electrical power grid supply frequency, which has a nominal value of 60 Hz in USA and 50 Hz in most parts of the world. It is more popularly known as electrical ”hum” and it has been proven to be picked up by audio or video recorders which are plugged in or near the power sources [4]. The variations from the nominal values have been proven to be connected with the local quantity of load in the network, which changes over time and geographical location and the switching, which represents part of the dynamic management of an electric grid. The voltage frequency of an electrical power grid can fluctuate at any time under the influence of the variations of any of the three components of the electrical network: the load, which is subject to various on and off switching according to specific operational conditions; the generator, which follows or anticipates the load variations in order to maintain an equilibrium between generation 1.4. ENF - PROPERTIES AND HOW THE ELECTRIC NETWORKS WORK3 and utilization; the electrical grid (mains), whose topology varies whenever the switching maneuvers are done, and which may be affected by faults.[1]

1.4 ENF - Properties And How The Electric Networks Work

Although the electrical network frequency (ENF) should have an exact value of 50 or 60 Hz, depending on the region and on the time of the day, it finally proves to be a function of some events, like stated in Section 1.3. The supply side of the electrical network is mainly formed by devices called alternators, capable of transforming mechanical energy into electrical energy. Because of the electro-mechanical properties that they exhibit, there exist certain features that relate their way of functioning with the frequency of the generated voltage, which finally gives the frequency of the electric network to which one generator is contributing. The nominal frequency value of a power grid needs to be strictly re- spected, as some devices, like motors which spin according to this value or digital clocks which synchronize themselves based on the ENF, depend directly on this value. In order to obtain a stable value in time, different (national or regional) networks are connected between them, creating vast synchronization areas, like those presented in Figure 1.1. Although the term ”synchronized” is used, this does not mean that two remote locations, both parts of the same synchronization area, will present the same variation of ENF in time, as they cannot be controlled up to that extent. In such a case, the synchronization process is based on the number of zero-crossings of the voltage waveform. For every given period, for example 24 hours, the number of zero crossings of the voltage signal in one synchronization region has to be the same. This keeps clocks which use ENF to synchronize, at 0 global error, although there might be differences between them in the speci- fied period of time; at the end of the, say, 24 hours, they will show the same time.

1.5 ENF- Relation with Generators

Any generator, or more specifically alternator, is composed of two parts: one static part and one mobile part (the rotor). Depending on the number of poles that it has, in order to generate a voltage of a desired frequency, the generator must perform a number of rotations per minute. In the most com- mon case (except for nuclear plants), the generator has 2 poles, so according to the formula N = 120 × f/p, 4 CHAPTER 1. INTRODUCTION

Figure 1.1: Synchronization regions by colors in Eurasia [20] one generator must spin with 3000 rotations per minute in order to generate a 50 Hz varying voltage. In the nuclear plants case, the generators usually have 4 poles and as a consequence, the number of rotations performed by such a generator in order to generate the same frequency voltage as a bipolar one has to be half,that is, 1500 in the case of 50 Hz networks.

Figure 1.2: Generation of one cycle of AC voltage [5]

Now if we also take into account the fact that the produced energy must leave the generator, as it is its main purpose to be a source of energy, the total mechanical power at the input of the generator must be equal to the electrical power at the output, discarding the system losses, as shown in Figure 1.4. If the energy demanded by the charges increases, the only source of extra energy can come from the kinetic energy produced by the rotation of the engine. Thus, in the absence of an increment of mechanical energy, the number of rotations per minute will decrease, causing a temporary decrease in the frequency of the generated voltage. In order to maintain a constant frequency of the generated voltage inside the electrical grid, a mechanism of control for the rotation of the motor is employed. In large electrical grids, like our main cases will be, a droop control is employed. In the droop 1.5. ENF- RELATION WITH GENERATORS 5 control mode, the revolution speed of the alternator does not affect directly the frequency of the generated voltage. It is a regulator which changes its operating point in order to make the generator admit more mechanical energy that changes the frequency of the signal, as represented in Figure 1.3. If the required amount of energy is of 50 % of the generator’s capacity, then the frecuency response function of the regulator should have the operating point just at the half of the f0 axis. If an amount of 100 % would be required, then it would be necessary to move the function upwards, in order to intersect of f0 axis at its end[5].

Figure 1.3: Evolution of the frequency in time and operation point of the regulator [5]

Figure 1.4: Energy conservation law in the case of electricity generator

The ENF can be also considered an indicator of balance between demand and generation of energy. An electric system does not have the capability of storing the electrical energy necessary to cover the sudden changes in loads. Thus, the generation must match the demand as much as possible. Sometimes, the sudden change in load values can be anticipated temporally and extra generators can be connected just before those changes are about to occur. In the distributed electrical systems that are used nowadays, the load distribution is made on large geographical areas, controlling which generators are connected to the system and which are not. This is done through a telecontrol, partially automatized system, which can transmit control signals through the electrical system or through a separate system of wires. Both voltage or frequency can be employed for signaling purposes [5]. Such systems are able to respond considerably fast to changes in the frequency of the voltage, caused by sudden changes in energy demand and have a common set of characteristics, as we will latter see. 6 CHAPTER 1. INTRODUCTION

Figure 1.5: An example of ENF signal along time, with sudden changes and reactions of the grid

The usefulness of the variation of the ENF is known to be high for various applications in forensics. In this thesis, we will prove some of them, like media authentication, fingerprinting, location of a recording and time of recording. The ENF is also used for binding audio with video signals, or for chronological alignment of signals.

1.6 ENF- How to extract it?

First, the ENF signal must be recorded directly from the power lines or through an ENF carrier, like light bulbs or loudspeakers. According to Nyquist’s sampling theorem, if we want to reconstruct an analog signal of frequency fa Hz, we must sample it with at least 2 × fa Hz. Because higher order harmonics until 250 Hz are useful for ENF analysis, as we shall see later in this thesis, we would need to sample the signal with at least 500 Hz. For not being too close to the maximum frequency which we shall use, we set a sampling frequency of 1000-1200 Hz, which would allow the successful reconstruction of a 500-600 Hz signal. If we want to give the input signal (after proper conditioning) to an audio card microphone input, we must take into account the maximum voltage input which it can support safely and the supported sampling frequencies. A decimation or down sampling of the recorded signal should be done at this point, after a low-pass filtering, in 1.6. ENF- HOW TO EXTRACT IT? 7 order to avoid big and useless storage and processing costs. If the signal is collected directly at a sample rate of 1200 Hz, as some audio cards allow, those steps can be skipped. After successful collection of the raw sinusoidal signal, the next step is the extraction of the variation of its frequency in time. The extraction process of the ENF can be approached in 2 ways: with parametric or non-parametric methods. The non-parametric method does not assume any model of the signal or noise which it is analyzing. Here, the Short Time Fourier Transform together with the Spectrogram come in handy. In order to detect the variation in time of the frequency, the raw signal is divided into frames of Tframe seconds, with an overlap of n%. For a signal at a sampling rate fs Hz, the frequency resolution of the Direct fs Fourier Transform of that frame is Hz, where Nf is the number of points Nf of the Discrete Fourier Transform. As an example, let us take a signal of 30 minutes with a sampling frequency fs = 1200 Hz, from Timisoara, Romania. We first pass it through a FIR with 3036 coefficients and with an equiripple passband frequency response, with cutoff frequencies of fc1 = 47

Hz and fc2 = 53 Hz, represented in Figure 1.6, as we are interested in seeing only the nominal frequencies and it’s variations which are, in the worst case, between ±1 Hz. Now we can analyze the obtained signal in terms of frequency. By setting a window length of 16384 samples, which for us will

Figure 1.6: Frequency response of band-pass filter used for the raw signal mean 16384/1200 = 13.65 seconds and an overlap of 50% we will obtain a frequency resolution of (16384 − 16384/2)/65536 = 0.125Hz. Figure 1.7 shows the obtained spectrogram with the fluctuations of the signal, which can be seen at a very bad resolution for our purposes. After obtaining the spectrogram of the signal, we will have to estimate the dominant frequencies in each frame. In order to do this, we take the maximum amplitude of all the estimated frequency components returned by the DFT in each frame. This reveals the importance of having a very good frequency resolution as for us, even a variation of 0.01 Hz, matters. On the other hand side, this method is susceptible to frequency outliers caused by sudden changes in energy due to such factors as sudden variations in ambient lighting, etc. For example, movement near optical sensors can cause changes in the dominant frequency [4]. In Figure 1.8 , we can see the result of applying the maximum energy 8 CHAPTER 1. INTRODUCTION component on the same signal, after obtaining the spectrogram.

Figure 1.7: Spectrogram of ENF fluctuations of signal used in experiment

Maximum of Spectrogram ENF 1200hz_timisoara_2101_0650PM_30min_ECON.wav 50.08

50.07

50.06

50.05

50.04

50.03

50.02

50.01

50

49.99

49.98 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 1.8: Maximum energy component along each frame of the signal

These non-parametric methods have a drawback: they allow one to have either a good frequency resolution or a good temporal one. In the case of the spectrogram, which uses Short Time Fourier Transform in order to ob- tain the frequency variation with time of a signal, its frecuency resolution depends on the sampling frequency and on the window size. For a signal of 44.1 KHz, if we would choose a window size of 4096 we would have a resolution of 44.1kHz/4096=10.76 Hz. The time resolution is the inverse of frequency resolution, 4096/44100 = 0,093 s or directly 1/10.76=93 ms. So, the larger the frequency resolution, the smaller the temporal. In addition, the two dimensions are relative because they depend on the sampling fre- quencies as well. In the following graphs, Figure1.11 and Figure 1.12, we increased the window size of the same signal to 64000, repectively to 131000 samples, which, as we can see in the maximum frequency graphic, yields a better discrimination of frequencies, but the 50 Hz component in the spec- 1.6. ENF- HOW TO EXTRACT IT? 9 trogram is barely visible. As a conclusion, if such a method is to be used, at least for initial tests, like in our case, a compromise between frequency and time resolution must be found such that the small variations of the ENF in time are well visible.

Figure 1.9: Spectrogram and maximum energy in time of the signal,wsize = 64000

Maximum of Spectrogram ENF 1200Hz_timisoara+timisoara_2101_0610PM_30min_ECON.wav 50.08

50.07

50.06

50.05

50.04

50.03

50.02

50.01

50

49.99

49.98 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 1.10: Spectrogram and maximum energy in time of the signal,wsize = 64000

The second approach of getting from the raw sinusoidal signal to the ENF time-frequency graphs consists in using parametric methods. Para- 10 CHAPTER 1. INTRODUCTION

Figure 1.11: Spectrogram and maximum energy in time of the signal,wsize = 131000

Maximum of Spectrogram ENF 1200Hz_timisoara+timisoara_2101_0610PM_30min_ECON.wav 50.06

50.05

50.04

50.03

50.02

50.01

50

49.99

49.98 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 1.12: Spectrogram and maximum energy in time of the signal,wsize = 131000 metric methods assume a prior model for the signal. For example, if we assume that what we are searching for in a signal is a sinusoidal wave em- bedded in white, gaussian noise, then we will obtain better results because of the specificity of the algorithm. MUSIC, or Multiple Signal Component algorithm is a method which estimates the frequency components of a signal composed of a known, finite number of sinusoids, embedded in white noise. The method relies on the orthogonality property of the sinusoidal signal 1.6. ENF- HOW TO EXTRACT IT? 11 subspace and the noise subspace, and provides a high resolution frequency estimate of a sinusoidal signal using a smaller number of data points than spectrogram-based methods [4]. Practically, this algorithm computes the M × M autocorrelation matrix Rx of the signal and performs an eigenvec- tor analysis on that matrix. The eigenvectors corresponding to the highest p eigenvalues (which represent the highest variability) represent the signal subspace and the remainig M −p vectors represent the noise subspace. Then the frequency estimation function is:

jω 1 PˆMUSIC (e ) = , (1.1) PM H 2 i=p+1 |a vi| where: a = [1e(jω)e(j2ω)...ej(M−1)ω] and vi are the noise eigenvectors. The locations of the p largest peaks of the estimation function give the frequency estimates for the p signal com- ponents. Unlike the DFT, this method has the advantage of estimating the frequencies with an accuracy higher than one sample, because the above function can be evaluated at any frequency. This concept is called super- resolution. Because multimedia signals can be quite noisy and the dynamic range of the frequency variation of the ENF is very small, this is the method that shall be presented and used in order to explain concepts and experi- ments in this thesis. Let us take the same signal as used in the Fourier analysis with the spectrogram. For a window length of 5 seconds, an over- lap of 50%, fs = 1200Hz, and p = 2 because we expect to have one real domain signal, the MUSIC algorithm performs quite better than Fourier, as we can see in Figure 1.13. With a window length of 5 seconds, which in our case will be 6000 samples, one can obtain a better frequency resolution than with Fourier analysis, economizing a considerable amount of computational power used by both Fourier and spectrogram. The source code of the pre- sented functions and filter, together with the script to be used with them, can be found in Appendix A.2. 12 CHAPTER 1. INTRODUCTION

MUSIC ENF 1200hz_timisoara_2101_0650PM_30min_ECON.wav 50.07

50.06

50.05

50.04

50.03

50.02

50.01 Frequency (Hz)

50

49.99

49.98

49.97 0 200 400 600 800 1000 1200 1400 1600 1800 Time (s)

Figure 1.13: ENF from MUSIC algorithm Chapter 2

State Of The Art

Digital media security is still an emerging area. The rapid evolution of the digital technology together with the continuous decrease in prices when we are talking about storage have provoked an increase in the usage and popu- larity among people of digital content. Also, digital media security and rights management has attracted the attention of many security professionals, law enforcement officers and practitioners [6]. In terms of ENF in forensics, the state of the art is not so voluminous, as the research of the ENF in forensics is a relatively new and untouched domain, with applications which have a theoretical and practical support, but which in most cases have not become commercially available. Starting with Min Wu, which can be considered one of the first scientists which started investigating the ENF for forensics purposes, there have been made some important steps in the last years.

2.1 Audio Authentication/Tampering

The first articles on ENF in forensics refer to the authentication of audio files from phone conversations or audio recordings by using the phase continuity of the ENF present inside the signal. In [7], the ENF used together with the spectral distance measurement has proved to be a good technique for establishing if an audio recording with an ENF present has been edited or not. Considering that the phase change is uniformly distributed between −180o and 180 o, and that a phase variation is no larger than, say, ±10o, the probability of detecting a phase change due to a digital audio forgery would be greater than 90%. In [8], signals were taken from a public spanish database, with different SNRs, and used with the short-time DFT of the first-order signal derivative in order to estimate the dominant frequency of each N-point frame of the signal and the phase of the resulting ENF. Again, even at low SNRs, the phase discontinuity of ENF in edited audio files could be clearly seen. In the same study, [8], on the presence of ENF in audio recordings taken with battery supplied devices, it has been concluded that only dynamic microphones can capture the field from near electric grid

13 14 CHAPTER 2. STATE OF THE ART sources. With the main type of microphone used in medium class devices, the electret microphone, the field is not sensed. Instead, the audio mains hum is considered to be a carrier of the ENF signal which can be sensed even from at 20 meters and after 5 walls distance from the source. In fact, any noise generated by a mains powered device in the proximity of recording devices might transmit the ENF signal. Even air conditioning fans have been used with success in experiments for detecting the presence of the mains hum.

2.2 Location Stamping

The first study on location stamping was made in the USA, where audio recordings from different locations among the same interconnected (USA Eastern) grid were taken to see the differences in ENF which may appear due to local load change and finite load balancing effect propagation among the same, interconnected grid. Based on the load compensation mechanism used in power grids, which was explained in the introductory section, the authors of [9] assume that due to small or large changes in load in some areas, specific ENF signatures may be generated which would allow the investigator to be more precise in localizing the recording within the grid. Also, those changes are known to propagate through the electric grid at a speed of 500 miles per second [9]. Thus, the ENF signals should be more similar in closer locations than in remote ones. Of course that such a property of the signal could be used for a triangulation process in order to detect, or at least narrow down the location of a recording, if a database at different locations would exist. By using the correlation coefficient between three 14 hours recordings of the power line in three different cities on the Eastern Coast of USA, it has been proved that the correlation coefficient has the potential to reveal distances with respect to one location. According to [9], it is difficult to characterize empirically a relation between the correlation coefficient value and the distance between two ENFs, because the correlation coefficient depends also on power line density, and wired distances vary from road or flight distances. In [10], the authors even propose a machine learning method in order to detect the grid in which one recording was made, based on the different statistical features of the signals in each grid. This is very useful for detecting the location of videos with bad purposes, like terrorist threat, child pornography, or ransom demands. The success rate in their case was of approximately 88% in detecting the grid of origin of audio or power ENFs based on 7 statistical features. 2.3. VIDEO SYNCHRONIZATION AND HISTORICAL ALIGNMENT15

2.3 Video Synchronization and Historical Alignment

In an article presented in 2014 at the IEEE International Conference on Acoustic, Speech and Signal Processing [11], the authors explore the use of the ENF for synchronizing and time-stamping videos, based on the ENF ex- tracted from their soundtracks. They affirm that in viewing the ENF signal as a continuous-time random process, its realization in each recording may serve as a timing fingerprint. Synchronization of audio and video recordings can therefore be performed by matching and aligning their embedded ENF signals. After dividing the soundtracks of two videos in frames of a few seconds length, they perform, using the MUSIC algorithm, the frequency estimation of each frame. After, a cross correlation is performed between the two resulting estimation signals, which reveals the lag between them. The accuracy of the synchronization is also being studied in terms of Mean Square Error (MSE) for different frame lengths and overlapping factors, managing to finally obtain an error of the magnitude of some seconds for videos of 30 frames /second. Finally, the authors believe that using ENF to analyze historical recordings can have many useful applications for forensics and archivists. For instance, many 20th century recordings are important cultural heritage records, but some lack necessary metadata, such as the date and time of recording. Also, the need may arise to time stamp old recordings for investigative purposes, and ENF may provide a way to do that. Finally, by using a spectral combining technique, they manage to suc- cessfully use ENF from higher order harmonics of the mains fundamental, solving like this the issues of extracting the ENF at low SNR rates and with a possible built in mains hum filter that a digital device nowadays might have.

2.4 Time Stamping

Time-stamping one media file involves comparing the ENF fluctuations from reference power-ENF with the media ENF and computing the cross corre- lation between the two obtained vectors. The highest correlation between two sets of ENFs corresponds to the instant of the beginning of the media file. In signal processing, the cross-correlation is a measure of similarity of two series as a function of the lag of one relative to the other. This is also known as a sliding dot product. It is commonly used for searching a long signal for a shorter, known feature. It has applications in pattern recogni- tion, single particle analysis, electron tomography, averaging, cryptanalysis, and neurophysiology [12]. Recalling the mathematical concept of correlation between two signals, different or not (case in which we are talking about the auto-correlation function), we can see that, in the case of discrete signals, it 16 CHAPTER 2. STATE OF THE ART is given by + inf 1 X r = f ∗[m]g[m + n], (2.1) fg N m=− inf where

1. f[n] and g[n] are discrete-time signals.

2. f ∗[m] is the complex conjugate of f[m], which for real-valued signal is equal to the original function.

3. n is the delay in samples.

4. N is the length of the signals.

The length of such an operation between two signals is 2N − 1, where N represents the length of the larger signal. So the closer the samples of the two functions will be in terms of values, the higher the result at the instant n, when they are aligned. One can note that the result is normalized by the number of samples of the signals, N. In [13], the authors make several experiments in order to observe the lag between two recordings with the help of the ENF variation. They use a light sensing circuit together with the ENF taken from the power grid directly and they delay on purpose the starting time of one of the two recordings. After successfully extracting the ENF, they observe if the delay is the same as the one introduced by starting the recordings at different time instants, by the Normalized Cross Correlation Coefficient, as presented in Section 5.2. In a first experiment they record in India, directly from the power mains and with a video camera at simultaneous times. The resulting correlation coefficient value confirms the synchronization of the ENFs, one extracted directly from the voltage signal and the other from the luminance signal of the video: a value of 0.91 at k=0 frames distance (synchronous recordings, like expected). For another experiment made in China, video recording is started at a delay of 50 seconds with respect to the power mains signal and again the NCC is calculated after ENF extraction. A maximum of the correlation is observed at k=6 frames which in this case mean 48 seconds. Finally, another experiment is made in the USA, where the nominal frequency of the network is 60 Hz. The maximum value of the coefficient was found to be 0.95 corresponding to a lag of k=4, indicating the video recording was started with a delay of approximately 32 seconds from the power signal. The actual delay in recording time is approximately 38 seconds. This shows the time tamping capability on the order of approximately 8 seconds. Chapter 3

Measuring the ENF - Circuits

3.1 Analog Option

Measuring the Electric Network Frequency directly from the power mains is a simple task, at least conceptually. With proper signal conditioning, an audio board from any computer may be used as an input and read with any software which supports it. We shall describe step by step two circuits together with the corresponding methods in order to measure and record the ENF on digital support. First, we shall describe a simple circuit which is connected to the ADC of the audio board of a computer and second, we shall present one autonomous circuit which reads and transmits, via USB, digital samples of the voltage signal to be analyzed. The first, simple circuit consists of a toroidal transformer which steps down the voltage from 220 V to 6 V AC. Then, a low-pass analog passive filter designed in Genesys has been used in order to avoid high-frequency noise which might be present in the network due to switched power supplies. Its frequency response together with the group delay and electric scheme can be seen in Figure 3.8. Its −3dB point is located at 1254 Hz. Finally, the audio card of a computer does not normally support more than 1 Volt at its input, although this should be consulted in each model manual before using or designing any circuit that will connect to it. For safety purposes, we decided to use a voltage divider in order to lower the 6 Volts to a value of 0.6 V peak-to-peak. The schematic of the whole circuit can be seen in Figure 3.9. Two electrolytic capacitors of 1 nF have been used to decouple noise and compensate for the parasitic capacitance induced by the transformer. It is important to recall at this point that for signal processing purposes, the toroidal transformers are the most recommended ones, due to the low parasitic capacitance induced and electromagnetic considerations.

17 18 CHAPTER 3. MEASURING THE ENF - CIRCUITS

3.2 Digital Option

The digital device which will be presented in this section has implemented the analog to digital part and is capable of transmitting data through the USB port of a computer. This is done by using a PIC16LF1788 8 bit micro- controller. This micro-controller is equiped with three 12-bit successive approximation ADCs that we need for measuring data from the electri- cal network, from a luminosity-sensing circuit and from a Light Dependent Resistor. An adequate program will be needed in order to receive and store the values into a format of file which is chosen by the user. Also, the code for this micro-controller to receive, convert and transmit via its asynchronous serial port that it has can be found in Appendix A.3 of this paper. In order to design a proper system, we have considered some preliminary features which are needed.

3.2.1 Acquisition Part Like in the case of the analog circuit, we need to step down the voltage from 220 V, so we choose a transformer which lowers it down to a value of 6V. An AC/DC converter is chosen as a DC power supply for the circuitry, which will output 5VDC and -5VDC (a symmetrical source is needed because the op-amps used in this circuit will receive and amplify bipolar signals), and will also provide a reliable ground path, as the mains GND cannot be used as a reference for the rest of the circuitry. As output we add two polarized 22 uF capacitors for noise decoupling and the output, as we can see, together with the schematic for this part, in Figure 3.2. On the other hand, we employ the circuit described in the analog section, together with an INA134UA differential line amplifier with fixed Gain=1 which is referenced to the GND plane provided by the AC/DC converter for the whole circuit. This is the solution for creating a stable, low impedance return path for the circuitry without using the negative potential from the electrical mains and of adding a DC component to the voltage signal from the mains, making it suitable as an input to the ADCs which work between 0 and Vdd of the MCU, 3.3V. In Figure 3.2 , we can see the circuit for creating the reference potential of 2 Volts, thus making like this the voltage signal to vary between 1.6 and 2.3 Volts. The micro-controller shall be programmed in order to get 100 samples of 12 bits (ADC characteristic) per signal cycle, which has a nominal fre- quency of 50 Hz, resulting thus in a period of 20 milliseconds. Therefore, the minimum sampling frequency is 100 × 50 = 5kHz. The MCU can be set to use its internal oscillator as a clock, which works at a frequency of 32 MHz. The internal ADC can be configured to work with a basis frequency of fOSC /8 (may be configured to work with a maximum basis frequency of fOSC /2). Taking into account that a successful a reliable conversion takes 3.2. DIGITAL OPTION 19

Figure 3.1: Powering circuit

Figure 3.2: Voltage reference and DC-component-adding circuit

15 × TADC . So the maximum sampling frequency will be 1 f 8MHz 115 × f = × OSCILAT OR = = 266.67kHz. (3.1) ADC 15 2 30 The minimum voltage step can be calculated as the maximum voltage that the ADC can support 212, because it has 12 bits of resolution. Thus, the voltage step is 3.3V V 212 = = 805µV. (3.2) ref 212 In order to transmit at a valid rate for the 5 Khz minimum sampling fre- quency, we need Rb = 16bits × 5kHz = 80kbps. (3.3) In order to avoid bottlenecks, it can be seen that a data rate of 115kbps would be enough. This value of data rate is valid both for both the USB 20 CHAPTER 3. MEASURING THE ENF - CIRCUITS port and the serial port of the micro-controller. With respect to the photo-diode circuit, which is part of the acquisition of the PCB, as we can see in Figure 3.4 the output of the BPW21, the chosen model, is a value in current proportional to the intensity of light which falls on it and the relative sensitivity which depends on the angle of incidence of the light, as presented in Figure 3.6. At maximum sensitivity, a value of 9nA/lx is present at the output. This is transformed to a voltage with the help of U3A and amplified by a factor of 48 by U3B. The high photo- sensitivity in the bandwidth of interest, as seen in Figure 3.6 was the main factor for choosing this model of photo-diode. Also, the fast reaction time to luminosity changes, which is very important because it finally translates into sampling frequency, is a very important parameter which should be taken in account when choosing such a device. Normally the diodes meet the minimum requirements in terms of sampling frequency. The band of the spectrum in which the chosen device is sensitive is also important because every bulb type radiates at a different frequency in the optical spectrum, as we can see in Figure 3.3.

Figure 3.3: Wavelengths radiated by different types of light [19]

LDRs or Light Dependent Resistors are widely used for controlling cir- cuitry as a function of the intensity of the light. Compared to the photo- diode, they have two disadvantages for our case: 1. They are non-linear devices, i.e the resistance follows a law give by !  I  R (−γ) = , (3.4) I0 R0

where R0 is the resistance at intensity IO and γ is a constant given for the device with a typical value between 0.6 and 0.8.[17] 3.2. DIGITAL OPTION 21

2. Their response rate, which is around 30 to 50 ms [17] which represents 20 Hz to approximately 33 Hz is rather poor for an application like ENF analysis, but it can be interesting to simulate the produced by the fact that the sample rate is lower than half of the frequency of the rectified 100 Hz variation which appears from the light bulbs.

The circuit which acquires data from the LDR consists in a resistive divider given as an input to a voltage follower op-amp which outputs the corre- sponding voltage that receives at the input, as we can see in Figure 3.7.

Figure 3.4: Light sensing circuit

Figure 3.5: Relative radiant sensitivity vs. angular displacement[16] 22 CHAPTER 3. MEASURING THE ENF - CIRCUITS

Figure 3.6: Relative spectral sensitivity vs. wavelength[16]

Figure 3.7: LDR acquisition circuit

3.3 Processing Part

Some technical considerations with respect to the processing of the resulting signal values must be taken into account at this point. This is required in this case in which very long periods of the signal will need to be processed, as the ENF varies very slowly in time. In order to figure out that the typical memory size of a micro-controller, which is up to 256KB is not enough for our task, let us make a simple operation. If we would need to record and store just 3 minutes of data and store it in the MCU for latter processing (which of course would be insufficient for concluding something with respect to ENF variations) we would need 80kbps × 180seconds = 14.4Mbits = 1.8MB, much more than the 256 KB. So, at this point two options remain for recording large periods of time: 1. Record online and store the values in a file for latter signal processing. 2. Process in real time the values and get the desired parameters. Probably, for the last option, an autonomous ARM-based computer like 3.3. PROCESSING PART 23 the Raspberry-Pi or Beagle Bone Black with a distribution of Linux would solve the issue at reasonable costs. We shall choose and describe the option of online-recording from this point. In order to further transmit the raw digital values from the ADCs of the micro-controller through its serial port, we choose a transmission rate of 115200 bauds and a sampling rate of 2 KHz. In order to convert the serial port protocol to a USB protocol, we use an UM232H which has a FTDI chip that takes care of this operation. For testing purposes, a terminal application has been used at the site of the computer to see if the values which were expected were the ones transmitted. Furthermore, the MATLAB code used to read the virtual serial port and to create a .mat with the vector to be given as an input to the processing algorithm can be found in Appendix A.2.

Figure 3.8: Low-Pass Chebyshev Characteristics 24 CHAPTER 3. MEASURING THE ENF - CIRCUITS

Figure 3.9: Analog ENF Sensing Circuit Schematic

3.4 Other Circuits for Sensing The Mains Hum

As stated and proved by Kirchner in [8], certain types of microphones cap- ture the ENF signal when they are in the proximity of the electric network. In order to confirm the experiment, a reconstruction of a real scenario has been made in Vigo, Spain. For this purpose, we used a standard, low-cost microphone from a head-set. The microphone has been put near a socket which was used by normal electrical household consumers and has been con- nected to the input of a sound board of a laptop which was disconnect from the electrical network for the veracity of the results. The ambient was a normal one with people talking and moving around, making the experiment setup a real scenario of everyday life. In parallel, a video recording has been made using a GoPro Hero 4 Black Edition camera at 240 fps. By using signal subspace analysis with the MUSIC algorithm, an ENF estimation has been performed. Aldough the amplitude of the signal recorded by the micro- phone was much more lower, as seen in Figure 3.10, the MUSIC algorithm recognized as the main, dominant frequency, a signal varying around 50 Hz, as seen in Figure 3.11. Apparently, the ENF estimation does not look at all like a normal one and it is a very noisy result. Upon comparing the ENF from the video recording made simultaneously, the result was surprising. Although the pattern is not respected with fidelity, the ENF follows a mean variation of the ENF from the video, as we can see in Figure 3.12. Audio systems can be categorized as being ENF sensing circuits as well. The electric hum in audio systems is a very well known and common prob- lem. It can be transferred capacitively, magnetically, or it can occur because of ground imbalances. Experiments were also carried in [8] with the audible hum coming from audio systems and captured by mobile phones in the prox- imity (20 meters and 5 walls). As mentioned by Kirchner, the microphone type together with the DSP algorithms inside the mobile phones makes it more or less possible to detect the ENF at its nominal frequency or at a 3.4. OTHER CIRCUITS FOR SENSING THE MAINS HUM25

−3 x 10 Filtered power signal 2

1.5

1

0.5

0 Amplitude

−0.5

−1

−1.5

−2 0 100 200 300 400 500 600 700 800 900 1000 Time (s)

Figure 3.10: Raw signal recorded with a standard microphone near the power mains

MUSIC ENF CUVI+CUVI_audio_socket_2801_1106AM.wav 50.6

50.5

50.4

50.3

50.2 Frequency (Hz)

50.1

50

0 100 200 300 400 500 600 700 800 900 Time (s)

Figure 3.11: ENF estimation using MUSIC for the power mains hum cap- tured by the microphone multiple of it. 26 CHAPTER 3. MEASURING THE ENF - CIRCUITS

ENF from power and light flicker at concurrent time (Amplitude factor applied)

Luminance mean(Amplitude)*Power 100.4

100.3

100.2

100.1

100 Frequency (Hz)

99.9

99.8

99.7

99.6 0 100 200 300 400 500 600 700 800 900 1000 Time (s)

Figure 3.12: ENF estimation using MUSIC for the power mains hum cap- tured by the microphone and for a video recorded simultaneously Chapter 4

Video as an ENF carrier

Extracting the ENF from a video is not straightforward, as several theo- retical concepts are needed in order to obtain the vector which can give us information about the ENF. The signal of interest from the video, which is captured by the cameras under some circumstances, is the luminance. The luminance or brightness of a video varies in accordance to the luminosity of the medium in which we are recording. As we are supplying our light bulbs with a sinusoidal waveform of 50 or 60 Hz, the result would be that the luminance varies with the same frequency. But the negative part of the sinus cannot be transmitted as luminosity. Thus we will see a rectified signal of twice the nominal electric network frequency, corresponding to the variation of the current circulating through our light source. This variation of 100 or 120 Hz, depending on the region can be perfectly captured by any light sensing diode or camera sensor. In the following subsection, we shall introduce the color space concept together with the video standards in terms of frames per second and finally, our experimental setup for video recording, in order to show how can a video be an electric network frequency carrier.

4.1 Color Spaces and Video Standards

A color space is nothing but a mathematical representation of the colors of a video. The three most popular color spaces are RGB (red,green,blue), used in computer graphics, YIQ or YUV or YCbCr, used in video systems and CMYK, used in color printing [15]. However, because of the poor efficiency in processing and because of bad performance with real-life images of the RGB [15], color spaces which use luma and two color difference signals like the YUV are used in video systems. YUV is a color space used in modern video cameras together with popular compression formats like MPEG. In terms of frame rate, or sampling frequency as adapted to our context,the most widely used values are of the PAL standard, corresponding to 25 frames per second or Hz, or the NTFS value of 29.97 fps or Hz. Modern video cam- eras can use presetable fps values of up to hundreds of frames per second.

27 28 CHAPTER 4. VIDEO AS AN ENF CARRIER

The YUV space compresses, as stated above, the video signal in three com- ponents, of which the Y one is of interest for us because it represents the luminosity level. In order to extract the luminance component, decompres- sion needs to be done with an adequate codec, according to the format of the video. Because the human eye can not distinguish so well between colors in comparison with luminosity, the sampling method used by MPEG codecs for Y component extraction is 4:2:0. This means that for each 4 samples of the Y component extracted from one frame, two of chrominance (Cb,Cr) are taken when decoding the video, as it can be seen in Figure 4.1. So if we have an m × n matrix (depends on the resolution of the video), the chromi- nance matrix will have a dimension of (m/2) × (n/2) = (m × n)/4, for each frame. Since there are two chrominance components, Cb and Cr, the total number of chrominance values is (m × n)/2. Thus, each 1.5 × (m × n), we will be able to extract the info from the Y matrix of pixels extracted for each frame. Then all the Y samples from one frame are summed and the resulting 1-D vector is stored in order to be given as an input to the ENF extraction algorithm. After a normalization in resolution and filtering at the proper frequency of 100 or 120 Hz with a band-pass filter of the same type as used for the electrical signal and described in the Introduction part, we use the MUSIC algorithm to provide the variation of the ENF in time. The C code used for processing the video frames and extracting the Y component can be found in Appendix A.4 of this document and has been used together with Ubuntu and the ffmpeg library.

Figure 4.1: Chroma Subsampling Compression [21]

For the purpose of the experiments described in this thesis, a GoPro Hero 4 Black has been used, which was set to record at a resolution of 1280 × 720 pixels and at a frame rate of 240 fps. This means that the 100 Hz com- ponent can be reconstructed perfectly because of the sample rate which is more than the double of the original signal frequency. In Figure 4.2 , we can see the raw luminance signal, followed by the ENF estimation in Figure 4.3 and ending up with superposing it on the synchronous recording from the electrical network at the same location, in Timisoara, Romania, in Figure 4.4. With the aim of matching the power ENF signal to the video ENF sig- nal, a factor A is calculated as the mean of the samplewise difference along both signals. Notice that, before calculating the samplewise difference, an 4.2. ALIASING EFFECT CONSEQUENCES 29 interpolation process over the video signal is needed in order to increase its sampling frequency up to the sampling frequency of the power signal. The resulting value of A is ≈ 2. Notice that the expected value of the factor A is 2, as the frequency of the video flickering is twice the power frequency. Fig 4.3 shows the obtained results, where a remarkable similarity is observed after applying the factor A on the power ENF signal. Furthermore, exper- iments considering several types of artificial light were performed. In those experiments, different light sources (fluorescent lamps, saving lamps, and bulbs) have shown flickering in the video recordings, and the corresponding video ENF signal has been succesfully extracted. Even Light Emiting Diode (LED) lamps have exhibited flicker in different experiments.

Fluorescent lamp flicker (fps = 240) 64

62

60

58

56 mean(Y) 54

52

50

48 0 500 1000 1500 2000 Time (s)

Figure 4.2: Raw luminance signal

4.2 Aliasing Effect Consequences

As we are expecting to see a variation of 100 Hz in the case of 50 Hz electric networks or 120 Hz in the case of the ones using 60 Hz, a first consequence will arise from sampling the signal with 30 or 25 fps: the aliasing effect. By the Nyquist sampling theorem, we can obtain the aliased frequency which would result for the different frame rates used in video cameras. Generally, the movie-making cameras use 24 fps, while the hand-held ones 25 or 30 fps. Suppose that a fluorescent light source is connected to a 50 Hz electrical network. Then the current will change polarity at a rate of 100 Hz. This means that we are expecting to see a rectified real-valued signal. Real signals 30 CHAPTER 4. VIDEO AS AN ENF CARRIER

ENF from light flicker (w(sec) = 5, overlap = 50%) 100.2

100.18

100.16

100.14

100.12

Frequency (Hz) 100.1

100.08

100.06

100.04 0 500 1000 1500 2000 Time (s)

Figure 4.3: ENF estimation of the luminance

ENF from power and light flicker at concurrent time (Normalized by mean and std) 2.5 Normalized Luminance Normalized Power 2

1.5

1

0.5

0 Frequency (Hz) −0.5

−1

−1.5

−2

−2.5 0 200 400 600 800 1000 1200 1400 1600 1800 Time (s)

Figure 4.4: Luminance and power ENF estimations at concurrent time have a symmetrical spectrum with respect to the y-axis. Now suppose that we are using a 29.97 (NTFS) camera. Thus the aliasing frequency can be calculated in the following simple way: the closest multiple of 29.97 to 100 is 29.97×3 = 89.91. By subtracting this value from 100 we obtain 10.09, which represents the value where we shall see the frequency component of 100 Hz. But it is also known that a real signal has a symmetrical spectrum about the y-axis, as mentioned before. Thus we shall have a repeated spectrum at

100 ± 29.97 × k [Hz], (4.1) 4.2. ALIASING EFFECT CONSEQUENCES 31 where k is any integer number. The original -200 Hz component will appear aliased to 9.79 Hz on the x-axis. In Figure 4.5 , taken from [4], we can see the above described phenomena graphically.

Figure 4.5: Frequency domain graph; upper part represents the original spectrum; lower part represents the aliased spectrum 32 CHAPTER 4. VIDEO AS AN ENF CARRIER Chapter 5

Experiments

5.1 Category I-ENF Spatial Localization

The best way to determine if a statistical hypothesis is true or false is to examine the whole set of data samples, but sometimes this is impractical. In those situations, one needs to assume that one event happens in a way and to verify if it does so by analysis of random data. If the sampled data is not consistent with the assumed hypothesis, then it is rejected. There are two types of statistical hypothesis:

1. The null hypothesis, also denoted by H0. It normally assumes that the observations result purely from chance.

2. Alternative hypothesis, denoted by Hα, which assumes that the obser- vations are influenced by a non-random event.

The final result of such a hypothesis test is used to reject or to accept the null or alternative assumption. In order to evaluate the performance of the employed test, also called the classifier, performance curves like the receiver operating characteristic is used.

5.1.1 ROC Curves and Signal Features In statistics, a receiver operating characteristic (ROC), or ROC curve, is a graphical plot that illustrates the performance of a binary classifier system as its discrimination threshold is varied. The curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. The true-positive rate is also known as sensitivity, or recall in machine learning. The false-positive rate is also known as the fall- out and can be calculated as (1 - specificity). The ROC curve is thus the sensitivity as a function of False Positive Rate [18]. The area under the ROC curve gives the performance of the classifier in a directly proportional fashion. In Figure 5.1, we can see a table with the value of the area under the

33 34 CHAPTER 5. EXPERIMENTS receiver operating characteristic and the associated performance. A ROC curve shows several things: 1. It shows the trade-off between sensitivity and specificity (any increase in sensitivity will be accompanied by a decrease in specificity). 2. The closer the curve follows the left-hand border and then the top border of the ROC space, the more accurate the test. 3. The closer the curve comes to the 45-degree diagonal of the ROC space, the less accurate the test. 4. The slope of the tangent line at a cut point gives the likelihood ratio (LR) for that value of the test.

Figure 5.1: Comparison between 3 ROC curves, in terms of the AUC (Area under the Curve)[22]

In our case, such a performance curve can be used to detect which signal feature is the most adequate one to act as a classifier between two networks, or , in other words, to see if each network has a specific ”stamp” left on the ENF and up to what grade we can be sure that we can detect that stamp. With respect to certainty, the selection of the operating point of the ROC curve is very important, as it offers a cut-off between samples of signal which contain a feature and the ones which do not contain a feature. The selection of such a cut-off point depends on the type of application in which the ROC curve is used. As an example, in a case of diagnosing a patient with cancer, we would prefer to diagnose a false positive instead of a false negative because although a false positive can scare the patient, further investigation can be made in order to confirm that the person has cancer and in this case a treatment can be prescribed. Giving a false negative in such a case can result in the death of the patient because of lack of treatment when it was needed. 5.1. CATEGORY I-ENF SPATIAL LOCALIZATION 35

5.1.2 Experiment I-Frequency Classification

In a first experiment, a test has been made in order to distinguish auto- matically between networks of 50 or 60 Hz. The operating frequency of the electrical network represents one of the signal features and its nominal value is of 50 or 60 Hz. From several recorded grids from different locations in the world to be used within the Signal Processing Cup 2016 organized by IEEE, 20 have been picked up by us and their frequency was measured in time with the help of the spectrogram. After that, it has been checked if the global estimated frequency is closer to 50 or 60 Hz, one of the two possible values of this feature. Finally, 50 Hz networks were labeled with a ’0’ and 60 Hz networks with a ’1’. The global frequency estimation from each net- work have been used as predictors and the logical labels as responses, when passing them through a linear regression analysis. Binomial distribution has been considered, which is generally used when counting the number of positive occurrences out of a set of negative and positive occurrences. The coefficient estimates for the generalized linear regression have been then passed to a function which computes predicted values for the generalized linear model (in our case linear regression) with link function logit, used with the binomial distribution. The resulting ROC curve, as seen in Figure 5.2 is telling us that the classification algorithm has a performance of 100% in discriminating between 50 Hz and 60 Hz electrical grids. The area under the ROC curve is 1. Translated in geographical localization, we can localize with 100% certainty, at a continental level, a video recording.

ROC for classification by logistic regression 1

0.9

0.8

0.7

0.6

0.5

True positive rate 0.4

0.3

0.2

0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate

Figure 5.2: ROC Curve for 50/60 Hz network classification using linear regression as a prediction model 36 CHAPTER 5. EXPERIMENTS

5.1.3 Experiment II-Range Classification In a second experiment, after estimating the ENF from two networks of dif- ferent nominal frequencies, we have tried to make a linear regression analysis in order to see how good we can distinguish between the two. The signal fea- ture which was used this time is the range of each ENF frame. This means that the maxima and minima from each ENF estimation of each grid, within a time window of fixed length, were taken and subtracted. Then, the results were put in a 1-D vector and given as an input of predictors to the general- ized linear regression model. The responses vector was formed by 1’s when the samples were belonging to the A grid or class or 0’s when the samples were belonging to the ’B’ grid or class. As we can clearly see in Figure 5.5, a clear classification between the two chosen networks cannot be made. This is because both of them have very similar ranges, as they are part of the same United States East Coast Grid. Two sample spectrograms from Grid A in Figure 5.7 and Grid C in Figure 5.4 can be seen. The range between the two is clearly very similar.

ENF estimation using Root MUSIC (w = 16384, overlap = 50%) 60.13

60.12

60.11

60.1

60.09

Frequency (Hz) 60.08

60.07

60.06

60.05 0 200 400 600 800 1000 1200 1400 1600 1800 Time (s)

Figure 5.3: Sample spectrogram from grid A

By chosing a 50 Hz network which is much worse in terms of quality in comparison with any of those for 60 Hz previously used, we can see in Figure 5.8 that the result changes significantly. We can clearly distinguish between those two networks. Two sample spectrograms are shown in Figure 5.7 and Figure 5.6, from which we can see that the range in Grid B is significantly higher than the one in grid A. 5.2. CATEGORY II- ENF TIME LOCALIZATION 37

ENF estimation using Root MUSIC (w = 16384, overlap = 50%) 60.07

60.06

60.05

60.04

60.03

Frequency (Hz) 60.02

60.01

60

59.99 0 500 1000 1500 2000 2500 3000 Time (s)

Figure 5.4: Sample spectrogram from grid C

ROC for classification by logistic regression,range feature 1

0.9

0.8

0.7

0.6

0.5

True positive rate 0.4

0.3

0.2

0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate

Figure 5.5: ROC Curve for networks A,C; classification using linear regres- sion as a prediction model and range as signal feature

5.2 Category II- ENF Time localization

Localizing an ENF in time can be a strong proof in the judicial environment when the spatial location is already known and a temporal one is required for confirming it. Generally speaking, two signals can be localized in time by fixing one (the ground truth in time and frequency, in our case the continuous 38 CHAPTER 5. EXPERIMENTS

ENF estimation using Root MUSIC (w = 16384, overlap = 50%) 50.8

50.6

50.4

50.2

50 Frequency (Hz)

49.8

49.6

0 500 1000 1500 2000 2500 3000 3500 4000 Time (s)

Figure 5.6: Sample spectrogram from Grid B

ENF estimation using Root MUSIC (w = 16384, overlap = 50%) 60.13

60.12

60.11

60.1

60.09

Frequency (Hz) 60.08

60.07

60.06

60.05 0 200 400 600 800 1000 1200 1400 1600 1800 Time (s)

Figure 5.7: Sample spectrogram from Grid A recordings of the power lines with Tensiunators1 at different geographical locations), shifting the other, shorter one (like, for example, the luminosity of a video recording) and calculating for each shift the correlation between the two signals, like presented in equation (2.1). Using the cross correlation

1Name given to the ENF data collection circuit. 5.2. CATEGORY II- ENF TIME LOCALIZATION 39

ROC for classification by logistic regression,range feature 1

0.9

0.8

0.7

0.6

0.5

True positive rate 0.4

0.3

0.2

0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 False positive rate

Figure 5.8: ROC Curve for networks A,B; Classification using linear regres- sion as a prediction model and range as signal feature directly as it is defined has two problems in our case: it depends on the mean and on the signals having equal lengths, in order for the normalization to be performed correctly. Thus, a normalized correlation coefficient (NCCC) which will account for the different means and different lengths of the signals. The NCCC is given by

PN n=1[f[m] − µf ][g[n] − µw] ρ[n] = q , (5.1) PN 2 PN 2 n=1[f[m] − µf ] n=1[g[n] − µw]

where:

1. N is the total number of samples of the shorter sequence.

2. f[n] is the signal under test.

3. g[n] is the nth sample of the ground-truth signal, which is shifted to the right and compared to the one under test.

4. µf is the average of the test signal.

5. µw is the average of the groundtruth signal.

The entire chain of operations done in order to obtain the following experi- ments can be more clearly seen in Figure 5.9. 40 CHAPTER 5. EXPERIMENTS

Figure 5.9: Signal Processing From Raw to Detection[4]

5.2.1 Experiment I- Simulated Delay In Experiment I, made in Timisoara, Romania, a fluorescent, low-consumption light was switched on and the optical sensor of the camera was oriented to- wards a white area on a wall, with no movements. Power signal mains recording was started at 7.5 seconds before the video recording in order to detect the accuracy of our time-localization method. During this video recording, all other lights in the room where switched off. The ENF estima- tions with the MUSIC algorithm of the two signals around the frequencies of interest can be seen in Figure 5.10 and respect exactly the same pat- tern. The power signal has been added a value of 50 Hz in order to make it easier to analyze. A delay of 100 seconds is simulated by shifting the lumi- nosity signal. Then, the cross-correlation between the resulting signal and the ground truth is computed. Since the synchronization at the moment of recording was not perfect between the camera and the power circuit (7.5 seconds), we shall see that an algorithm for a finer alignment of the signals would be required in order to simulate correct situation. In Figure 5.10, we can see the ENF estimations to be used in order to simulate a delay of 100 seconds of the video recording with respect to the power recording. In order to perform the cross correlation, two methods are used. First, the predefined MATLAB xcorr is used together with a global normalization of the signals in energy. Second, the NCC defined previously in this Section has been used to verify the effectiveness of the method. As we can see depicted in Figure 5.14, a peak of the cross correlation is present at second 92.5. This is due to the fact that the recordings were not started at exactly the same time. Thus, the true delay between the two raw recordings can be calculated as the delay-peak of the cross correlation. Thus, results in 7.5 second of advance of the power signal with respect to the video signal. The day and the hour are taken with the help of a text processing algorithm from the ground truth signal which is permanently recorded and stored in files with the name containing a standard format: fs date place time duration. Thus, upon detecting a match, exact infor- mation about the localization in time of the ENF can be provided in a fast and cheap way. By comparing the MATLAB correlation function with the NCC, we can clearly see that the NCC is more adequate as it gives an un- 5.2. CATEGORY II- ENF TIME LOCALIZATION 41

ENF signals to cross correlate 100.2

100.18

100.16

100.14

100.12

100.1

100.08

100.06

100.04 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 5.10: Red:groundtruth; blue:luminosity, cut at seconds 100 and 830 doubtedly better result, being more clear where the peak of the correlation is present.

Matching detected on 2101 at 6:51.541667 with a value of 0.999239 1 X: 92.5 Samplewise normalised Y: 0.9992 Matlab xcorr

0.8

0.6

0.4

0.2 Normalized Cross Correlation

0

−0.2

−0.4 −2000 −1500 −1000 −500 0 500 1000 1500 2000 Lag (s)

Figure 5.11: Time localization of the unknown ENF extracted from the video signal; cross-correlation peaks at 92.5 seconds of delay.

To confirm the veracity of the method , we cut the same signal at second 500 and 830, as it can be seen in Figure 5.12. The resulting cross correlation peak moved correspondingly at 492.5 seconds, as seen in Figure 5.13. Also, 42 CHAPTER 5. EXPERIMENTS

ENF signals to cross correlate 100.2

100.18

100.16

100.14

100.12

100.1

100.08

100.06

100.04 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 5.12: Red:groundtruth; blue:luminosity, signal cut at seconds 500 and 830 the information was updated in terms of local hour, as seen in the title of the graphic (6:58PM).

Matching detected on 2101 at 6:58.208333 with a value of 0.998537 1 Samplewise normalised Matlab xcorr 0.8

0.6

0.4

0.2

0

Normalized Cross Correlation −0.2

−0.4

−0.6

−0.8 −2000 −1500 −1000 −500 0 500 1000 1500 2000 Lag (s)

Figure 5.13: Time localization of the unknown ENF extracted from the video signal; cross-correlation peaks at 492.5 seconds of delay. 5.2. CATEGORY II- ENF TIME LOCALIZATION 43

5.2.2 Experiment II-Simulated Time Localization In another experiment from Vigo, Spain a concurrent power recording and video recording have been made at a considerable distance from each other. Again, a 200 second delay has been simulated by cutting the luminance sig- nal between seconds 200 and 530 and calculating the correlation coefficient along time by shifting it over the ground truth (power signal). As seen in Figure 5.14 and in Figure 5.15 , the algorithm returned again with a very good accuracy the exact date and hour of the video recording, leading thus to the conclusion that a very precise time localization can be achieved with the NCC.

ENF signals to cross correlate 100.18

100.16

100.14

100.12

100.1

100.08

100.06

100.04

100.02

100

99.98 0 100 200 300 400 500 600 700 800 900

Figure 5.14: Signals to be correlated for time localization 44 CHAPTER 5. EXPERIMENTS

Matching detected on lab+lab at 6:5.375000 with a value of 0.996551 1 X: 200 Samplewise normalised Y: 0.9947 Matlab xcorr

0.5

Normalized Cross Correlation 0

−0.5 −1000 −800 −600 −400 −200 0 200 400 600 800 1000 Lag (s)

Figure 5.15: Time localization of the unknown ENF extracted from the video signal in Vigo, Spain; Cross-correlation peaks at 200 seconds of delay; Chapter 6

Conclusions and Future Work

We have proved in this thesis that the electric network frequency variation along time, which at first seems to be a completely random phenomenon, can be, in fact, the key to new discoveries in security of media and of people. If used with the right techniques, the potential that the ENF has is very big, starting with applications such as outdoor night security and ending with proofs in justice. Practically, any video recording which contains the so called flicker can be localized both in time and space with the help of a database generated by circuits recording the electrical network and very low-end equipment for basic signal processing computations. The detection of the flickering inside videos and thus the localization in time and/or space has its own limitations. In fact, light flicker avoiding techniques have been developed such that this bothering effect is not noticeable in video record- ings. Factors like the dynamic light response mechanism of modern cameras or frame rate can make the ENF apparently undetectable within the lu- minosity. Future work in this direction of determining the factors which influence the quality of the captured ENF signal from videos is being done. Video background extraction algorithms are tried in order to improve the quality of the ENF signal extracted from video. A first approach, based on the custom configuration of the background extraction algorithm MOG2, has been implemented, but the results achieved applying the background extraction are very similar in terms of the maximum of the NCC between the power and video ENF signals. Future work lines regarding the background extration are: - Trying different parameters for the background extraction algorithm. - Trying image processing techniques in order to improve the mask re- turned by the background algorithm (erosion, dilation, removal of dark areas, segmentation, connected regions detection, etc.). Moreover, other techniques can be applied on each frame in order to improve the extracted ENF signal. Image processing techniques such as im- age background extraction, object segmentation, thresholding (disregarding

45 46 CHAPTER 6. CONCLUSIONS AND FUTURE WORK dark pixels with low flicker) and so on, could be explored. We have also per- formed some experiments in real scenarios such as dowloaded videos from YouTube and, besides the aliasing at low frame rates (fr < fv), we have detected another issue related with the shutter speed (or exposure time) of each frame, since the downloaded videos do not show flicker, at least at first glance. The shutter speed is the amount of time while the sensor is exposed to the light for each frame (the maximum shutter speed is 1/fr). The amount of light reaching the sensor depends on the shutter speed, the lens aperture (also called f-stop),1 and the scene’s luminance. The video camera settings usually allow to modify the Exposure Value (EV), which is a parameters that is function of the shutter speed and the lens aperture. Therefore, experiments by using the Go Pro at 30 fps and varying the EV compensantion option have been conducted. Also, band-pass filter improve- ments are being studied in order to maximize the extraction of the ENF. Below, in Figure 6.1, we can see the ENF of a 30 fps video recorded with the GOPro, Ev = −1 uploaded and re-downloaded from Youtube. Here, there

10.15 ENF from light A x ENF − 90

10.1 Frequency (Hz) 10.05

10 0 200 400 600 800 1000 1200 1400 1600 1800 Time (s)

Figure 6.1: ENF matching for video flicker at 30 fps. EVC = −1. are two interesting things to be noted:

1. The MUSIC algorithm detects two powerful aliased replicas around the frequency of 10 Hz, as it was reported in [4], which means that the fps is not exactly 30.

2. The proper ENF detection could be made by MUSIC only after using a filter with a gain factor of 10dB in the pass-band, with the frequency response shown in Figure 6.2 .

Finally, we are trying to analyze all possible aliased frequencies along the spectrum (between 0 and fr/2 Hz) to find out where the ENF signal is and how to create automatic algorithms in order to detect and exploit it at its maximum potential. One of the problems which has not been solved

1Common f-stop values are f/1.4, f/2, f/2.8, and f/4. 47

0

−10

−20

−30

−40

−50 Magnitude (dB)

−60

−70

−80

−90

0 2 4 6 8 10 12 14 Frequency (Hz)

Figure 6.2: Improved band-pass filter applied before ENF extraction yet is the case where the aliased frequency is at baseband. In that case, experiments are being conducted to see if the information (the variation of the ENF) has been lost or signal processing techniques can be used to extract the wanted signal. In order to simulate what would happen in such a case, we started from a video recorded again with the GoPro camera at 240 fps, we downsampled the signal such that the new fps would be 30 and gave the signal as input to the MUSIC algorithm. As we can see in Figure 6.3, the matching was still kept and the aliased component appears around 10 Hz, the expected value. The next and final step was to simulate a frame rate which is an exact divisor of 100, case in which the aliased frequency is 0, according to the sampling theorem. Thus, the variations of the ENF should be around 0, since they are very slow in time. After downsampling from 240 to 20 frames the luminance signal and giving it as an input to MUSIC, the result was as expected, as we can see in Figure 6.4. The information appears to be lost since the algorithm does not detect anymore a matching between the two ENFs, as see in Figure 6.5 . As a final experiment, we tried to see if there is any temporal match between the two signals after downsampling and plotting their spectra. As we can see in Figure 6.7 , the cross correlation suggests that the information has not been entirely lost and it would be rather a failure of MUSIC to detect it, but future work in this direction has to be made in order to determine if that is true or not. The spectrum using FFT also gives the expected results, as seen in Figure 6.6. As a conclusion, the research direction of the ENF in forensics is still a new concept which must be more deeply studied in order to understand and 48 CHAPTER 6. CONCLUSIONS AND FUTURE WORK

ENF from power and light flicker at concurrent time (Amplitude factor applied) 11.2 Luminance mean(Amplitude)*Power

11

10.8

10.6 Frequency (Hz)

10.4

10.2

10 0 500 1000 1500 2000 2500 3000 3500 4000 Time (s)

Figure 6.3: Matching between Power and Luminance after downsapling from 240 to 30 fps

Fluorescent lamp flicker (fps = 20) 8.334

8.333

8.332

8.331

8.33

8.329

8.328 0 200 400 600 800 1000 1200 1400 1600 1800

Figure 6.4: Luminance after downsapling from 240 to 20 fps exploit the natural phenomena which could give us valuable information in forensics. There are still a number of problems and limitations to address, but the objective of a possible project regarding the ENF problem can be very promising, taking into account that it proves to be a solid proof that something happened at a certain time and location. 49

ENF from power and light flicker at concurrent time (Amplitude factor applied) 10.2 Luminance mean(Amplitude)*Power 10

9.8

9.6

9.4

9.2 Frequency (Hz) 9

8.8

8.6

8.4

8.2 0 500 1000 1500 2000 2500 3000 3500 4000 Time (s)

Figure 6.5: No matching between Power and Luminance after downsapling from 240 to 20 fps

6 x 10 spectrum of luminance after filtering and downsampling 3

2

1

0 −10 −5 0 5 10

spectrum of power after filtering and downsampling 6000

4000

2000

0 −10 −5 0 5 10

Figure 6.6: Spectrum of Power and Luminance after downsapling from 240 to 20 fps 50 CHAPTER 6. CONCLUSIONS AND FUTURE WORK

1

0.8

0.6

0.4

0.2

0

−0.2

−0.4

−0.6

−0.8

−1 Normalized Cross Correlation after filtering and downsampling −5 0 5 4 Lag(samples) x 10

Figure 6.7: Cross Correlation between Power and Luminance after down- sapling from 240 to 20 fps Appendix A

A.1 Circuit Schematic and BOM

51 52 APPENDIX A.

Figure A.1: Digital Tensiunator Squematic A.2. MATLAB FUNCTIONS AND SCRIPTS 53

Figure A.2: Digital Tensiunator BOM

A.2 Matlab Functions and Scripts

%% Function to collect data from the electrical grid with the analog circuit presented in section Analog Option. clear all;close all; clc;

b_analize_recording = 1; fs = 1000; % sampling frequency of the recorded signal. rec_time =900; % recording time in seconds.

har= dsp.AudioRecorder(’SampleRate’,fs,’NumChannels’,1,’ SamplesPerFrame’, fs);

% For saving the file after recording date_now= date; hour_now= hour(now); minute_now= minute(now);

name_file= sprintf(’Network_Recording_Tensiunator_%s_%i_%i. wav’, date_now,hour_now,minute_now);

hmfw= dsp.AudioFileWriter(name_file,’FileFormat’,’WAV’,’ SampleRate’,fs);

disp(’Recording...’);

tic; while toc <=rec_time, step(hmfw, step(har)); end

release(har); release(hmfw); fprintf(’\nRecording complete\n’); fprintf(’File%s saved\n’, name_file);

if b_analize_recording 54 APPENDIX A.

[x, fs]=audioread(name_file); figure; t= linspace(0, 1/fs*length(x), length(x)); plot(t,x); figure; pwelch(x,[],[],[],fs) coeff=filt_47_53(fs); x=filter(coeff,x); window =10; nooverlap_wfactor = 7; overlap= window-round(window/nooverlap_wfactor);

[P, Fp, Tp] = music_ENFestimation(x,2,[],fs,window, overlap, 1,’seconds’); power_signal= max(Fp, [], 2); figure hold on plot(Tp, power_signal,’r’); xlabel(’Time(s)’), ylabel(’Frequency(Hz)’), title(sprintf(’ENF from power recording’))

end

%%%%%%%%%%%%%%%%%%%%%%Read data from Digital Tensiunator %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

clear all;clc;close all;

%%%%ENTERTIMETORECORD:%%%%% timetorecord_seconds=900; % seconds

%%%%%%%BOOLEANCONTROLVARIABLES%%%%%%%%%%%%%%%%%%%%%%%%% save_raw_recording=1; % save recording or not. b_ENF_estimation=1;%1 to perform theENF estimation when the recording finishes

%%%%%%%%%%%%%%%%%%%%%%%

delete(instrfindall); b_filter=1; port=’/dev/ttyUSB0’;%’COM3’ for windows,/dev/ttyUSB0 for linux s= serial(port);%MODIFYCOMPORTNUMBERACORDINGTODEVICE MANAGER

%set(s,’InputBufferSize’,512); %number of bytes in inout buffer set(s,’FlowControl’,’none’); set(s,’BaudRate’, 115200); set(s,’Parity’,’none’); set(s,’DataBits’, 8); A.2. MATLAB FUNCTIONS AND SCRIPTS 55

set(s,’StopBit’, 1); set(s,’Terminator’,’’);

disp(get(s,’Name’)); prop(1)=(get(s,’BaudRate’)); prop(2)=(get(s,’DataBits’)); prop(3)=(get(s,’StopBit’)); prop(4)=(get(s,’InputBufferSize’)); fopen(s); disp([num2str(prop)]);

fs=1000;

n=timetorecord_seconds*fs;

datafin=zeros(n,1);

% For saving the file after recording date_now= date; hour_now= hour(now); minute_now= minute(now); data = 0; disp(’Tensiunator is recording, please wait...’); tic;

k=1; while toc< timetorecord_seconds,

data=fscanf(s,’%x’);

b=isfloat(data); if(b) datafin(k)=(data*3.3)/4096; k=k+1; end

end disp(toc); fclose(s);%close the serial port

detect=0;

disp(’Tensiunator has finished recording!’); datafin=datafin-mean(datafin); 56 APPENDIX A.

%%%%%%%%%%%%plot sinusoid%%%%%%% dt = 1/fs; t = 0:dt:(length(datafin)*dt)-dt; figure hold on plot(t, datafin), xlabel(’Time(s)’), ylabel(’Amplitude’), s_title= sprintf(’ENF power recording from Tensiunator’); title(’sinusoid from Tensiunator’)

%%%%%%%plot spectral componentsPSD%%%%%%%% figure hold on pwelch(datafin’,[],[],[],fs) title(’spectral composition of sinusoid from Tensiunator’)

if save_raw_recording save_load_file= sprintf(’ Network_Recording_Tensiunator_%s_%i_%i.mat’, date_now,hour_now,minute_now); save(save_load_file,’fs’,’datafin’); end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%ENF estimation %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

if b_ENF_estimation %%%%%%%filter sinusoid%%%%%%%%%%%%% if b_filter coeff=filt_47_53(fs); datafinfilt=filter(coeff,datafin); end

%%%%%%%analiseENF%%%%%%%%%%% window =10; nooverlap_wfactor= 7; overlap= window-round(window/nooverlap_wfactor);

[P, Fp, Tp]= music_ENFestimation(datafinfilt,2,[],fs, window,overlap, 1,’seconds’); power_signal= max(Fp,[], 2);

%%%%%%%%%%%plotENF%%%%%%%%%% figure hold on plot(Tp, power_signal,’r’); xlabel(’Time(s)’), ylabel(’Frequency(Hz)’), title(sprintf(’ENF from power recording’))

end A.2. MATLAB FUNCTIONS AND SCRIPTS 57

%%Band-pass filter between 47 and 53 Hz, in order to condition the signal from the power grid for processing function Hd= filt_47_53(Fs) %FILT_47_53 Returnsa discrete-time filter object.

%MATLAB Code % Generated byMATLAB(R) 8.3 and the Signal Processing Toolbox 6.21. % Generated on: 19-Nov-2015 18:42:43

% Equiripple Bandpass filter designed using theFIRPM function.

% All frequency values are in Hz. % Fs = 1200; % Sampling Frequency

Fstop1=46; % First Frequency Fpass1=47; % First Passband Frequency Fpass2=53; % Second Passband Frequency Fstop2=54; % Second Stopband Frequency Dstop1=0.0001; % First Stopband Attenuation Dpass = 0.057501127785; % Passband Dstop2=0.0001; % Second Stopband Attenuation dens=20; % Density Factor

% Calculate the order from the parameters usingFIRPMORD. [N, Fo, Ao,W] = firpmord([Fstop1 Fpass1 Fpass2 Fstop2]/(Fs /2) , [0 1 ... 0], [Dstop1 Dpass Dstop2]);

% Calculate the coefficients using theFIRPM function. b= firpm(N, Fo, Ao,W,{dens}); Hd= dfilt.dffir(b);

%[EOF]

%%Code to extract and compare the processed luminance froma video with theENF extracted from the power grid. %%%%%%%%%%%%%%%%%%%%%%ENF vs light flicker %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

function[luminance_signal,power_signal,T,Tp,interp_F,F,Fp, A_factor]=Compare_ENF_light_concurrent(power_ENF_wav,fs, power_ENF_mat,ENFpower_file,window,nooverlap_wfactor, resolution,fps,Y)

overlap= window-window/nooverlap_wfactor;

%%%%%%%%%%%%%%%ENF estimation from light flicker%%%%%%%% av_Y=Y./prod(resolution); t = 1/fps*linspace(0, numel(av_Y),numel(av_Y)); figure hold on 58 APPENDIX A.

plot(t, av_Y) xlabel(’Time(s)’), ylabel(’mean(Y)’) title(sprintf(’Fluorescent lamp flicker(fps=%i)’, fps))

coeff=filter_10(fps); av_Y=filter(coeff,av_Y);

%%%ENF analysis [~,F,T] = music_ENFestimation(av_Y,2,[],fps,window,overlap , 1,’seconds’);

luminance_signal= max(F,[],2);

figure hold on

plot(T, luminance_signal); xlabel(’Time(s)’), ylabel(’Frequency(Hz)’), title(sprintf(’ENF from light flicker(w(sec)=%i, overlap =%g%%)’, window, (1-1/nooverlap_wfactor)*100))

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%ENF estimation from power recording%%%%%%% if power_ENF_wav [x_power, fs] = audioread(ENFpower_file); else if power_ENF_mat ENFpower=load(ENFpower_file);

x_power=ENFpower.datafin;

end end coeff=filt_47_53(fs); x_power=filter(coeff,x_power); dt = 1/fs; t = 0:dt:(length(x_power)*dt)-dt; figure hold on plot(t, x_power), xlabel(’Time(s)’), ylabel(’Amplitude’), %s_title= sprintf(’ENF power recording from file%s’, ENFpower_file); title(’Filtered power signal’)

%%%ENF analysis [~, Fp, Tp] = music_ENFestimation(x_power,2,[],fs,window, overlap, 1,’seconds’);

power_signal= max(Fp, [], 2); figure hold on A.2. MATLAB FUNCTIONS AND SCRIPTS 59

plot(Tp, power_signal,’r’); xlabel(’Time(s)’), ylabel(’Frequency(Hz)’), title(sprintf(’MUSICENF%s’,ENFpower_file))

%%%%%%%%%%%%%%%%

interp_F= interp1(T, max(F,[],2), Tp);

difference= interp_F./max(Fp,[],2)’;

%%%% Plot together with the gain factor%%%%%%%%%%% A_factor= mean(difference(~isnan(difference)));

figure hold on plot(T, max(F,[], 2)), hold on xlabel(’Time(s)’), ylabel(’Frequency(Hz)’),

plot(Tp,(A_factor)*max(Fp,[], 2),’-r’), hold off xlabel(’Time(s)’), ylabel(’Frequency(Hz)’), ylim([99.6 100.5])

title(’ENF from power and light flicker at concurrent time( Amplitude factor applied)’) legend(’Luminance’,’mean(Amplitude)*Power’)

%%%%% Plot together normalized by mean and standard deviation%%%%%%% figure hold on s1= max(F,[],2); s1=(s1- mean(s1))./ std(s1); plot(T, s1), hold on xlabel(’Time(s)’), ylabel(’Frequency(Hz)’),

s2= max(Fp,[],2); s2=(s2- mean(s2))./ std(s2); plot(Tp, s2,’-r’), hold off xlabel(’Time(s)’), ylabel(’Frequency(Hz)’),

title(’ENF from power and light flicker at concurrent time( Normalized by mean and std)’) legend(’Normalized Luminance’,’Normalized Power’) end

%%%%%%%%%%%%%%%%%%%%%%Correlation Coefficient Calculation 60 APPENDIX A.

and Time origin Detection between2ENF’s %%%%%%%%%%%%%%%%%%

clear all; close all; clc; ENF_ENF=0;%%%%%%%%both are power/audio recordings ENF_LIGHT=1;%%%%%one is luminance,other is power%%%%%% n_sec=2; window=n_sec; nooverlap_wfactor=2; frame_rate=n_sec/nooverlap_wfactor;%%%sample rate after ENF estimation cut_ini= 200; cut_fin=530;

ifENF_ENF both_mat=0; both_wav=1; fs_different=1; if both_mat ENFpower_file1=’myspeech.wav’; ENFpower_file2=’recording20160301211510.wav’; name=ENFpower_file1; numberStr=regexp(name,’(\d*)’,’tokens’); % day=cell2mat(numberStr{1}); % year=cell2mat(numberStr{2}); hour=cell2mat(numberStr{3}); min=cell2mat(numberStr{4}); j= strsplit(ENFpower_file1,’_’); date=j{4}; else if both_wav ENFpower_file1=’myspeech.wav’; ENFpower_file2=’recording20160301211510.wav’; end end

fs=1000; dbstop in Compare_ENF_ENF_concurrent at7 [power_signal_ext,power_signal_tens,T,Tp,interp_F,F,Fp, A_factor]=Compare_ENF_ENF_concurrent(both_wav, both_mat,fs_different,ENFpower_file1,ENFpower_file2, window,nooverlap_wfactor); signal1=power_signal_ext; signal2=power_signal_tens; else ifENF_LIGHT

Y_1= load(’30fps_cuvi+cuvi_0205_1201_PM_1.mat’,’-ASCII’); Y_2= load(’30fps_cuvi+cuvi_0205_1201_PM_2.mat’,’-ASCII’); %Y_3=load(’240fps_timisoara_2101_0610PM_3.mat’,’-ASCII’); %Y_4=load(’240fps_timisoara_2101_0610PM_4.mat’,’-ASCII’); Y=[Y_1 Y_2]; fps=30; resolution= [1280 720]; A.2. MATLAB FUNCTIONS AND SCRIPTS 61

ENFpower_file=’1200Hz_CUVI+CUVI_0205_1201PM.wav’; name=ENFpower_file; numberStr=regexp(name,’(\d*)’,’tokens’); % day=cell2mat(numberStr{1}); % year=cell2mat(numberStr{2}); hour=cell2mat(numberStr{3}); min=cell2mat(numberStr{2}); j= strsplit(ENFpower_file,’_’); date=j{3}; % load(’Network_Recording_Tensiunator_08-Feb-2016_22_30.mat ’); fs=1200; dbstop in Compare_ENF_light_concurrent at 12 power_ENF_wav=1; power_ENF_mat=0; [luminance_signal,power_signal,T,Tp,interp_F,F,Fp,A_factor ]=Compare_ENF_light_concurrent(power_ENF_wav,fs, power_ENF_mat,ENFpower_file,window,nooverlap_wfactor, resolution,fps,Y); signal1=power_signal; signal2=luminance_signal; end end

ind_ini=find(Tp>=cut_ini,1); ind_fin=find(Tp >= cut_fin,1);

signal2=signal2(ind_ini:ind_fin); % save_load_file= sprintf(’lum_power’); % save(save_load_file,’ENF_estimation1’,’ENF_estimation2’,’ Tp’,’T’,’cut_ini’,’cut_fin’);

figure hold on plot(T(ind_ini:ind_fin),signal2) plot(Tp,signal1.*A_factor,’r-’) title(’ENF signals to cross correlate’)

%dbstop in ENF_time_origin at5 [crosscor,delay_pos_sec,correct_asynch,correct_eq_in_frames, T_int]=ENF_time_origin(signal1,signal2,Tp,cut_ini, frame_rate);

%%%%%%%%%%%%%%%%%%%%%%%%%%alignment%%%%%%%%%%%%%%%%%%%%% [signal_1,signal_2,D]= alignsignals(signal1,signal2, correct_eq_in_frames);

ind_begi=find(signal_2>=1,1);

figure 62 APPENDIX A.

hold on %plot(Tp(ind_ini+ind_begi:length(signal_2)+ind_ini+ind_begi -1),signal_2(ind_begi:end)) plot(Tp(ind_ini-correct_eq_in_frames:length(signal_2)+ ind_ini-correct_eq_in_frames-1),signal_2(ind_begi:end)) plot(Tp,signal_1’.*A_factor,’r-’) title(’ENF signals aligned in time’) hour=str2num(hour); min=str2num(min); real_min=delay_pos_sec/60; if real_min>60 real_min=real_min/60; real_time=hour+floor(real_min); real_min=min+mod(real_min, 60); else real_min=min+real_min;

end

%%%%%%%%%%%%%%%%%%%%Matlab crosscor%%%%%%%%%%%%%%%%%%%%%%%% signal1=signal1-mean(signal1); signal2=signal2-mean(signal2); [Rmm,lags]=xcorr(signal1,signal2,’none’); EN=sqrt(sum(signal2.^2) * sum(signal1.^2)); Rmm=Rmm./EN;

figure hold on plot(T_int,crosscor) plot(lags*frame_rate,Rmm,’r-’) xlabel(’Lag(s)’) ylabel(’Normalized Cross Correlation’) legend(’Samplewise normalised’,’Matlab xcorr’) title(sprintf(’Matching detected on%s at%i:%f witha value of%f’,date,hour,real_min,max(crosscor)))

%%Code for classifying2 networs and trying to localize them geographycally. clear all;close all;clc; spect_files= dir(fullfile(’GridA_ROOT_MUSIC’,’Spectrograms ’,’*.mat’)); i=1; contor=0; contor50=0; contor60=0; savefiles=0;% 1 to save statistics for each grid ina file, 0 to output statistics for each grid in workspace clas_names={’B’,’C’}; true_clas={’1’}; range_tot=[]; mean_spect_tot=[]; A.2. MATLAB FUNCTIONS AND SCRIPTS 63

for file=spect_files’ mn_spect_per_frame=[]; range_per_frame=[]; variance=[]; contor=0; nae=spect_files(i).name;

j= strsplit(nae,’_’); clas{i}=j{4};

% fullfile(’GridA_ROOT_MUSIC’,’Spectrograms’,name); load(fullfile(’GridA_ROOT_MUSIC’,’Spectrograms’,nae),’T’,’F ’); step1=floor(length(F(:,1))/12);%must work on this for probe=1:step1:length(F(:,1))-step1 contor=contor+1; mn_spect_per_frame(contor)=mean(F(probe:probe+step1,1),1) ;

variance(contor)=var(F(probe:probe+step1,1),0,1); species{contor}= nae(24:27);

range_per_frame(contor)=abs(max(F(probe:probe+step1,1))-min (F(probe:probe+step1,1)));

end %put the global value of each param at the end of the vector with per-frame %values mean_spect_tot(i)=mean(F(:,1)); mn_spect_per_frame=[mn_spect_per_frame mean_spect_tot(i)]; mean_spect_tot=sort(mean_spect_tot);

variance_tot(i)=var(F(:,1)); variance=[variance variance_tot(i)]; species=[species nae(24:27)]; range_tot(i)=log(max(F(:,1))-min(F(:,1))); range_per_frame=[range_per_frame range_tot(i)]; F(:,1)=F(:,1)-mean_spect_tot(i); F(:,2)=abs(F(:,2));

%produce name for calculated parameters in the above for loop if savefiles save_load_file= sprintf(’Statistics_%s_window_%i.mat’,nae (13:27),step1); save(save_load_file,’mn_spect_per_frame’,’variance’,’ range_per_frame’,’F’,’T’,’species’); fprintf(’Spectrogram saved in%s\n’, save_load_file); else namefinvar=nae(13:end); 64 APPENDIX A.

aux3=genvarname([’mn_spect_per_frame_’,namefinvar]); eval([aux3,’=mn_spect_per_frame’]); aux4=genvarname([’variance_per_frame_’,namefinvar]); eval([aux4,’=variance’]); aux5=genvarname([’range_per_frame_’,namefinvar]); eval([aux5,’=range_per_frame’]); aux1=genvarname([’T_fin’,num2str(i)]); aux2=genvarname([’F_fin’,num2str(i)]); eval([aux1,’=T’]); eval([aux2,’=F’]); end

i=i+1; end mean_spect_tot=mean_spect_tot’; range_tot=range_tot’; fork=1:length(mean_spect_tot) if mean_spect_tot(k,1)-50<5 mean_spect_tot(k,2)=0; mean_spect_tot(k,3)=50; else if mean_spect_tot(k,1)-60<5 mean_spect_tot(k,2)=1; mean_spect_tot(k,3)=60;

end end if strcmp(clas{k},clas_names{2})range_tot(k,2)=0; range_tot(k,3)=50; else if strcmp(clas{k},clas_names{1}) range_tot(k,2)=1; range_tot(k,3)=60;

else error(’unknown class!’) end

end end x=mean_spect_tot(:,1); x_range=range_tot(:,1); y=logical(mean_spect_tot(:,2)); y_range=logical(range_tot(:,2)); b= glmfit(x,y,’binomial’); b_range=glmfit(x_range,y_range,’normal’); % Logistic regression p= glmval(b,x,’logit’); p_range=glmval(b_range,x_range,’identity’); auxmean=num2str(mean_spect_tot(:,3)); auxmean_range=num2str(range_tot(:,2)); species=cellstr(auxmean); species_range=cellstr(auxmean_range); % Fit probabilities for scores %[X,Y]= perfcurve(species,p,’60’); [X_range,Y_range]=perfcurve(species_range,p_range,true_clas) ;

% figure A.2. MATLAB FUNCTIONS AND SCRIPTS 65

% plot(X,Y) % xlabel(’False positive rate’); ylabel(’True positive rate’ ) % title(’ROC for classification by logistic regression’) %

figure plot(X_range,Y_range) xlabel(’False positive rate’); ylabel(’True positive rate’) title(’ROC for classification by logistic regression,range feature’)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%detect matching and time origin between2ENF signals %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function[crosscor,delay_pos_sec,correct_asynch, correct_eq_in_frames,T_int]=ENF_time_origin(signal1, signal2,Tp,cut_ini,frame_rate)

if length(signal1)

foru=1:length(signal1)-length(signal2) trozo=signal1(u:u+length(signal2)-1); crosscor(u)=sum((trozo-mean(trozo)).*(signal2))/sqrt(sum( signal2.^2) .* sum((trozo-mean(trozo)).^2)); end

T_int=Tp; T_int=T_int(1:length(crosscor));

[~,delay_pos]=max(crosscor); delay_pos_sec=delay_pos*frame_rate;

correct_asynch=cut_ini-delay_pos_sec; correct_eq_in_frames=correct_asynch/frame_rate;

end

function[P,F,T]= music_ENFestimation(x,p,nfft,fs, 66 APPENDIX A.

window,overlap,b_rootmusic, p_units) %MUSIC_ENFESTIMATION Spectrogram usingMUSIC or RootMUSIC. % window: window length. % overlap: window overlap length. % b_rootmusic:1 for rootMUSIC, 0 forMUSIC. % p_units: indicates the units of window and overlap. If’seconds’ the lenght of the window % and the overlap is defined in seconds, if ’samples’, the lenghts are in number of samples. % The units are’samples’ by default.

if nargin<8, p_units=’samples’; end if nargin<7, b_rootmusic = 1; end

switch p_units case’seconds’ b_unit_sec = 1; case’samples’ b_unit_sec = 0; otherwise error(’Unexpected units parameter’) end if b_unit_sec == 1 window= window*fs; overlap= overlap*fs; end

l_x= length(x); nwin= floor((l_x-overlap)/(window-overlap)); index_win = 1 + (0:(nwin-1))*(window-overlap); if b_rootmusic F= NaN(nwin,p); P= NaN(nwin, 0.5*p); end

fori=1:nwin if b_rootmusic %[F, P_cell] = rootmusic(x(index_win(i):index_win(i) +window-1),p, fs); [F(i,:), ~] = rootmusic(x(index_win(i):index_win(i)+ window-1),p, fs);

%P(i,:)

%if size(P_cell{i}) ~= 2 % warning(’RootMUSIC estimation problem!!!’) % fprintf(’Estimation result in the window%i:%g ’, P_cell{i}); %end else% this part is not implemented correctly!!! revise !!! [P{i},F] = pmusic(x(index_win(i):index_win(i)+ window-1),p, nfft); A.2. MATLAB FUNCTIONS AND SCRIPTS 67

end end

% check if all vectors into the cell P_cell have the same length

%if(P_cell{:})

%end

%P= cell2mat(P_cell); %else % warning(’Different lengths between vectors from the cell array P_cell’) %end

T = ((index_win-1)+(window/2)) / fs; if~b_rootmusic %P= cell2mat(P_cell); P = 0; clear P_cell F= fs/2*F/pi; end

end

%%Code to extract theENF using Short Time Fourier Transform clear all; clc; close all; name=’1200Hz\_timisoara+timisoara\_2101\_0610PM\_30min\_ECON .wav’; [x, fs] = audioread(’1200Hz_timisoara+ timisoara_2101_0610PM_30min_ECON.wav’);% get the samples of the.wav file

%x=x(:,1); % get the first channel % xmax= max(abs(x));% find the maximum abs value %x=x/xmax;% scalling the signal

% define analysis parameters xlen= length(x);% length of the signal if(xlen<3000000) wlen =131072;% window length(recomended to be power of 2) wlen=wlen/2; h= wlen/2; % hop size(recomended to be power of 2) nfft =131072*2 ;% number of fft points(recomended to be power of 2)

else wlen =65536;% window length(recomended to be power of 2) h= wlen/2; % hop size(recomended 68 APPENDIX A.

to be power of 2) nfft = 131563;% number of fft points(recomended to be power of 2)

end % define the coherent amplification of the window K= sum(hamming(wlen,’periodic’))/wlen;

% performSTFT [Y,s,f,t] = stft_mod(x, wlen,h, nfft, fs); s= abs(s)/wlen/K; %Y(1,:)=abs(Y(1,:))./wlen./K; % po=Y(1,:)’; % correction of theDC& Nyquist component if rem(nfft, 2)% odd nfft excludes Nyquist point s(2:end,:)=s(2:end,:).*2; else% even nfft includes Nyquist point s(2:end-1,:)=s(2:end-1,:).*2; end

% convert amplitude spectrum to dB(min= -120 dB) s= 20*log10(s+1e-6);

% plot the spectrogram [val_out pos_out]=max(s);%max works on columns by default( likeFFT) pos_out=pos_out*fs/nfft; spectrogram(x,wlen,h,nfft,fs) title(sprintf(’SpectrogramENF%s’,name))

figure plot3(t, pos_out,val_out) view(2) title(sprintf(’Maximum of SpectrogramENF%s’,name))

function[Y,stft,f,t] = stft_mod(x, wlen,h, nfft, fs)

%x- signal in the time domain % wlen- length of the hamming window %h- hop size % nfft- number ofFFT points % fs- sampling frequency, Hz %f- frequency vector, Hz %t- time vector,s % stft-STFT matrix(only unique points, time across columns, freq across rows)

% representx as column-vector if it is not if size(x, 2) > 1 x=x’; end A.3. C CODE FOR PIC MICROCONTROLLER 69

% length of the signal xlen= length(x);

% forma periodic hamming window win= hamming(wlen,’periodic’);

% form the stft matrix rown= ceil((1+nfft)/2);% calculate the total number of rows coln= 1+fix((xlen-wlen)/h);% calculate the total number of columns stft= zeros(rown, coln);% form the stft matrix Y=zeros(2,coln); % initialize the indexes indx= 0; col= 1; count=0; % performSTFT while indx+ wlen <= xlen % windowing xw=x(indx+1:indx+wlen).*win; count=count+1; %FFT X= fft(xw, nfft); %u=10.*log10(X(1:rown)); %[val, pos]=max(abs(X)); %Y(1,col)=val; %Y(2,col)= pos*fs/nfft; % update the stft matrix stft(:, col)=X(1:rown);

% update the indexes indx= indx+h; col= col+ 1; end

% calculate the time and frequency vectors t=(wlen/2:h:wlen/2+(coln-1)*h)/fs; f= (0:rown-1)*fs/nfft;

end A.3 C Code for PIC Microcontroller

#include #include #include #include #include

#define _XTAL_FREQ 32000000 70 APPENDIX A.

#pragma config CLKOUTEN=OFF #pragma config WDTE=OFF #pragma config PWRTE=OFF #pragma config CP=OFF #pragma config BOREN=OFF #pragma config FCMEN=OFF #pragma config CPD=OFF #pragma config IESO=OFF #pragma config BORV=HI #pragma config MCLRE=OFF #pragma config FOSC=INTOSC #pragma config PLLEN=OFF #pragma config STVREN=ON #pragma config LPBOR=OFF #pragma config LVP=OFF #pragma config WRT=OFF

#define VCC 3.3

char measureAC[3], measureLDR[2], measurePhotD[2]; char adH; char adL; bit test_out;

void _sysinit(void){ TRISA0 = 1; // 1.CONFIGUREPORTADC TRISA1 = 1; //pins configured asINPUT TRISA2 = 1; TRISA3 = 0; TRISC6 = 0; TRISC7 = 1; ANSA0 = 1; //pins selected asANALOG input ANSA1 = 1; ANSA2 = 1; CCP3SEL = 1; //pin selected as inputRX WPUC = 0; //?? INLVLC = 0; //TTL input fromFTDI 5-volt chip,ST=CMOS with trigger schmidth OSCCON=0b11111000; ADCON1=0b11100000; // 2.CONFIGUREADCMODULE ADCON2=0x0F; ADIE = 0; GIE = 0; //CONFEUSART SPBRGH = 0; SPBRGL = 68; BRG16 = 1; BRGH = 1; SYNC = 0; SPEN = 1; //CSRC= 1; //SREN= 0; CREN = 0; A.3. C CODE FOR PIC MICROCONTROLLER 71

TXEN = 0; TX9 = 0; TXIE = 0; PEIE = 0; OPTION_REG=0x05; //pre scaler for timer 0, 101 or 1:64 mode selected INTCON=0; //disable all interrupts

}

void rs232_printf(const char*string, int length){ unsigned chari; TXEN = 1; for(i = 0;i< length;i++) { TXREG= string[i]; __delay_us(90); } }

void conversion(unsigned char channel){

switch(channel){ case 0: ADCON0 = 1; break; case 1: ADCON0 = 5; break; case2: ADCON0 = 9; break; default: break; }

__delay_us(50); GO = 1; while(GO); adH=ADRESH; adL=ADRESL; ADIF = 0; ADON = 0; }

// adResult=(ADRESH << 8)+ADRESL;

void main(void){ _sysinit(); int value; //float decValue; char stringX[5],stringY[5]; 72 APPENDIX A.

test_out = 1; while (1) { TMR0=131; // 256-numero de cyclos de timer 0(f( timer0)=125k.) 131-1Khz,6-500Hz RA3= test_out; test_out=!test_out; conversion(0);

measureAC[0] = adH; measureAC[1] = adL; value=(adH << 8)+adL; conversion(1); // measurePhotD[0]=adH; // measurePhotD[1]=adL;

//sprintf(stringX,"%X",value); // stringX[4]=’’; // value=(adH<<8)+adL;

sprintf(stringY,"%X",value); stringY[4]=’’; // rs232_printf(stringX,5); rs232_printf(stringY,5); // measureAC[2]=0; //value=(adH << 8)+adL; // decValue=(value*VCC)/4096.0; //sprintf(s,"%1.4f\r\n",decValue);

//conversion(1); /* measureLDR[0]=adH; measureLDR[1]=adL; value=(adH << 8)+adL; decValue=(value*VCC)/4096.0; sprintf(s,"%1.4f",decValue); // rs232_printf(s,8); **/ /* conversion(2); measurePhotD[0]=adH; measurePhotD[1]=adL; value=(adH<<8)+adL; decValue=(value*VCC)/4096.0; sprintf(s,"%1.4f",decValue); */

// rs232_printf(s,8); while(!TMR0IF); //wait for timer TMR0IF=0;

} } A.4. LUMINOSITY EXTRACTION ALGORITHM 73

A.4 Luminosity Extraction Algorithm

#include #include #include

int main(int argc, char**argv){ intc; FILE*fid; long int sum;

// open file fid= fopen("output.mat","w+");

// loop for reading pipe while((c= getc(stdin)) !=EOF){ ungetc(c,stdin); unsigned char buffer[1280*360*3]; // 1280*720*1.5 inti;

// initialize sum sum = 0;

// read an entire YUV420 frame of size 1280x720 fread(buffer,1280*720*1.5,1,stdin);

// sum all pixel values of one frame for(i=0;i<1280*720;i++){ sum += (int) buffer[i]; }

fprintf(fid,"%ld",sum); }

fclose(fid); return 0; } 74 APPENDIX A. Bibliography

[1] Daniele Gallo, Carmine Landi, Nicola Pasquino, ”An Instrument for the Objective Measurement of Light Flicker”,2005.IMTC 2005 – Instru- mentation and Measurement Technology Conference, Ottawa, Canada, May 17-19, 2005

[2] Fernando Perez-Gonzalez, Juan R. Hernandez, ”A Tutorial on Digital Watermarking” ,Security Technology, 1999. Proceedings. IEEE 33rd Annual 1999 International Carnahan Conference on Security Technol- ogy (Cat. No.99CH36303)

[3] Prabhishek Singh, R S Chadha, ”A Survey of Digital Watermarking Techniques”, Applications and Attacks, International Journal of Engi- neering and Innovative Technology (IJEIT) Volume 2, Issue 9, March 2013

[4] Ravi Garg, Avinash L. Varna, Adi Hajj-Ahmad, Min Wu, ”“Seeing” ENF: Power-Signature-Based Timestamp for Digital Multimedia via Optical Sensing and Signal Processing”, IEEE Transactions on Infor- mation Forensics and Security, Vol. 8, No. 9, September 2013

[5] GRADIANT, ENF Internal Report, 2015

[6] Wojciech Mazurczyk, Krzysztof Szczypiorski,”Advances in digital media security and right management”,Multimedia Systems (2014) 20:101–103,DOI 10.1007/s00530-013-0339-8

[7] Daniel Patricio Nicolalde ,Jose Antonio Apolinario Jr,”Evaluating Digital Audio Authenticity with Spectral Distances and ENF Phase Change”, DOI: 10.1109/ICASSP.2009.4959859 Conference: Proceed- ings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2009, 19-24 April 2009, Taipei, Taiwan

[8] Niklas Fechner, Matthias Kirchner,”The Humming Hum:Background Noise as a Carrier of ENF Artifacts in Mobile Device Audio Record- ings”,2014,Eighth International Conference on IT Security Incident Management and IT Forensics (IMF)

75 76 BIBLIOGRAPHY

[9] Adi Hajj-Ahmad, Ravi Garg, Min Wu,”Instantaneous Frequency Esti- mation and Localization for ENF Signals”, Signal and Information Pro- cessing Association Annual Summit and Conference (APSIPA ASC), 2012

[10] Adi Hajj-Ahmad,Ravi Garg, Min Wu,”ENF-Based Region-of- Recording Identification for Media Signals”,IEEE Transactions on Information Forensics and Security, Vol. 10, No. 6, June 2015

[11] Hui Su, Adi Hajj-Ahmad, Min Wu, Douglas W. Oard,”Exploring the Use of ENF for Multimedia Synchronization”, 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP)

[12] https : //en.wikipedia.org/wiki/Cross − correlation

[13] Ravi Garg, Avinash L. Varna, Min Wu, ”“Seeing” ENF: Natural Time Stamp for Digital Video via Optical Sensing and Signal Pro- cessing”,DOI: 10.1145/2072298.2072303 Conference: Proceedings of the 19th International Conference on Multimedea 2011, Scottsdale, AZ, USA, November 28 - December 1, 2011

[14] Hui Su, Adi Hajj-Ahmad, Chau-Wai Wong, Ravi Garg, Min Wu,”ENF Signal Induced by Power Grid: A New Modality for Video Synchro- nization”,Proceedings of the 2nd ACM International Workshop on Im- mersive Media Experiences, pp.13-18

[15] Keith Jack,”Video Demystified: A Handbook for the Digital Engineer”, Newnes , 2005, Chapter 3

[16] http : //www.vishay.com/docs/81519/bpw21r.pdf

[17] http : //astro.u − strasbg.fr/ koppen/blueskies/photometer.html

[18] https : //en.wikipedia.org/wiki/Receiveroperatingcharacteristic [19] http : //housecraft.ca/eco−friendly−lighting−colour−rendering− index − and − colour − temperature/

[20] AlinoratEnglishW ikipedia, CCBY − SA3.0, https : //commons.wikimedia.org/w/index.php?curid = 30154547

[21] http : //www.videomaker.com/article/f6/15788 − the − anatomy − of − chroma − subsampling

[22] http : //gim.unmc.edu/dxtests/roc3.htm