Comparing Oversampling Techniques to Handle the Class Imbalance Problem: a Customer Churn Prediction Case Study

Total Page:16

File Type:pdf, Size:1020Kb

Comparing Oversampling Techniques to Handle the Class Imbalance Problem: a Customer Churn Prediction Case Study Received September 14, 2016, accepted October 1, 2016, date of publication October 26, 2016, date of current version November 28, 2016. Digital Object Identifier 10.1109/ACCESS.2016.2619719 Comparing Oversampling Techniques to Handle the Class Imbalance Problem: A Customer Churn Prediction Case Study ADNAN AMIN1, SAJID ANWAR1, AWAIS ADNAN1, MUHAMMAD NAWAZ1, NEWTON HOWARD2, JUNAID QADIR3, (Senior Member, IEEE), AHMAD HAWALAH4, AND AMIR HUSSAIN5, (Senior Member, IEEE) 1Center for Excellence in Information Technology, Institute of Management Sciences, Peshawar 25000, Pakistan 2Nuffield Department of Surgical Sciences, University of Oxford, Oxford, OX3 9DU, U.K. 3Information Technology University, Arfa Software Technology Park, Lahore 54000, Pakistan 4College of Computer Science and Engineering, Taibah University, Medina 344, Saudi Arabia 5Division of Computing Science and Maths, University of Stirling, Stirling, FK9 4LA, U.K. Corresponding author: A. Amin ([email protected]) The work of A. Hussain was supported by the U.K. Engineering and Physical Sciences Research Council under Grant EP/M026981/1. ABSTRACT Customer retention is a major issue for various service-based organizations particularly telecom industry, wherein predictive models for observing the behavior of customers are one of the great instruments in customer retention process and inferring the future behavior of the customers. However, the performances of predictive models are greatly affected when the real-world data set is highly imbalanced. A data set is called imbalanced if the samples size from one class is very much smaller or larger than the other classes. The most commonly used technique is over/under sampling for handling the class-imbalance problem (CIP) in various domains. In this paper, we survey six well-known sampling techniques and compare the performances of these key techniques, i.e., mega-trend diffusion function (MTDF), synthetic minority oversampling technique, adaptive synthetic sampling approach, couples top-N reverse k-nearest neigh- bor, majority weighted minority oversampling technique, and immune centroids oversampling technique. Moreover, this paper also reveals the evaluation of four rules-generation algorithms (the learning from example module, version 2 (LEM2), covering, exhaustive, and genetic algorithms) using publicly available data sets. The empirical results demonstrate that the overall predictive performance of MTDF and rules- generation based on genetic algorithms performed the best as compared with the rest of the evaluated oversampling methods and rule-generation algorithms. INDEX TERMS SMOTE, ADASYN, mega trend diffusion function, class imbalance, rough set, customer churn, mRMR. ICOTE, MWMOTE, TRkNN. I. INTRODUCTION certain patterns that may be exploited for acquiring customer In many subscription-based service industries such as information on time for intelligent decision-making practice telecommunications companies, are constantly striving to of an industry [2]. recognize customers that are looking to switch providers (i.e., However, knowledge discovery in such rich CRM Customer churn). Reducing churn is extremely important in databases, which typically contains thousands or millions of competitive markets since acquiring new customers in such customers' information, is a challenging and difficult task. markets is very difficult (attracting non-subscribers can cost Therefore, many industries have to inescapably depend on up to six times more than what it costs to retain the current prediction models for customer churn if they want to remain customers by taking active steps to discourage churn behavior in the competitive market [1]. As a consequence, several [1]). Churn-prone industries such as the telecommunication competitive industries have implemented a wide range of sta- industry typically maintain customer relationship manage- tistical and intelligent machine learning (ML) techniques to ment (CRM) databases that are rich in unseen knowledge and develop predictive models that deal with customer churn [2]. 2169-3536 2016 IEEE. Translations and content mining are permitted for academic research only. 7940 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 4, 2016 See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. A. Amin et al.: Comparing Oversampling Techniques to Handle the CIP: A Customer Churn Prediction Case Study Unfortunately, the performance of ML techniques is SMOTE [6] is commonly used as a benchmark for over- considerably affected by the CIP. The problem of imbal- sampling algorithm [7], [8]. ADASYN is also an important anced dataset appears when the proportion of majority oversampling technique which improves the learning about class has a higher ratio than minority class [3], [4]. the samples distribution in an efficient way [9]. MTDF was The skewed distribution (imbalanced) of data in the dataset first proposed by Li et al. [10] and reported improved per- poses challenges for machine learning and data mining algo- formance of classifying imbalanced medical datasets [11]. rithms [3], [5]. This is an area of research focusing on skewed CUBE is also another advanced oversampling technique [12] class distribution where minority class is targeted for classi- but we have not considered this approach since as noted fication [6]. Consider a dataset where the imbalance ratio is by Japkowicz [13], CUBE oversampling technique does not 1:99 (i.e., where 99% of the instances belong to the majority increase the predictive performance of the classifier. The class, and 1% belongs to the minority class). A classifier above-mentioned oversampling approaches are used to han- may achieve the accuracy up to 99% just by ignoring that dle the imbalanced dataset and improve the performance 1% of minority class instances—however, adopting such of predictive models for customer churn, particularly in an approach will result in the failure to correctly classify the telecommunication sector. Rough Set Theory (RST) is any instances of the class of interest (often the minority applied to the four different rule-generation algorithms— class). (i) Learning from Example Module, version 2 (LEM2), It is well known that churn is a rare object in service- (ii) Genetic (Gen), (iii) Covering (Cov) and (iv) Exhaustive based industries, and that misclassification is more costly for (Exh) algorithms—in this study to observe the behavior of rare objects or events in the case of imbalance datasets [7]. customer churn and all experiments are applied on the pub- Traditional approaches, therefore, can provide mislead- licly available dataset. It is specified here that this paper is ing results on the class-imbalanced dataset—a situation an extended version of our previous work [14], and makes that is highly significant and one that occurs in many the following contributions: (i) more datasets are employed domains [3], [6]. For instance, there are a significant num- to obtained more generalized results for the selected over- ber of real-world applications that are suffering from the sampling techniques and the rules-generation algorithms, class imbalance problem (e.g., medical and fault diagnosis, (ii) another well-known oversampling technique—namely, anomaly detection, face recognition, telecommunication, the ADASYN—is also used (iii) detailed analysis and discus- web & email classification, ecology, biology and financial sion on the performance of targeted oversampling tech- services [3], [4], [6], [85], [86]). niques (namely, ADASYN, MTDF, SMOTE, MWMOTE, Sampling is a commonly used technique for handling ICOTE and TRkNN) followed by the rules-generation algo- the CIP in various domains. Broadly speaking, sampling rithms—namely, Gen, Cov, LEM2 and Exh, and (iv) detailed can be categorized into oversampling and undersampling. performance evaluation—in terms of the balance accuracy, A large number of studies focused on handling the class the imbalance ratio, the area under the curve (AUC) and imbalance problem have been reported in literature [2], [4]. the McNemar's statistical test—is performed to validate the These studies can be grouped into the following approaches results and avoid any biases. Many comparative studies [2], based on their dealing with class imbalance issue: (i) the [7], [11], [15], [16] have already been carried out on the internal-level: construct or update the existing methods to comparison of oversampling and undersampling methods for emphasize the significance of the minority class and (ii) handling the CIP; however, the proposed study differs from the external level: adding data in preprocessing stage where the previous studies in that in addition to evaluating six over- the distribution of class is resampled in order to reduce sampling techniques (SMOTE, ADASYN, MTDF, ICOTE, the influence of imbalanced distribution of class in the MWMOTE and TRkNN), we also compare the performance classification process. The internal level approach is fur- of four rules-generation algorithms (Exh, Gen, Cov and ther divided into two groups [5]. Firstly, the cost-sensitive LEM2). The proposed study is also focused on considering approach, which falls between internal and external level the following research questions (RQ): approaches, is based on reducing incorrect classification • RQ1: What is the list of attributes that is highly symp- costs for minority class leading to reduction of the over- tomatic in the targeted data set for prediction of customer all
Recommended publications
  • Efficient Supersampling Antialiasing for High-Performance Architectures
    Efficient Supersampling Antialiasing for High-Performance Architectures TR91-023 April, 1991 Steven Molnar The University of North Carolina at Chapel Hill Department of Computer Science CB#3175, Sitterson Hall Chapel Hill, NC 27599-3175 This work was supported by DARPA/ISTO Order No. 6090, NSF Grant No. DCI- 8601152 and IBM. UNC is an Equa.l Opportunity/Affirmative Action Institution. EFFICIENT SUPERSAMPLING ANTIALIASING FOR HIGH­ PERFORMANCE ARCHITECTURES Steven Molnar Department of Computer Science University of North Carolina Chapel Hill, NC 27599-3175 Abstract Techniques are presented for increasing the efficiency of supersampling antialiasing in high-performance graphics architectures. The traditional approach is to sample each pixel with multiple, regularly spaced or jittered samples, and to blend the sample values into a final value using a weighted average [FUCH85][DEER88][MAMM89][HAEB90]. This paper describes a new type of antialiasing kernel that is optimized for the constraints of hardware systems and produces higher quality images with fewer sample points than traditional methods. The central idea is to compute a Poisson-disk distribution of sample points for a small region of the screen (typically pixel-sized, or the size of a few pixels). Sample points are then assigned to pixels so that the density of samples points (rather than weights) for each pixel approximates a Gaussian (or other) reconstruction filter as closely as possible. The result is a supersampling kernel that implements importance sampling with Poisson-disk-distributed samples. The method incurs no additional run-time expense over standard weighted-average supersampling methods, supports successive-refinement, and can be implemented on any high-performance system that point samples accurately and has sufficient frame-buffer storage for two color buffers.
    [Show full text]
  • ESE 531: Digital Signal Processing
    ESE 531: Digital Signal Processing Lec 12: February 21st, 2017 Data Converters, Noise Shaping (con’t) Penn ESE 531 Spring 2017 - Khanna Lecture Outline ! Data Converters " Anti-aliasing " ADC " Quantization " Practical DAC ! Noise Shaping Penn ESE 531 Spring 2017 - Khanna 2 ADC Penn ESE 531 Spring 2017 - Khanna 3 Anti-Aliasing Filter with ADC Penn ESE 531 Spring 2017 - Khanna 4 Oversampled ADC Penn ESE 531 Spring 2017 - Khanna 5 Oversampled ADC Penn ESE 531 Spring 2017 - Khanna 6 Oversampled ADC Penn ESE 531 Spring 2017 - Khanna 7 Oversampled ADC Penn ESE 531 Spring 2017 - Khanna 8 Sampling and Quantization Penn ESE 531 Spring 2017 - Khanna 9 Sampling and Quantization Penn ESE 531 Spring 2017 - Khanna 10 Effect of Quantization Error on Signal ! Quantization error is a deterministic function of the signal " Consequently, the effect of quantization strongly depends on the signal itself ! Unless, we consider fairly trivial signals, a deterministic analysis is usually impractical " More common to look at errors from a statistical perspective " "Quantization noise” ! Two aspects " How much noise power (variance) does quantization add to our samples? " How is this noise distributed in frequency? Penn ESE 531 Spring 2017 - Khanna 11 Quantization Error ! Model quantization error as noise ! In that case: Penn ESE 531 Spring 2017 - Khanna 12 Ideal Quantizer ! Quantization step Δ ! Quantization error has sawtooth shape, ! Bounded by –Δ/2, +Δ/2 ! Ideally infinite input range and infinite number of quantization levels Penn ESE 568 Fall 2016 - Khanna adapted from Murmann EE315B, Stanford 13 Ideal B-bit Quantizer ! Practical quantizers have a limited input range and a finite set of output codes ! E.g.
    [Show full text]
  • CSMOUTE: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification
    CSMOUTE: Combined Synthetic Oversampling and Undersampling Technique for Imbalanced Data Classification Michał Koziarski Department of Electronics AGH University of Science and Technology Al. Mickiewicza 30, 30-059 Kraków, Poland Email: [email protected] Abstract—In this paper we propose a novel data-level algo- ity observations (oversampling). Secondly, the algorithm-level rithm for handling data imbalance in the classification task, Syn- methods, which adjust the training procedure of the learning thetic Majority Undersampling Technique (SMUTE). SMUTE algorithms to better accommodate for the data imbalance. In leverages the concept of interpolation of nearby instances, previ- ously introduced in the oversampling setting in SMOTE. Further- this paper we focus on the former. Specifically, we propose more, we combine both in the Combined Synthetic Oversampling a novel undersampling algorithm, Synthetic Majority Under- and Undersampling Technique (CSMOUTE), which integrates sampling Technique (SMUTE), which leverages the concept of SMOTE oversampling with SMUTE undersampling. The results interpolation of nearby instances, previously introduced in the of the conducted experimental study demonstrate the usefulness oversampling setting in SMOTE [5]. Secondly, we propose a of both the SMUTE and the CSMOUTE algorithms, especially when combined with more complex classifiers, namely MLP and Combined Synthetic Oversampling and Undersampling Tech- SVM, and when applied on datasets consisting of a large number nique (CSMOUTE), which integrates SMOTE oversampling of outliers. This leads us to a conclusion that the proposed with SMUTE undersampling. The aim of this paper is to approach shows promise for further extensions accommodating serve as a preliminary study of the potential usefulness of local data characteristics, a direction discussed in more detail in the proposed approach, with the final goal of extending it to the paper.
    [Show full text]
  • Signal Sampling
    FYS3240 PC-based instrumentation and microcontrollers Signal sampling Spring 2017 – Lecture #5 Bekkeng, 30.01.2017 Content – Aliasing – Sampling – Analog to Digital Conversion (ADC) – Filtering – Oversampling – Triggering Analog Signal Information Three types of information: • Level • Shape • Frequency Sampling Considerations – An analog signal is continuous – A sampled signal is a series of discrete samples acquired at a specified sampling rate – The faster we sample the more our sampled signal will look like our actual signal Actual Signal – If not sampled fast enough a problem known as aliasing will occur Sampled Signal Aliasing Adequately Sampled SignalSignal Aliased Signal Bandwidth of a filter • The bandwidth B of a filter is defined to be between the -3 dB points Sampling & Nyquist’s Theorem • Nyquist’s sampling theorem: – The sample frequency should be at least twice the highest frequency contained in the signal Δf • Or, more correctly: The sample frequency fs should be at least twice the bandwidth Δf of your signal 0 f • In mathematical terms: fs ≥ 2 *Δf, where Δf = fhigh – flow • However, to accurately represent the shape of the ECG signal signal, or to determine peak maximum and peak locations, a higher sampling rate is required – Typically a sample rate of 10 times the bandwidth of the signal is required. Illustration from wikipedia Sampling Example Aliased Signal 100Hz Sine Wave Sampled at 100Hz Adequately Sampled for Frequency Only (Same # of cycles) 100Hz Sine Wave Sampled at 200Hz Adequately Sampled for Frequency and Shape 100Hz Sine Wave Sampled at 1kHz Hardware Filtering • Filtering – To remove unwanted signals from the signal that you are trying to measure • Analog anti-aliasing low-pass filtering before the A/D converter – To remove all signal frequencies that are higher than the input bandwidth of the device.
    [Show full text]
  • Enhancing ADC Resolution by Oversampling
    AVR121: Enhancing ADC resolution by oversampling 8-bit Features Microcontrollers • Increasing the resolution by oversampling • Averaging and decimation • Noise reduction by averaging samples Application Note 1 Introduction Atmel’s AVR controller offers an Analog to Digital Converter with 10-bit resolution. In most cases 10-bit resolution is sufficient, but in some cases higher accuracy is desired. Special signal processing techniques can be used to improve the resolution of the measurement. By using a method called ‘Oversampling and Decimation’ higher resolution might be achieved, without using an external ADC. This Application Note explains the method, and which conditions need to be fulfilled to make this method work properly. Figure 1-1. Enhancing the resolution. A/D A/D A/D 10-bit 11-bit 12-bit t t t Rev. 8003A-AVR-09/05 2 Theory of operation Before reading the rest of this Application Note, the reader is encouraged to read Application Note AVR120 - ‘Calibration of the ADC’, and the ADC section in the AVR datasheet. The following examples and numbers are calculated for Single Ended Input in a Free Running Mode. ADC Noise Reduction Mode is not used. This method is also valid in the other modes, though the numbers in the following examples will be different. The ADCs reference voltage and the ADCs resolution define the ADC step size. The ADC’s reference voltage, VREF, may be selected to AVCC, an internal 2.56V / 1.1V reference, or a reference voltage at the AREF pin. A lower VREF provides a higher voltage precision but minimizes the dynamic range of the input signal.
    [Show full text]
  • Undersampling and Oversampling in Sample Based Shape Modeling
    Undersampling and Oversampling in Sample Based Shape Modeling Tamal K. Dey Joachim Giesen Samrat Goswami James Hudson Rephael Wenger Wulue Zhao Ohio State University Columbus, OH 43210 Abstract early paper on the problem was by Boissonat [11] who pro- posed a ‘sculpting’ of the Delaunay triangulation for recon- Shape modeling is an integral part of many visualization struction. A more refined sculpting strategy was designed problems. Recent advances in scanning technology and a by Edelsbrunner and Muck¨ e [16] in their -shape algorithm. number of surface reconstruction algorithms have opened up Bajaj, Bernardini and Xu [9] used -shapes for reconstruct- a new paradigm for modeling shapes from samples. Many of ing scalar fields and 3D CAD models. In [15] Edelsbrun- the problems currently faced in this modeling paradigm can ner reported the design of a commercial software WRAP be traced back to two anomalies in sampling, namely under- that eliminated the need for uniform samples in -shapes. sampling and oversampling. Boundaries, non-smoothness Hoppe et al. [24] reconstructed the surface using the zero and small features create undersampling problems, whereas level set of a distance function defined over the samples. oversampling leads to too many triangles. We use Voronoi Curless and Levoy [14] used a distance function to con- cell geometry as a unified guide to detect undersampling and struct an implicit surface from multiple range scans. Turk oversampling. We apply these detections in surface recon- and Levoy [31] devised an incremental algorithm that itera- struction and model simplification. Guarantees of the algo- tively improves a reconstruction by erosion and zippering.
    [Show full text]
  • Basics on Digital Signal Processing
    Analog to Digital Conversion Oversampling or Σ-Δ converters Vassilis Anastassopoulos Electronics Laboratory, Physics Department, University of Patras Outline of the Lecture Analog to Digital Conversion - Oversampling or Σ-Δ converters Sampling Theorem - Quantization of Signals PCM (Successive Approximation, Flash ADs) DPCM - DELTA - Σ-Δ Modulation Principles - Characteristics 1st and 2nd order Noise shaping - performance Decimation and A/D converters Filtering of Σ-Δ sequences 2/39 Analog & digital signals Analog Digital Continuous function V Discrete function Vk of of continuous variable t Sampled discrete sampling (time, space etc) : V(t). Signal variable tk, with k = integer: Vk = V(tk). 0.3 0.3 0.2 0.2 0.1 0.1 0 0 Voltage [V] Voltage Voltage [V] Voltage Voltage [V] Voltage -0.1 -0.1 ts ts -0.2 -0.2 0 2 4 6 8 10 0 2 4 6 8 10 time [ms] sampling time, tk [ms] Uniform (periodic) sampling. Sampling frequency fS = 1/ tS 3/39 AD Conversion - Details 4/39 Sampling 5/39 Rotating Disk How fast do we have to instantly stare at the disk if it rotates with frequency 0.5 Hz? 6/39 1 The sampling theorem A signal s(t) with maximum frequency fMAX can be Theo* recovered if sampled at frequency fS > 2 fMAX . * Multiple proposers: Whittaker(s), Nyquist, Shannon, Kotelnikov. Naming gets confusing ! Nyquist frequency (rate) fN = 2 fMAX or fMAX or fS,MIN or fS,MIN/2 Example s(t) 3cos(50πt) 10sin(300πt) cos(100πt) Condition on fS? F1 F2 F3 f > 300 Hz F1=25 Hz, F2 = 150 Hz, F3 = 50 Hz S fMAX 7/39 Sampling and Spectrum 8/39 1 Sampling low-pass signals Continuous spectrum (a) (a) Band-limited signal: frequencies in [-B, B] (fMAX = B).
    [Show full text]
  • AN118: Improving ADC Resolution by Oversampling and Averaging
    AN118 IMPROVING ADC RESOLUTION BY OVERSAMPLING AND AVERAGING 1. Introduction 2. Key Points Many applications require measurements using an Oversampling and averaging can be used to analog-to-digital converter (ADC). Such increase measurement resolution, eliminating the applications will have resolution requirements need to resort to expensive, off-chip ADCs. based in the signal’s dynamic range, the smallest Oversampling and averaging will improve the SNR and measurement resolution at the cost of increased change in a parameter that must be measured, and CPU utilization and reduced throughput. the signal-to-noise ratio (SNR). For this reason, Oversampling and averaging will improve signal-to- many systems employ a higher resolution off-chip noise ratio for “white” noise. ADC. However, there are techniques that can be 2.1. Sources of Data Converter Noise used to achieve higher resolution measurements and SNR. This application note describes utilizing Noise in ADC conversions can be introduced from oversampling and averaging to increase the many sources. Examples include: thermal noise, resolution and SNR of analog-to-digital shot noise, variations in voltage supply, variation in conversions. Oversampling and averaging can the reference voltage, phase noise due to sampling increase the resolution of a measurement without clock jitter, and noise due to quantization error. The resorting to the cost and complexity of using noise caused by quantization error is commonly expensive off-chip ADCs. referred to as quantization noise. Noise power from This application note discusses how to increase the these sources can vary. Many techniques that may resolution of analog-to-digital (ADC) measure- be utilized to reduce noise, such as thoughtful ments by oversampling and averaging.
    [Show full text]
  • Achieving Higher ADC Resolution Using Oversampling
    AN1152 Achieving Higher ADC Resolution Using Oversampling THEORY OF OPERATION Author: Jayanth Murthy Madapura Microchip Technology Inc. As previously mentioned, ADCs transform analog signals into digital sample values. Analog signal amplitude is quantized into digital code words with a INTRODUCTION finite word length. This process of quantization introduces noise in the signal called “quantization An Analog-to-Digital Converter (ADC) is an active inter- noise”. The smaller the word length, the greater the face between the analog and digital signal chains in an noise introduced. embedded system. An ADC converts analog signals into digital signals in electronic systems. The key fea- Quantization noise can be reduced by adding more bits ture of an ADC is the accuracy (resolution) it offers. The into the ADC hardware design. This noise can also be higher the desired accuracy, the higher the ADC cost. reduced in software by oversampling the ADC and then processing the digital signal. The oversampling ADC Higher ADC accuracy is achieved by designing hard- method and a few associated terms are explained in ware to quantize the analog signal amplitude into the the following sections. digital signal with a higher code-word length. Practical ADCs have finite word lengths. ADC Voltage Resolution To effectively strike a balance between system cost and accuracy, higher conversion accuracy is achieved Voltage resolution of an ADC is defined as the ratio of by oversampling the low-resolution ADC integrated full scale voltage range to the number of digital levels within a digital signal controller (DSC), and then pro- that are accommodated in that range.
    [Show full text]
  • Oversampling with Noise Shaping • Place the Quantizer in a Feedback Loop Xn() Yn() Un() Hz() – Quantizer
    Oversampling Converters David Johns and Ken Martin University of Toronto ([email protected]) ([email protected]) University of Toronto slide 1 of 57 © D.A. Johns, K. Martin, 1997 Motivation • Popular approach for medium-to-low speed A/D and D/A applications requiring high resolution Easier Analog • reduced matching tolerances • relaxed anti-aliasing specs • relaxed smoothing filters More Digital Signal Processing • Needs to perform strict anti-aliasing or smoothing filtering • Also removes shaped quantization noise and decimation (or interpolation) University of Toronto slide 2 of 57 © D.A. Johns, K. Martin, 1997 Quantization Noise en() xn() yn() xn() yn() en()= yn()– xn() Quantizer Model • Above model is exact — approx made when assumptions made about en() • Often assume en() is white, uniformily distributed number between ±∆ ⁄ 2 •∆ is difference between two quantization levels University of Toronto slide 3 of 57 © D.A. Johns, K. Martin, 1997 Quantization Noise T ∆ time • White noise assumption reasonable when: — fine quantization levels — signal crosses through many levels between samples — sampling rate not synchronized to signal frequency • Sample lands somewhere in quantization interval leading to random error of ±∆ ⁄ 2 University of Toronto slide 4 of 57 © D.A. Johns, K. Martin, 1997 Quantization Noise • Quantization noise power shown to be ∆2 ⁄ 12 and is independent of sampling frequency • If white, then spectral density of noise, Se()f , is constant. S f e() ∆ 1 Height k = ---------- ---- x f 12 s f f f s 0 s –---- ---- 2 2 University of Toronto slide 5 of 57 © D.A. Johns, K. Martin, 1997 Oversampling Advantage • Oversampling occurs when signal of interest is bandlimited to f0 but we sample higher than 2f0 • Define oversampling-rate OSR = fs ⁄ ()2f0 (1) • After quantizing input signal, pass it through a brickwall digital filter with passband up to f0 y ()n 1 un() Hf() y2()n N-bit quantizer Hf() 1 f f fs –f 0 f s –---- 0 0 ---- 2 2 University of Toronto slide 6 of 57 © D.A.
    [Show full text]
  • Oversampling Techniques Using the Tms320c24x Family
    Oversampling Techniques using the TMS320C24x Family Literature Number: SPRA461 Texas Instruments Europe June 1998 IMPORTANT NOTICE Texas Instruments and its subsidiaries (TI) reserve the right to make changes to their products or to discontinue any product or service without notice, and advise customers to obtain the latest version of relevant information to verify, before placing orders, that in- formation being relied on is current and complete. All products are sold subject to the terms and conditions of sale supplied at the time of order acknowledgement, including those pertaining to warranty, patent infringement, and limitation of liability. TI warrants performance of its semiconductor products to the specifications applicable at the time of sale in accordance with TI’s standard warranty. Testing and other quality control techniques are utilized to the extent TI deems necessary to support this warranty. Specific testing of all parameters of each device is not necessarily performed, except those mandated by government requirements. CERTAIN APPLICATIONS USING SEMICONDUCTOR PRODUCTS MAY INVOLVE POTENTIAL RISKS OF DEATH, PERSONAL INJURY, OR SEVERE PROPERTY OR ENVIRONMENTAL DAMAGE (“CRITICAL APPLICATIONS”). TI SEMICONDUCTOR PRODUCTS ARE NOT DESIGNED, AUTHORIZED, OR WARRANTED TO BE SUITA- BLE FOR USE IN LIFE-SUPPORT DEVICES OR SYSTEMS OR OTHER CRITICAL AP- PLICATIONS. INCLUSION OF TI PRODUCTS IN SUCH APPLICATIONS IS UNDERS- TOOD TO BE FULLY AT THE CUSTOMER’S RISK. In order to minimize risks associated with the customer’s applications, adequate design and operating safeguards must be provided by the customer to minimize inherent or pro- cedural hazards. TI assumes no liability for applications assistance or customer product design.
    [Show full text]
  • Sampling: What Nyquist Didn't Say, and What to Do About It
    Sampling: What Nyquist Didn’t Say, and What to Do About It Tim Wescott, Wescott Design Services The Nyquist-Shannon sampling theorem is useful, but often misused when engineers establish sampling rates or design anti-aliasing filters. This article explains how sampling affects a signal, and how to use this information to design a sampling system with known performance. June 20, 2016 Sampling: What Nyquist Didn’t Say, and What to Do About It What Nyquist Did Say The assertion made by the Nyquist-Shannon sampling theorem is simple: if you have a signal that is perfectly band limited to a bandwidth of f0 then you can collect all the information there is in that signal by sampling it at discrete times, as long as your sample rate is greater than 2f0 . As theorems go this statement is delightfully short. Unfortunately, while the theorem is simple to state it can be very misleading when one tries to apply it in practice. Often when I am teaching a seminar, working with a client, or reading a Usenet newsgroup, someone will say “Nyquist says”, then follow this with an incorrect use of the Nyquist- Shannon theorem. This paper is about explaining what the Nyquist-Shannon sampling theorem really says, what it means, and how to use it. It is a common misconception that the Nyquist-Shannon sampling theorem could be used to provide a simple, straight forward way to determine the correct minimum sample rate for a system. While the theorem does establish some bounds, it does not give easy an- swers.
    [Show full text]