Towards Robust Audio-Visual Speech Recognition

Total Page:16

File Type:pdf, Size:1020Kb

Towards Robust Audio-Visual Speech Recognition Research Collection Doctoral Thesis Towards Robust Audio-Visual Speech Recognition Author(s): Naghibi, Tofigh Publication Date: 2015 Permanent Link: https://doi.org/10.3929/ethz-a-010492674 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library DISS. ETH NO. 22867 Towards Robust Audio-Visual Speech Recognition A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich) presented by TOFIGH NAGHIBI MSc, Sharif University of Technology, Iran born June 8, 1983 citizen of Iran accepted on the recommendation of Prof. Dr. Lothar Thiele, examiner Prof. Dr. Gerhard Rigoll, co-examiner Dr. Beat Pfister, co-examiner 2015 Acknowledgements I would like to thank Prof. Dr. Lothar Thiele and Dr. Beat Pfister for super- vising my research. Beat has always allowed me complete freedom to define and explore my own directions in research. While this might prove difficult and sometimes even frustrating to begin with, I have come to appreciate the wisdom of his way since it not only encouraged me to think for myself, but also gave me enough self-confidence to tackle open problems that otherwise I would not dare to touch. The former and current members of the ETH speech group helped me a lot in various ways. In particular, I would like to thank Sarah Hoffmann who was my computer science and cultural mentor. Her vast knowledge in pro- gramming and algorithm design and her willingness to share all these with me made my electrical engineer-to-computer scientist transformation much easier. Further, following her cultural advice particularly on food and bever- age created the best memories of my PhD life. I want to thank Hui Liang for his valuable feedback and reading parts of this thesis, usually on very short notice. I also owe him for always being available to listen to my ideas and even study new emerging machine learning techniques together which without him would not be read. I am also indebted to Lukas Knauer for his German translation of my dissertation abstract and Naoya Takahashi for using his power point expertise. On the personal side, I wish to thank my parents for all they have done for me. It takes many years to realize how much of what you are depends on your parents. I wish to thank my sister Nasim and my close friend Damoun for the fun we have had growing up together. Finally, special thanks must go to my amazing wife Haleh for her pa- tience. She waited for me many nights while I was playing with my machine 4 learning toys at office or even at home. Her invisible management of every- thing that may distract me and her favorite quote ”everything is under the control” were literally my anchor throughout my PhD life. Contents List of Abbreviations 9 Abstract 11 Kurzfassung 15 1 Introduction 19 1.1 ProblemStatement . ... .... .... .... .... .. 19 1.2 ScientificContributions. 21 1.3 StructureofthisThesis . 22 2 Visual Features and AV Datasets 25 2.1 VisualFeatureDescriptions. 26 2.1.1 Scale-invariantFeatureTransform(SIFT) . 27 2.1.2 InvariantScatteringConvolutionNetworks . 28 2.1.3 3D-SIFT ....................... 29 2.1.4 Bag-of-Words... .... .... .... .... .. 29 2.2 Datasets............................ 30 2.2.1 CUAVE........................ 31 2.2.2 GRID......................... 32 2.2.3 Oulu ......................... 34 2.2.4 AVletter........................ 35 2.2.5 ETHDigits ...................... 36 6 Contents 3 Feature Selection 39 3.1 IntroductiontoFeatureSelectionProblem . 40 3.1.1 VariousFeatureSelectionMethods . 40 3.1.2 FeatureSelectionSearchStrategies . 42 3.2 MutualInformationProsandCons . 43 3.2.1 First Expansion:Multi-wayMutualInformation . 46 3.2.2 SecondExpansion:ChainRuleofInformation . 49 3.2.3 TruncationoftheExpansions . 50 3.2.4 The superiority of the D2 Approximation . 54 3.3 SearchStrategies ....................... 55 3.3.1 ConvexBasedSearch. 56 3.3.2 ApproximationAnalysis . 61 3.4 ExperimentsforCOBRAEvaluation . 62 3.5 COBRA-selectedISCNFeaturesforLipreading . 73 3.6 ConclusionandDiscussion . 80 4 Binary and Multiclass Classification 81 4.1 IntroductiontoBoostingProblem . 82 4.2 OurResults .......................... 83 4.3 Fundamentals ......................... 85 4.4 BoostingFramework . 87 4.4.1 SparseBoosting . .... .... .... .... .. 91 4.4.2 SmoothBoosting . 93 4.4.3 MABoost forCombiningDatasets (CD-MABoost) . 93 4.4.4 LazyUpdateBoosting . 94 4.5 MulticlassGeneralization. 96 4.5.1 PreliminariesforMulticlassSetting . 97 4.5.2 MulticlassMABoost(Mu-MABoost) . 98 4.6 ClassificationExperimentswithBoosting . 100 4.6.1 BinaryMABoostExperiments . 101 4.6.2 ExperimentwithSparseBoost . 102 4.6.3 ExperimentwithCD-MABoost . 104 4.6.4 MulticlassClassificationExperiments . 106 4.7 ConclusionandDiscussion . 108 Contents 7 5 VisualVoiceActivityDetection 111 5.1 IntroductiontoUtteranceDetection . 112 5.2 SupervisedLearning:VADbyUsingSparseBoost . 113 5.3 Semi-supervisedLearning:Audio-visualVAD . 119 5.3.1 Ro-MABoost ... .... .... .... .... .. 121 5.3.2 Experiments on Supervised & Semi-supervised VAD 123 5.4 Conclusion .......................... 126 6 Lipreading with High-Dimensional Feature Vectors 127 6.1 IntroductiontoVisualSpeechRecognition . 128 6.2 ColorSpaces ......................... 132 6.3 VisualFeatures ........................ 135 6.4 MulticlassClassifier. 135 6.5 LipreadingExperimentswithISMAFeatures . 136 6.5.1 Oulu:Phraserecognition . 137 6.5.2 GRID:Digitrecognition . 142 6.5.3 AVletter:Letterrecognition . 147 6.6 ConclusionandDiscussion . 150 7 Audio Visual Information Fusion 153 7.1 TheProblemofAudioVisualFusion . 154 7.2 Preliminaries&ModelDescription. 157 7.3 AUC:TheAreaUnderanROCCurve . 160 7.3.1 AUCandMismatch . 163 7.3.2 AUCGeneralizationtoMulticlassProblem . 164 7.3.3 TransformingHMMOutputs. 165 7.3.4 OnlineAUCMaximization. 166 7.4 Multi-vectorModel . 170 7.5 ExperimentsonAUC-basedFusionStrategy . 172 7.6 Conclusion .......................... 183 8 Multichannel Audio-video Speech Recognition System 185 8.1 IntroductiontoBeamformingProblem . 185 8.2 SignalCancellation . 187 8.3 TransferFunctionEstimation . 189 8 Contents 8.4 SimulationResults . 192 8.5 ExperimentswithETHDigitsDataset . 194 8.6 Conclusion .......................... 198 9 Conclusion 201 9.1 Achievements......................... 201 9.1.1 Perspectives. 202 9.1.2 OpenProblems . 204 A Complementary Fundamentals 207 A.1 DefinitionsandPreliminaries . 207 A.2 ProofofTheorem4.7 . 208 A.3 ProofofLemma4.8. 209 A.4 Proof of the Boosting Algorithm for Combined Datasets . 209 A.5 ProofofTheorem4.9 . 210 A.6 ProofofEntropyProjectionontoHypercube . 213 A.7 ProofofTheorem4.11 . 214 A.8 MulticlassWeak-learningCondition . 216 List of Abbreviations AAM activeappearancemodel A-ASR audio-basedautomaticspeechrecognition ADABoost adaptive Boosting ANN artificialneuralnetwork ASM activeshapemodel ASR automaticspeechrecognition AUC areaunderthecurve A-VAD audio-basedvoiceactivitydetection AV-ASR audio-visualautomaticspeechrecognition BE backward elimination BoW bagofwords CART classification and regression tree CD-MABoost combiningdatasetwithMABoost COBRA convex based relaxation approximation DCT discretecosinetransform fps numberofframespersecond(framerate) FS forward selection GMM Gaussianmixturemodel HMM hiddenMarkovmodel ISCN invariant scattering convolution network ISMA combinedfeaturevectorconsistingofISCNfeatures and MABoost posterior probabilities 10 Contents JMI jointmutualinformation KL Kullback-Leibler LBP localbinarypattern LDA linear discriminant analysis MABoost mirrorascentboosting MADABoost modified adaptive boosting MAPP Mu-MABoost posterior probabilities MFCC mel-frequencycepstralcoefficient MIFS mutualinformationfeatureselection ML machinelearning MRI magneticresonanceimaging mRMR minimalredundancymaximalrelevance Mu-MABoost multiclass MABoost MVDR minimumvariancedistortionlessresponse PAC probablyapproximatelycorrect PCA principlecomponentanalysis RF randomforest ROC receiveroperatingcharacteristic SAMME Stagewise Additive Modeling using a Multi-class Exponential loss function SDP semidefiniteprogramming SIFT scale-invariantfeaturetransform SSP subset selection problem SVM supportvectormachine TF transferfunction VAD voice activity detection V-ASR visual-basedautomaticspeechrecognition V-VAD visual-based voice activity detection Abstract Human speech production is a multimodal process, by nature. While usually the acoustic speech signals constitute the primary cue in human speech per- ception, visual signals can also substantially contribute, particularly in noisy environments. Research in automatic audio-visual speech recognition is mo- tivated by the expectation that automatic speech recognition systems can also exploit the multimodal nature of speech. This thesis aims to explore the main challenges faced in realization and development of robust audio-visual au- tomatic speech recognition (AV-ASR), i.e., developing (I) a high-accuracy speaker-independent lipreading system and (II) an optimal audio-visual fu- sion strategy. Extracting a set of informative visual features is the first step to build an accurate visual speech recognizer. In this thesis, various visual feature ex- traction methods are employed. Some of these methods, however, encode vi- sual information into very high-dimensional feature vectors. In order to re- duce the computational complexity and the risk of overfitting, a new feature selection algorithm is introduced that selects a subset of
Recommended publications
  • Colour Spaces - a Review of Historic and Modern Colour Models* GD Hastings and a Rubin
    S Afr Optom 2012 71(3) 133-143 Colour spaces - a review of historic and modern colour models* GD Hastings and A Rubin Department of Optometry, University of Johannesburg, PO Box 524, Auckland Park, 2006 South Africa <[email protected]> <[email protected]> Received 12 April 2012; revised version accepted 31 August 2012 Abstract colour models and a brief history of the advance- ments that have led to our current understanding Although colour is one of the most interest- of the complicated phenomenon of colour. (S Afr ing and integral parts of vision, most models and Optom 2012 71(3) 133-143) methods of colourimetry (the measurement of col- our) available to describe and quantify colour have Key words: Colour space, colour model, col- been developed outside of optometry. This article ourimetry, colour discrimination. presents a summary of some of the most popular Introduction textiles, and computer graphics industries because as these industries became more sophisticated, more The description and quantification of colour, and universal, and more commercial, models and systems the even more complicated matter of colour percep- needed to be evolved to allow an increasingly more tion, have interested philosophers and scientists for precise and comprehensive description of colour and over 2500 years, since at least the times of the ancient colour perception. Greeks1. Colour is a fundamental part of the sense of Some of the earliest attempts at representing colour sight and has a fascinating effect on how people per- seem to have been inspired by the change of night ceive the world - the poet Julian Grenfell even went to day and involve a linear (one-dimensional) colour so far as to say that “Life is Colour”2.
    [Show full text]
  • Shadow Detection on Multispectral Images A
    SHADOW DETECTION ON MULTISPECTRAL IMAGES A THESIS SUBMITTED TO THE GRADUATE SCHOOL OF NATURAL AND APPLIED SCIENCES OF MIDDLE EAST TECHNICAL UNIVERSITY BY HAZAN DAGLAYAN˘ SEVIM˙ IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN INFORMATION SYSTEMS SEPTEMBER 2015 Approval of the thesis: SHADOW DETECTION ON MULTISPECTRAL IMAGES submitted by HAZAN DAGLAYAN˘ SEVIM˙ in partial fulfillment of the requirements for the degree of Master of Science in Information Systems Department, Middle East Technical University by, Prof. Dr. Nazife Baykal Dean, Graduate School of Informatics Prof. Dr. Yasemin Yardımcı Çetin Head of Department, Information Systems Prof. Dr. Yasemin Yardımcı Çetin Supervisor, Information Systems, METU Examining Committee Members: Assoc. Prof. Dr. Altan Koçyigit˘ Information Systems, METU Prof. Dr. Yasemin Yardımcı Çetin Information Systems, METU Assist. Prof. Dr. Ersin Karaman MIS Department, Atatürk University Assoc. Prof. Dr. Banu Günel Kılıç Information Systems, METU Assoc. Prof. Dr. Alptekin Temizel Modeling and Simulation, METU Date: I hereby declare that all information in this document has been obtained and presented in accordance with academic rules and ethical conduct. I also declare that, as required by these rules and conduct, I have fully cited and referenced all material and results that are not original to this work. Name, Last Name: HAZAN DAGLAYAN˘ SEVIM˙ Signature : v ABSTRACT SHADOW DETECTION ON MULTISPECTRAL IMAGES DAGLAYAN˘ SEVIM,˙ HAZAN M.S., Department of Information Systems Supervisor : Prof. Dr. Yasemin Yardımcı Çetin September 2015, 93 pages Shadows caused by clouds, mountains or high human-made structures pose chal- lenges for identification of objects from satellite or aerial images since they dete- riorate and mask true spectral properties of the objects.
    [Show full text]
  • Color Feature Integration with Directional Ringlet Intensity
    COLOR FEATURE INTEGRATION WITH DIRECTIONAL RINGLET INTENSITY FEATURE TRANSFORM FOR ENHANCED OBJECT TRACKING Thesis Submitted to The School of Engineering of the UNIVERSITY OF DAYTON In Partial Fulfillment of the Requirements for The Degree of Master of Science in Electrical Engineering By Kevin Thomas Geary Dayton, Ohio December, 2016 COLOR FEATURE INTEGRATION WITH DIRECTIONAL RINGLET INTENSITY FEATURE TRANSFORM FOR ENHANCED OBJECT TRACKING Name: Geary, Kevin Thomas APPROVED BY: _______________________________ _______________________________ Vijayan K. Asari, Ph.D. Eric J. Balster, Ph.D. Advisory Committee Chairman Committee Member Professor Associate Professor Electrical and Computer Engineering Electrical and Computer Engineering _______________________________ Theus H. Aspiras, Ph.D. Committee Member Research Engineer and Adjunct Faculty Electrical and Computer Engineering _______________________________ _______________________________ Robert J. Wilkens, Ph.D., P.E. Eddy M. Rojas, Ph.D., M.A., P.E. Associate Dean for Research and Innovation Dean, School of Engineering Professor School of Engineering ii ABSTRACT COLOR FEATURE INTEGRATION WITH DIRECTIONAL RINGLET INTENSITY FEATURE TRANSFORM FOR ENHANCED OBJECT TRACKING Name: Geary, Kevin Thomas University of Dayton Advisor: Dr. Vijayan K. Asari Object tracking, both in wide area motion imagery (WAMI) and in general use cases, is often subject to many different challenges, such as illumination changes, background variation, rotation, scaling, and object occlusions. As WAMI datasets become more common, so too do color WAMI datasets. When color data is present, it can offer very strong potential features to enhance the capabilities of an object tracker. A novel color histogram-based feature descriptor is proposed in this thesis research to improve the accuracy of object tracking in challenging sequences where color data is available.
    [Show full text]
  • A Low-Cost Real Color Picker Based on Arduino
    Sensors 2014, 14, 11943-11956; doi:10.3390/s140711943 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Article A Low-Cost Real Color Picker Based on Arduino Juan Enrique Agudo 1, Pedro J. Pardo 1,*, Héctor Sánchez 1, Ángel Luis Pérez 2 and María Isabel Suero 2 1 University Center of Merida, University of Extremadura, Sta. Teresa de Jornet, 38, Mérida 06800, Spain; E-Mails: [email protected] (J.E.A.); [email protected] (H.S.) 2 Physics Department, University of Extremadura, Avda. Elvas s/n, Badajoz 06006, Spain; E-Mails: [email protected] (A.L.P.); [email protected] (M.I.S.) * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +34-924-289-300 (ext. 82622); Fax: +34-924-301-212. Received: 18 March 2014; in revised form: 2 July 2014 / Accepted: 3 July 2014 / Published: 7 July 2014 Abstract: Color measurements have traditionally been linked to expensive and difficult to handle equipment. The set of mathematical transformations that are needed to transfer a color that we observe in any object that doesn’t emit its own light (which is usually called a color-object) so that it can be displayed on a computer screen or printed on paper is not at all trivial. This usually requires a thorough knowledge of color spaces, colorimetric transformations and color management systems. The TCS3414CS color sensor (I2C Sensor Color Grove), a system for capturing, processing and color management that allows the colors of any non-self-luminous object using a low-cost hardware based on Arduino, is presented in this paper.
    [Show full text]
  • Acceleration of Image Processing Using New Color Model
    American Journal of Applied Sciences 6 (5): 1015-1020, 2009 ISSN 1546-9239 © 2009 Science Publications Acceleration of Image Processing Using New Color Model 1Mahdi Alshamasin, 2Riad Al-kasasbeh, 2A. Khraiwish, 2Y. Al-shiboul and 3Dmitriy E. Skopin 1Department of Mechatronics, 2Department of Electrial Engineering, 3Dpartment of Computer Engineering, Faculty of Engineering Technology, Al-Balqa'a Applied University, P.O. Box 15008 Amman 11134, Jordan Abstract: The theoretical outcomes and experimental results of new color model designed for accelerated image processing together with implementations of designed model to software of image processing are presented in the study. This model, as it will be shown below, may be used in modern real time video processing applications. The developed model allows to increase a speed of image processing algorithms using accelerated model of conversion to new decorrelated color space. Experimental results the proposed method can get a better performance than other existing methods. Key words: Image processing, color model, RGB, HSV and HSI INTRODUCTION for object recognition. In section four, we develop an advanced approach for image processing that satisfies Digital image processing is a new and promptly the target of this study. Results and conclusion are developing field which finds more and more application shown in section five and six correspondently. in various information and technical systems such as: radar-tracking, communications, television, astronomy, MATERIALS AND METHODS etc. [1]. There are numerous methods of digital image processing techniques such as: Histogram processing, Image processing using standard color model: The local enhancement, smoothing and sharpening, color data flow of standard color image processing is shown segmentation, a digital image filtration and edge in Fig.
    [Show full text]
  • On the Performance of Kernel Methods for Skin Color Segmentation
    Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2009, Article ID 856039, 13 pages doi:10.1155/2009/856039 Research Article On the Performance of Kernel Methods for Skin Color Segmentation A. Guerrero-Curieses,1 J. L. Rojo-Alvarez,´ 1 P. Con de - Pa rdo, 2 I. Landesa-Vazquez,´ 2 J. Ramos-Lopez,´ 1 and J. L. Alba-Castro2 1 Departamento de Teor´ıa de la Senal˜ y Comunicaciones, Universidad Rey Juan Carlos, 28943 Fuenlabrada, Spain 2 Departamento de Teor´ıa de la Senal˜ y Comunicaciones, Universidad de Vigo, 36200 Vigo, Spain Correspondence should be addressed to A. Guerrero-Curieses, [email protected] Received 26 September 2008; Revised 23 March 2009; Accepted 7 May 2009 Recommended by C.-C. Kuo Human skin detection in color images is a key preprocessing stage in many image processing applications. Though kernel-based methods have been recently pointed out as advantageous for this setting, there is still few evidence on their actual superiority. Specifically, binary Support Vector Classifier (two-class SVM) and one-class Novelty Detection (SVND) have been only tested in some example images or in limited databases. We hypothesize that comparative performance evaluation on a representative application-oriented database will allow us to determine whether proposed kernel methods exhibit significant better performance than conventional skin segmentation methods. Two image databases were acquired for a webcam-based face recognition application, under controlled and uncontrolled lighting and background conditions. Three different chromaticity spaces (YCbCr, ∗ CIEL∗a∗b , and normalized RGB) were used to compare kernel methods (two-class SVM, SVND) with conventional algorithms (Gaussian Mixture Models and Neural Networks).
    [Show full text]
  • Object Localization Using Color Histograms
    Object Localization Using Color Histograms b a Krzysztof Horecki a 0 & Dietrich Paulus & Konrad Wojciechowski b a Schlesische Technische Universitat¨ Lehrstuhl f¨ur Mustererkennung Gliwice, Polen Universitat¨ Erlangen–N¨urnberg [email protected] [email protected] http://www.olimp.com.pl/ krzy ho http://www.mustererkennun g.de Abstract In the contribution we present the comparison of di erent histogram distance measures and color spaces applied to ob ject lo calization. In particular we examine the Earth Mover's Distance which has not yet b een compared to other measures in the area of ob ject lo calization. The evaluation is based on more than 80,000 exp eriments. Furthermore, we prop ose an extension to Normalized RG color space and an ecient scanning algorithm for lo calization. 1 Introduction Localizing an object in a scene is a common task in computer vision. Color as a cue for solving this problem was presented by Swain and Ballard in [Swa91]. There it was shown that the distributions of color information, i.e. color histograms, can be used to solve the object localization problem. In order to localize the object in the scene, we try to find that sub–image of the scene image, which has the smallest distance from the object image. The distances in sense of histograms are concerned. We examined several histogram distance measures and color spaces in which we eval- uated the histograms. We restricted our experiments to finding objects in the office scenery, whose sizes were known apriorifrom manual segmentation.1 This contribution is organized as follows; in Section 2 we introduce color histograms.
    [Show full text]
  • Fusion of Multi Color Space for Human Skin Region Segmentation
    International Journal of Information and Electronics Engineering, Vol. 3, No. 2, March 2013 Fusion of Multi Color Space for Human Skin Region Segmentation Fan Hai Xiang and Shahrel Azmin Suandi, Senior Member, IACSIT YCbCr-YU'V, RGB-YU'V color models are introduced and Abstract—This paper proposes a technique to extract human compared with each other. The proposed skin color regions skin color regions from the color images based on combination segmentation technique is based on combination of RGB and of both RGB and YU'V color spaces. Since skin pixels can vary YUV color models. The experimental results show that the with ambient light, to find the range of values for which most proposed approach can detect skin color regions more skin pixels fall in the skin color space becomes a hard task. Therefore, other techniques based on normalized rg, HSV, YUV, accurately than other methods. YIQ, YCbCr, RGB, YCbCr-YU'V color spaces are presented and compared in this paper in order to select the appropriate control limits for the purpose of skin color segmentation. From II. SKIN COLOR SEGMENTATIONS METHODS the experimental results, the skin detection results obtained by the proposed technique based on RGB-YU'V color model is A. Skin Color Segmentation in Normalize rg Color Model found to be the most appropriate color model compared to The normalized rg color space is one of the most widely other methods. used color spaces for processing digital image for skin color detection. There is one method of skin color segmentation Index Terms—Skin color segmentation, color space, control limits, YCbCr- YU'V.
    [Show full text]
  • Book III Color
    D DD DDD DDDDon.com DDDD Basic Photography in 180 Days Book III - Color Editor: Ramon F. aeroramon.com Contents 1 Day 1 1 1.1 Theory of Colours ........................................... 1 1.1.1 Historical background .................................... 1 1.1.2 Goethe’s theory ........................................ 2 1.1.3 Goethe’s colour wheel .................................... 6 1.1.4 Newton and Goethe ..................................... 9 1.1.5 History and influence ..................................... 10 1.1.6 Quotations .......................................... 13 1.1.7 See also ............................................ 13 1.1.8 Notes and references ..................................... 13 1.1.9 Bibliography ......................................... 16 1.1.10 External links ......................................... 16 2 Day 2 18 2.1 Color ................................................. 18 2.1.1 Physics of color ....................................... 20 2.1.2 Perception .......................................... 22 2.1.3 Associations ......................................... 26 2.1.4 Spectral colors and color reproduction ............................ 26 2.1.5 Additive coloring ....................................... 28 2.1.6 Subtractive coloring ..................................... 28 2.1.7 Structural color ........................................ 29 2.1.8 Mentions of color in social media .............................. 30 2.1.9 Additional terms ....................................... 30 2.1.10 See also ...........................................
    [Show full text]
  • Concerning the Calculation of the Color Gamut in a Digital Camera
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Repositorio Institucional de la Universidad de Alicante Concerning the Calculation of the Color Gamut in a Digital Camera Francisco Martı´nez-Verdu´,1* M. J. Luque,2 P. Capilla,2 and J. Pujol3 1Department of Optics, University of Alicante, Apartado de Correos n° 99, 03080 Alicante, Spain 2Department of Optics, University of Valencia, Dr. Moliner s/n, 46100 Burjassot, Valencia, Spain 3Department of Optics and Optometry, Center for Development of Sensors, Instrumentation and Systems (CD6), Technical University of Catalonia, Rambla de Sant Nebridi n° 10, 08222 Terrassa, Barcelona, Spain Received 3 January 2004; revised 25 November 2005; accepted 11 January 2006 Abstract: Several methods to determine the color gamut of INTRODUCTION any digital camera are shown. Since an input device is additive, its color triangle was obtained from their spectral Color devices1–7 are basically divided into input or capture sensitivities and it was compared with the theoretical sen- devices (scanners and digital cameras) and output devices sors of Ives-Abney-Yule and MacAdam. On the other hand, (softcopy, such as displays, and hardcopy, such as printers). the RGB digital data of the optimal or MacAdam colors Scanners, digital cameras, and displays (CRT, LCD/TFT, were simulated to transform them into XYZ data according plasma, etc.) are additive color devices. However, most to the colorimetric profile of the digital camera. From this, printing devices8 (inkjet, electro-photography, offset, etc.) the MacAdam limits associated to the digital camera are perform by additive and subtractive color mixing.
    [Show full text]
  • CSIFT Based Locality-Constrained Linear Coding for Image
    1 CSIFT Based Locality-constrained Linear Coding for Image Classification CHEN Junzhou 1 , LI Qing 1, PENG Qiang 1 and Kin Hong Wong 2 1 School of Information Science & Technology, Southwest Jiaotong University, Chengdu, Sichuan, 610031, China 2 Department of Computer Science & Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong Email:[email protected]; [email protected] Abstract CSIFT descriptors can achieve better performances in re- In the past decade, SIFT descriptor has been witnessed sisting some certain photometric changes. One example as one of the most robust local invariant feature de- can be found in [3], which shows that CSIFT is more scriptors and widely used in various vision tasks. Most stable than SIFT in case of illumination changes. traditional image classification systems depend on the On the other hand, the bag-of-features (BoF) [7] [8] luminance-based SIFT descriptors, which only analyze joined with the spatial pyramid matching (SPM) kernel the gray level variations of the images. Misclassification [9] has been employed to build the recent state-of-the-art may happen since their color contents are ignored. In image classification systems. In BoF, images are consid- this article, we concentrate on improving the perfor- ered as sets of unordered local appearance descriptors, mance of existing image classification algorithms by which are clustered into discrete visual words for the adding color information. To achieve this purpose, dif- representation of images in semantic classification. ferent kinds of colored SIFT descriptors are introduced SPM divides an image into 2l × 2l segments in differ- and implemented.
    [Show full text]
  • Research Article on the Performance of Kernel Methods for Skin Color Segmentation
    Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2009, Article ID 856039, 13 pages doi:10.1155/2009/856039 Research Article On the Performance of Kernel Methods for Skin Color Segmentation A. Guerrero-Curieses,1 J. L. Rojo-Alvarez,´ 1 P. Con de - Pa rdo, 2 I. Landesa-Vazquez,´ 2 J. Ramos-Lopez,´ 1 and J. L. Alba-Castro2 1 Departamento de Teor´ıa de la Senal˜ y Comunicaciones, Universidad Rey Juan Carlos, 28943 Fuenlabrada, Spain 2 Departamento de Teor´ıa de la Senal˜ y Comunicaciones, Universidad de Vigo, 36200 Vigo, Spain Correspondence should be addressed to A. Guerrero-Curieses, [email protected] Received 26 September 2008; Revised 23 March 2009; Accepted 7 May 2009 Recommended by C.-C. Kuo Human skin detection in color images is a key preprocessing stage in many image processing applications. Though kernel-based methods have been recently pointed out as advantageous for this setting, there is still few evidence on their actual superiority. Specifically, binary Support Vector Classifier (two-class SVM) and one-class Novelty Detection (SVND) have been only tested in some example images or in limited databases. We hypothesize that comparative performance evaluation on a representative application-oriented database will allow us to determine whether proposed kernel methods exhibit significant better performance than conventional skin segmentation methods. Two image databases were acquired for a webcam-based face recognition application, under controlled and uncontrolled lighting and background conditions. Three different chromaticity spaces (YCbCr, ∗ CIEL∗a∗b , and normalized RGB) were used to compare kernel methods (two-class SVM, SVND) with conventional algorithms (Gaussian Mixture Models and Neural Networks).
    [Show full text]