2.2 Background Subtraction 14 2.3 Combined Effect 16 2.4 Additional Considerations 17

Total Page:16

File Type:pdf, Size:1020Kb

2.2 Background Subtraction 14 2.3 Combined Effect 16 2.4 Additional Considerations 17 DEPARTMENT OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE MASTER´S THESIS CONTROL AND PROGRAMMING OF AN INDUSTRIAL ROBOT USING MICROSOFT KINECT BY: PABLO PÉREZ TEJEDA MAT. NO.: PROFESSOR: Prof. Dr. Ing. Konrad Wöllhaf SUBMISSION DATE: 31. August 2013 Acknowledgement I would like to include in this short acknowledgement a gratitude to my Professor, Dr. Wöllhaf, for giving me hints in the analysis of the problem and providing my facilities for finishing my thesis; also mention Prof. Matthias Stark, for his help with the hardware, and finally, to my family and friends, for giving me support and understanding in the long days spent in front of three computer screens. ii Abstract Industrial robot programming relies on a suitable interface between a human operator and the robot hardware. This interface has evolved through the years to facilitate the task of programming, making possible, for example, to position a robot at real time with a handheld unit, or designing ‘offline’ the layout and operation of a complex industrial process in the GUI of a computer application. The different approaches to robot programming aim to increasing rates of efficiency without impairing already assumed capacities of the programming environment, like stability control, precision or safety. Some of these approaches have found their way in computer vision, and the last generation of image sensors boosts today many applications, inside and outside the automation industry, featuring extended capabilities in a new low-cost market. The Kinect™ sensor from Microsoft® emerged in the market of video game consoles in 2010, based around a webcam-style add-on peripheral for the Xbox 360™ console, enabling gamers to control and interact with it through a natural user interface using gestures and spoken commands. Nevertheless, the project was aimed at broadening the Xbox 360’s audience beyond its typical gamer base and in 2011 Microsoft released the Kinect SDK for Windows® 7, allowing developers to write Kinect applications in C++/CLI, C# and Visual Basic .NET™. The aim of this thesis is the design, implementation, testing and documentation of the software capable of identifying, tracking, locating and representing three objects in real time, based on their color characteristic, through the use of the Kinect sensor. Such a system is also the beginning of an interface for robot programming. The sensor is to be programmed in C++ language using the Kinect for Windows SDK and the Desktop App UI for Windows. The OpenCV library is the tool for the image processing algorithms. For this thesis, the IDE selected for programming is Visual Studio 2012, running in a 32-bit OS1. 1 Other details of the hardware are the microprocessor, AMD Phenom™ II X4 955 at 3.20 GHz, and the graphic accelerator, NVIDIA GeForce GTS 450. The OS version is Windows 8 Premium iii Contents 1 Introduction 6 1.1 Motivation. The Use of the Natural User Interface 7 1.2 Hardware and Software Tools 8 1.3 Approach to the Problem 9 2 Segmentation Algorithm 10 2.1 Color Segmentation 11 2.2 Background Subtraction 14 2.3 Combined Effect 16 2.4 Additional Considerations 17 3 Proposed Extensions on Image Processing 20 3.1 Labeling 20 3.2 Shape recognition 21 3.3 Video Tracking 21 4 Design and Interface for the Solution 22 4.1 System Schematic 22 4.2 Application’s Flow Diagram 22 4.3 Application’s Performance 24 5 Application’s User Guide 26 5.1 Algorithms Bitmap 27 5.2 Object Tracking Bitmap 31 Appendix A I Bibliography II iv Introduction The use of the term “interface” is recurrently used throughout the whole document, and therefore, it is appropriate to include a short explanation of this term. Interface can adopt different definitions depending on the context to which it is applied: In computer science… “An interface is the point of interaction between a computer and any other entity, such as a printer or human operator.” More specifically, in software development… “A software interface is the layout of an application’s graphic or textual controls in conjunction with the way the application responds to user activity.” In object-oriented programming… “An interface is an abstract type that contains no data, but exposes behaviors defined as methods.” A general definition… “An interface is a surface forming a common boundary between bodies or regions.” Attending to the first definition, an interface is required to connect the computer to the data captured by the Kinect. This connection is assessed by the Kinect Natural User Interface and the Kinect drivers and hardware. The Kinect Natural User Interface is a part of the Kinect for Windows API focused on the acquisition of data streams from the Kinect. Attending to the second definition, another interface is needed to permit the interaction between a computer application and its user. This connection must be designed using the Windows Desktop API. The Windows Desktop API is a set of software implementations which operates directly on the computer hardware, permitting us, for example, to display images on a screen device or process mouse and keyboard events from the user. 6 To address the objectives of this thesis, we need to develop the second interface of the two preceding and a processing routine. With these elements, the system must be able to retrieve an input, three objects to be recognized, screen this input to obtain useful data, and show the appropriate results at an output2. Interfaces and routine are embedded into a computer application, and run into a single computer process. The processing routine uses C++ implementations from both of the previous APIs, from the standard C++ libraries and from the OpenCV libraries. The whole application is compiled in C++ language and executed into a Win32 app. The following schematic helps making clear the structure of the system (the elements to be developed in this thesis are marked in red). Application Interface Natural Routine Kinect Kinect User Operator API Interface 1.1 Motivation. The Use of the Natural User Interface A Natural User Interface (NUI), as opposed to a Graphical User Interface (GUI) or a Command-Line Interface (CLI) is based on actions that come naturally to humans, like basic body motion or gesturing. It effectively becomes invisible for its users (as the details of its implementation remain hidden), improves performance when working with an application and permits successive learned interactions that make the process of learning more intuitive and easily assimilated at each step. NUIs deploy interaction design concepts, such as “shape digital things for people’s use” and rely on a user being able to quickly transition from novice to expert3. A Kinetic User Interface (KUI) allows users to interact with computing devices through the motion of objects or the human body. The Kinect is an example of a KUI. The development of this interface technology relies on the ability of interaction devices to detect an increasingly wide range of human actions and gestures. 2 The output can be any suitable representation of the position of the objects. For this thesis, two representations are provided; one, showing the position of the objects as perceived by the camera; the other, simulating a robotic arm which takes the positional data of the objects as kinematic parameters. The second representation is intended to demonstrate the behaviour of an hypothetical robot connected to the system 3 One example of a strategy for designing a NUI is the strict limiting of functionality and customization so that users have very little to learn in the operation of a device. Provided that the default capabilities match the user’s goals, the interface is effortless to use. Other approaches use “reality-based interfaces” (RBI) methods, whose main efforts point to a merge between cyberspace and real world 7 The Kinect NUI gives the programmer the possibility to deploy solutions that permit users the natural interaction with computers by simply gesturing and speaking. The control of the motion of a robot through one of these interfaces is known as vision-based robot control, or simply visual servoing (abbreviated VS). VS is a technique which uses feedback information extracted from a vision sensor, like the Kinect, to control the motion of a robot. The first works in these fields were published in 1996, and today a complete methodology that treats this technology exists. 1.2 Hardware and Software Tools Kinect sensor and the Kinect for Windows API The Kinect has hardware components (including the Kinect sensor and the USB hub through which it is connected to the computer), drivers to connect these hardware components (installed as part of the SDK for Windows OS) and audio and video components and APIs. It features an infrared emitter and depth sensor, a RGB camera, a microphone array and an accelerometer, as well as a software pipeline that processes color, depth and skeleton data up to 30 FPS. Some of these data can be retrieved from the sensor in the form of media streams; for example, in one of those streams we find data about the color camera, other provides depth data from the depth sensor, audio data from the microphone, skeleton data, etc. For the purposes of this thesis we will only need the two first streams: the color stream and the depth stream. The Kinect for windows API delivers these streams with minimal software latency across various video, CPU, and device variables. Besides the Kinect for Windows API, the Kinect also makes use of implementations of media APIs available to desktop applications in Windows 7 and Windows 8. Image Processing and the OpenCV libraries The application developed for this thesis uses concepts, algorithms and implementations from image processing and computer vision.
Recommended publications
  • Tracking and Automation of Images by Colour Based
    Vol 11, Issue 8,August/ 2020 ISSN NO: 0377-9254 TRACKING AND AUTOMATION OF IMAGES BY COLOUR BASED PROCESSING N Alekhya 1, K Venkanna Naidu 2 and M.SunilKumar 3 1PG student, D.N.R College of Engineering, ECE, JNTUK, INDIA 2 Associate Professor D.N.R College of Engineering, ECE, JNTUK, INDIA 3Assistant Professor Sir CRR College of Engineering , EEE, JNTUK, INDIA [email protected], [email protected] ,[email protected] Abstract— Now a day all application sectors are mostly image analysis involves maneuver the moving for the automation processing and image data to conclude exactly the information sensing . for example image processing in compulsory to help to answer a computer imaging medical field ,in industrial process lines , object problem. detection and Ranging application, satellite Digital image processing methods stems from two imaging Processing ,Military imaging etc, In principal application areas: improvement of each and every application area the raw images pictorial information for human interpretation, and are to be captured and to be processed for processing of image data for tasks such as storage, human visual inspection or digital image transmission, and extraction of pictorial processing systems. Automation applications In information this proposed system the video is converted into The remaining paper is structured as follows. frames and then it is get divided into sub bands Section 2 deals with the existing method of Image and then background is get subtracted, then the Processing. Section 3 deals with the proposed object is get identified and then it is tracked in method of Image Processing. Section 4 deals the the framed from the video .This work presents a results and discussions.
    [Show full text]
  • Color Spaces YCH and Ysch for Color Specification and Image Processing in Multi-Core Computing and Mobile Systems
    Programación Matemática y Software (2012) Vol. 4. No 2. ISSN: 2007-3283 Recibido: 14 de septiembre del 2011 Aceptado: 3 de enero del 2012 Publicado en línea: 8 de enero del 2013 Color spaces YCH and YScH for color specification and image processing in multi-core computing and mobile systems Yuriy Kotsarenko, Fernando Ramos Tecnológico de Monterrey, Campus Cuernavaca [email protected], [email protected] Resumen. En este trabajo dos nuevos espacios de color se describen para especificación de colores y procesamiento de imágenes utilizando la forma cilíndrica del espacio de color YIQ. Los espacios de colores clásicos tales como HSL y HSV no toman en cuenta la visión humana y son perceptualmente inexactos. Los espacios de colores perceptualmente uniformes como CIELAB y CIELUV son muy costosos computacionalmente para aplicaciones interactivas de tiempo real y son difíciles de implementar. Las alternativas propuestas, por otro lado, tienen un balance entre uniformidad perceptual, desempeño y simplicidad de cálculo. Estos espacios modelan colores de forma más exacta y son rápidos de calcular. Los resultados experimentales en este trabajo comparan espacios de colores clásicos con los propuestos en términos de uniformidad, riqueza de colores y desempeño, incluyendo numerosas pruebas de rapidez en procesadores de varios núcleos y sistemas móviles tales como ultra portátiles y los tablets tipo iPad. Los resultados evidencian que los espacios de colores propuestos son mejores alternativas para la industria de computación donde actualmente se utilicen los espacios de colores clásicos. Abstract. Two novel color spaces are described for color specification and image processing using cylindrical variants of YIQ color space.
    [Show full text]
  • Fiery Color Reference C9800 OKI Americas Inc
    2Copyright Copyright ES3640e MFP Color Reference Guide P/N 59377001, Revision 1.0 June, 2005 Every effort has been made to ensure that the information in this document is complete, accurate, and up-to-date. Oki assumes no responsibility for the results of errors beyond its control. Oki also cannot guarantee that changes in software and equipment made by other manufacturers and referred to in this guide will not affect the applicability of the information in it. Mention of software products manufactured by other companies does not necessarily constitute endorsement by Oki. While all reasonable efforts have been made to make this document as accurate and helpful as possible, we make no warranty of any kind, expressed or implied, as to the accuracy or completeness of the information contained herein. The most up-to-date drivers and manuals are available from the Oki web site: http://my.okidata.com Copyright © 2005 Oki Data Americas, Inc. and Electronics for Imaging, Inc. All rights reserved. This publication is protected by copyright, and all rights are reserved. No part of it may be reproduced or transmitted in any form or by any means for any purpose without express prior written consent from Electronics for Imaging, Inc. Information in this document is subject to change without notice and does not represent a commitment on the part of Electronics for Imaging, Inc. This publication is provided in conjunction with an EFI product (the “Product”) which contains EFI software (the “Software”). The Software is furnished under license and may only be used or copied in accordance with the terms of the Software license set forth below.
    [Show full text]
  • Color-Proposal.Pdf
    Colors in Snap! -bh proposal, draft, do not distribute Your computer monitor can display millions of colors, but you probably can’t distinguish that many. For example, here’s red 57, green 180, blue 200: And here’s red 57, green 182, blue 200: You might be able to tell them apart if you see them side by side: … but maybe not even then. Color space—the collection of all possible colors—is three-dimensional, but there are many ways to choose the dimensions. RGB (red-green-blue), the one most commonly used, matches the way TVs and displays produce color. Behind every dot on the screen are three tiny lights: a red one, a green one, and a blue one. But if you want to print colors on paper, your printer probably uses a different set of three colors: CMY (cyan-magenta-yellow). You may have seen the abbreviation CMYK, which represents the common technique of adding black ink to the collection. (Mixing cyan, magenta, and yellow in equal amounts is supposed to result in black ink, but typically it comes out a not-very-intense gray instead.) Other systems that try to mimic human perception are HSL (hue-saturation-lightness) and HSV (hue-saturation-value). If you are a color professional—a printer, a web designer, a graphic designer, an artist—then you need to understand all this. It can also be interesting to learn about. For example, there are colors that you can see but your computer display can’t generate. If that intrigues you, look up color theory in Wikipedia.
    [Show full text]
  • Color Control of LED Luminaires by Robert Bell
    Color control of LED luminaires BY ROBERT BELL Why it is not as easy as you might think. Another description is by hue, saturation, not create every color your eye can see. Below A bit about additive and luminance, HSL. (Some say “intensity” is a hypothetical locus of an RGB system color mixing or “lightness” instead of “luminance.”) rendered on the entire visible light spectrum. WITH RECENT MASS ACCEPTANCE Equally valid is hue, saturation, and value, of solid-state LED lighting, it’s time HSV. Value is sometimes referred to as for an explanation of this technology’s brightness and is similar to luminance. complexities and ways in which it can be However, saturation in HSL and HSV differ tamed. LED luminaires use the output of dramatically. For simplicity, I define hue multiple sources to achieve different colors as color and saturation as the amount of and intensities. Additive color mixing is color. I also try to remember if “L” is set to nothing new to our industry. We’ve done 100%, that is white, 0% is black, and 50% it for years on cycloramas with gelled is pure color when saturation is 100%. As luminaires hitting the same surface, but for “V”, 0% is black and 100% is pure color, control can be tricky. The first intelligent and the saturation value has to make up the luminaire I used was a spotlight that had difference. That over simplifies it, but let’s three MR16 lamps, fitted with red, green, carry on, as we’re not done yet. and blue filters.
    [Show full text]
  • Color Image Segmentation Using Perceptual Spaces Through Applets for Determining and Preventing Diseases in Chili Peppers
    African Journal of Biotechnology Vol. 12(7), pp. 679-688, 13 February, 2013 Available online at http://www.academicjournals.org/AJB DOI: 10.5897/AJB12.1198 ISSN 1684–5315 ©2013 Academic Journals Full Length Research Paper Color image segmentation using perceptual spaces through applets for determining and preventing diseases in chili peppers J. L. González-Pérez1 , M. C. Espino-Gudiño2, J. Gudiño-Bazaldúa3, J. L. Rojas-Rentería4, V. Rodríguez-Hernández5 and V.M. Castaño6 1Computer and Biotechnology Applications, Faculty of Engineering, Autonomous University of Queretaro, Cerro de las Campanas s/n, C.P. 76010, Querétaro, Qro., México. 2Faculty of Psychology, Autonomous University of Queretaro, Cerro de las Campanas s/n, C.P. 76010, Querétaro, Qro., México. 3Faculty of Languages Letters, Autonomous University of Queretaro, Cerro de las Campanas s/n, C.P. 76010, Querétaro, Qro., México. 4New Information and Communication Technologies, Faculty of Engineering, Department of Intelligent Buildings, Autonomous University of Queretaro, Cerro de las Campanas s/n, C.P. 76010, Querétaro, Qro., México. 5New Information and Communication Technologies, Faculty of Computer Science, Autonomous University of Queretaro, Cerro de las Campanas s/n, C.P. 76010, Querétaro, Qro., México. 6Center for Applied Physics and Advanced Technology, National Autonomous University of Mexico. Accepted 5 December, 2012 Plant pathogens cause disease in plants. Chili peppers are one of the most important crops in the world. There are currently disease detection techniques classified as: biochemical, microscopy, immunology, nucleic acid hybridization, identification by visual inspection in vitro or in situ but these have the following disadvantages: they require several days, their implementation is costly and highly trained.
    [Show full text]
  • Transforming Thermal Images to Visible Spectrum Images Using Deep Learning
    Master of Science Thesis in Electrical Engineering Department of Electrical Engineering, Linköping University, 2018 Transforming Thermal Images to Visible Spectrum Images using Deep Learning Adam Nyberg Master of Science Thesis in Electrical Engineering Transforming Thermal Images to Visible Spectrum Images using Deep Learning Adam Nyberg LiTH-ISY-EX–18/5167–SE Supervisor: Abdelrahman Eldesokey isy, Linköpings universitet David Gustafsson FOI Examiner: Per-Erik Forssén isy, Linköpings universitet Computer Vision Laboratory Department of Electrical Engineering Linköping University SE-581 83 Linköping, Sweden Copyright © 2018 Adam Nyberg Abstract Thermal spectrum cameras are gaining interest in many applications due to their long wavelength which allows them to operate under low light and harsh weather conditions. One disadvantage of thermal cameras is their limited visual inter- pretability for humans, which limits the scope of their applications. In this the- sis, we try to address this problem by investigating the possibility of transforming thermal infrared (TIR) images to perceptually realistic visible spectrum (VIS) im- ages by using Convolutional Neural Networks (CNNs). Existing state-of-the-art colorization CNNs fail to provide the desired output as they were trained to map grayscale VIS images to color VIS images. Instead, we utilize an auto-encoder ar- chitecture to perform cross-spectral transformation between TIR and VIS images. This architecture was shown to quantitatively perform very well on the problem while producing perceptually realistic images. We show that the quantitative differences are insignificant when training this architecture using different color spaces, while there exist clear qualitative differences depending on the choice of color space. Finally, we found that a CNN trained from day time examples generalizes well on tests from night time.
    [Show full text]
  • Fiery Color Reference
    Fiery® Color Server SERVER & CONTROLLER SOLUTIONS Fiery Color Reference © 2004 Electronics for Imaging, Inc. The information in this publication is covered under Legal Notices for this product. 45046197 24 September 2004 CONTENTS 3 CONTENTS INTRODUCTION 7 About this manual 7 For additional information 8 OVERVIEW OF COLOR MANAGEMENT CONCEPTS 9 Understanding color management systems 9 How color management works 10 Using ColorWise and application color management 11 Using ColorWise color management tools 12 USING COLOR MANAGEMENT WORKFLOWS 13 Understanding workflows 13 Standard recommended workflow 15 Choosing colors 16 Understanding color models 17 Optimizing for output type 18 Maintaining color accuracy 19 MANAGING COLOR IN OFFICE APPLICATIONS 20 Using office applications 20 Using color matching tools with office applications 21 Working with office applications 22 Defining colors 22 Working with imported files 22 Selecting options when printing 23 Output profiles 23 Ensuring color accuracy when you save a file 23 CONTENTS 4 MANAGING COLOR IN POSTSCRIPT APPLICATIONS 24 Working with PostScript applications 24 Using color matching tools with PostScript applications 25 Using swatch color matching tools 25 Using the CMYK Color Reference 25 Using the PANTONE reference 26 Defining colors 27 Working with imported images 29 Using CMYK simulations 30 Using application-defined halftone screens 31 Ensuring color accuracy when you save a file 32 MANAGING COLOR IN ADOBE PHOTOSHOP 33 Loading monitor settings files and ICC device profiles in Photoshop 6.x/7.x 33 Specifying
    [Show full text]
  • Analysis of Binarization Techniques and Tsetlin Machine Architectures Targeting Image Classification
    Analysis of binarization techniques and Tsetlin machine architectures targeting image classification ERIK M(T)ISE* )(LVOR S. S'ØRVIK SUPERVISORS Ole-Christo!er "ranmo# Jivitesh Sharma University of Agder, 2020 Faculty of Engineering and Science Department of Information and Communication Technology Abstract The Tsetlin Machine is a constantly evolving and developing machine learning technique with ever-increasing success. However, for every success, the Tsetlin Machine achieves, a new set of challenges are put ahead. To sufficiently bring the Tsetlin Machine to a broadly used standard, these challenges must be completed. This thesis focuses on the challenge of doing color image classification and will provide an introductory description of how this is possible through the usage of an older technique, namely binarization. A comparison with the various Tsetlin Machine adaptations made public in recent times is also present after the achieved color image classification. The results of both the initial color image classification experiment and the comparison between the varying adaptations show that the Tsetlin Machine, with a little extra work, can achieve high accuracy color image classification without image augmentation or pre-training. Preface This thesis summarizes the culmination education received at the University of Agder, Grimstad. The thesis encompasses work done over the last semester. The task was pro- posed by Prof. Ole-Christoffer Granmo in early January 2020. Both the authors of this thesis have a great interest in the Tsetlin Machine and its potential capabilities. Therefore when Granmo proposed working on a proof of concept towards color image classification along with a likely comparison of the various architectures, which had yet to be done, we were highly interested.
    [Show full text]
  • A Colour Segmentation Method for Detection of Based NZ Speed Signs
    A Colour Segmentation Method for Detection of New Zealand Speed Signs Abhishek Bedi A thesis submitted to Auckland University of Technology in fulfilment of the requirements for the degree of Master of Engineering August 2011 School of Engineering Primary Supervisor: Dr John Collins Acknowledgements This dissertation is a part of the Masters of Engineering at Auckland University of Technology New Zealand. This piece of work is a result of hard work, patience, sacrifice and unconditional support of many people. I wish to thank everyone who has supported and helped me in completing this research. Firstly, I wish express my gratitude to Geosmart NZ Limited for inspiring me to carry out this work. I am thankful to my supervisor Dr. John Collins for keeping patience, being a motivating force and taking care of me during the process. I would like to thank them for the guidance they have given me throughout the period. I express my honest thanks to the AUT library for their overall support for the entire period of the study at AUT. At the end, I am grateful to my parents and family members, especially my wife for her extreme sacrifices and providing moral support. i Statement of Originality ‘I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person nor material which to a substantial extent has been accepted for qualification of any other degree or diploma of a university or other institution of higher learning, except where due acknowledgement is made in the acknowledgements.’ Abhishek Bedi ii Abstract New Zealand Speed signs provide safe travelling speed limit or guidance information to drivers on roads.
    [Show full text]
  • Real-Time Uncertainty Visualization for B-Mode Ultrasound
    Personal reprint with permission; the definite version of this article is available at http://ieeexplore.ieee.org/ Copyright c 2015, IEEE, DOI: 10.1109/SciVis.2015.7429489 Real-time Uncertainty Visualization for B-Mode Ultrasound Christian Schulte zu Berge, Denis Declara, Christoph Hennersperger, Maximilian Baust, and Nassir Navab Figure 1: Uncertainty visualization for B-mode abdominal ultrasound. Given the original B-mode ultrasound image (a), we first com- pute the corresponding Confidence Map (b), which maps regions of low attenuation to high confidence. Using different schemes, we then compute the proposed fused uncertainty visualizations. For educational purposes, we introduce an uncertainty color overlay (c). For clinical applications, we further propose mapping to chroma (d) and to fuzziness (e) as perceptional visualization schemes maintaining the original diagnostic value. ABSTRACT adjust to the transducer positioning in real-time, provides also ex- pert clinicians with a strong visual feedback on their actions. This B-mode ultrasound is a very well established imaging modality and helps them to optimize the acoustic window and can improve the is widely used in many of today’s clinical routines. However, ac- general clinical value of ultrasound. quiring good images and interpreting them correctly is a challeng- ing task due to the complex ultrasound image formation process de- Keywords: Ultrasound, Uncertainty Visualization, Confidence pending on a large number of parameters. To facilitate ultrasound Maps, Real-time. acquisitions, we introduce a novel framework for real-time uncer- tainty visualization in B-mode images. We compute real-time per- Index Terms: Computer Graphics [I.3.m]: Miscellaneous Com- pixel ultrasound Confidence Maps, which we fuse with the original puter Applications [J.3]: Life and Medical Sciences—Health.
    [Show full text]
  • Constructing Cylindrical Coordinate Colour Spaces
    Constructing Cylindrical Coordinate Colour Spaces Allan Hanbury Pattern Recognition and Image Processing Group (PRIP), Institute of Computer-Aided Automation, Vienna University of Technology, Favoritenstraße 9/1832, A-1040 Vienna, Austria Abstract A cylindrical coordinate colour space (lightness, saturation/chroma, hue) is derived from an opponent colour space in the RGB space. It is shown how cylindrical co- ordinate colour models widely used in the literature are related to or can be re- duced to the derived model, thereby contributing to creating a unified cylindri- cal coordinate colour model. In particular, the widely used saturation expression max(R, G, B) − min(R, G, B) is derived from the proposed model. Properties of the derived chroma and saturation expressions are examined. Finally, some applications of cylindrical coordinate colour spaces are briefly reviewed. Key words: hue, saturation, chroma, lightness, colour, cylindrical coordinates 1 Introduction The representation of the colour coordinates of images in cylindrical coordi- nates (hue, saturation/chroma, lightness), is widely used in the image analysis and computer vision community. There are however two main difficulties as- sociated with its use: (1) the large choice of available transformations from an RGB space, e.g. HSV (Smith, 1978), HSL, HMMD (Manjunath et al., 2002), HSB and HSI (Gonzalez and Woods, 1992). (2) the fact that some of these transformations have their saturation nor- malised by the lightness, making them unsuitable for some tasks. The latter concern can be understood by looking at the saturation images shown for three colour models in Figure 1. The saturation for the HSV and HSL models represent percentages of the maximum saturation obtainable for Preprint submitted to Elsevier 24 July 2007 (a) original (b) HSV (c) HSL (d) max − min Fig.
    [Show full text]