Combining Generic Programming with Vector Processing for Machine Vision

University of Wollongong Research Online University of Wollongong Thesis Collection 1954-2016 University of Wollongong Thesis Collections 2005 Combining generic programming with vector processing for machine vision Bing-Chang Lai University of Wollongong Follow this and additional works at: https://ro.uow.edu.au/theses University of Wollongong Copyright Warning You may print or download ONE copy of this document for the purpose of your own research or study. The University does not authorise you to copy, communicate or otherwise make available electronically to any other person any copyright material contained on this site. You are reminded of the following: This work is copyright. Apart from any use permitted under the Copyright Act 1968, no part of this work may be reproduced by any process, nor may any other exclusive right be exercised, without the permission of the author. Copyright owners are entitled to take legal action against persons who infringe their copyright. A reproduction of material that is protected by copyright may be a copyright infringement. A court may impose penalties and award damages in relation to offences and infringements relating to copyright material. Higher penalties may apply, and higher damages may be awarded, for offences and infringements involving the conversion of material into digital or electronic form. Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong. Recommended Citation Lai, Bing-Chang, Combining generic programming with vector processing for machine vision, PhD thesis, School of Information Technology and Comuter Science, University of Wollongong, 2005. http://ro.uow.edu.au/theses/326 Research Online is the open access institutional repository for the University of Wollongong. For further information contact the UOW Library: [email protected] Combining Generic Programming with Vector Processing for Machine Vision A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy (Computer Science) from UNIVERSITY OF WOLLONGONG by Bing-Chang Lai, BCompSc (Hons) School of Information Technology and Computer Science 2005 Declaration I, Bing-Chang Lai, hereby declare that I am the sole author of this thesis. I also declare that the material presented within is my own work, except where duly acknowledged, and that I am not aware of any similar work either prior to this thesis or currently being pursued. ___________________________ Bing-Chang Lai ___________________________ Date i ii Abstract This thesis addresses the integration of generic programming with vector processing for machine vision. While generic libraries have been shown to provide near optimal performance without sacrificing flexibility and adaptability, current generic libraries do not utilise the vector processing unit (VPU), nor can they be vectorised directly. Generic vectorised libraries require a mechanism for expressing vectorised algorithms independently of the VPU. This is a problem since different VPUs can have different instructions and different limitations; programs written to use one vector technology are not portable to other vector technologies. Lastly, most existing machine-vision libraries do not provide image capture from sequence grabbers; the programmer has to use another library to capture images, and to supply additional code to enable the two libraries to work together. To allow vectorised, machine-vision algorithms to be portable across different vector technologies, this thesis proposes the use of an abstract VPU. The abstract VPU represents a set of real VPUs with a virtual VPU that has an idealised instruction set and constraints common to the real VPUs being represented. An abstract VPU, named Virtual Vector Machine (VVM), was developed to support generic programming. Different methods of implementing VVM were evaluated against hand-coded AltiVec (a vector technology found in PowerPC G4 and G5 processors) and scalar programs. The implementation chosen has no significant overheads when processing VVM vectors with a single AltiVec vector or a single scalar when compiled using Apple GCC 3.1 20021003. VVM vectors with a single AltiVec vector or scalar cover all byte AltiVec vectors in AltiVec mode and all types in scalar mode. When processing VVM vectors that use more than one AltiVec vector, the VVM implementation chosen is within 24% slower than a hand-coded program. Vectorised algorithms are difficult to implement, because they handle VPU-specific issues such as memory alignments, edges and prefetching. Thus, to reduce the number of algorithms required, a categorisation of image processing operations based on input-to- output correlation is proposed. This categorisation maps easily to generic programming and provides implementation hints. The categorisation scheme separates image processing operations into three categories, which this thesis refers to as quantitative, transformative and convolutive operations. Quantitative operations require one input element to produce zero or more output elements. Transformative operations require one input el- iii ement to produce one output element. Convolutive operations produce a single output element from a rectangle of input elements. Each category requires only one general algorithm. Variations of the algorithm to handle differing input and output set combinations are also required. The generic, vectorised, machine-vision library, Vectorised Vision (VVIS), developed in this thesis uses the abstract VPU (VVM) and the three categories to provide cross- platform, vectorised algorithms. Because the division of duties used by existing generic libraries is unsuitable for vectorisation, two new divisions of duties are proposed and their performance is evaluated. Generic, vectorised algorithms for each category were evaluated against hand-coded programs and the speedup gained by using VVIS in AltiVec mode instead of scalar mode was collated. The VVIS implementation is comparable to hand-coded AltiVec and scalar programs when operating on single-channel byte images when processing quantitative and transformative operations. For convolutive operations, the final VVIS implementation is twice as slow in AltiVec mode when processing single- channel byte images, because the hand-coded AltiVec program did not need to support variable kernel sizes. In scalar mode, the final VVIS convolutive algorithm is comparable to hand-coded scalar programs. VVIS was slower than hand-coded programs when processing non-byte images, because of overheads introduced by VVM. For the transformative operations tested, VVIS, when executed in AltiVec mode instead of scalar mode, provides at most a four fold speedup depending on the operation when processing single- channel byte images. For convolutive operations tested, VVIS provides a speedup of ap- proximately 1.5 times when processing single-channel byte images, despite using signed short internally; the VVM implementation used had noticeable overheads when operating on signed shorts in AltiVec mode. Results show that a generic, vectorised, machine-vision library generally have comparable performance to hand-coded programs when processing single-channel byte images. Because of overheads introduced by the VVM implementation, VVIS does not have comparable performance for all image types. Most image processing operations however use either single-channel byte or multi-channel byte images. iv Acknowledgements I would like to thank my primary supervisor, Associate Professor Phillip John McKerrow, for providing guidance and support, and allowing me the freedom to pursue my interests; and my secondary supervisor, Dr Jo Abrantes, for sharing her valuable experiences and providing support while Phillip was away. I would also like to thank Associate Professor Neil Gray for taking time to review and to provide comments on this thesis. I would like to thank my parents for giving me the opportunity to undertake a Ph.D. at the University of Wollongong and for raising me to be a responsible, up-standing citizen. I would also like to express my gratitude to my elder brother, Hsuan-Cheng Lai, for checking part of this thesis. Finally, I would like to thank Chia-Yun Chen, my girlfriend, who kept me company, and motivated me throughout my thesis. I would also like to mention my cat, Jaime, who re-wrote portions of my thesis by walking on my keyboard. This research was supported by a grant from the Apple University Development Fund in Australia. v vi Publications from this Thesis Lai, B., McKerrow, P.J. & Woolley, D. (2001), Developing a Java API for digital video control using the FireWire SDK, in N. Smythe, ed, ‘e-Xplore 2001: a face to face odyssey’, Apple University Consortium. Lai, B. & McKerrow, P.J. (2001), Programming the Velocity Engine, in N. Smythe, ed, ‘e-Xplore 2001: a face to face odyssey’, Apple University Consortium. Lai, B. & McKerrow, P.J. (2001), Image processing libraries, Australasian Conference on Robotics & Automation. Lai, B., McKerrow, P.J. & Abrantes, J. (2002), Vectorized machine vision algorithms using AltiVec, Australasian Conference on Robotics & Automation. Lai, B. & McKerrow, P.J. & Abrantes, J. (2005), The abstract vector processor, Micropro- cessors and Microsystems, Accepted June 10th 2005. Lai, B. & McKerrow, P.J. (2005), Design of a generic, vectorised, machine-vision library, in Cutting Edge Robotics, ISBN 3-86611-038-3, International Journal

Load more