Multimedia Information Retrieval Copyright © 2010 by Morgan & Claypool

Total Page:16

File Type:pdf, Size:1020Kb

Multimedia Information Retrieval Copyright © 2010 by Morgan & Claypool Multimedia Information Retrieval Copyright © 2010 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. Multimedia Information Retrieval Stefan Rüger www.morganclaypool.com ISBN: 9781608450978 paperback ISBN: 9781608450985 ebook DOI 10.2200/S00244ED1V01Y200912ICR010 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON INFORMATION CONCEPTS, RETRIEVAL, AND SERVICES Lecture #10 Series Editor: Gary Marchionini, University North Carolina, Chapel Hill Series ISSN Synthesis Lectures on Information Concepts, Retrieval, and Services Print 1947-945X Electronic 1947-9468 Synthesis Lectures on Information Concepts, Retrieval, and Services Editor Gary Marchionini, University North Carolina, Chapel Hill Multimedia Information Retrieval Stefan Rüger 2009 Information Architecture: The Design and Integration of Information Spaces Wei Ding, Xia Lin 2009 Reading and Writing the Electronic Book Catherine C. Marshall 2009 Hypermedia Genes: An Evolutionary Perspective on Concepts, Models, and Architectures Nuno M. Guimarães, Luís M. Carriço 2009 Understanding User-Web Interactions via Web Analytics Bernard J. ( Jim) Jansen 2009 XML Retrieval Mounia Lalmas 2009 Faceted Search Daniel Tunkelang 2009 Introduction to Webometrics: Quantitative Web Research for the Social Sciences Michael Thelwall 2009 iv Exploratory Search: Beyond the Query-Response Paradigm Ryen W. White, Resa A. Roth 2009 New Concepts in Digital Reference R. David Lankes 2009 Automated Metadata in Multimedia Information Systems: Creation, Refinement, Use in Surrogates, and Evaluation Michael G. Christel 2009 Multimedia Information Retrieval Stefan Rüger The Open University SYNTHESIS LECTURES ON INFORMATION CONCEPTS, RETRIEVAL, AND SERVICES #10 &MC Morgan& cLaypool publishers ABSTRACT At its very core multimedia information retrieval means the process of searching for and finding multimedia documents; the corresponding research field is concerned with building the best possible multimedia search engines.The intriguing bit here is that the query itself can be a multimedia excerpt: For example,when you walk around in an unknown place and stumble across an interesting landmark, would it not be great if you could just take a picture with your mobile phone and send it to a service that finds a similar picture in a database and tells you more about the building — and about its significance for that matter? This book goes further by examining the full matrix of a variety of query modes versus document types. How do you retrieve a music piece by humming? What if you want to find news video clips on forest fires using a still image? The text discusses underlying techniques and common approaches to facilitate multimedia search engines from metadata driven retrieval, via piggy-back text retrieval where automated processes create text surrogates for multimedia, automated image annotation and content-based retrieval. The latter is studied in great depth looking at features and distances, and how to effectively combine them for efficient retrieval, to a point where the readers have the ingredients and recipe in their hands for building their own multimedia search engines. Supporting users in their resource discovery mission when hunting for multimedia material is not a technological indexing problem alone.We look at interactive ways of engaging with repositories through browsing and relevance feedback, roping in geographical context, and providing visual sum- maries for videos. The book concludes with an overview of state-of-the-art research projects in the area of multimedia information retrieval, which gives an indication of the research and development trends and, thereby, a glimpse of the future world. KEYWORDS multimedia information retrieval, multimedia digital libraries, visual search, content- based retrieval, piggy-back text retrieval, automated image annotation, audiovisual fin- gerprinting, semantic gap, polysemy, multimedia features and distances, fusion of fea- tures and distances, high-dimensional indexing, video summaries, information visual- isation, relevance feedback, geo-temporal browsing vii Contents Preface .........................................................................xiii 1 What is Multimedia Information Retrieval? .........................................1 1.1 Information Retrieval .......................................................1 1.2 Multimedia ................................................................3 1.3 Multimedia Information Retrieval ............................................4 1.4 Challenges of Automated Multimedia Indexing ...............................7 1.5 Summary ..................................................................9 1.6 Exercises .................................................................10 1.6.1 Memex 10 1.6.2 Loops and Interaction 11 1.6.3 Automated vs Manual 11 1.6.4 Compound Text Queries 12 1.6.5 SearchTypes 12 2 Basic Multimedia Search Technologies.............................................13 2.1 Metadata Driven Retrieval .................................................13 2.2 Piggy-back Text Retrieval ..................................................17 2.3 Content-based Retrieval ...................................................20 2.4 Automated Image Annotation ..............................................21 2.5 Fingerprinting ............................................................27 2.5.1 Audio Fingerprinting 28 2.5.2 Image Fingerprinting 32 2.6 Exercises .................................................................37 2.6.1 Search Types Continued 37 2.6.2 Intensity Histograms 37 viii CONTENTS 2.6.3 Fingerprint Block Probabilities 37 2.6.4 Fingerprint Block False Positives 38 2.6.5 Shazam’s Constellation Pairs 38 2.6.6 One Pass Algorithm for Min Hash 38 3 Content-based Retrieval in Depth .................................................41 3.1 Content-based Retrieval Architecture .......................................41 3.2 Features ..................................................................42 3.2.1 Colour Histograms 43 3.2.2 Statistical Moments 44 3.2.3 Texture Histograms 45 3.2.4 Shape 48 3.2.5 Spatial Information 54 3.2.6 Other Feature Types 57 3.3 Distances .................................................................57 3.3.1 Geometric Component-wise Distances 57 3.3.2 Geometric Quadratic Distances 59 3.3.3 Statistical Distances 60 3.3.4 Probabilistic Distance Measures 61 3.3.5 Ordinal and Nominal Distances 63 3.3.6 String-based Distances 64 3.4 Feature and Distance Standardisation........................................66 3.4.1 Component-wise Standardisation using Corpus Statistics 67 3.4.2 Range Standardisation 67 3.4.3 Ratio Features 68 3.4.4 Vector Normalisation 68 3.5 High-dimensional Indexing ................................................69 3.6 Fusion of Feature Spaces and Query Results..................................70 3.6.1 Single Query Example with Multiple Features 70 3.6.2 Multiple Query Examples 73 3.6.3 Order of Fusion 75 CONTENTS ix 3.7 Exercises .................................................................76 3.7.1 Colour Histograms 76 3.7.2 HSV Colour Space Quantisation 77 3.7.3 CIE LUV Colour Space Quantisation 78 3.7.4 Skewness and Kurtosis 79 3.7.5 Boundaries for Tamura Features 79 3.7.6 Distances and Dissimilarities 80 3.7.7 Ordinal Distances — Pen-pal Matching 80 3.7.8 Asymmetric Binary Features 80 3.7.9 Jaccard Distance 81 3.7.10Levenshtein Distance 81 3.7.11Co-occurrence Dissimilarity 81 3.7.12Chain Codes and Edit Distance 81 3.7.13Time Warping Distance 81 3.7.14Feature Standardisation 82 3.7.15Curse of Dimensionality 83 3.7.16ImageSearch 83 4 Added Services ..................................................................85 4.1 Video Summaries .........................................................85 4.2 Paradigms in Information Visualisation ......................................88 4.3 Visual Search and Relevance Feedback.......................................94 4.4 Browsing .................................................................99 4.5 Geo-temporal Aspects of Media ...........................................100 4.5.1 Importance of Geography as Context 100 4.5.2 Geo-temporal Browsing and Access 103 4.6 Exercises ................................................................104 4.6.1 Interface Design and Functionality 104 4.6.2 Shot Boundary Detection for Gradual Transitions 105 4.6.3 Relevance Feedback: Optimal Weight Computation 105 4.6.4 Relevance Feedback: Lagrangian Multiplier 106 x CONTENTS 4.6.5 Geographic Attributes 106 4.6.6 View of the World — Retrieval and Context 107 5 Multimedia Information Retrieval Research .......................................109 5.1 Multimedia Representation and Management ...............................109 5.1.1 GATE 111 5.1.2 AXMEDIS 111 5.1.3 NM2 112 5.1.4 3DTV 112 5.1.5 VISNET II 112 5.1.6 X-Media 113 5.1.7 MediaCampaign 113 5.1.8 SALERO 114 5.1.9 LUISA 114 5.1.10SEMEDIA 114 5.2 Digital Libraries..........................................................114 5.2.1 Greenstone 116 5.2.2 Informedia 117 5.2.3 SCULPTEUR 117 5.2.4 CHLT 118 5.2.5 DELOS 118 5.2.6 BRICKS 119 5.2.7 StoryBank 119 5.3 Metadata and Automated Annotation ......................................119 5.3.1 aceMedia 120 5.3.2 LAVA 121 5.3.3 KerMIT 121 5.3.4 POLYMNIA
Recommended publications
  • Secure Multi Keyword Fuzzy with Semantic Expansion Based Search Over Encrypted Cloud Data
    SECURE MULTI KEYWORD FUZZY WITH SEMANTIC EXPANSION BASED SEARCH OVER ENCRYPTED CLOUD DATA ARFA BAIG Dept of Computer Science & Engineering B.N.M Institute of Technology, Bangalore, India E-mail: [email protected] Abstract— The initiation of cloud computing has led to ease of access in Internet-based computing and is commonly used for web servers or development systems where there are security and compliance requirements. Nevertheless, some of the confidential information has to be encrypted to avoid any intrusion. Henceforward as an attempt, a semantic expansion based multi- keyword fuzzy search provides solution over encrypted cloud data by using the locality-sensitive hashing technique. This solution returns not only the accurately matched files, but also the files including the terms semantically related to the query keyword. In the proposed scheme fuzzy matching is achieved through algorithmic design rather than expanding the index files. It also eradicates the need of a predefined dictionary and effectively supports multiple keyword fuzzy search without increasing the index or search complexity. The indexes are formed based on locality sensitive hashing (LSH), the result files are returned according to the total relevance score. Index Terms—Multi keyword fuzzy search, Locality Sensitive Hashing, Secure Semantic Expansion. support fuzzy search and also required the use of pre- I. INTRODUCTION defined dictionary which lacked scalability and Cloud computing is a form of computing that depends flexibility for modification and updation of the data. on sharing computing resources rather than having These drawbacks create the necessity of the new local servers or personal devices to handle technique of multi keyword fuzzy search.
    [Show full text]
  • Lower Bounds on Lattice Sieving and Information Set Decoding
    Lower bounds on lattice sieving and information set decoding Elena Kirshanova1 and Thijs Laarhoven2 1Immanuel Kant Baltic Federal University, Kaliningrad, Russia [email protected] 2Eindhoven University of Technology, Eindhoven, The Netherlands [email protected] April 22, 2021 Abstract In two of the main areas of post-quantum cryptography, based on lattices and codes, nearest neighbor techniques have been used to speed up state-of-the-art cryptanalytic algorithms, and to obtain the lowest asymptotic cost estimates to date [May{Ozerov, Eurocrypt'15; Becker{Ducas{Gama{Laarhoven, SODA'16]. These upper bounds are useful for assessing the security of cryptosystems against known attacks, but to guarantee long-term security one would like to have closely matching lower bounds, showing that improvements on the algorithmic side will not drastically reduce the security in the future. As existing lower bounds from the nearest neighbor literature do not apply to the nearest neighbor problems appearing in this context, one might wonder whether further speedups to these cryptanalytic algorithms can still be found by only improving the nearest neighbor subroutines. We derive new lower bounds on the costs of solving the nearest neighbor search problems appearing in these cryptanalytic settings. For the Euclidean metric we show that for random data sets on the sphere, the locality-sensitive filtering approach of [Becker{Ducas{Gama{Laarhoven, SODA 2016] using spherical caps is optimal, and hence within a broad class of lattice sieving algorithms covering almost all approaches to date, their asymptotic time complexity of 20:292d+o(d) is optimal. Similar conditional optimality results apply to lattice sieving variants, such as the 20:265d+o(d) complexity for quantum sieving [Laarhoven, PhD thesis 2016] and previously derived complexity estimates for tuple sieving [Herold{Kirshanova{Laarhoven, PKC 2018].
    [Show full text]
  • Efficient (Ideal) Lattice Sieving Using Cross-Polytope
    Efficient (ideal) lattice sieving using cross-polytope LSH Anja Becker1 and Thijs Laarhoven2? 1 EPFL, Lausanne, Switzerland | [email protected] 2 TU/e, Eindhoven, The Netherlands | [email protected] Abstract. Combining the efficient cross-polytope locality-sensitive hash family of Terasawa and Tanaka with the heuristic lattice sieve algorithm of Micciancio and Voulgaris, we show how to obtain heuristic and prac- tical speedups for solving the shortest vector problem (SVP) on both arbitrary and ideal lattices. In both cases, the asymptotic time complex- ity for solving SVP in dimension n is 20:298n+o(n). For any lattice, hashes can be computed in polynomial time, which makes our CPSieve algorithm much more practical than the SphereSieve of Laarhoven and De Weger, while the better asymptotic complexities imply that this algorithm will outperform the GaussSieve of Micciancio and Voulgaris and the HashSieve of Laarhoven in moderate dimensions as well. We performed tests to show this improvement in practice. For ideal lattices, by observing that the hash of a shifted vector is a shift of the hash value of the original vector and constructing rerandomiza- tion matrices which preserve this property, we obtain not only a linear decrease in the space complexity, but also a linear speedup of the overall algorithm. We demonstrate the practicability of our cross-polytope ideal lattice sieve IdealCPSieve by applying the algorithm to cyclotomic ideal lattices from the ideal SVP challenge and to lattices which appear in the cryptanalysis of NTRU. Keywords: (ideal) lattices, shortest vector problem, sieving algorithms, locality-sensitive hashing 1 Introduction Lattice-based cryptography.
    [Show full text]
  • Murari Lal (India), Hideo Harasawa (Japan), and Daniel Murdiyarso (Indonesia)
    11 Asia MURARI LAL (INDIA), HIDEO HARASAWA (JAPAN), AND DANIEL MURDIYARSO (INDONESIA) Lead Authors: W.N. Adger (UK), S. Adhikary (Nepal), M. Ando (Japan), Y. Anokhin (Russia), R.V. Cruz (Philippines), M. Ilyas (Malaysia), Z. Kopaliani (Russia), F. Lansigan (Philippines), Congxian Li (China), A. Patwardhan (India), U. Safriel (Israel), H. Suharyono (Indonesia), Xinshi Zhang (China) Contributing Authors: M. Badarch (Mongolia), Xiongwen Chen (China), S. Emori (Japan), Jingyun Fang (China), Qiong Gao (China), K. Hall (USA), T. Jarupongsakul (Thailand), R. Khanna-Chopra (India), R. Khosa (India), M.P. Kirpes (USA), A. Lelakin (Russia), N. Mimura (Japan), M.Q. Mirza (Bangladesh), S. Mizina (Kazakhstan), M. Nakagawa (Japan), M. Nakayama (Japan), Jian Ni (China), A. Nishat (Bangladesh), A. Novoplansky (Israel), T. Nozawa (Japan), W.T . Piver (USA), P.S. Ramakrishnan (India), E. Rankova (Russia), T.L. Root (USA), D. Saltz (Israel), K.P. Sharma (Nepal), M.L. Shrestha (Nepal), G. Srinivasan (India), T.S. Teh (Malaysia), Xiaoping Xin (China), M. Yoshino (Japan), A. Zangvil (Israel), Guangsheng Zhou (China) Review Editors: Su Jilan (China) and T. Ososkova (Uzbekistan) CONTENTS Executive Summary 53 5 11. 2 . 4 . Oceanic and Coastal Ecosystems 56 6 11. 2 . 4 . 1 . Oceans and Coastal Zones 56 6 11. 1 . The Asian Region 53 9 11. 2 . 4 . 2 . Deltas, Estuarine, 11. 1 . 1 . Ba c k g r o u n d 53 9 and Other Coastal Ecosystems 56 7 11. 1 . 2 . Physical and Ecological Features 53 9 11.2.4.3. Coral Reefs 56 7 11. 1 . 2 . 1 . Regional Zonation 53 9 11.2.4.4.
    [Show full text]
  • Efficient Similarity Search Over Encrypted Data
    Efficient Similarity Search over Encrypted Data Mehmet Kuzu, Mohammad Saiful Islam, Murat Kantarcioglu Department of Computer Science, The University of Texas at Dallas Richardson, TX 75080, USA {mehmet.kuzu, saiful, muratk} @ utdallas.edu Abstract— In recent years, due to the appealing features of A similarity search problem consists of a collection of data cloud computing, large amount of data have been stored in the items that are characterized by some features, a query that cloud. Although cloud based services offer many advantages, specifies a value for a particular feature and a similarity metric privacy and security of the sensitive data is a big concern. To mitigate the concerns, it is desirable to outsource sensitive data to measure the relevance between the query and the data items. in encrypted form. Encrypted storage protects the data against The goal is to retrieve the items whose similarity against the illegal access, but it complicates some basic, yet important func- specified query is greater than a predetermined threshold under tionality such as the search on the data. To achieve search over the utilized metric. Although exact matching based searchable encrypted data without compromising the privacy, considerable encryption methods are not suitable to achieve this goal, there amount of searchable encryption schemes have been proposed in the literature. However, almost all of them handle exact query are some sophisticated cryptographic techniques that enable matching but not similarity matching; a crucial requirement similarity search over encrypted data [9], [10]. Unfortunately, for real world applications. Although some sophisticated secure such secure multi-party computation based techniques incur multi-party computation based cryptographic techniques are substantial computational resources.
    [Show full text]
  • Implementation of Locality Sensitive Hashing Techniques
    Implementation of Locality Sensitive Hashing Techniques Project Report submitted in partial fulfillment of the requirement for the degree of Bachelor of Technology. in Computer Science & Engineering under the Supervision of Dr. Nitin Chanderwal By Srishti Tomar(111210) to Jaypee University of Information and TechnologyWaknaghat, Solan – 173234, Himachal Pradesh i Certificate This is to certify that project report entitled “Implementaion of Locality Sensitive Hashing Techniques”, submitted by Srishti Tomar in partial fulfillment for the award of degree of Bachelor of Technology in Computer Science & Engineering to Jaypee University of Information Technology, Waknaghat, Solan has been carried out under my supervision. This work has not been submitted partially or fully to any other University or Institute for the award of this or any other degree or diploma. Date: Supervisor’s Name: Dr. Nitin Chanderwal Designation : Associate Professor ii Acknowledgement I am highly indebted to Jaypee University of Information Technology for their guidance and constant supervision as well as for providing necessary information regarding the project & also for their support in completing the project. I would like to express my gratitude towards my parents & Project Guide for their kind co-operation and encouragement which help me in completion of this project. I would like to express my special gratitude and thanks to industry persons for giving me such attention and time. My thanks and appreciations also go to my colleague in developing the project and people who have willingly helped me out with their abilities Date: Name of the student: Srishti Tomar iii Table of Content S. No. Topic Page No. 1. Abstract 1 2.
    [Show full text]
  • Evaluation of Scalable Pprl Schemes with a Native Lsh Database Engine
    EVALUATION OF SCALABLE PPRL SCHEMES WITH A N ATIVE LSH DATABASE ENGINE Dimitrios Karapiperis 1, Chris T. Panagiotakopoulos 2 and Vassilios S. Verykios 3 1School of Science and Technology, Hellenic Open University, Greece 2Department of Primary Education, University of Patras, Greece 3School of Science and Technology, Hellenic Open University, Greece ABSTRACT In this paper, we present recent work which has been accomplished in the newly introduced research area of privacy preserving record linkage, and then, we present our L-fold redundant blocking scheme, that relies on the Locality-Sensitive Hashing technique for identifying similar records. These records have undergone an anonymization transformation using a Bloom filter- based encoding technique. We perform an experimental evaluation of our state-of-the-art blocking method against four other rival methods and present the results by using LSHDB, a newly introduced parallel and distributed database engine. KEYWORDS Locality-Sensitive Hashing, Record Linkage, Privacy-Preserving Record Linkage, Entity Resolution 1. INTRODUCTION A series of economic collapses of bank and insurance companies recently triggered a financial crisis of unprecedented severity. In order for these institutions to get back on their feet, they had to engage in merger talks inevitably. One of the tricky points for such mergers is to be able to estimate the extent to which the customer bases of the constituent institutions are in common, so that the benefits of the merger can be proactively assessed. The process of comparing the customer bases and finding out records that refer to the same real world entity, is known as the Record Linkage , the Entity Resolution or the Data Matching problem.
    [Show full text]
  • Image and Video Searching on the World Wide Web
    Image and Video Searching on the World Wide Web Michael J. Swain Cambridge Research Laboratory Compaq Computer Corporation One Kendall Square, Bldg. 700 Cambridge, MA 02139, USA [email protected] Abstract The proliferation of multimedia on the World Wide Web has led to the introduction of Web search engines for images, video, and audio. On the Web, multimedia is typically embedded within documents that provide a wealth of indexing information. Harsh computational constraints imposed by the economics of advertising-supported searches restrict the complexity of analysis that can be performed at query time. And users may be unwilling to do much more than type a keyword or two to input a query. Therefore, the primary sources of information for indexing multimedia documents are text cues extracted from HTML pages and multimedia document headers. Off-line analysis of the content of multimedia documents can be successfully employed in Web search engines when combined with these other information sources. Content analysis can be used to categorize and summarize multimedia, in addition to providing cues for finding similar documents. This paper was delivered as a keynote address at the Challenge of Image Retrieval ’99. It represents a personal and purposefully selective review of image and video searching on the World Wide Web. 1 Introduction The World Wide Web is full of images, video, and audio, as well as text. Search engines are starting to appear that can allow users to find such multimedia, the quantity of which is growing even faster than text on the Web. As 56 kbps (V.90) modems have become standardized and widely used, and as broadband cable modem and telephone-network based Digital Subscribe Line (DSL) services gain following in the United States and Europe, multimedia on the Web is becoming freed of its major impediment: low-bandwidth consumer Internet connectivity.
    [Show full text]
  • Search Computing Cover.Indd
    Search Computing Business Areas, Research and Socio-Economic Challenges Media Search Cluster White Paper European Commission European SocietyInformation and Media LEGAL NOTICE By the Commission of the European Communities, Information Society & Media Directorate-General, Future and Emerging Technologies units. Neither the European Commission nor any person acting on its behalf is responsible for the use which might be made of the information contained in the present publication. The European Commission is not responsible for the external web sites referred to in the present publication. The views expressed in this publication are those of the authors and do not necessarily reflect the official European Commission view on the subject. Luxembourg: Publications Office of the European Union, 2011 ISBN 978-92-79-18514-4 doi:10.2759/52084 © European Union, July 2011 Reproduction is authorised provided the source is acknowledged. © Cover picture: Alinari 24 ORE, Firenze, Italy Printed in Belgium White paper: Search Computing: Business Areas, Research and Socio-Economic Challenges Search Computing: Business Areas, Research and Socio- Economic Challenges Media Search Cluster White Paper Media Search Cluster - 2011 Page 1 White paper: Search Computing: Business Areas, Research and Socio-Economic Challenges Coordinated by the CHORUS+ project co-funded by the European Commission under - 7th Framework Programme (2007-2013) by the –Networked Media and Search Systems Unit of DG INFSO Media Search Cluster - 2011 Page 2 White paper: Search Computing:
    [Show full text]
  • Crypto-Aided MAP Detection and Mitigation of False Data in Wireless Relay Networks
    Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2019 Crypto-aided MAP detection and mitigation of false data in wireless relay networks Xudong Liu Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Communication Commons Recommended Citation Liu, Xudong, "Crypto-aided MAP detection and mitigation of false data in wireless relay networks" (2019). Graduate Theses and Dissertations. 17730. https://lib.dr.iastate.edu/etd/17730 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Crypto-aided MAP detection and mitigation of false data in wireless relay networks by Xudong Liu A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Electrical Engineering (Communications and Signal Processing) Program of Study Committee: Sang Wu Kim, Major Professor Yong Guan Chinmay Hegde Thomas Daniels Long Que The student author, whose presentation of the scholarship herein was approved by the program of study committee, is solely responsible for the content of this dissertation. The Graduate College will ensure this dissertation is globally accessible and will not permit alterations after a degree is conferred. Iowa State University Ames, Iowa 2019 Copyright c Xudong Liu, 2019. All rights reserved. ii DEDICATION I would like to dedicate this thesis to my beloved wife Yu Wang, without her emotional support I would not be able to complete this work.
    [Show full text]
  • Symmetric Cryptography
    Symmetric Cryptography CS461/ECE422 Fall 2010 1 Outline • Overview of Cryptosystem design • Commercial Symmetric systems – DES – AES • Modes of block and stream ciphers 2 Reading • Chapter 9 from Computer Science: Art and Science – Sections 3 and 4 • AES Standard issued as FIPS PUB 197 – http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf • Handbook of Applied Cryptography, Menezes, van Oorschot, Vanstone – Chapter 7 – http://www.cacr.math.uwaterloo.ca/hac/ 3 Stream, Block Ciphers • E encipherment function – Ek(b) encipherment of message b with key k – In what follows, m = b1b2 …, each bi of fixed length • Block cipher – Ek(m) = Ek(b1)Ek(b2) … • Stream cipher – k = k1k2 … – Ek(m) = Ek1(b1)Ek2(b2) … – If k1k2 … repeats itself, cipher is periodic and the length of its period is one cycle of k k … 1 2 4 Examples • Vigenère cipher – |bi| = 1 character, k = k1k2 … where |ki| = 1 character – Each bi enciphered using ki mod length(k) – Stream cipher • DES – |bi| = 64 bits, |k| = 56 bits – Each bi enciphered separately using k – Block cipher 5 Confusion and Diffusion • Confusion – Interceptor should not be able to predict how ciphertext will change by changing one character • Diffusion – Cipher should spread information from plaintext over cipher text – See avalanche effect 6 Avalanche Effect • Key desirable property of an encryption algorithm • Where a change of one input or key bit results in changing approx half of the output bits • If the change were small, this might provide a way to reduce the size of the key space to be searched
    [Show full text]
  • Hashing Techniques: a Survey and Taxonomy
    11 Hashing Techniques: A Survey and Taxonomy LIANHUA CHI, IBM Research, Melbourne, Australia XINGQUAN ZHU, Florida Atlantic University, Boca Raton, FL; Fudan University, Shanghai, China With the rapid development of information storage and networking technologies, quintillion bytes of data are generated every day from social networks, business transactions, sensors, and many other domains. The increasing data volumes impose significant challenges to traditional data analysis tools in storing, processing, and analyzing these extremely large-scale data. For decades, hashing has been one of the most effective tools commonly used to compress data for fast access and analysis, as well as information integrity verification. Hashing techniques have also evolved from simple randomization approaches to advanced adaptive methods considering locality, structure, label information, and data security, for effective hashing. This survey reviews and categorizes existing hashing techniques as a taxonomy, in order to provide a comprehensive view of mainstream hashing techniques for different types of data and applications. The taxonomy also studies the uniqueness of each method and therefore can serve as technique references in understanding the niche of different hashing mechanisms for future development. Categories and Subject Descriptors: A.1 [Introduction and Survey]: Hashing General Terms: Design, algorithms Additional Key Words and Phrases: Hashing, compression, dimension reduction, data coding, cryptographic hashing ACM Reference Format: Lianhua Chi and Xingquan Zhu. 2017. Hashing techniques: A survey and taxonomy. ACM Comput. Surv. 50, 1, Article 11 (April 2017), 36 pages. DOI: http://dx.doi.org/10.1145/3047307 1. INTRODUCTION Recent advancement in information systems, including storage devices and networking techniques, have resulted in many applications generating large volumes of data and requiring large storage, fast delivery, and quick analysis.
    [Show full text]