Bioinformatics-Inspired Analysis for Watermarked Images with Multiple Print and Scan

Bioinformatics-Inspired Analysis for Watermarked Images with Multiple Print and Scan By Abhimanyu Singh Garhwal A thesis submitted to Auckland University of Technology in fulfilment of the requirements for the degree of Doctor of Philosophy September 2017 Acronyms Used in This Thesis BIIA - Bioinformatics-Inspired Image Analysis BIIIA - Bioinformatics-Inspired Image Identification Approach BIIIG - Bioinformatics-Inspired Image Grouping Approach DNA – Deoxyribonucleic Acid MPS – Multiple Print and Scan MSA – Multiple Sequence Alignment NW - Non-Watermarked NWA – Needleman Wunch Algorithm NWD – Non-Watermarked and Degraded NWND – Non-Watermarked and Non-degraded PSA – Pairwise Sequence Alignment SWA – Smith Waterman Algorithm W – Watermarked WD – Watermarked and Degraded WND – Watermarked and Non-Degraded II Abstract Image identification and grouping through pattern analysis are the core problems in image analysis. In this thesis, the gap between bioinformatics and image analysis is bridged by using biologically-encoding and sequence-alignment algorithms in bioinformatics. In this thesis, the novel idea is to exploit the whole image which is encoded biologically in DNA without extracting its features. This thesis proposed novel methods for identifying and grouping images no matter whether having or not having watermarks. Three novel methods are proposed. The first is to evaluate degraded/non-degraded and watermarked/non-watermarked images by using image metrics. The bioinformatics-inspired image identification approach (BIIIA) is the second contribution, where two DNA-encoded images are aligned by using SWA algorithm or NWA algorithm to derive substrings, which are exploited for pattern matching so as to identify the images having a watermark or degradation generated from MPS. The outcomes of identification affirm the capability of BIIIA algorithm. Furthermore, it asserts that DNA-based encoding is the best way for digital images as well as SWA algorithm is the best one for the sequence alignment. The last one is the bioinformatics-inspired image grouping approach (BIIGA), where the DNA-encoded images are aligned by using multiple sequence alignment (MSA), which is exploited by using the phylogenetic tree to group the watermarked / non- watermarked and degraded / non-degraded images; the resultant analysis confirms the potential of BIIGA algorithm. All three methods are empirically verified and validated by using real datasets. Keywords: Multiple print and scan, multiple sequence alignment, local pairwise and global alignment, image quality metrics, image analysis, pattern matching, phylogenetic tree, bioinformatics tool. III Table of Contents Abstract ................................................................................................................... III List of Figures ......................................................................................................... VI List of Tables ....................................................................................................... VIII Attestation of Authorship ........................................................................................ X Acknowledgement .................................................................................................. XI 1. Introduction ............................................................................................................... 1 1.1 Motivation .................................................................................................................. 2 1.2 Scope .......................................................................................................................... 6 1.2.1 Concepts of Bioinformatics ....................................................................... 6 1.2.2 Sequence Analysis ..................................................................................... 9 1.2.3 Image Analysis ........................................................................................ 10 1.3 Thesis Structure ....................................................................................................... 14 1.3.1 Contribution of Thesis ............................................................................. 14 1.3.2 Organisation of Thesis ............................................................................. 14 1.4 Summary .................................................................................................................. 16 2. Literature Survey .................................................................................................... 17 2.1 Image Analysis ........................................................................................................ 18 2.1.1 Image Analysis ........................................................................................ 18 2.1.2 Major Methodologies in Image Analysis ................................................. 18 2.2 Bioinformatics Image Analysis ................................................................................ 20 2.2.1 Existing Techniques of Bioinformatics Image Analysis .......................... 20 2.2.2 Biologically-Based Image Representation ............................................... 21 2.2.3 Biological Sequence Alignment .............................................................. 23 2.2.4 Phylogenetic Tree for Sequence Visualisation ........................................ 29 2.3 Image Watermarking Systems ................................................................................. 38 2.3.1 Background and Components of Digital watermarking .......................... 38 2.3.2 Fundamental Properties of Watermarking ............................................... 43 2.4 Multiple Print and Scan (MPS) ................................................................................ 49 2.4.1 Image Degradation Due to MPS .............................................................. 50 2.4.2 Metrics for Measuring Image Degradation .............................................. 53 2.5 Research Problems ................................................................................................... 58 2.6 Summary .................................................................................................................. 59 3. Research Methodology ........................................................................................... 61 3.1 Introduction .............................................................................................................. 62 3.2 Empirical Research Methodology ............................................................................ 63 3.3 Research Problems and Open Questions ................................................................. 64 3.3.1 Research Problems ................................................................................... 64 3.3.2 Research Questions .................................................................................. 65 3.4 Proposed Method ..................................................................................................... 66 3.4.1 Limitations of Previous Methods ............................................................. 66 3.4.2 Hypothesis ............................................................................................... 67 3.4.3 Evaluating Image Degradation from MPS ............................................... 68 3.4.4 The Idea for Developing BIIIA and BIIGA Algorithm ........................... 68 3.5 Design of Experimental Methods ............................................................................ 68 3.6 Analysis of BIIIA and BIIGA .................................................................................. 70 IV 3.7 Evaluation and Review of Output Reports .............................................................. 70 3.8 Tools ........................................................................................................................ 70 3.8.1 uMark ....................................................................................................... 71 3.8.2 Printer and Scanner .................................................................................. 71 3.8.3 Image Datasets ......................................................................................... 72 3.8.4 WUtils.com & Tomeko.net Web Tools ................................................... 78 3.8.5 JAligner .................................................................................................... 79 3.8.6 MAFT ...................................................................................................... 80 3.8.7 MEGA7 .................................................................................................... 81 3.8.8 Clamscan .................................................................................................. 82 3.9 Summary .................................................................................................................. 82 4. Evaluations of Image Degradation from MPS ..................................................... 84 4.1 Introduction .............................................................................................................. 85 4.2 Image Quality Metrics ............................................................................................. 85 4.3 A Novel Method for Evaluating Images from MPS ................................................ 86 4.4 Results .....................................................................................................................

Bioinformatics-Inspired Analysis for Watermarked Images with Multiple Print and Scan

Simplified Matching Algorithm Using a Translated Codon (Tron)

Gap Opening Penalty Formula

Alignment Principles and Homology Searching Using (PSI-)BLAST

Sequence Analysis

Sequence Alignment

Gap Penalty in Sequence Alignment Pdf

Structural and Evolutionary Considerations for Multiple Sequence Alignment of RNA, and the Challenges for Algorithms That Ignore Them

The Biologist's Guide to Paracel's Similarity Search Algorithms

Sequence Alignment Algorithms

Aligning Coding Sequences with Frameshift Extension Penalties