PDF/A for Scanned Documents

PDF/A for Scanned Documents

<p><strong>Webinar </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>PDF/A for Scanned Documents </strong></p><p><strong>Paper Becomes Digital </strong></p><p><strong>Mark McKinney, LuraTech, Inc., President Armin Ortmann, LuraTech, CTO </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p>© 2009&nbsp;<a href="/goto?url=http://www.pdfa.org" target="_blank">PDF/A Competence Center, www.pdfa.org </a></p><p><strong>Existing Solutions for Scanned Documents </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Black &amp; White: TIFF G4 Color: Mostly JPEG, but sometimes PNG, BMP and other raster graphics formats </strong></p><p><strong>Often special version formats like “JPEG in TIFF” Disadvantages: </strong></p><p><strong>Several formats already for scanned documents Even more formats for born digital documents Loss of information, e.g. with TIFF G4 Bad image quality and huge file size, e.g. with JPEG No standardized metadata spread over all formats Not full text searchable (OCR) inside of files </strong></p><p><strong>Black/White: Color: </strong>- <strong>TIFF FAX G4&nbsp;</strong>- <strong>TIFF </strong><br>- <strong>TIFF LZW </strong>- <strong>JPEG </strong>- <strong>PDF </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>2</strong></p><p><strong>Existing Solutions for Scanned Documents </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Bad image quality vs. file size </strong><br><strong>TIFF G4 </strong><br><strong>JPEG </strong><br><strong>TIFF/BMP </strong></p><p><strong>60 kB </strong><br><strong>23.8 MB </strong><br><strong>180 kB </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>3</strong></p><p><strong>Alternative Solution: PDF </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>PDF is already widely used to: </strong></p><p><strong>Unify file formats </strong></p><p>Image à PDF “Office” Documents à PDF Other sources à PDF </p><p><strong>Create full-text searchable files Apply modern compression technology (e.g. the JPEG2000 file formats family) </strong></p><p><strong>Harmonize metadata </strong></p><p><strong>Conclusion: PDF avoids the disadvantages of the legacy formats </strong></p><p><strong>“So if you are already using PDF as archival format, why not use PDF/A with its many advantages?” </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>4</strong></p><p><strong>PDF/A </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>What is PDF/A? </strong>• <strong>ISO 19005-1, Document Management </strong>• <strong>Electronic document file format for long-term preservation </strong></p><p><strong>Goals of PDF/A: </strong>• <strong>Maintain static visual representation of documents </strong></p><p>• <strong>Consistent handing of Metadata </strong>• <strong>Option to maintain structure and semantic meaning of content </strong></p><p>• <strong>Transparency to guarantee access </strong>• <strong>Limit the number of restrictions </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>PDF/A – Full-Text Searchability (OCR) </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Benefit: Searchable at the File Level </strong></p><p><strong>Digital Library - “after book download” Large Manuals / Multi-Page Construction Documents Downloaded Documents from Archive Databases Documents sent to customers, suppliers, lawyers, etc. as email attachments </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>6</strong></p><p><strong>PDF/A – Enhanced Compression </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>For Black &amp; White Documents </strong></p><p><strong>JBIG2 - ISO/IEC 14492 </strong></p><p>Used as alternative to TIFF G4 Full and visual lossless mode Embedded in PDF/A, available in Acrobat Reader </p><p></p><ul style="display: flex;"><li style="flex:1"><strong>FAX G4 </strong></li><li style="flex:1"><strong>JBIG2/lossless </strong></li><li style="flex:1"><strong>JBIG2/lossy </strong></li></ul><p></p><ul style="display: flex;"><li style="flex:1"><strong>60 kB </strong></li><li style="flex:1"><strong>46 kB </strong></li><li style="flex:1"><strong>29 kB </strong></li></ul><p></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>7</strong></p><p><strong>PDF/A – Enhanced Compression </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>For Color Documents </strong></p><p><strong>MRC Compression, also known as JPEG2000 (JPM) Splits documents in three layers to be compressed independently and stored in PDF/A </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>8</strong></p><p><strong>PDF/A – Enhanced Compression </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>For Color Documents </strong></p><p><strong>Extreme compression, fully legible Saves the color and the visual quality </strong></p><p><strong>TIFF G4 </strong></p><ul style="display: flex;"><li style="flex:1"><strong>JPEG </strong></li><li style="flex:1"><strong>TIFF </strong></li></ul><p><strong>PDF/A </strong></p><p></p><ul style="display: flex;"><li style="flex:1"><strong>23,8 MB </strong></li><li style="flex:1"><strong>65 kB </strong></li><li style="flex:1"><strong>60 kB </strong></li><li style="flex:1"><strong>180 kB </strong></li></ul><p></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>9</strong></p><p><strong>PDF Compressor Basics: How it works </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>TIFF </strong></p><p><strong>Network / Workflow </strong></p><p><strong>LuraDocument </strong></p><p><strong>JPEG </strong></p><p><strong>PDF Compressor </strong></p><p><strong>Scanner </strong></p><p><strong>PDF </strong></p><p><strong>Conversion and Optimization Process </strong><br><strong>Paper </strong></p><p><strong>Storage / ECM </strong></p><p><strong>Convert Scanned documents </strong></p><p><strong>Batch conversion “unattended” Fully automated </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>Demo </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Armin, let’s have a look! </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>11 </strong></p><p><strong>Question: </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>PDF/A: hype or the future archiving format? </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>12 </strong></p><p><strong>PDF/A – Example e-Government </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Medical and Student Records State of New York Long-term Archive </strong></p><p><strong>Department of Health Department of Education </strong></p><p><strong>Project Outline </strong></p><p><strong>Previously using 1 terabyte of storage every 2 weeks Capture all documents with Scan Service Provider with Fujistu and Kodak scanners </strong></p><p><strong>Convert images to optimized PDF/A with LuraDocument PDF Compressor </strong></p><p><strong>Deliver and store PDF/A documents with ECM </strong></p><p><strong>Results </strong></p><p><strong>High compressed PDF/A files reduce storage costs and bandwidth needs by 90% </strong></p><p><strong>Long term readability of all files with retention time of over 40 years </strong></p><p><strong>Files are now available quickly for daily research AIIM 2008: Best Practices Award </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>GTC West 2008: Best Solutions Award </strong></p><p><strong>13 </strong></p><p><strong>PDF/A – Example Credit Files </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Mailroom for credit files and international checks Example: HeLaBa (German State Bank) Mailroom </strong></p><p><strong>Revenue: 168B Euros Employees: 5,700 </strong></p><p><strong>Project Outline </strong></p><p><strong>Convert 20 Mio. Pages paper based archive to PDF/A Convert all daily incoming mail to PDF/A Create complete electronic credit files Used tools: LuraTech PDF Compressor, Kofax Ascent, EMC Centera, Wincor Nixdorf archive:net (Taxnet) </strong></p><p><strong>Results </strong></p><p><strong>Full color scans in electronic archive High compressed PDF/A files Full text searchable credit files Long term readability of credit files </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>First step on the way to single archiving format </strong></p><p><strong>14 </strong></p><p><strong>Billions of Pages Preserved </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Airbus (D) </strong><br><strong>Library of Congress (USA) </strong></p><p><strong>OCE (NL/D) </strong><br><strong>AOK (D) APO-Bank (D) </strong><br><strong>RWE Energy (D) </strong></p><p><strong>Bank Julius Baer (CH) </strong></p><p><strong>Blohm &amp; Voss (D) </strong><br><strong>Siemens (D) Southern Nuclear (USA) Southern CA Edison (USA) West LB (D) </strong><br><strong>Bosch Rexroth (D) British Library (UK) City of Arlington (USA) City of Toronto (CA) DAK Insurance (D) Department of Defense (USA) Harvard Library (USA) Het Utrechts Archief (NL) International Labor </strong><br><strong>Sparkassen Informatik (D) State of New York (USA) Swiss RE (CH) Universa Insurance (D) Vattenfall (D) </strong></p><p><strong>A few of the projects that LuraTech knows about. </strong></p><p><strong>Organization (CH) </strong></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>15 </strong></p><p><strong>PDF/A for Scanned Documents </strong></p><p><a href="/goto?url=http://www.pdfa.org" target="_blank"><strong>www.pdfa.org </strong></a></p><p><strong>Thanks your interest! </strong><br><strong>Please fill out our questionnaire. </strong></p><p><strong>Demo software or more information? </strong></p><p><a href="mailto:[email protected]" target="_blank"><strong>[email protected] </strong></a></p><p>Mark McKinney <br>President, LuraTech, Inc. </p><p><strong>16 </strong></p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    16 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us