PDF/A for Scanned Documents
Webinar www.pdfa.org PDF/A for Scanned Documents Paper Becomes Digital Mark McKinney, LuraTech, Inc., President Armin Ortmann, LuraTech, CTO Mark McKinney President, LuraTech, Inc. © 2009 PDF/A Competence Center, www.pdfa.org Existing Solutions for Scanned Documents www.pdfa.org Black & White: TIFF G4 Color: Mostly JPEG, but sometimes PNG, BMP and other raster graphics formats Often special version formats like “JPEG in TIFF” Disadvantages: Several formats already for scanned documents Even more formats for born digital documents Loss of information, e.g. with TIFF G4 Bad image quality and huge file size, e.g. with JPEG No standardized metadata spread over all formats Not full text searchable (OCR) inside of files Black/White: Color: - TIFF FAX G4 - TIFF - TIFF LZW Mark McKinney - JPEG President, LuraTech, Inc. - PDF 2 Existing Solutions for Scanned Documents www.pdfa.org Bad image quality vs. file size TIFF/BMP JPEG TIFF G4 23.8 MB 180 kB 60 kB Mark McKinney President, LuraTech, Inc. 3 Alternative Solution: PDF www.pdfa.org PDF is already widely used to: Unify file formats Image à PDF “Office” Documents à PDF Other sources à PDF Create full-text searchable files Apply modern compression technology (e.g. the JPEG2000 file formats family) Harmonize metadata Conclusion: PDF avoids the disadvantages of the legacy formats “So if you are already using PDF as archival Mark McKinney format, why not use PDF/A with its many President, LuraTech, Inc. advantages?” 4 PDF/A www.pdfa.org What is PDF/A? • ISO 19005-1, Document Management • Electronic document file format for long-term preservation Goals of PDF/A: • Maintain static visual representation of documents • Consistent handing of Metadata • Option to maintain structure and semantic meaning of content • Transparency to guarantee access • Limit the number of restrictions Mark McKinney President, LuraTech, Inc.
[Show full text]