PDF/A in a Nutshell 2.0 PDF for Long-Term Archiving
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The LAS File Format Contains a Header Block, Variable Length
LAS Specification Version 1.2 Approved by ASPRS Board 09/02/2008 LAS 1.2 1 LAS FORMAT VERSION 1.2: This document reflects the second revision of the LAS format specification since its initial version 1.0 release. Version 1.2 retains the same structure as version 1.1 including identical field alignment. LAS 1.1 file Input/Output (I/O) libraries will require slight modifications in order to be compliant with this revision. A LAS 1.1 Reader will read LAS 1.2 (without the new enhancements) with no modifications. A detailed change document that provides both an overview of the changes in the specification as well as the motivation behind each change is available from the ASPRS website in the LIDAR committee section. The additions of LAS 1.2 include: • GPS Absolute Time (as well as GPS Week Time) – LAS 1.0 and LAS 1.1 specified GPS “Week Time” only. This meant that GPS time stamps “rolled over” at midnight on Saturday. This makes processing of LIDAR flight lines that span the time reset difficult. LAS 1.2 allows both GPS Week Time and Absolute GPS Time (POSIX) stamps to be used. • Support for ancillary image data on a per point basis. You can now specify Red, Green, Blue image data on a point by point basis. This is encapsulated in two new point record types (type 2 and type 3). LAS FORMAT DEFINITION: The LAS file is intended to contain LIDAR point data records. The data will generally be put into this format from software (e.g. -
Why ODF?” - the Importance of Opendocument Format for Governments
“Why ODF?” - The Importance of OpenDocument Format for Governments Documents are the life blood of modern governments and their citizens. Governments use documents to capture knowledge, store critical information, coordinate activities, measure results, and communicate across departments and with businesses and citizens. Increasingly documents are moving from paper to electronic form. To adapt to ever-changing technology and business processes, governments need assurance that they can access, retrieve and use critical records, now and in the future. OpenDocument Format (ODF) addresses these issues by standardizing file formats to give governments true control over their documents. Governments using applications that support ODF gain increased efficiencies, more flexibility and greater technology choice, leading to enhanced capability to communicate with and serve the public. ODF is the ISO Approved International Open Standard for File Formats ODF is the only open standard for office applications, and it is completely vendor neutral. Developed through a transparent, multi-vendor/multi-stakeholder process at OASIS (Organization for the Advancement of Structured Information Standards), it is an open, XML- based document file format for displaying, storing and editing office documents, such as spreadsheets, charts, and presentations. It is available for implementation and use free from any licensing, royalty payments, or other restrictions. In May 2006, it was approved unanimously as an International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) standard. Governments and Businesses are Embracing ODF The promotion and usage of ODF is growing rapidly, demonstrating the global need for control and choice in document applications. For example, many enlightened governments across the globe are making policy decisions to move to ODF. -
2 Portable Document Format (Pdf)
ESCOLA TÈCNICA SUPERIOR D’ENGINYERIA ELECTRÒNICA I INFORMÀTICA LA SALLE PROJECTE FI DE CARRERA ENGINYERIA EN INFORMÀTICA PDFLab ALUMNE PROFESSOR PONENT Yuri González Azín Mª Antonia Mozota Coloma ACTA DE L'EXAMEN DEL PROJECTE FI DE CARRERA Reunit el Tribunal qualificador en el dia de la data, l'alumne D. Yuri González Azín va exposar el seu Projecte de Fi de Carrera, el qual va tractar sobre el tema següent: PDFLab Acabada l'exposició i contestades per part de l'alumne les objeccions formulades pels Srs. membres del tribunal, aquest valorà l'esmentat Projecte amb la qualificació de Barcelona, VOCAL DEL TRIBUNAL VOCAL DEL TRIBUNAL PRESIDENT DEL TRIBUNAL Abstract Este proyecto se ha encargado de desarrollar la herramienta PDFLab. PDFLab es una herramienta capaz de leer ficheros PDF y generar una estructura intermedia para implementar cualquier otro tipo de aplicaciones. En este documento se puede apreciar todo el estudio que ha sido necesario para desarrollar este proyecto. Así como, un estudio de las aplicaciones que existen en el mercado. Y documentación técnica sobre las etapas del desarrollo, tales como, Análisis, Diseño e Implementación. i A mi padre porque este trabajo es tanto suyo como mío. iii Resumen En este proyecto se ha desarrollado PDFLab, una herramienta capaz de leer ficheros PDF y generar una estructura intermedia para implementar cualquier otro tipo de aplicaciones. En el desarrollo se comprende el estudio de la estructura interna de los ficheros PDF, estudio de las técnicas de desarrollo, tanto metodológicas como de diseño, toma de requerimientos, análisis de requerimientos, diseño orientado a objetos, implementación, pruebas y mantenimiento. -
Key Aspects in 3D File Format Conversions
Key Aspects in 3D File Format Conversions Kenton McHenry and Peter Bajcsy Image Spatial Data Analysis Group, NCSA Presented by: Peter Bajcsy National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Outline • Introduction • What do we know about 3D file formats? • Basic Archival Questions • Is there an optimal format to convert to? • Can we quantify 3D noise introduced during conversions? • NCSA Polyglot to Support Archival Processes • Automation of File Format Conversions • Quality of File Format Conversions • Scalability with Volume • Conclusions • Live demonstration Introduction Introduction to 3D File Format Reality *.k3d *.pdf (*.prc, *.u3d) *.ma, *.mb, *.mp *.w3d *.lwo *.c4d *.dwg *.blend *.iam *.max, *.3ds Introduction: Our Survey about 3D Content • Q: How Many 3D File Formats Exist? • A: We have found more than 140 3D file formats. Many are proprietary file formats. Many are extremely complex (1,200 and more pages of specifications). • Q: How Many Software Packages Support 3D File Format Import, Export and Display? • A: We have documented about 16 software packages. There are many more. Most of them are proprietary/closed source code. Many contain incomplete support of file specifications. Examples of Formats and Stored Content Format Geometry Appearance Scene Animation Faceted Parametric CSG B-Rep Color Material Texture Bump Lights Views Trans. Groups 3ds √ √ √ √ √ √ √ √ √ igs √ √ √ √ √ √ √ lwo √ √ √ √ √ √ obj √ √ √ √ √ √ √ ply √ √ √ √ √ stp √ √ √ √ √ √ wrl √ √ √ √ √ √ √ √ √ √ √ u3d √ √ √ √ √ -
The Microsoft Compound Document File Format"
OpenOffice.org's Documentation of the Microsoft Compound Document File Format Author Daniel Rentz ✉ mailto:[email protected] http://sc.openoffice.org License Public Documentation License Contributors Other sources Hyperlinks to Wikipedia ( http://www.wikipedia.org) for various extended information Mailing list ✉ mailto:[email protected] Subscription ✉ mailto:[email protected] Download PDF http://sc.openoffice.org/compdocfileformat.pdf XML http://sc.openoffice.org/compdocfileformat.odt Project started 2004-Aug-30 Last change 2007-Aug-07 Revision 1.5 Contents 1 Introduction ......................................................................................................... 3 1.1 License Notices 3 1.2 Abstract 3 1.3 Used Terms, Symbols, and Formatting 4 2 Storages and Streams ........................................................................................... 5 3 Sectors and Sector Chains ................................................................................... 6 3.1 Sectors and Sector Identifiers 6 3.2 Sector Chains and SecID Chains 7 4 Compound Document Header ............................................................................. 8 4.1 Compound Document Header Contents 8 4.2 Byte Order 9 4.3 Sector File Offsets 9 5 Sector Allocation ............................................................................................... 10 5.1 Master Sector Allocation Table 10 5.2 Sector Allocation Table 11 6 Short-Streams ................................................................................................... -
File Formats
Electronic Records Management Guidelines - File Formats Rapid changes in technology mean that file formats can become obsolete quickly and cause problems for your records management strategy. A long-term view and careful planning can overcome this risk and ensure that you can meet your legal and operational requirements. Legally, your records must be authentic, complete, accessible, legally admissible in court, and durable for as long as your approved records retention schedules require. For example, you can convert a record to another, more durable format (e.g., from a nearly obsolete software program to a text file). That copy, as long as it is created in a trustworthy manner, is legally acceptable. The software in which a file is created usually has a default format, often indicated by a file name suffix (e.g., *.PDF for portable document format). Most software allows authors to select from a variety of formats when they save a file (e.g., document [DOC], Rich Text Format [RTF], text [TXT] in Microsoft Word). Some software, such as Adobe Acrobat, is designed to convert files from one format to another. Legal Framework: Key Concepts As you consider the file format options available to you, you will need to be familiar with the following concepts: Proprietary and non-proprietary file formats File format types Preservation: conversion and migration Compression Importance of planning File Format Decisions and Electronic Records Management Goals Proprietary and Non-proprietary File Formats A file format is usually described as either proprietary or non-proprietary: Proprietary formats. Proprietary file formats are controlled and supported by just one software developer, or can only be read by a limited number of other programs. -
EDTA Conference: Part 1
EDTA, iText and INBATEK Conference Bangkok, July 27, 2017 © 2015, iText Group NV, iText Software Corp., iText Software BVBA How standards drive business . History of PDF . Umbrella of standards . Focus on: PDF/A, PDF/UA, PAdES, PDF 2.0, next-generation PDF © 2015, iText Group NV, iText Software Corp., iText Software BVBA Speaking the same language Not being able to understand each other is a punishment, NOT a business model! Standards are about speaking the same language! 3 How standards drive business © 2015, iText Group NV, iText Software Corp., iText Software BVBA History of PDF Version Date # pages Content Adobe PDF 1.0 June 1993 230 43 tables, 42 figures Adobe PDF 1.1 23 January 1996 302 20 references Adobe PDF 1.2 12 November 1996 394 137 tables, 86 examples Adobe PDF 1.3 July 2000 696 223 tables, 73 figures Adobe PDF 1.4 December 2001 978 277 tables, 20 color plates Adobe PDF 1.5 August 2003 1172 333 tables, 70 figures Adobe PDF 1.6 November 2004 1236 370 tables, 80 figures Adobe PDF 1.7 October 2006 1310 389 tables, 98 figures ISO 32000-1:2008 (PDF 1.7) 1 July 2008 756 (A4) 78 Normative References ISO 32000-2:2017 (PDF 2.0) 2017 970 (A4) 5836 “shall”, 411 “should” 4 How standards drive business © 2015, iText Group NV, iText Software Corp., iText Software BVBA PDF: an umbrella of standards PDF Portable Document Format First released by Adobe in 1993 ISO Standard since 2008 ISO 32000 PDF/X PDF/A PDF/E PDF/VT PDF/UA Related: graphic arts archive engineering printing accessibility • EcmaScript (ISO) • PRC (ISO) Since 2001 Since 2005 Since 2008 Since 2010 Since 2012 • PAdES (ETSI) ISO 15930 ISO 19005 ISO 24517 ISO 16612 ISO 14289 • ZUGFeRD (DIN) 5 How standards drive business © 2015, iText Group NV, iText Software Corp., iText Software BVBA PDF/A ISO 19005: long-term preservation © 2015, iText Group NV, iText Software Corp., iText Software BVBA Goals and concept ISO-19005, Long-term preservation of documents, Approved parts will never become invalid, Individual parts define new, useful features. -
Fileweaver: Flexible File Management with Automatic Dependency Tracking Julien Gori Han L
FileWeaver: Flexible File Management with Automatic Dependency Tracking Julien Gori Han L. Han Michel Beaudouin-Lafon Université Paris-Saclay, CNRS, Inria, Laboratoire de Recherche en Informatique F-91400 Orsay, France {jgori, han.han, mbl}@lri.fr ABSTRACT Specialized tools typically load and save information in pro- Knowledge management and sharing involves a variety of spe- prietary and/or binary data formats, such as Matlab1 .mat cialized but isolated software tools, tied together by the files files or SPSS2 .sav files. Knowledge workers have to rely on that these tools use and produce. We interviewed 23 scientists standardized exchange file formats and file format converters and found that they all had difficulties using the file system to communicate information from one application to the other, to keep track of, re-find and maintain consistency among re- leading to a multiplication of files. lated but distributed information. We introduce FileWeaver, a system that automatically detects dependencies among files Moreover, as exemplified by Guo’s “typical” workflow of a without explicit user action, tracks their history, and lets users data scientist [8, Fig. 2.1], knowledge workers’ practices often interact directly with the graphs representing these dependen- consist of several iterations of exploratory, production and cies and version history. Changes to a file can trigger recipes, dissemination phases, in which workers create copies of files either automatically or under user control, to keep the file con- to save their work, file revisions, e.g. to revise the logic of sistent with its dependants. Users can merge variants of a file, their code, and file variants, e.g. -
Common Object File Format (COFF)
Application Report SPRAAO8–April 2009 Common Object File Format ..................................................................................................................................................... ABSTRACT The assembler and link step create object files in common object file format (COFF). COFF is an implementation of an object file format of the same name that was developed by AT&T for use on UNIX-based systems. This format encourages modular programming and provides powerful and flexible methods for managing code segments and target system memory. This appendix contains technical details about the Texas Instruments COFF object file structure. Much of this information pertains to the symbolic debugging information that is produced by the C compiler. The purpose of this application note is to provide supplementary information on the internal format of COFF object files. Topic .................................................................................................. Page 1 COFF File Structure .................................................................... 2 2 File Header Structure .................................................................. 4 3 Optional File Header Format ........................................................ 5 4 Section Header Structure............................................................. 5 5 Structuring Relocation Information ............................................... 7 6 Symbol Table Structure and Content........................................... 11 SPRAAO8–April 2009 -
Preservation with PDF/A (2Nd Edition)
01000100 01010000 Preservation 01000011 with PDF/A (2nd Edition) 01000100 Betsy A Fanning 01010000 AIIM 01000011 01000100 DPC Technology Watch Report 17-01 July 2017 01010000 01000011 01000100 01010000 01000011 Series editors on behalf of the DPC Charles Beagrie Ltd. 01000100 Principal Investigator for the Series Neil Beagrie 01010000 01000011 © Digital Preservation Coalition 2017, Betsy A Fanning 2017, and AIIM 2017, unless otherwise stated ISSN: 2048-7916 DOI: http://dx.doi.org/10.7207/twr17-01 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior permission in writing from the publisher. The moral rights of the author have been asserted. First published in Great Britain in 2008 by the Digital Preservation Coalition. Second Edition 2017. Foreword The Digital Preservation Coalition (DPC) is an advocate and catalyst for digital preservation, ensuring our members can deliver resilient long-term access to digital content and services. It is a not-for-profit membership organization whose primary objective is to raise awareness of the importance of the preservation of digital material and the attendant strategic, cultural and technological issues. It supports its members through knowledge exchange, capacity building, assurance, advocacy and partnership. The DPC’s vision is to make our digital memory accessible tomorrow. The DPC Technology Watch Reports identify, delineate, monitor and address topics that have a major bearing on ensuring our collected digital memory will be available tomorrow. They provide an advanced introduction in order to support those charged with ensuring a robust digital memory, and they are of general interest to a wide and international audience with interests in computing, information management, collections management and technology. -
Design and Implementation of a Framework for Viewing and Analysis of Malicious Documents
Masaryk University Faculty of Informatics Design and implementation of a framework for viewing and analysis of malicious documents Diploma Thesis Bc. Richard Nossek 2013 1 2 Statement I hereby declare that I have worked on this thesis independently using only the sources listed in the bibliography. All resources, sources, and literature, which I used in preparing or I drew on them, I quote in the thesis properly with stating the full reference to the source. 3 Acknowledgement I would like to thank my advisor, RNDr. Václav Lorenc, for his guidance and advice during my work on this thesis. 4 Abstract The goal of this thesis is to provide an in-depth assessment of the use of PDF (Portable Document Format) file format as an attack vector and the current state of the field of malicious document analysis. First, we provide detailed introduction into the inner organization and structure of PDF files and describe how different features can be used for obfuscation purposes. Next, we survey available options for viewing PDF documents in web browser environment, as well as tools for PDF document analysis. The practical part consists of designing and implementing a web application that serves as framework for malicious document analysis. Keywords Portable Document Format, PDF, malware, malicious documents, PDF analysis, PDF analysis tools, analysis framework. 5 Contents Introduction 8 1 Portable Document Format 9 1.1 Version history 9 1.2 PDF architecture 11 1.3 PDF file structure 11 1.3.1 File header 12 1.3.2 File body 12 1.3.2.1 Boolean objects -
Image Formats
Image Formats Ioannis Rekleitis Many different file formats • JPEG/JFIF • Exif • JPEG 2000 • BMP • GIF • WebP • PNG • HDR raster formats • TIFF • HEIF • PPM, PGM, PBM, • BAT and PNM • BPG CSCE 590: Introduction to Image Processing https://en.wikipedia.org/wiki/Image_file_formats 2 Many different file formats • JPEG/JFIF (Joint Photographic Experts Group) is a lossy compression method; JPEG- compressed images are usually stored in the JFIF (JPEG File Interchange Format) >ile format. The JPEG/JFIF >ilename extension is JPG or JPEG. Nearly every digital camera can save images in the JPEG/JFIF format, which supports eight-bit grayscale images and 24-bit color images (eight bits each for red, green, and blue). JPEG applies lossy compression to images, which can result in a signi>icant reduction of the >ile size. Applications can determine the degree of compression to apply, and the amount of compression affects the visual quality of the result. When not too great, the compression does not noticeably affect or detract from the image's quality, but JPEG iles suffer generational degradation when repeatedly edited and saved. (JPEG also provides lossless image storage, but the lossless version is not widely supported.) • JPEG 2000 is a compression standard enabling both lossless and lossy storage. The compression methods used are different from the ones in standard JFIF/JPEG; they improve quality and compression ratios, but also require more computational power to process. JPEG 2000 also adds features that are missing in JPEG. It is not nearly as common as JPEG, but it is used currently in professional movie editing and distribution (some digital cinemas, for example, use JPEG 2000 for individual movie frames).