A Framework for Mime Type Identification and Content Filtering In

Total Page:16

File Type:pdf, Size:1020Kb

A Framework for Mime Type Identification and Content Filtering In The Pennsylvania State University The Graduate School Department of Computer Science and Engineering A FRAMEWORK FOR MIME TYPE IDENTIFICATION AND CONTENT FILTERING IN THE FIREFOX WEB BROWSER A Thesis in Computer Science and Engineering by Matthew James Rummel c 2012 Matthew James Rummel Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science December 2012 The thesis of Matthew James Rummel has been reviewed and approved* by the following: Patrick McDaniel Professor of Computer Science and Engineering Thesis Adviser Trent Jaeger Associate Professor of Computer Science and Engineering Lee Coraor Associate Professor of Computer Science and Engineering Director of Graduate Affairs *Signatures are on file in the Graduate School iii Abstract Modern Web browser architectures allow for extensibility in order to support an evolving variety of content. Each supported plugin interacts with the browser and underlying host through a diverse set of operations that bring new challenges to the security model. These capabilities provide the means for a growing number of attack vectors that leverage the lax MIME type verification utilities in browsers to disguise malicious files. Once loaded by a browser, these objects take advantage of the escalated privileges available to their concealed payload in order to execute commands on the client. Such attacks can be launched from files shared on social media sites, through email, or from a server controlled by the attacker. To protect against these threats, we offer MIME Detector, a Firefox browser extension to identify and monitor the browser's use of loading objects. By utilizing a collection of open source tools and internal browser components, the tool is able to determine the MIME type of incoming content and enforce an acceptable use policy. Our testing shows that this research provides a solid framework towards providing users with a greater level of control over how Web based content interacts with their client. iv Table of Contents List of Tables :::::::::::::::::::::::::::::::::::::: v List of Figures ::::::::::::::::::::::::::::::::::::: vi Acknowledgments ::::::::::::::::::::::::::::::::::: vii Chapter 1. Introduction :::::::::::::::::::::::::::::::: 1 1.1 Camouflaging Malicious Content . 2 1.1.1 GIFAR . 2 1.1.2 Flash and ZIP Archives . 5 1.1.3 Chameleon Files . 6 1.2 Research Statement . 7 Chapter 2. Related Work ::::::::::::::::::::::::::::::: 9 2.1 Client Filtering . 9 2.1.1 String Based Filtering . 10 2.1.2 Control Flow Detection . 12 2.2 Server Filtering . 13 2.2.1 Common Approaches . 14 2.2.2 Automata Based . 15 2.3 Comparison to Project . 16 Chapter 3. Implementation :::::::::::::::::::::::::::::: 18 3.1 User Interface . 18 3.1.1 Site Elements . 18 3.1.2 Settings . 19 3.1.3 Action Log . 22 3.2 Browser Interaction . 23 3.2.1 Channel Proxy . 23 3.2.2 Content Evaluation . 24 3.3 MIME Identification and HTML Parsing . 26 Chapter 4. Evaluation ::::::::::::::::::::::::::::::::: 29 4.1 Rule Set Tests . 29 4.2 MIME Identification Tests . 32 4.3 Web Browsing Test . 34 Chapter 5. Conclusions :::::::::::::::::::::::::::::::: 37 Appendix. Web Browsing Test Results ::::::::::::::::::::::: 40 References :::::::::::::::::::::::::::::::::::::::: 52 v List of Tables 3.1 Monitored HTML tags and their associated reference attribute. 19 4.1 The result of the tag test evaluation. 31 4.2 The results of the camouflaged objects evaluation. 33 A.1 A sample rule set for general Web browsing. 41 A.2 An evaluation of identification results. 44 A.3 A listing of items blocked by the extension. 48 A.4 A comparison of collected performance metrics. 51 vi List of Figures 1.1 Sample Java and HTML code to launch a GIFAR attack that lists a user's files [9]. 4 1.2 A Postscript file modified to contain HTML code and an HTML file with a GIF header [7]. 7 3.1 The user interface tabs . 22 3.2 The stages of a file’s evaluation . 27 vii Acknowledgments I am appreciative of the guidance I have received from my advisor, Dr. Patrick McDaniel. His perspective and feedback were instrumental in leading this thesis to successful completion. I am also grateful for the support of my family and friends. Their unwavering encouragement has always been a positive influence in all of my endeavors. Most of all, I would like to express my deepest gratitude to Allison for her understanding, patience, and reassurance throughout the duration of this project | I couldn't have done it without her. 1 Chapter 1 Introduction The incorporation of Web 2.0 technologies in the World Wide Web has brought substantial changes to both the user experience and security model of Internet appli- cations. As a platform for services and user content, Web based products allow for in- creased ease of collaboration and data dissemination amongst distributed parties. This vast dispersion of files originating from end users combined with the execution of client side code can also be leveraged to compromise the privacy of users and the integrity of their devices. A recent report by Symantec lists blogs and Web communication as the category of websites most frequently utilized to launch such an attack [1]. The report further cites plugins, including Oracle Java; Adobe Flash; and Adobe Acrobat Reader, as commonly providing a mechanism for many malicious exploits. Research has shown these manipulations to include cross site forgeries [8], cross site script attacks [7], and malware [10]. Additionally, it has been revealed that any type of file, even those as seem- ingly benign as images, can be used to exploit properties of Web architectures [9] [7]. Thus, the ability to add media to websites coupled with the requirement that browsers support rich content presents an ongoing challenge in browser security. In this research, we examined a particular category of Web based attacks in which an object loaded into a browser is embedded with the payload of a malicious object of a different MIME type. By disguising malicious files in this manner, attackers are able 2 to circumvent content policies enforced by both browsers and servers. Such attacks have been described as content repurposing by Sundareswaran and Squicciarini [29] and \chameleons" by Barth et al [8]. The objective of this project was to develop a framework to prevent such exploits implemented as a browser extension. 1.1 Camouflaging Malicious Content Regardless of the method used to repurpose content, there are some common characteristics that can be recognized in each approach. Each attack implements some form of digital steganography, or the practice of disguising data by placing it within other data, thereby concealing the secret payload [4]. Although standard MIME types have recognizable signatures, the process of finding all signatures within a given payload of data has proven to be a difficult task at both the client and server. Furthermore, when MIME types are inferred through different recognition techniques, it is possible that the server will identify the object as being of one type, while the client attempts to utilize it as though it were another. The following descriptions exemplify the attack vectors and capabilities of hidden Web content. 1.1.1 GIFAR To date, the most highly publicized repurposing attack is the GIFAR, so named for its construction as a concatenation of an image, such as a GIF and a Java archive, or JAR. The GIFAR vulnerability was presented at the Black Hat USA Conference in 2008 based on research by Billy Rios and Petko Petkov. The attack was regarded as one of the top Web hacking techniques of that year based on its simplicity and ability 3 to compromise a victim's privacy [14]. The vulnerability was patched shortly after the presentation and is no longer a threat in versions of Java since 1.6.0.11 and 1.5.0.17 [9]. A notable property that contributed to the effectiveness of the GIFAR is its distribution through images. While most Web applications will not allow executable code to be uploaded, images are frequently permitted and wildly shared in social media and content management applications. In addition to third party sites, an attacker may consider storing the malicious content on their own domain and attract users to their site through advertisements or other means. Once the GIFAR is stored on a third party server, the attacker must find a way to embed HTML code that enables the JAR to execute. This code can be inserted into the webpage due to lax text input sanitation or other attack methods whereby HTML can be injected. An additional method is to upload the GIFAR to a server and then send an HTML email to the victim. The HTML message would contain the an <a> tag that embeds the <img> tag, thus referencing the GIFAR as a link. When the user clicks on the link, a page is loaded that invokes the applet and thus carries out the attack [29]. The overall extent of the a GIFAR's effectiveness is largely based on the security measures in place on the client, the browser settings, and the security awareness of a potential victim. A number of these scenarios were discussed by Ron Brandis, a researcher at EWA-Australia [9]. If the firewall setting on the user's local machine prohibits the GIFAR from establishing a connection back to a server controlled by the attacker, then no information can be retrieved. If a TCP tunnel can be established, then a fairly low level set of attacks could be launched to return information such as the target's internal IP address, send spam emails, or forward commands to botnets. The 4 // Included in Evil.class in a JAR concatenated to evil.gif public class Evil extends JApplet { public void start() { Socket socket=new Socket(attackerIP, attackterPort); OutDataStream out=new DataOutputStream( sock.getOutputStream()); Process p=Runtime.getRuntime().exec("ls -l"): BufferedReader in= new BufferedReader(new InputStreamReader(p.getInputStream())); String line = ""; while ((line = in.readLine()) !=null) out.writeUTF(line+"\n"); } } <!-- Included in loading HTML code--> <html > <body > <img src=evil.gif> <applet archive=evil.gif code=Evil.class> </body > </html > Fig.
Recommended publications
  • Generating File Format Identification and Checksums with DROID
    Electronic Records Modules Electronic Records Committee Congressional Papers Roundtable Society of American Archivists Generating File Format Identification and Checksums with DROID Brandon Hirsch Center for Legislative Studies [email protected] ____________________________________________________ Date Published: July 2016 Module#: ERCM001 Created 2016-07 CPR Electronic Records Committee File Format Identification & Checksum Generation with DROID May 2016 For Congressional Papers Roundtable Electronic Records Committee Table of Contents Table of Contents Overview and Rationale Procedural Assumptions Hardware and Software Requirements Workflow Configuring DROID Configuring DROID in Mac OS X Configuring DROID in Windows Starting DROID Starting DROID in Mac OS X Starting DROID in Windows What Do These Results Mean? Checksums Further Evaluation Exporting Results Filtering Reports Overview and Rationale File format identification is a critical component of digital preservation activities because it provides a reliable method for determining exactly what types of files are stored in your institution’s holdings. Understanding the contents of one’s holdings provides a foundation upon which additional preservation decisions are made. Additionally, generating checksums provides a reliable method for evaluating the identity and integrity of the specific files and objects in an institution’s digital holdings throughout the preservation lifecycle. The National Archives UK’s Digital Record Object IDentifier is one tool that can meet both of these needs. DROID’s primary function is to generate file format identification in compliance with the PRONOM registry, and to provide reports and/or exported results that can be used to 2 interpret the files within a data set. The exported results (i.e. exported to .csv) can also be used to enhance preservation information for a collection, accession, data set, etc.
    [Show full text]
  • Tools Used by CERP
    COLLABORATIVE ELECTRONIC RECORDS PROJECT EVALUATION OF TOOLS In order to process and preserve email collections for the pilot, tools were needed for format conversion, format detection, file comparison, and file extraction. One goal of the project was to address the realities that small to mid-sized institutions face with limited funding and technical staffing. During the project, various software applications (some free), metadata formats, and guides were used and evaluated. The summary below includes product information and the results of our trials. This report should not be considered an official endorsement of any product, nor is it a comprehensive list of every applicable product. Note: See glossary for format definitions. Product ABC Amber Outlook Converter Description ProcessText Group application that converts email into different formats such as PDF, HTML, and TXT. Trial version available. Vendor information “ABC Amber Outlook Converter is intended to help you keep your important emails, newsletters, other important messages organized in one file. It is a useful tool that converts your emails from MS Outlook to any document format (PDF, DOC, HTML, CHM, RTF, HLP, TXT, DBF, CSV, XML, MDB, etc.) easily and quickly. It generates the contents with bookmarks (in PDF, DOC, RTF and HTML), keeping hyperlinks. Also you can use this tool as MSG Converter. Currently our software supports more than 50 languages.” Intended CERP Use SIA tried it for some XML conversion of email before the XML parser- schema work was started by the CERP technical consultant. It can produce a report indicating number of unread items within the folders of an email account.
    [Show full text]
  • Download Download
    “What? So What?”: The Next-Generation JHOVE2 Architecture 123 The International Journal of Digital Curation Issue 3, Volume 4 | 2009 “What? So What”: The Next-Generation JHOVE2 Architecture for Format-Aware Characterization Stephen Abrams, California Digital Library, University of California Sheila Morrissey, Portico Tom Cramer, Stanford University Summary The JHOVE characterization framework is widely used by international digital library programs and preservation repositories. However, its extensive use over the past four years has revealed a number of limitations imposed by idiosyncrasies of design and implementation. With funding from the Library of Congress under its National Digital Information Infrastructure Preservation Program (NDIIPP), the California Digital Library, Portico, and Stanford University are collaborating on a two-year project to develop and deploy a next-generation architecture providing enhanced performance, streamlined APIs, and significant new features. The JHOVE2 Project generalizes the concept of format characterization to include identification, validation, feature extraction, and policy-based assessment. The target of this characterization is not a simple digital file, but a (potentially) complex digital object that may be instantiated in multiple files.1 1 This article is based on the paper given by the authors at iPRES 2008; received April 2009, published December 2009. The International Journal of Digital Curation is an international journal committed to scholarly excellence and dedicated to the advancement of digital curation across a wide range of sectors. ISSN: 1746-8256 The IJDC is published by UKOLN at the University of Bath and is a publication of the Digital Curation Centre. 124 “What? So What?”: The Next-Generation JHOVE2 Architecture Introduction Digital preservation is the set of intentions, strategies, and activities directed toward ensuring the continuing usability of digital objects over time.
    [Show full text]
  • A Novel Approach of MIME Sniffing Using
    ISSN: 2277-3754 ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 4, Issue 11, May 2015 A Novel Approach of MIME Sniffing using AES Ankita Singh, Amit Saxena, Dr.Manish Manoria TRUBA Institute of Engineering and Information Technology (TIEIT), Bhopal (M.P) We discuss some web application attacks which can be Abstract— In today’s scenario communication is rely on possible over browser also discuss security concern can be web, users can access these information from web with the use applied in future for security on web application of browsers, as the usage of web increases the security of data is required. If browser renders malicious html contents or environment. JavaScript code block, the content sniffing attack may occur. The contents are divided in different sections. In section In this paper we provide a framework with AES algorithm to 2 we mention different types of attacks. Related work is secure the content sniffing for the web browsers with text, discussed in section 3. Proposed work is discussed in image and PDF files. In this work the data files having section 4. Result analysis in section 5. Conclusion and encryption then partition in multiple parts for reducing the future direction in Section 6, and then references are duration of file transmission and transferring with parity bit checking to identify the attack. mention. II. ATTACKS Index Terms— Cross-Site Scripting, Web Application We discuss about some attacks, associated with this Security, Content Sniffing, MIME, AES. work. ClickJacking[11] - The purpose of this attack is to open I.
    [Show full text]
  • Preserva'on*Watch What%To%Monitor%And%How%Scout%Can%Help
    Preserva'on*Watch What%to%monitor%and%how%Scout%can%help Luis%Faria%[email protected] KEEP%SOLUTIONS%www.keep7solu:ons.com Digital%Preserva:on%Advanced%Prac::oner%Course Glasgow,%15th719th%July%2013 KEEP$SOLUTIONS • Company%specialized%in%informa:on%management • Digital%preserva:on%experts • Open%source:%RODA,%KOHA,%DSpace,%Moodle,%etc. • Scien:fic%research • SCAPE:%large7scale%digital%preserva:on%environments • 4C:%digital%preserva:on%cost%modeling h/p://www.keep6solu'ons.com This%work%was%par,ally%supported%by%the%SCAPE%Project. The%SCAPE%project%is%co<funded%by%the%European%Union%under%FP7%ICT<2009.4.1%(Grant%Agreement%number%270137). 2 Preservation monitoring 3 Why do we need monitoring? Format obsolescence New standards Emerging technology Repository Producer trends Organisation Bit rot mission Resource capability Organisation System availability Consumer trends policies Security breach Economical limitations Social and political factors 4 Why do we need monitoring? Format obsolescence New standards Emerging technology Repository Producer trends Organisation Bit rot mission Risks Resource capability Organisation System availability Consumer trends Opportunities policies Security breach Economical limitations Social and political factors 5 SCAPE State of the Art • Digital Format Registries • Automatic Obsolescence Notification System (AONS) • Technology watch reports 6 SCAPE State of the Art • Digital Format Registries • Lack of coverage • Statically-defined generic risks • Lack of structure in risks • Focus on format obsolescence • AONS
    [Show full text]
  • The Application of File Identification, Validation, and Characterization Tools in Digital Curation
    THE APPLICATION OF FILE IDENTIFICATION, VALIDATION, AND CHARACTERIZATION TOOLS IN DIGITAL CURATION BY KEVIN MICHAEL FORD THESIS Submitted in partial fulfillment of the requirements for the degree of Master of Science in Library and Information Science in the Graduate College of the University of Illinois at Urbana-Champaign, 2011 Urbana, Illinois Advisers: Research Assistant Professor Melissa Cragin Assistant Professor Jerome McDonough ABSTRACT File format identification, characterization, and validation are considered essential processes for digital preservation and, by extension, long-term data curation. These actions are performed on data objects by humans or computers, in an attempt to identify the type of a given file, derive characterizing information that is specific to the file, and validate that the given file conforms to its type specification. The present research reviews the literature surrounding these digital preservation activities, including their theoretical basis and the publications that accompanied the formal release of tools and services designed in response to their theoretical foundation. It also reports the results from extensive tests designed to evaluate the coverage of some of the software tools developed to perform file format identification, characterization, and validation actions. Tests of these tools demonstrate that more work is needed – particularly in terms of scalable solutions – to address the expanse of digital data to be preserved and curated. The breadth of file types these tools are anticipated to handle is so great as to call into question whether a scalable solution is feasible, and, more broadly, whether such efforts will offer a meaningful return on investment. Also, these tools, which serve to provide a type of baseline reading of a file in a repository, can be easily tricked.
    [Show full text]
  • The Unified Digital Formats Registry
    article excerpted from: information StandardS Quarterly SPRING 2010 | VOL 22 | ISSUE 2 | ISSN 1041-0031 SPECIAL ISSUE: DIGITAL PRESERVATION Digital Preservation MetaData stanDarDs trUstworthy Digital rePositories UnifieD Digital forMats registry Audio-visUal Digitization GuiDelines Digital Preservation Education 26 FE the UUnifIeD Digital Formats registry a n D r e a g o e t h a l s a publication of the national information standards organization (NISO) fe 27 Why do we need a format registry for digital preservation? If you diligently protected a WordStar document for the last twenty-five years, all of its original bits may still be intact, but it would not be usable to anyone. Today’s computers do not have software that can open documents in the WordStar format. It’s not enough to keep digital bits safe; to fully preserve digital content we must make sure that it remains compatible with modern technology. Given that the ultimate goal of digital preservation is to keep content usable, practically how do we accomplish this? Somehow we need to be able to answer two questions: (1) is the content I’m managing in danger of becoming unusable, and if so, (2) how can I remedy this situation? Formats play a key role in determining if digital material is usable. While traditional books are human-readable, giving the reader immediate access to the intellectual content, to use a digital book, the reader needs hardware that runs software, that understands formats, composed of bits, to access the intellectual content. Without technological mediation, a digital book cannot be read. Formats are the bridge between the bits and the technologies needed to make sense of the bits.
    [Show full text]
  • Hands-On Laboratory on Web Content Injection Attacks
    TALLINN UNIVERSITY OF TECHNOLOGY Faculty of Information Technology Department of Computer Science TUT Centre for Digital Forensics and Cyber Security Hands-on laboratory on web content injection attacks Master’s thesis ITC70LT Anti Räis 121973IVCMM Supervisors Elar Lang, MSc Rain Ottis, PhD Tallinn 2015 Declaration I declare that this thesis is the result of my own research except as cited in the refer- ences. The thesis has not been accepted for any degree and is not concurrently submitted in candidature of any other degree. Anti Räis May 22, 2015 ........................ (Signature) Abstract This thesis focuses on explaining web application injection attacks in a practical hands-on laboratory. It is an improvement on Lang’s [1] master’s thesis about web appli- cation security. One of the main contributions of this thesis is gathering and structuring information about Cross Site Scripting (XSS) attacks and defenses and then presenting them in a practical learning environment. This is done to better explain the nuances and details that are involved in attacks against web applications. A thorough and clear under- standing of how these attacks work is the foundation for defense. The thesis is in English and contains 95 pages of text, 6 chapters, 4 figures, 27 tables. Annotatsioon Magistritöö eesmärk on selgitada kuidas töötavad erinevad kaitsemeetmed veebi- rakenduste rünnete vastu. Töö täiendab osaliselt Langi [1] magistritööd veebirakenduse rünnete kohta. Põhiline panus antud töös on koguda, täiendada ja struktureerida teavet XSS rünnete kohta ning luua õppelabor, kus on võimalik antud teadmisi praktikas rak- endada. See aitab kinnistada ja paremini mõista teemat. Selge ning täpne arusaamine, kuidas ründed toimuvad, on korrektse kaitse aluseks.
    [Show full text]
  • Der Security-Leitfaden Für Webentwickler
    Tangled Web - Der Security-Leitfaden für Webentwickler Deutsche Ausgabe – Aktualisiert und erweitert von Mario Heiderich von Michal Zalewski, Mario Heiderich 1. Auflage Tangled Web - Der Security-Leitfaden für Webentwickler – Zalewski / Heiderich schnell und portofrei erhältlich bei beck-shop.de DIE FACHBUCHHANDLUNG Thematische Gliederung: Netzwerksicherheit – Netzwerksicherheit dpunkt.verlag 2012 Verlag C.H. Beck im Internet: www.beck.de ISBN 978 3 86490 002 0 Inhaltsverzeichnis: Tangled Web - Der Security-Leitfaden für Webentwickler – Zalewski / Heiderich 245 13 Mechanismen zur Inhaltserkennung Bis jetzt haben wir einige gutgemeinte Browsermerkmale betrachtet, die sich im Laufe der Entwicklung der Technologie als kurzsichtig und geradezu gefährlich erwiesen haben. In der Geschichte des Web hat sich jedoch nichts als so fehlgelei- tet herausgestellt wie das sogenannte Content-Sniffing. Ursprünglich lag dem Content-Sniffing folgende simple Annahme zugrunde: Browseranbieter gingen davon aus, dass es in manchen Fällen angemessen – und sogar wünschenswert – sei, die normalerweise vom Server stammenden verbind- lichen Metadaten eines geladenen Dokuments zu ignorieren, so etwa den Header Content-Type. Anstatt die erklärte Absicht des Entwicklers zu akzeptieren, versu- chen viele existierende Browser stattdessen den Inhaltstyp zu erraten, indem sie proprietäre Heuristiken auf die vom Server zurückgegebenen Daten anwenden. Das Ziel dieses Vorgehens ist es, eventuelle Unstimmigkeiten zwischen Typ und Inhalt zu »korrigieren«. (Erinnern Sie sich
    [Show full text]
  • PREMIS Implementation Research
    Implementing the PREMIS data dictionary: a survey of approaches 4 June 2007 Deborah Woodyard-Robinson Woodyard-Robinson Holdings Ltd For The PREMIS Maintenance Activity sponsored by the Library of Congress ACKNOWLEDGEMENTS Particular thanks to the following people for contributing their time, expertise and cooperation for this report: Mathew Black, National Library of New Zealand Steve Bordwell, National Archives of Scotland Adrian Brown, National Archives, UK Priscilla Caplan, Florida Center for Library Automation, USA Gerard Clifton, National Library of Australia Ruth Duerr, National Snow and Ice Data Center, USA Rebecca Guenther, Library of Congress, USA Nancy Hoebelheinrich, Stanford University Libraries, USA Brian Lavoie, OCLC, USA Bronwyn Lee, Australian Partnership for Sustainable Repositories Yaniv Levi, ExLibris, Israel Justin Littman, Library of Congress, USA Julien Masanes, International Internet Preservation Consortium, France John Meyer, Portico, USA Mark Middleton, British Library, UK Gordon Mohr, International Internet Preservation Consortium, USA Barbara Sierman, Koninklijke Bibliotheek, The Netherlands Susan Thomas, Oxford University Library Services, UK Dave Thompson, Wellcome Trust, UK Andrew Wilson, Arts and Humanities Data Service, UK Implementing the PREMIS data dictionary Page 2 of 56 CONTENTS 1. EXECUTIVE SUMMARY...............................................................................6 2. INTRODUCTION ...........................................................................................9 Brief history
    [Show full text]
  • DPC Jargon Buster
    DPC Jargon Buster ADS: Archaeology Data Service, a digital archive specialising in archaeological data AIP: Archival Information Package, a package of information held within an OAIS ASCII: American Standard Code for Information Interchange, standard for electronic text BL: British Library CCSDS: Consultative Committee for Space Data Systems, originators of the OAIS standard Checksum: a unique numerical signature derived from a file. Used to compare copies DCC: Digital Curation Centre, data management advisory service for research DIP: Dissemination Information Package, the data disseminated from an OAIS DPA: Digital Preservation Award, biannual prize awarded by the DPC, won twice by TNA DPC: Digital Preservation Coalition, a membership body that supports digital preservation DPTP: Digital Preservation Training Programme, an intensive training course run by ULCC DRAMBORA: Digital Repository Audit Methodology Based on Risk Assessment DROID: tool developed and distributed by TNA to identify file formats. Based on PRONOM Emulation: the process of running old versions of software on modern hardware GIF: Graphic Interchange Format, an image which typically uses lossy compression GIS: Geographical Information System, a system that processes mapping and data together HTML: Hypertext Markup Language, a format used to present text on the World Wide Web Ingest: the process of turning an SIP into an AIP, ie putting data into a digital archive ISO: International Organization for Standardization, body that promotes standards JISC: Joint Information
    [Show full text]
  • The DROID Application Programming Interface
    The DROID Application Programming Interface Author: Adrian Brown Version: 1 Date: 6 September 2005 The DROID Application Programming Interface Document Control Author: Adrian Brown, Head of Digital Preservation Document Reference: DROID-API-1 Issue: 1 Issue Date: 6 September 2005 Document History Issue Author Date Comments 1 Adrian Brown 6 September 2005 Release version ©THE NATIONAL ARCHIVES 2005 Page 2 of 8 The DROID Application Programming Interface Contents 1 INTRODUCTION .....................................................................................................................4 2 RUNNING DROID FROM THE COMMAND LINE.............................................................5 3 EXAMPLES..............................................................................................................................6 3.1 Display a signature file version ...........................................................................6 3.2 Check signature file is up to date ........................................................................6 3.3 Download a new signature file.............................................................................6 3.4 Identify the formats of files in a list .....................................................................6 3.5 Write output to CSV ..............................................................................................6 3.6 Identify the formats of files listed in a File Collection file .................................7 4 ERRORS...................................................................................................................................8
    [Show full text]