File Formats Slides
Total Page:16
File Type:pdf, Size:1020Kb
Workshop Background Purpose • To provide you with resources and tools to help you know how to handle file format decisions as a researcher. Context • Workshop Series: Preservation and Curation of ETD Research Data and Complex Digital Objects • Other topics: Copyright, Data Organization, Metadata, Storage, Version Control • https://educopia.org/research/etdplus Photo by Dmitri Popov on Unsplash Learning Objectives • Understand you have a range of file format options and that each choice has implications for future use and access. • Gain exposure to tools for archiving particularly challenging file types (e.g., web pages). • Understand how to reduce your risk via using export and “save as” functions. Photo by Christian Fregnan on Unsplash Examples of file formats • Images: jpg, gif, tiff, png, ai, svg, ... • Video: mpeg, m2tvs, flv, dv, ... • GIS: kml, dxf, shp, tiff, ... • CAD: dxf, dwg, pdf, … • Data: csv, mdf, fp, spv, xls, tsv, ... Key concept The file formats you choose will determine how easy (or difficult!) your research outputs are to access and build upon in the future. How to choose • Use software that imports and exports data in common formats to which you know you’ll have long-term access. • Ask advisors and colleagues what formats they use and why. • Choose a format with functions that support your research needs (e.g., collaboration). • Save your content in multiple formats to spread your risk across software platforms (e.g., docx, pdf, & txt; or mp4, avi, & mpg). Photo by Bryan Minear on Unsplash Informing your decision • Sustainability of Digital Formats https://www.loc.gov/preservation/digital/ formats/intro/intro.shtml • Recommended Formats Statement https://www.loc.gov/preservation/resources/rf s/ Photo by Lou Levit on Unsplash Archiving Web-based Resources • Wayback Machine (Internet Archive) https://archive.org/web/ • Robust Links http://robustlinks.mementoweb.org/ • Screen shots Photo by Logan Popoff on Unsplash File Format Conversions • Options include proprietary, freeware, and open source solutions. • Formats in broad use usually have more available options for conversion. • When you convert the file, recognize that the process may transform your content. • Before you convert, identify what characteristics are most important to maintain in the conversion process. Photo by Chelsea Bock on Unsplash Specific PDF advice • Embed fonts. • Embed hyperlinks. • Stabilize hyperlinks. • Store supplementary materials as separate files. • Verify PDF/A compliance. • Test EVERYTHING. Photo by Ahmad Kadhim on Unsplash File Formats There is no perfect file format. Each will How to select file formats: Many ETD programs favor pdf files. If you have advantages and disadvantages ● Use software that imports and exports data export research outputs to pdf, make sure you: depending on your research uses. Select a in common formats. 1. Embed your fonts file format, or set of file formats, that helps ● Ask advisors and colleagues what formats 2. Embed (and test!) hyperlinks you complete your research now, and that they use. 3. Stabilize your web-based resources and you can access again in the future. This is citations (using a tool like Robust Links, ● Choose a format with functions that support important both for your research outputs Archive-It, or PermaCC) your research needs. (what you create) and your research inputs 4. Store supplementary materials as separate (materials you use in the research process). ● Save final versions of your content in files multiple formats in order to spread your risk 5. Verify the PDF/A compliance (use Acrobat across multiple software platforms (e.g., Pro “Preflight” feature under “Edit) Common file types include: docx, pdf, and txt; or mp4, avi, and mpg). ▪ Images: jpg, gif, tiff, png, ai, svg, … Before you undertake any conversion, you need ▪ Video: mpeg, m2tvs, flv, dv, … to identify what characteristics of your data are ▪ GIS: kml, dxf, shp, tiff, … important to maintain during the conversion. For ▪ CAD: dxf, dwg, pdf, … example, are the colors in a document or image ▪ Data: csv, mdf, fp, spv, xlx, tsv, … important? Is the pagination essential? What ▪ Text: txt, rtf, tvi, doc, pdf… about references? You will want to test these after your conversion is complete to ensure that you have a conversion that will meet your needs. Consider what might happen if you can no longer use your software. Whether the software publisher goes bankrupt, the If you use website-based materials as evidence or Additional Resources: latest version refuses to read older data, references, take precautions to ensure that if the ● List of File Formats (Wikipedia) or you can’t afford a personal license for it content moves, changes, or disappears, you still ● Recommended Formats Statement (Library of Congress) after you graduate, the end result is the have evidence of its existence. Current tools to help ● Evaluating Your File Formats (UK same. Losing access to your software can you ensure the longevity of these materials include Robust Links and Archive-It. You can also take National Archives) mean losing your data, especially if it is the screenshots of important digital content in order to ● Reformatting Guides (US National only software that can read your data. preserve the look and feel of an object. Archives) Activity • Look at a folder of your research materials and answer the following questions. • What software do you need to access these materials? • Do you face a risk of losing access to that software, now or in the future? • Would a colleague be able to open and use your materials if you shared it with them? • Can you submit your thesis/dissertation and its related research materials using the file formats supported by the software you are using? ? Photo by Evan Dennis on Unsplash.