File Formatsformats
Total Page:16
File Type:pdf, Size:1020Kb
FILEFILE FORMATSFORMATS Summary records created or formatted electronically are Rapid changes in technology mean that file formats covered under the act. can become obsolete quickly and cause problems for Proprietary, Non-proprietary, Open your records management strategy. A long-term view Standard and Open Source File Formats and careful planning can overcome this risk and ◆ ensure that you can meet your legal and operational Proprietary formats. Proprietary file formats are requirements. controlled and supported by just one software developer. Microsoft Word (.DOC) format is an Legally, your records must be trustworthy, complete, example. accessible, admissible in court, and durable for as ◆ Non-proprietary formats. These formats are long as your approved records retention schedules supported by more than one developer and can require. For example, you can convert a record to be accessed with different software systems. For another, more durable format (e.g., from a nearly example, eXtensible Markup Language (XML) is obsolete software program to a text file). That copy, becoming an increasingly popular non-proprietary as long as it is created in a trustworthy manner, is format. legally acceptable. ◆ Open Source formats. In general, open source The software in which a file is created usually has a refers to any program whose source code is made default format, often indicated by a file name suffix available for use or modification as users or other (e.g., *.PDF for portable document format). Most developers see fit. Open source software may be software allows authors to select from a variety of developed, modified and distributed by formats when they save a file (e.g., document independent software companies for profit. The [DOC], Rich Text Format [RTF], text [TXT]). Some Linux operating system is an example. software, such as Adobe Acrobat, is designed to ◆ convert files from one format to another. Open Standard formats. Open standard software formats are created using publicly available Legal Framework specifications. Although software source codes remain proprietary, the availability of the For more information on the legal framework you standard increases compatibility by allowing must consider when selecting digital file formats, other developers to create hardware and software refer to the chapter Records Management in an solutions that interact with, or substitute for, Electronic Environment in the Electronic Records other software. The Portable Document Format Management Guidelines and Appendix A6 of the (.PDF) is based on an open standard. Trustworthy Information Systems Handbook. Also review the requirements of the: File Format Types ◆ Public Records Act [PRA] (Code of Laws of South There are hundreds of file formats used to encode Carolina, 1976, Section 30-1-10 through digital information. Below are brief descriptions of 30-1-140, as amended) available at the basic files you are likely to encounter. Use the www.scstatehouse.org/code/t30c001.htm, which resources in the Annotated List of Resources for supports government accountability by mandating more detailed information on specific file formats. the use of retention schedules to manage records Basic file format types include: of South Carolina public entities. This law governs the management of all records created by agencies or entities supported in whole or in part MORE ➔ by public funds in South Carolina. Section 30-1-70 establishes your responsibility to protect South Carolina Department of Archives & History the records you create and to make them www.state.sc.us/scdah/erg/erg.htm available for easy use. The act does not January 2005 Version 1 — FF discriminate between media types. Therefore, Page 1 ◆ Text files. Text files are most often created in are widely usable in many different software word processing software programs. Common file programs. TIFF files are either uncompressed formats for text files include: or compressed using a lossless algorithm — Proprietary formats, such as Microsoft Word – Graphics Interchange Format (GIF) files, files and WordPerfect files, which carry the which are widely used for Internet extension of the software in which they were applications. GIF is a lossless compression created. format but is limited to 256 colors or less. — RTF or Rich Text Format files, are supported by – Joint Photographic Experts Group (JPEG) a variety of applications and saved with files, which are used for full-color or gray- formatting instructions (such as page layout). scale images. Used primarily for photographs, — Portable Document Format (PDF) files contain the standard JPEG format uses a lossy an image of the page, including text and compression algorithm that discards some graphics. PDF files are widely used for read- information to achieve a smaller file size. only file sharing and printing. Adobe Acrobat is, by far, the most popular PDF file although – Portable Network Graphics (PNG) files. A other types are available. Acrobat reader, lossless compression designed to replace GIF available for no charge, is necessary for files. PNG is completely patent and license reading an Adobe PDF file. free and is of higher quality than GIF. ◆ ◆ Graphics files. Graphics files store an image (e.g., Data files. Data files are created in database photograph, drawing) and are divided into two software programs. Data files are divided into basic types: fields and tables that contain discrete elements of information. The software builds the — Vector-based files that store the image as relationships between these discrete elements. geometric shapes stored as mathematical For example, a customer service database may formulas, which allow the image to be scaled contain customer name, address, and billing without distortion. Common types of vector- history fields. These fields may be organized into based file formats include: separate tables (e.g., one table for all customer – Drawing Interchange Format (DXF) files, name fields). You may convert data files to a text which are widely used in computer-aided format, but you will lose the relationships among design software programs, such as those used the fields and tables. For example, if you convert by engineers and architects the information in the customer database to text, – Encapsulated PostScript (EPS) files, which you may end up with ten pages of names, ten are widely used in desktop publishing pages of addresses, and a thousand pages of software programs billing information, with no indication of which – Computer Graphics Metafile (CGM) files, information is related. which are widely used in many image- ◆ Spreadsheet files. Spreadsheet files store the oriented software programs (e.g., Photoshop) value of the numbers in their cells, as well as the and offer a high degree of durability relationships of those numbers. For example, one – Shapefiles (SHP), ESRI GIS applications use cell may contain the formula that sums two other vector coordinates to store non-topological cells. Like data files, spreadsheet files are most geometry and attribute information for often in the proprietary format of the software features. program in which they were created. Some software programs can import and export data — Raster-based files that store the image as a from other sources, including software programs collection of pixels. Raster graphics are also designed for such data sharing (e.g., Data referred to as bitmapped images. Raster Interchange Format [DIF]). Spreadsheet files can graphics cannot be scaled without distortion. be exported as text files, but the value and Common types of raster-based file formats relationship of the numbers are lost. include: – Bitmap (BMP) files, which are uncompressed, relatively low-quality files used most often in MORE ➔ word processing applications South Carolina Department of Archives & History – Tagged Image File Format (TIFF) files, which www.state.sc.us/scdah/erg/erg.htm January 2005 Version 1 — FF Page 2 ◆ Video and audio files. These files contain moving with content through the use of pre-defined images (e.g., digitized video, animation) and tags, HTML is simple to use but limited in sound data. They are most often created and scope. Other markup languages such as XHTML viewed in proprietary software programs and and XML offer greater flexibility. stored in proprietary formats. Common files — eXtensible Hypertext Markup Language formats in use include QuickTime, Motion Picture (XHTML) combines the flexibility found in XML Experts Group (MPEG) formats and Real Video. with the ease of use associated with HTML. ◆ Markup languages. Markup languages, also called Strict XHTML rules improve consistency and markup formats, contain embedded instructions provide the ability to create your own markup for displaying or understanding the content of tags. Because they share similar rules, the file. They provide the means to transmit and converting XHTML into XML is easier than share information over the web. The World Wide converting HTML into XML. Web Consortium (W3C) (www.w3c.org) supports — eXtensible Markup Language (XML) is a these standards. Common markup language file relatively simple language based on SGML that formats include the following: is gaining popularity for managing and sharing — Standard Generalized Markup Language (SGML), information. XML provides even greater a common markup language used in flexibility and control than XHTML while government offices worldwide, is an avoiding the complexities associated with international standard. HTML and XML are SGML. derived