To Collect the Data, First We Generated Some Input Files

Total Page:16

File Type:pdf, Size:1020Kb

To Collect the Data, First We Generated Some Input Files An Comparison of Two Data Compression Program by Walter Scheper Table of Contents Executive Summary ........................................ 1 Methods .................................................. 3 Raw Data ................................................. 5 Analysis of Data ......................................... 6 Conclusion ............................................... 8 Appendix ................................................. 9 i Executive Summary The purpose of this study is to determine which compression program, WinZip v8.1 or WinRAR v2.90, is the better general purpose tool in terms of efficiency for compressing data. WinZip is the defacto standard consumer compression program. WinRAR uses a less well-known format and is the predominant method of compressing files for distribution via Usenet newsgroups. Deciding between the two is important, because an incorrect decision could waste time and space. Through the course of the study, I found that WinZip v8.1 significantly out-preformed WinRAR v2.90 for compressing both ASCII text and multimedia WAV files. When compressing ASCII text files, WinRAR typically compressed the data 7% smaller than WinZip, but at a cost of taking 10 times as long. The comparison is still worse for WinRAR when dealing with the hard-to-compress multimedia files. The difference in compression for multimedia files is reduced to less than 1%, with some cases actually in WinZip’s favor, while WinRAR now takes 13 times as long. These results are completely different from what I expected to see. Data compression is primarily achieved by encoding information so that fewer bits are used to represent those parts which occur more frequently. The 1 tradeoff is that less frequently used items must be encoded with more bits. As a result, what you count and the relative frequency of those parts will determine how well your data is compressed. The ASCII text I used was mostly written in the English language, which provides a particularly skewed frequency count and therefore something easily compressed. However, WAV files are much more random, meaning there is less skewing of the frequency count towards certain values. This makes compression difficult. Knowing this, what I expected to see was that WinZip and WinRAR would perform about the same, with WinRAR achieving slightly better performance, while the type of file would primarily affect overall efficiency. I was quite surprised to see just how much WinZip out preformed WinRAR and how little difference the file type made in overall efficiency. 2 Methods To collect the data, I first created the input files. This was done by searching the internet for a large number of files containing only ASCII text, preferably consisting of standard English. I then concatenated the files together until I had six files of ASCII text approximately 12,000,000 bytes in size. The six WAV files were created by 'ripping' music tracks from CDs, ripping each track generated a WAV file of approximately 26,000,000 bytes. Each file was then compressed using the latest versions of both WinRAR and WinZIP. The order in which I compressed the files was randomized to break up any linking characteristics and the computer was freshly re-booted to clear up as much system resources for the tests as possible. A stopwatch was used to time how long it took each program to compress a file, and the original and compressed file sizes were recorded. Finally, I calculated the ratio of compressed size to original size. The experiment resulte din two measures of efficiency that are important when dealing with computers: time and space. Since time and space are two key elements of computing efficiency, I wanted to explore these two measures together rather than apart. Using the gathered information, 3 I determined how much time it took WinRAR and WinZIP to compress a file by a certain percentage. To do this I, simply divided the percentage size difference by the time to compress. Doing this also eliminates the effect of the input WAV files having much larger sizes than the input text files and should give us a good measurement of the total efficiency of each compression program. This value was then used for determining whether program type (WinZip or WinRAR) or file type (wav or text) was the more important factor in determining overall efficiency. 4 Raw Data Trial Prog Type Time Orig. Comp. Ratio Cr/s 6 zip txt 3.09 12,003,805 4,533,235 37,765 12.22 18 zip txt 2.42 12,061,980 4,624,590 38,340 15.84 7 zip txt 3.97 12,007,756 4,290,472 35.731 9.000 24 zip txt 2.38 12,036,161 4,594,080 38.169 16.04 11 zip txt 3.38 12017031 4546661 37.835 11.19 5 zip txt 2.76 12021495 4433054 36.876 13.36 17 zip wav 6.31 27,694,844 26,274,644 94.872 15.04 22 zip wav 6.35 26041388 25007422 96.030 15.12 14 zip wav 8.15 34879148 32339260 92.734 11.38 10 zip wav 7.31 30853580 28462609 92.251 12.62 9 zip wav 5.97 26471804 25163007 95.056 15.92 3 zip wav 7.02 28482764 24572574 86.272 12.29 8 rar txt 24.90 12003805 3696517 30.795 1.237 4 rar txt 25.74 12061980 3772400 31.275 1.215 19 rar txt 23.27 12007756 3479978 28.981 1.245 20 rar txt 24.74 12036161 3758207 31.224 1.262 23 rar txt 24.99 12017031 3720525 30.960 1.241 1 rar txt 25.49 12021495 3601750 29.961 1.175 15 rar wav 81.82 27,694,844 26,335,278 95.091 1.162 16 rar wav 76.81 26041388 25065804 96.254 1.253 2 rar wav 102.59 34873148 32397711 92.902 0.9056 12 rar wav 90.30 30853580 28525337 92.454 1.024 21 rar wav 78.58 26471804 25191745 95.164 1.211 13 rar wav 83.04 28482764 24522980 86.098 1.037 5 Analysis of Data Even looking at just the raw data shows that WinZip has a significant performance edge on WinRAR: scoring an average of 13.3 in the Compression per Second statistic, as well as being about 10 times faster than WinRAR. However, the raw data fails to tell us if there is any interaction between the compression program types and the file types. For that we need to look at the output from Matlab in the Appendix. Going through the output from the means(), mfit() and ANOVA functions in Matlab quickly demonstrates the primacy of program type in determining the overall efficiency. First we look at the output of the means() function for each factor, program type and file type. Looking at this, we begin to see that program type is the more important factor, since the mean differences are 12.6295 and 11.7125 vs 0.7866 and -0.1304. This is also re-inforced by the Fitted Mean Effect in Table III. Finally, looking at the output of the ANOVA table we see that the program type is the only factor that has a F-value greater that one, 325.18140, and is also the only factor with a p-value that indicates statistical significance. These results strongly indicate that there is no interaction between program and file type, and that program 6 type is the most important factor in overall efficiency. 7 Conclusion The importance of this study is that it demonstrates that there is a major difference between these two compression programs in terms of efficiency. While WinRAR does achieve somewhat better compression, it has a cost in terms of time that outweighs the compression advantage. However, the study does not answer the question of the underlying cause of the results. It may be that the data compression algorithm used by WinRAR is simply slower, but it could be an implementation issue. In order to determine this, I would need more specific information about the algorithms used by WinZip and WinRAR. However, it is interesting to see that the WAV files, which are hard to compress, don’t affect the efficiency as I had assumed they would. This fact tends to point towards an implementation issue in WinRAR. 8 Appendix Mean Plot of Cr/s vs Program Type and File Type 9 I. means() output for Program Type Means of Compression/second, by Program Type Source N Mean zip 12 13.335 rar 12 1.164 II. Means() output for File Type Means of Compression/second, by File Type Source N Mean wav 12 7.4136 txt 12 7.0854 Table of means of Compression/second by Program Type and File Type; with Mean Differences x1 zip rar | x2 wav 13.7283 1.0988 | 12.6295 txt 12.9417 1.2292 | 11.7125 --------------------- 0.7866 -0.1304 III. mfit(comp.rps,comp.prog,comp.type) Overall Mean 7.2495 Fitted Main Effect of Compression/second, by Program Type Source N Main Effect zip 12 6.0855 rar 12 -6.0855 Fitted Main Effect of Compression/second, by File Type Source N Main Effect wav 12 0.16407 txt 12 -0.16407 Table of 2-way Program Type by File Type Interaction Effects 10 x1 zip rar x2 wav 0.22927 -0.22927 txt -0.22927 0.22927 VI. Lm output Sequential Sums of Squares ANOVA Table: Models Compression/second = Program Type + File Type + (Program Type * File Type) Source df SS MS F P-val prog 1 888.80430 888.80430 325.18140 7.7161e-14 type 1 0.64603 0.64603 0.23636 0.63213000 prog*type 1 1.26150 1.26150 0.46154 0.50469000 Error 20 54.66510 2.73330 R-square 0.94218 Standard Error 1.6533 11.
Recommended publications
  • Encryption Introduction to Using 7-Zip
    IT Services Training Guide Encryption Introduction to using 7-Zip It Services Training Team The University of Manchester email: [email protected] www.itservices.manchester.ac.uk/trainingcourses/coursesforstaff Version: 5.3 Training Guide Introduction to Using 7-Zip Page 2 IT Services Training Introduction to Using 7-Zip Table of Contents Contents Introduction ......................................................................................................................... 4 Compress/encrypt individual files ....................................................................................... 5 Email compressed/encrypted files ....................................................................................... 8 Decrypt an encrypted file ..................................................................................................... 9 Create a self-extracting encrypted file .............................................................................. 10 Decrypt/un-zip a file .......................................................................................................... 14 APPENDIX A Downloading and installing 7-Zip ................................................................. 15 Help and Further Reference ............................................................................................... 18 Page 3 Training Guide Introduction to Using 7-Zip Introduction 7-Zip is an application that allows you to: Compress a file – for example a file that is 5MB can be compressed to 3MB Secure the
    [Show full text]
  • Pack, Encrypt, Authenticate Document Revision: 2021 05 02
    PEA Pack, Encrypt, Authenticate Document revision: 2021 05 02 Author: Giorgio Tani Translation: Giorgio Tani This document refers to: PEA file format specification version 1 revision 3 (1.3); PEA file format specification version 2.0; PEA 1.01 executable implementation; Present documentation is released under GNU GFDL License. PEA executable implementation is released under GNU LGPL License; please note that all units provided by the Author are released under LGPL, while Wolfgang Ehrhardt’s crypto library units used in PEA are released under zlib/libpng License. PEA file format and PCOMPRESS specifications are hereby released under PUBLIC DOMAIN: the Author neither has, nor is aware of, any patents or pending patents relevant to this technology and do not intend to apply for any patents covering it. As far as the Author knows, PEA file format in all of it’s parts is free and unencumbered for all uses. Pea is on PeaZip project official site: https://peazip.github.io , https://peazip.org , and https://peazip.sourceforge.io For more information about the licenses: GNU GFDL License, see http://www.gnu.org/licenses/fdl.txt GNU LGPL License, see http://www.gnu.org/licenses/lgpl.txt 1 Content: Section 1: PEA file format ..3 Description ..3 PEA 1.3 file format details ..5 Differences between 1.3 and older revisions ..5 PEA 2.0 file format details ..7 PEA file format’s and implementation’s limitations ..8 PCOMPRESS compression scheme ..9 Algorithms used in PEA format ..9 PEA security model .10 Cryptanalysis of PEA format .12 Data recovery from
    [Show full text]
  • 7Z Zip File Download How to Open 7Z Files
    7z zip file download How to Open 7z Files. This article was written by Nicole Levine, MFA. Nicole Levine is a Technology Writer and Editor for wikiHow. She has more than 20 years of experience creating technical documentation and leading support teams at major web hosting and software companies. Nicole also holds an MFA in Creative Writing from Portland State University and teaches composition, fiction-writing, and zine-making at various institutions. There are 8 references cited in this article, which can be found at the bottom of the page. This article has been viewed 292,786 times. If you’ve come across a file that ends in “.7z”, you’re probably wondering why you can’t open it. These files, known as “7z” or “7-Zip files,” are archives of one or more files in one single compressed package. You’ll need to install an unzipping app to extract files from the archive. These apps are usually free for any operating system, including iOS and Android. Learn how to open 7z files with iZip on your mobile device, 7-Zip or WinZip on Windows, and The Unarchiver in Mac OS X. 7z zip file download. If you are looking for 7-Zip, you have come to the right place. We explain what 7-Zip is and point you to the official download. What is 7-Zip? 7-zip is a compression and extraction software similar to WinZIP and WinRAR, that can read from and write to .7z archive files (although it can open other compressed archives such as .zip and .rar, among others, even including limited access to the contents of .msi, or Microsoft Installer files, and .exe files, or executable files).
    [Show full text]
  • Winzip 12 Reviewer's Guide
    Introducing WinZip® 12 WinZip® is the most trusted way to work with compressed files. No other compression utility is as easy to use or offers the comprehensive and productivity-enhancing approach that has made WinZip the gold standard for file-compression tools. With the new WinZip 12, you can quickly and securely zip and unzip files to conserve storage space, speed up e-mail transmission, and reduce download times. State-of-the-art file compression, strong AES encryption, compatibility with more compression formats, and new intuitive photo compression, make WinZip 12 the complete compression and archiving solution. Building on the favorite features of a worldwide base of several million users, WinZip 12 adds new features for image compression and management, support for new compression methods, improved compression performance, support for additional archive formats, and more. Users can work smarter, faster, and safer with WinZip 12. Who will benefit from WinZip® 12? The simple answer is anyone who uses a PC. Any PC user can benefit from the compression and encryption features in WinZip to protect data, save space, and reduce the time to transfer files on the Internet. There are, however, some PC users to whom WinZip is an even more valuable and essential tool. Digital photo enthusiasts: As the average file size of their digital photos increases, people are looking for ways to preserve storage space on their PCs. They have lots of photos, so they are always seeking better ways to manage them. Sharing their photos is also important, so they strive to simplify the process and reduce the time of e-mailing large numbers of images.
    [Show full text]
  • How Do You Download Driver Fron 7 Zip Download Arduino and Install Arduino Driver
    how do you download driver fron 7 zip Download Arduino and install Arduino driver. You can direct download the latest version from this page: http://arduino.cc/en/Main/Software, When the download finishes, unzip the downloaded file. Make sure to preserve the folder structure. Double-click the folder to open it. There should be a few files and sub-folders inside. Connect Seeeduino to PC. Connect the Seeeduino board to your computer using the USB cable. The green power LED (labeled PWR) should go on. Install the driver. Installing drivers for the Seeeduino with window7. Plug in your board and wait for Windows to begin its driver installation process. After a few moments, the process will fail. Open the Device Manager by right clicking “My computer” and selecting control panel. Look under Ports (COM & LPT). You should see an open port named "USB Serial Port" Right click on the "USB Serial Port" and choose the "Update Driver Software" option. Next, choose the "Browse my computer for Driver software" option. Finally, select the driver file named "FTDI USB Drivers", located in the "Drivers" folder of the Arduino Software download. Note: the FTDI USB Drivers are from Arduino. But when you install drivers for other Controllers, such as Xadow Main Board, Seeeduino Clio, Seeeduino Lite, you need to download corresponding driver file and save it. And select the driver file you have downloaded. The below dialog boxes automatically appears if you have installed driver successfully. You can check that the drivers have been installed by opening the Windows Device Manager. Look for a "USB Serial Port" in the Ports section.
    [Show full text]
  • Installing Your Cinesamples Product - Cinebells
    Installing Your Cinesamples Product - CineBells Step 1. In your Downloads folder, or in the location you have told your browser to send your downloads, you will find a file called“CineBells.zip”along with the five .rar files shown to the left. First Unzip the zip file to create your main CineBells folder. Then you will need a utility that can extract the .rar files, like RarMarchine or UnRarX for Mac, or WinRar for Windows. No matter which you use, you will only need to select and unarchive the first file (part1). The utility will automatically create a Step 2. CineBells_Samples folder, and decompress all five .rar files into it in one step. Please DO NOT use Stuffit for this - it will not work correctly. Make sure to check that all five rars downloaded completely - notice the sizes to the left. After the rars are extracted, go into the CineBells_Samples folder, and drag the Samples Step 3. folder inside it into the main CineBells folder that was created from the zip file. Next drag your final CineBells folder to your sample hard drive. It should look like the picture below afterwards. If this is your first Cinesamples product, you may want to create a Cinesamples folder beforehand. The “CineBells_Samples”folder should now be empty and can be deleted. Step 4. Next, open the full version of Kontakt 4 (at least 4.2.3) or Kontakt 5, select the Files tab, and navigate to your CineBells folder. Double click or drag nki files into the main window to load the different instruments.
    [Show full text]
  • Licensing Information User Manual Release 8.5 F11004-03 August 2020
    Oracle® Outside In Licensing Information User Manual Release 8.5 F11004-03 August 2020 Introduction This Licensing Information document is a part of the product or program documentation under the terms of your Oracle license agreement and is intended to help you understand the program editions, entitlements, restrictions, prerequisites, special license rights, and/or separately licensed third party technology terms associated with the Oracle software program(s) covered by this document (the "Program(s)"). Entitled or restricted use products or components identified in this document that are not provided with the particular Program may be obtained from the Oracle Software Delivery Cloud website (https://edelivery.oracle.com) or from media Oracle may provide. If you have a question about your license rights and obligations, please contact your Oracle sales representative, review the information provided in Oracle’s Software Investment Guide (http://www.oracle.com/us/corporate/pricing/ software-investment-guide/index.html), and/or contact the applicable Oracle License Management Services representative listed on http://www.oracle.com/us/corporate/ license-management-services/index.html. Licensing Information Product Subproduct Licensing Information Outside In Outside In Product Editions and Permitted Features Software ActiveX Viewer Oracle Outside In Viewer is an embeddable SDK that Developer Kits and Outside In renders high-fidelity views of files and allows printing, Viewer copy/paste and annotations. Prerequisite Products None Entitled Products and Restricted Use Licenses None 1 Product Subproduct Licensing Information Outside In Outside In Web Product Editions and Permitted Features Software View Export Oracle Outside In Web View Export is an embeddable Developer Kits SDK that converts files into high-fidelity HTML5 renditions.
    [Show full text]
  • Affiliates' Manual
    Affiliates' Manual Last Update: December 2008 Index 1) Preface 2) Contacts 3) WinRAR at a glance WinRAR... Unique Selling Proposition Screenshots Comparison to Winzip license text & the shareware system Languages 4) Distribution network Partnership program Types of distribution partners 5) WinRAR Pricing Policy and Product Ids 6) Accounting 7) Gen erating WinRAR Affiliate Links 8) FAQ 1) Preface Aim of this manual The aim of this manual is to provide our distribution partners with all the information they need in order to be able to successfully sell WinRAR. It includes general information about win.rar GmbH and our products, as well as suggestions and guidelines about how to market WinRAR more effectively, and descriptions of the order, payment and accounting process. The manual is especially directed at new partners, giving them all the information they need in order to become a distribution partner. However, it can also be used as a work of reference for existing and already experienced partners. If there are any issues which are not mentioned in this document, please feel free to contact us at: partners@ win-rar.com . 2) Contacts General E-Mail: [email protected] Our office: Schumannstrasse 17 10117 Berlin Germany Telephone: +49 30 28886758 Fax: +49 30 28884514 Legal Representatives: Burak Canboy, Öncul Kaya We have made the experience that sometimes it is very difficult to communicate via telephone. Language problems or bad connections may easily lead to misunderstandings. In order to avoid this, we prefer to communicate with our affiliates via e-mail. If you have an urgent case, of course, you can call us.
    [Show full text]
  • Forcepoint DLP Supported File Formats and Size Limits
    Forcepoint DLP Supported File Formats and Size Limits Supported File Formats and Size Limits | Forcepoint DLP | v8.8.1 This article provides a list of the file formats that can be analyzed by Forcepoint DLP, file formats from which content and meta data can be extracted, and the file size limits for network, endpoint, and discovery functions. See: ● Supported File Formats ● File Size Limits © 2021 Forcepoint LLC Supported File Formats Supported File Formats and Size Limits | Forcepoint DLP | v8.8.1 The following tables lists the file formats supported by Forcepoint DLP. File formats are in alphabetical order by format group. ● Archive For mats, page 3 ● Backup Formats, page 7 ● Business Intelligence (BI) and Analysis Formats, page 8 ● Computer-Aided Design Formats, page 9 ● Cryptography Formats, page 12 ● Database Formats, page 14 ● Desktop publishing formats, page 16 ● eBook/Audio book formats, page 17 ● Executable formats, page 18 ● Font formats, page 20 ● Graphics formats - general, page 21 ● Graphics formats - vector graphics, page 26 ● Library formats, page 29 ● Log formats, page 30 ● Mail formats, page 31 ● Multimedia formats, page 32 ● Object formats, page 37 ● Presentation formats, page 38 ● Project management formats, page 40 ● Spreadsheet formats, page 41 ● Text and markup formats, page 43 ● Word processing formats, page 45 ● Miscellaneous formats, page 53 Supported file formats are added and updated frequently. Key to support tables Symbol Description Y The format is supported N The format is not supported P Partial metadata
    [Show full text]
  • Compression: Putting the Squeeze on Storage
    Compression: Putting the Squeeze on Storage Live Webcast September 2, 2020 11:00 am PT 1 | ©2020 Storage Networking Industry Association. All Rights Reserved. Today’s Presenters Ilker Cebeli John Kim Brian Will Moderator Chair, SNIA Networking Storage Forum Intel® QuickAssist Technology Samsung NVIDIA Software Architect Intel 2 | ©2020 Storage Networking Industry Association. All Rights Reserved. SNIA-At-A-Glance 3 3 | ©2020 Storage Networking Industry Association. All Rights Reserved. NSF Technologies 4 4 | ©2020 Storage Networking Industry Association. All Rights Reserved. SNIA Legal Notice § The material contained in this presentation is copyrighted by the SNIA unless otherwise noted. § Member companies and individual members may use this material in presentations and literature under the following conditions: § Any slide or slides used must be reproduced in their entirety without modification § The SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations. § This presentation is a project of the SNIA. § Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney. § The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK. 5 | ©2020 Storage Networking Industry Association.
    [Show full text]
  • Usability Themes in Open Source Software
    Usability Themes in Open Source Software Jim Hall University of Minnesota (Dr. Ann Hill Duin, advisor) April 30, 2014 1 ABSTRACT This research examines the prevalent state of usability in open source software, focusing on the reasons why usability is often overlooked in the open source software noosphere. A usability test of GNOME, a popular open source software desktop environment, provides insights into the present development structure, and highlights areas for improvement. Analysis of the test data suggests features or themes of usability, and provides avenues of exploration to improve overall usability within open source software systems. 2 A program should follow the `Law of Least Astonishment.' What is this law? It is simply that the program should always respond to the user in the way that astonishes him the least. The Tao of Programming (pp. 55-57) Geoffrey James Open source software developers create an array of innovative programs: WordPress is the world's most popular blogging platform, used by a staggering 202 million websites ¼ Magento, used by 30,000 merchants, including Samsung, Nespresso and The North Face, is the world's fastest growing e-commerce platform ¼ Firefox currently accounts for 24.43% of the recorded usage share of web browsers, but this figure is on the rise ¼ GnuCash provides a great, free alternative to paid-for accounting software ¼ Music software like Cubase and Logic Pro can be incredibly expensive, which is why an increasing number of people are turning to Audacity, a free, cross-platform sound editor
    [Show full text]
  • Winzip 12 Work Faster, Smarter, Safer! Product Information Sheet
    ® WinZip 12 Work faster, smarter, safer! Product Information Sheet What is WinZip®? WinZip is the most popular and trusted way to work with compressed files. No other compression utility offers the easy, comprehensive and productive approach that makes WinZip the gold standard for file compression tools. With WinZip 12, PC users can quickly and securely zip and unzip files to conserve storage space, speed up email transmission and reduce download times. State-of-the-art file compression, strong AES encryption, compatibility with more compression formats, and comprehensive solutions for photo compression, management and sharing make WinZip 12 the complete solution for the compression and archiving needs of any home and business user. WinZip 12 is available in Standard and Pro Editions so you can choose the level that is best for you. The Standard Edition provides all the tools you need to quickly and easily compress files and open the most common archive files. Strong AES encryption also allows you to password-protect files to secure confidential information. The Pro Edition delivers all the features of the Standard, plus it is packed with powerful tools to automate routine tasks like transferring photos, backing up data and archiving files, burning Zip files to CD/DVD, uploading Zip files to online FTP sites, and configuring WinZip in a business or enterprise environment. WiZip 12 Standard Edition Designed with digital photo enthusiasts in mind, WinZip 12 now offers easier photo compression, sharing and management. Transfer, manage, share and back up digital photos more easily than ever with the following new features and enhancements.
    [Show full text]