Automatic Identification of Architecture and Endianness Using Binary File Contents

Total Page:16

File Type:pdf, Size:1020Kb

Automatic Identification of Architecture and Endianness Using Binary File Contents Sami Kairajärvi Automatic identification of architecture and endianness using binary file contents in Information Technology Master’s Thesis May 14, 2019 University of Jyväskylä Faculty of Information Technology Author: Sami Kairajärvi Contact information: [email protected] Supervisor: Andrei Costin Title: Automatic identification of architecture and endianness using binary file contents Työn nimi: Arkkitehtuurin ja tavujärjestyksen automaattinen tunnistaminen binääritiedoston sisällön avulla Project: Master’s Thesis Study line: Cyber security Page count: 71+5 Abstract: This thesis explores how architecture and endianness of executable code can be identified using binary file contents, as falsely identifying the architecture caused about 10% of failures of firmware analysis in a recent study (Costin et al. 2014). A literature review was performed to identify the current state-of-the-art methods and how they could be improved in terms of algorithms, performance, data sets, and support tools. The thorough review identified methods presented by Clemens (2015) and De Nicolao et al. (2018) as the state-of-the-art and found that they had good results. However, these methods were found lacking essential tools to acquire or build the data sets as well as requiring more comprehensive comparison of classifier performance on full binaries. An experimental evaluation was performed to test classifier performance on different situations. For example, when training and testing classifiers with only code sections from executable files, all the classifiers performed equally well achieving over 98% accuracy. On samples with very small code sections 3-nearest neighbors and SVM had the best performance achieving 90% accuracy at 128 bytes. At the same time, random forest classifier performed the best classifying full binaries when trained with code sections at 90% accuracy and 99.2% when trained using full binaries. Keywords: Firmware Analysis, Supervised Machine Learning, Classification, Binary Code Suomenkielinen tiivistelmä: Tässä tutkielmassa tutkitaan kuinka suoritettavan ohjelman arkkitehtuurin ja tavujärjestyksen voi tunnistaa binääritiedoston sisällön avulla. Tätä tietoa tarvittiin esimerkiksi siksi, että arkkitehtuurin väärä tunnistus aiheutti 10% laiteohjelmistoanalyysin epäonnistumisista viimeaikaisessa tutkimuksessa (Costin et al. 2014). Tutkielmassa tehtiin kirjallisuuskatsaus, jossa etsittiin tämänhetkiset parhaat metodit sekä i kuinka niitä voisi parantaa. Clemens (2015) ja De Nicolao et al. (2018) tunnistettiin johtavina tutkimuksina, ja niissä esitetyt menetelmät antoivat hyviä luokittelutuloksia. Puutteita löytyi aineistonkeräyksestä sekä luokittelijoiden laajemmasta vertailusta kokonaisten binäärien luokittelussa. Tutkielman kokeellisessa osiossa luokittelijoiden suorituskykyä mitattiin erilaisissa tilanteissa. Luokittelijoita esimerkiksi opetettiin ja testattiin vain suoritettavien ohjelmien koodiosioilla, jolloin kaikki luokittelijat saavuttivat vähintään 98% tarkkuuden. 3-lähimmän naapurin menetelmä sekä SVM luokittelija suoriutuivat parhaiten pienten näytteiden luokittelussa saavuttaen 90% tarkkuuden jo 128 tavun kokoisten tiedostojen kohdalla. Satunnaismetsä luokittelijalla oli paras suorituskyky kokonaisten binääritiedostojen luokittelussa 90% tarkkuudella, kun luokittelija opetettiin binääritiedoston koodiosilla, ja 99.2%, kun luokittelija opetettiin kokonaisilla binääritiedostoilla. Avainsanat: Laiteohjelmiston analysointi, valvottu koneoppiminen, luokittelu, binäärikoodi ii List of Figures Figure 1. Example of an ELF file structure .........................................8 Figure 2. Overview of steps in supervised learning (Kotsiantis, I. Zaharakis, and P. Pintelas 2007) ...........................................................18 Figure 3. Workflow of the tool to download data sets .............................19 Figure 4. Bayesian network representation (Dua and Du 2016) .................26 Figure 5. Example of a kNN where k=5 (Dua and Du 2016) .....................27 Figure 6. Example of a decision tree (Dua and Du 2016) .........................29 Figure 7. Example of a logistic regression .........................................30 Figure 8. Example of a multilayer perceptron. It is composed of input neurons X1::XN, N number of neurons H in m hidden layers, and output neurons Y1::YN (Kononenko and Kukar 2007) .................................31 Figure 9. Example of SVM classifier (Dua and Du 2016)..........................32 Figure 10. Steps to choose a research methodology (University of Jyväskylä 2010).............................................................................34 Figure 11. Classifier and dataflow setup used in Azure Machine Learning .....38 Figure 12. Coefficients in logistic regression when using the additional powerpcspe signatures. Class 13 is powerpc while class 14 is powerpcspe. .....................................................................42 Figure 13. Impact of the sample size on classifier performance .................44 Figure 14. Confusion matrix of random forest classification results on full binaries. Columns show the predicted class frequencies while rows show the true class frequencies. ..............................................47 Figure 15. Output JSON from FIRMADYNE scraper ..............................65 Figure 16. Output JSON from Jigdo downloader ...................................66 Figure 17. Output JSON from Debian extractor ...................................66 Figure 18. (Partial) Output JSON from Debian unpacker .........................67 Figure 19. Output JSON from binary extractor ....................................68 List of Tables Table 1. A confusion matrix for binary classification (Sokolova, Japkowicz, and Szpakowicz 2006) ..........................................................21 Table 2. Cumulative data set statistics..............................................36 Table 3. Data set statistics ...........................................................36 Table 4. Classifiers and settings used by Clemens (2015) ........................37 Table 5. F1 score with random forest trained and tested with full binaries in Weka with and without powerpcspe signatures ..............................40 Table 6. Confusion matrix of powerpc and powerpcspe architectures without the additional signatures .......................................................41 Table 7. Confusion matrix of powerpc and powerpcspe architectures with the additional signatures .......................................................41 iii Table 8. Classifier performance with different feature sets, also compared with some metrics presented by Clemens (2015) ............................43 Table 9. Logistic regression 10-fold cross validation performance on different platforms ..............................................................45 Table 10. Classifier performance on full binaries ..................................46 Table 11. Accuracy, precision and recall for random forest and decision jungle classifiers trained in Azure Machine Learning platform. Performance results from random forest implemented in Weka added for comparison. .................................................................48 Table 12. Classifier performance on full binaries trained with full binaries. Random forest and random jungle missing AUC and F1 measures as Azure platform does not provide them. Accuracies achieved in classifying full binaries with classifiers trained with code sections added for comparison. ..........................................................48 iv Contents 1 INTRODUCTION ................................................................1 2 OVERVIEW OF SECURITY-ORIENTED FIRMWARE ANALYSIS ............4 2.1 Unpacking .................................................................4 2.2 Identification and classification..........................................5 2.3 Static analysis .............................................................6 2.4 Architecture identification ...............................................7 2.5 Emulation ..................................................................11 2.6 Dynamic analysis..........................................................12 2.7 Domain-specific analysis .................................................13 3 MACHINE LEARNING .........................................................15 3.1 Overview of artificial intelligence and its subfields ....................15 3.2 Unsupervised learning ...................................................16 3.3 Supervised learning ......................................................16 3.3.1Identification of required data .....................................18 3.3.2Data pre-processing and feature selection .......................19 3.3.3Algorithm selection .................................................20 3.3.4Evaluation ...........................................................21 3.3.5Evaluation metrics ..................................................21 3.3.6Cross-validation .....................................................23 3.4 Reinforcement learning ..................................................24 3.5 Deep learning .............................................................24 3.6 Classifiers..................................................................25 3.6.1Bayesian classifiers .................................................25 3.6.2Lazy classifiers ......................................................27 3.6.3Trees .................................................................28
Recommended publications
  • Use of Seek When Writing Or Reading Binary Files
    Title stata.com file — Read and write text and binary files Description Syntax Options Remarks and examples Stored results Reference Also see Description file is a programmer’s command and should not be confused with import delimited (see [D] import delimited), infile (see[ D] infile (free format) or[ D] infile (fixed format)), and infix (see[ D] infix (fixed format)), which are the usual ways that data are brought into Stata. file allows programmers to read and write both text and binary files, so file could be used to write a program to input data in some complicated situation, but that would be an arduous undertaking. Files are referred to by a file handle. When you open a file, you specify the file handle that you want to use; for example, in . file open myfile using example.txt, write myfile is the file handle for the file named example.txt. From that point on, you refer to the file by its handle. Thus . file write myfile "this is a test" _n would write the line “this is a test” (without the quotes) followed by a new line into the file, and . file close myfile would then close the file. You may have multiple files open at the same time, and you may access them in any order. 1 2 file — Read and write text and binary files Syntax Open file file open handle using filename , read j write j read write text j binary replace j append all Read file file read handle specs Write to file file write handle specs Change current location in file file seek handle query j tof j eof j # Set byte order of binary file file set handle byteorder hilo j lohi j 1 j 2 Close
    [Show full text]
  • SFLC V Conservancy
    Trademark Trial and Appeal Board Electronic Filing System. http://estta.uspto.gov ESTTA Tracking number: ESTTA863914 Filing date: 12/11/2017 IN THE UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE TRADEMARK TRIAL AND APPEAL BOARD Proceeding 92066968 Party Defendant Software Freedom Conservancy Correspondence PAMELA S CHESTECK Address CHESTEK LEGAL P O BOX 2492 RALEIGH, NC 27602 UNITED STATES Email: [email protected] Submission Motion for Summary Judgment Yes, the Filer previously made its initial disclosures pursuant to Trademark Rule 2.120(a); OR the motion for summary judgment is based on claim or issue pre- clusion, or lack of jurisdiction. The deadline for pretrial disclosures for the first testimony period as originally set or reset: 07/20/2018 Filer's Name Pamela S Chestek Filer's email [email protected] Signature /Pamela S Chestek/ Date 12/11/2017 Attachments Motion for SJ on affirmative defenses-signed.pdf(756280 bytes ) Kuhn-Declara- tion_summary-judgment_as-submitted_reduced-size-signed.pdf(2181238 bytes ) Sandler-declara- tion_summary-judgment_as-submitted-reduced-size-signed.pdf(1777273 bytes ) Chestek declaration_summary-judgment-signed-with-exhibits.pdf(2003142 bytes ) IN THE UNITED STATES PATENT AND TRADEMARK OFFICE BEFORE THE TRADEMARK TRIAL AND APPEAL BOARD In the Mater of Registraion No. 4212971 Mark: SOFTWARE FREEDOM CONSERVANCY Registraion date: September 25, 2012 Sotware Freedom Law Center Peiioner, v. Cancellaion No. 92066968 Sotware Freedom Conservancy Registrant. RESPONDENT’S MOTION FOR SUMMARY JUDGMENT ON ITS AFFIRMATIVE DEFENSES Introducion The Peiioner, Sotware Freedom Law Center (“SFLC”), is a provider of legal services. It had the idea to create an independent enity that would ofer inancial and administraive services for free and open source sotware projects.
    [Show full text]
  • File Handling in Python
    hapter C File Handling in 2 Python There are many ways of trying to understand programs. People often rely too much on one way, which is called "debugging" and consists of running a partly- understood program to see if it does what you expected. Another way, which ML advocates, is to install some means of understanding in the very programs themselves. — Robin Milner In this Chapter » Introduction to Files » Types of Files » Opening and Closing a 2.1 INTRODUCTION TO FILES Text File We have so far created programs in Python that » Writing to a Text File accept the input, manipulate it and display the » Reading from a Text File output. But that output is available only during » Setting Offsets in a File execution of the program and input is to be entered through the keyboard. This is because the » Creating and Traversing a variables used in a program have a lifetime that Text File lasts till the time the program is under execution. » The Pickle Module What if we want to store the data that were input as well as the generated output permanently so that we can reuse it later? Usually, organisations would want to permanently store information about employees, inventory, sales, etc. to avoid repetitive tasks of entering the same data. Hence, data are stored permanently on secondary storage devices for reusability. We store Python programs written in script mode with a .py extension. Each program is stored on the secondary device as a file. Likewise, the data entered, and the output can be stored permanently into a file.
    [Show full text]
  • Endian: from the Ground up a Coordinated Approach
    WHITEPAPER Endian: From the Ground Up A Coordinated Approach By Kevin Johnston Senior Staff Engineer, Verilab July 2008 © 2008 Verilab, Inc. 7320 N MOPAC Expressway | Suite 203 | Austin, TX 78731-2309 | 512.372.8367 | www.verilab.com WHITEPAPER INTRODUCTION WHat DOES ENDIAN MEAN? Data in Imagine XYZ Corp finally receives first silicon for the main Endian relates the significance order of symbols to the computers chip for its new camera phone. All initial testing proceeds position order of symbols in any representation of any flawlessly until they try an image capture. The display is kind of data, if significance is position-dependent in that regularly completely garbled. representation. undergoes Of course there are many possible causes, and the debug Let’s take a specific type of data, and a specific form of dozens if not team analyzes code traces, packet traces, memory dumps. representation that possesses position-dependent signifi- There is no problem with the code. There is no problem cance: A digit sequence representing a numeric value, like hundreds of with data transport. The problem is eventually tracked “5896”. Each digit position has significance relative to all down to the data format. other digit positions. transformations The development team ran many, many pre-silicon simula- I’m using the word “digit” in the generalized sense of an between tions of the system to check datapath integrity, bandwidth, arbitrary radix, not necessarily decimal. Decimal and a few producer and error correction. The verification effort checked that all other specific radixes happen to be particularly useful for the data submitted at the camera port eventually emerged illustration simply due to their familiarity, but all of the consumer.
    [Show full text]
  • Chapter 10 Streams Streams Text Files and Binary Files
    Streams Chapter 10 • A stream is an object that enables the flow of File I/O data between a ppgrogram and some I/O device or file – If the data flows into a program, then the stream is called an input stream – If the dtdata flows out of a program, then the stream is called an output stream Copyright © 2012 Pearson Addison‐Wesley. All rights reserved. 10‐2 Streams Text Files and Binary Files • Input streams can flow from the kbkeyboar d or from a • Files that are designed to be read by human beings, file and that can be read or written with an editor are – StSystem. in is an itinput stream tha t connects to the called text files keyboard – Scanner keyy(y);board = new Scanner(System.in); Text files can also be called ASCII files because the data they contain uses an ASCII encoding scheme • Output streams can flow to a screen or to a file – An advantage of text files is that the are usually the same – System.out is an output stream that connects to the screen on all computers, so tha t they can move from one System.out.println("Output stream"); computer to another Copyright © 2012 Pearson Addison‐Wesley. All rights reserved. 10‐3 Copyright © 2012 Pearson Addison‐Wesley. All rights reserved. 10‐4 Text Files and Binary Files Writing to a Text File • Files tha t are didesigne d to be read by programs and • The class PrintWriter is a stream class that consist of a sequence of binary digits are called binary files that can be used to write to a text file – Binary files are designed to be read on the same type of – An object of the class PrintWriter has the computer and with the same programming language as the computer that created the file methods print and println – An advantage of binary files is that they are more efficient – These are similar to the System.out methods to process than text files of the same names, but are used for text file – Unlike most binary files, Java binary files have the advantage of being platform independent also output, not screen output Copyright © 2012 Pearson Addison‐Wesley.
    [Show full text]
  • Automatic Porting of Binary File Descriptor Library
    Automatic Porting of Binary File Descriptor Library Maghsoud Abbaspour+, Jianwen Zhu++ Technical Report TR-09-01 September 2001 + Electrical and Computer Engineering University of Tehran, Iran ++ 10 King's College Road Edward S. Rogers Sr. Electrical and Computer Engineering University of Toronto, Ontario M5S 3G4, Canada [email protected] [email protected] Abstract Since software is playing an increasingly important role in system-on-chip, retargetable compi- lation has been an active research area in the last few years. However, the retargetting of equally important downstream system tools, such as assemblers, linkers and debuggers, has either been ignored, or falls short of production quality due to the complexity involved in these tools. In this paper, we present a technique that can automatically retarget the GNU BFD library, the foundation library for a suite of binary tools. Other than having all the advantages enjoyed by open-source software by aligning to a de facto standard, our technique is systematic, as a result of using a formal model of abstract binary interface (ABI) as a new element of architectural model; and simple, as a result of leveraging free software to the largest extent. Contents 1 Introduction 1 2 Related Work 2 3 Binary File Descriptor Library (BFD) 3 4 ABI Modeling 5 5 Retargetting BFD 9 6 Implementation and Experiments 10 7 Conclusion 12 8 References 12 i 1 Introduction New products in consumer electronics and telecommunications are characterized by increasing functional complexity and shorter design cycle. It is generally conceived that the complexity problem can be best solved by the use of system-on-chip (SOC) technology.
    [Show full text]
  • File and Console I/O
    File and Console I/O CS449 Spring 2016 What is a Unix(or Linux) File? • File: “a resource for storing information [sic] based on some kind of durable storage” (Wikipedia) • Wider sense: “In Unix, everything is a file.” (a.k.a “In Unix, everything is a stream of bytes.”) – Traditional files, directories, links – Inter-process communication (pipes, shared memory, sockets) – Devices (interactive terminals, hard drives, printers, graphic card) • Usually mounted under /dev/ directory – Process Links (for getting process information) • Usually mounted under /proc/ directory Stream of Bytes Abstraction • A file, in abstract, is a stream of bytes • Can be manipulated using five system calls: – open: opens a file for reading/writing and returns a file descriptor • File descriptor: index into an OS array called open file table – read: reads current offset through file descriptor – write: writes current offset through file descriptor – lseek: changes current offset in file – close: closes file descriptor • Some files do not support certain operations (e.g. a terminal device does not support lseek) C Standard Library Wrappers • C Standard Library wraps file system calls in library functions – For portability across multiple systems – To provide additional features (buffering, formatting) • All C wrappers buffered by default – Buffering can be controlled using “setbuf” or “setlinebuf” calls (remember those?) • Works on FILE * instead of file descriptor – FILE is a library data structure that abstracts a file – Contains file descriptor, current offset, buffering mode etc. Wrappers for the Five System Calls Function Prototype Description FILE *fopen(const char *path, const Opens the file whose name is the string pointed to char *mode); by path and associates a stream with it.
    [Show full text]
  • [MS-CFB]: Compound File Binary File Format
    [MS-CFB]: Compound File Binary File Format Intellectual Property Rights Notice for Open Specifications Documentation . Technical Documentation. Microsoft publishes Open Specifications documentation (“this documentation”) for protocols, file formats, data portability, computer languages, and standards support. Additionally, overview documents cover inter-protocol relationships and interactions. Copyrights. This documentation is covered by Microsoft copyrights. Regardless of any other terms that are contained in the terms of use for the Microsoft website that hosts this documentation, you can make copies of it in order to develop implementations of the technologies that are described in this documentation and can distribute portions of it in your implementations that use these technologies or in your documentation as necessary to properly document the implementation. You can also distribute in your implementation, with or without modification, any schemas, IDLs, or code samples that are included in the documentation. This permission also applies to any documents that are referenced in the Open Specifications documentation. No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation. Patents. Microsoft has patents that might cover your implementations of the technologies described in the Open Specifications documentation. Neither this notice nor Microsoft's delivery of this documentation grants any licenses under those patents or any other Microsoft patents. However, a given Open Specifications document might be covered by the Microsoft Open Specifications Promise or the Microsoft Community Promise. If you would prefer a written license, or if the technologies described in this documentation are not covered by the Open Specifications Promise or Community Promise, as applicable, patent licenses are available by contacting [email protected].
    [Show full text]
  • Debian Installer Bullseye Alpha 3 Released and More Debian News
    Published on Tux Machines (http://www.tuxmachines.org) Home > content > Debian Installer Bullseye Alpha 3 Released and More Debian News Debian Installer Bullseye Alpha 3 Released and More Debian News By Roy Schestowitz Created 06/12/2020 - 9:51pm Submitted by Roy Schestowitz on Sunday 6th of December 2020 09:51:32 PM Filed under Debian [1] Debian Installer Bullseye Alpha 3 release [2] The Debian Installer team[1] is pleased to announce the third alpha release of the installer for Debian 11 "Bullseye". Improvements in this release ============================ * apt-setup: - Remove mention of volatile repo from generated sources.list file (#954460). * base-installer: - Improve test architecture, adding support for Linux 5.x versions. * brltty: - Improve hardware detection and driver support. * cdebconf: - Make text interface report progress more accurately: from the very beginning, and also as soon as an answer to a question has been given. * choose-mirror: - Update Mirrors.masterlist. * console-setup: - Improve support for box-drawing characters (#965029). - Sync Terminus font with the xfonts-terminus package. - Fix Lithuanian layout (#951387). * debian-cd: - Only include Linux udebs for the latest ABI, making small installation images more useful. * debian-installer: - Bump Linux kernel ABI to 5.9.0-4 - Drop fontconfig tweaks introduced in the Debian Installer Buster Alpha 1 release (See: #873462). - Install kmod-udeb instead of libkmod2-udeb. - Mimick libgcc1 handling, for libgcc-s1. - Clean up the list of fake packages. - Replace the mklibs library reduction pass with a hack, copying libgcc_s.so.[124] from the host filesystem for the time being. - Add explicit build-depends on fdisk on arm64, amd64 and i386 now that util-linux doesn't depend on it anymore.
    [Show full text]
  • A Zahlensysteme
    A Zahlensysteme Außer dem Dezimalsystem sind das Dual-,dasOktal- und das Hexadezimalsystem gebräuchlich. Ferner spielt das Binär codierte Dezimalsystem (BCD) bei manchen Anwendungen eine Rolle. Bei diesem sind die einzelnen Dezimalstellen für sich dual dargestellt. Die folgende Tabelle enthält die Werte von 0 bis dezimal 255. Be- quemlichkeitshalber sind auch die zugeordneten ASCII-Zeichen aufgeführt. dezimal dual oktal hex BCD ASCII 0 0 0 0 0 nul 11111soh 2102210stx 3113311etx 4 100 4 4 100 eot 5 101 5 5 101 enq 6 110 6 6 110 ack 7 111 7 7 111 bel 8 1000 10 8 1000 bs 9 1001 11 9 1001 ht 10 1010 12 a 1.0 lf 11 101 13 b 1.1 vt 12 1100 14 c 1.10 ff 13 1101 15 d 1.11 cr 14 1110 16 e 1.100 so 15 1111 17 f 1.101 si 16 10000 20 10 1.110 dle 17 10001 21 11 1.111 dc1 18 10010 22 12 1.1000 dc2 19 10011 23 13 1.1001 dc3 20 10100 24 14 10.0 dc4 21 10101 25 15 10.1 nak 22 10110 26 16 10.10 syn 430 A Zahlensysteme 23 10111 27 17 10.11 etb 24 11000 30 18 10.100 can 25 11001 31 19 10.101 em 26 11010 32 1a 10.110 sub 27 11011 33 1b 10.111 esc 28 11100 34 1c 10.1000 fs 29 11101 35 1d 10.1001 gs 30 11110 36 1e 11.0 rs 31 11111 37 1f 11.1 us 32 100000 40 20 11.10 space 33 100001 41 21 11.11 ! 34 100010 42 22 11.100 ” 35 100011 43 23 11.101 # 36 100100 44 24 11.110 $ 37 100101 45 25 11.111 % 38 100110 46 26 11.1000 & 39 100111 47 27 11.1001 ’ 40 101000 50 28 100.0 ( 41 101001 51 29 100.1 ) 42 101010 52 2a 100.10 * 43 101011 53 2b 100.11 + 44 101100 54 2c 100.100 , 45 101101 55 2d 100.101 - 46 101110 56 2e 100.110 .
    [Show full text]
  • Debian 1 Debian
    Debian 1 Debian Debian Part of the Unix-like family Debian 7.0 (Wheezy) with GNOME 3 Company / developer Debian Project Working state Current Source model Open-source Initial release September 15, 1993 [1] Latest release 7.5 (Wheezy) (April 26, 2014) [±] [2] Latest preview 8.0 (Jessie) (perpetual beta) [±] Available in 73 languages Update method APT (several front-ends available) Package manager dpkg Supported platforms IA-32, x86-64, PowerPC, SPARC, ARM, MIPS, S390 Kernel type Monolithic: Linux, kFreeBSD Micro: Hurd (unofficial) Userland GNU Default user interface GNOME License Free software (mainly GPL). Proprietary software in a non-default area. [3] Official website www.debian.org Debian (/ˈdɛbiən/) is an operating system composed of free software mostly carrying the GNU General Public License, and developed by an Internet collaboration of volunteers aligned with the Debian Project. It is one of the most popular Linux distributions for personal computers and network servers, and has been used as a base for other Linux distributions. Debian 2 Debian was announced in 1993 by Ian Murdock, and the first stable release was made in 1996. The development is carried out by a team of volunteers guided by a project leader and three foundational documents. New distributions are updated continually and the next candidate is released after a time-based freeze. As one of the earliest distributions in Linux's history, Debian was envisioned to be developed openly in the spirit of Linux and GNU. This vision drew the attention and support of the Free Software Foundation, who sponsored the project for the first part of its life.
    [Show full text]
  • Linux on the Move
    Guest Editors’ Introduction What is Linux? And why should you care? This focus section has insights for both newcomers and diehard fans. Linux on the Move Terry Bollinger, The Mitre Corporation Peter Beckman, Los Alamos National Laboratory inux is a free, open-source operating system that looks like Unix, L except that it runs on PCs as well as other platforms. Linux was created by Linus Torvalds in 1991. Today, Linux is cooperatively improved by Torvalds and thousands of volunteers from around the world using open-source development methods. At this point in time, “Linux” generally refers to the entire suite of software in a distribution, from the operating system kernel to the Web server and graphical user interface. When we say that Linux is “free” we mean, well…free. You do not need to pay money to get a copy of it, although it is usually more convenient to buy an inexpen- sive CD-ROM copy than download an entire distribution over the Internet. Once you get a copy of Linux, you also have the right to make as many copies of it as you want. 30 IEEE Software January/February 1999 0740-7459/99/$10.00 © 1999 . DEFINING TERMS GETTING RESULTS By “open source”we mean that you also have the The only traditional software practice that open- right to get copies of all the source code from which source software developers do follow is peer review, Linux and its associated tools were originally com- and they do that with a vengeance. Each piece of piled. There are no magical, mysterious binary files, source code is placed on display in front of a global although you can of course get the Linux system precompiled if you prefer.
    [Show full text]