Handbook of Open Source Tools

Total Page:16

File Type:pdf, Size:1020Kb

Handbook of Open Source Tools Handbook of Open Source Tools Sandeep Koranne Handbook of Open Source Tools Sandeep Koranne SW Boeckman Road 8005 97070 Wilsonville Oregon USA [email protected] ISBN 978-1-4419-7718-2 e-ISBN 978-1-4419-7719-9 DOI 10.1007/978-1-4419-7719-9 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2010938855 © Springer Science+Business Media, LLC 2011 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Preface The constant and speedy progress made by humankind in the industrial revolution, and more recently in the information technology era can be directly attributed to sharing of knowledge between various disciplines, reuse of the knowledge as sci- ence and technology advanced, and inclusion of this knowledge in the curriculum. The phrases “do not reinvent the wheel” and “to stand upon the shoulders of giants” come to our mind as representative of this thought process of using existing solu- tions and building upon existing knowledge, but at the same time contributing to the society as a whole. It was with this intention of documenting existing (circa 2010) Open Source Tools for Scientists and Engineers, that I set about to write this book. Computer technology has progressed at such a fast pace that it is difficult (nary impossible) to catalog all of the existing software systems which are available to us. To simplify our task I have chosen a representative software to solve a class of problem. Where space and time permitted I have provided alternatives as well. A key benefit of using open-source applications is that the code can be compiled on a system which is non-standard. Or, it can be compiled with CPU specific opti- mizations which a general purpose binary released from an ISV cannot assume. As CPU technology advances rapidly, and software has a longer lifespan, the ability to recompile the source code becomes more and more important. The same can be said with open-source implementations of data-standard in image processing, and docu- mentation retrieval. In this media focused era, more and more content is being stored as digital data. Unlike, paper, whose archival properties have been refined over cen- turies, digital media has not gone through the same process of archival management. In situations where data archival is necessary, a key component is the persistence of the key software components which read and write the digital data files. Since no one can predict the computers of the next century, open-source software is essential to long term archival of information. Rather than duplicate the fine documentation for each package, this book is orga- nized in sections related to solve a particular problem. This book should be treated as an “existential quantifier” 9, rather than 8, on the information provided for each task. Once the existence of a solution or software tool to address the problem is v vi Preface known, more details about the solution can be researched. Each software system or tool is presented in a simple to read manner describing the main problem the system addresses and the tasks performed. Each chapter is presented in a similar manner to ease referencing. Although many of the software mentioned in this book are routinely used in science and engineering tasks, more and more I have found that students and gen- eral practitioners from other fields, such as liberal arts, music, statistics, are using these in their work and study. This book contains references to artificial intelligence programs and tools which are being widely used in cognitive sciences. Many of the software tools use libraries and development tools which are themselves open- source; this synergy is representative of the open-source concept, and a key driver to its proliferation. As such, any large open-source software is a good learning ex- ample to study the use of its components. For example the GRASS GIS software (presented in Section 14.9.1) is itself developed using a number of libraries such as (i) PROJ4, (ii) HDF5, (iii) MySQL, (iv) FFTW, and many others. To learn how to use these libraries in a real world example, one only has to study the GRASS GIS source code. This is a key advantage of open-source tools. Another argument (made mostly in the context of mathematical proofs) stems from the scientific validity and acceptance of computer generated, or computer as- sisted proofs. For such proofs to be included as standard material, the software sys- tem used to arrive at the result must also be available to researchers, as well as its own correctness be verified. These goals are readily achieved by open-source math- ematical software as presented in Chapter 16. I present a short summary of the book contents: This book is divided in six parts. The first part describes the open-source operat- ing systems and user interaction as well as introspection tools. Chapter 1 includes a discussion on Bash shell, POSIX compliant libraries and open-source programming languages (including Erlang, Lua and Smalltalk). External utilities (such as tar, find and rsync) as well as OpenSSH are discussed to ease the users interaction with a modern GNU/Linux type operating system. Chapter 2 presents several text processing and document creation and management tools. These include OpenOf- fice and various LATEXprocessing tools. Software for Wiki management as well as graphical page layout are also described. Part II of this book focuses on the process of open-source software creation. For the reader who wants to know more about the methods and systems used by the authors to create the open-source tools, this part provides information on the GNU build system, version control, compilers, APIs and much more. In Chapter 3 I present the GNU Compiler Collection. Commonly used command-line options, pragmas, pre-processor defines as well as GCC intrinsics are explained. Examples of using GCC to compile Java and Ada are presented, as well as recent features of GCC including OpenMP support and C++ advice features. Source code version control with CVS and SVN is presented with the help of examples; GUI front-ends for version control (TkCVS) is presented and used in many examples. The GNU Build system is discussed with the help of examples in Section 3.3. GNU Make as Preface vii well as SConstruct are described in Section 3.4. Both GNU make and SConstruct are also used in many examples in the sequel of this book. Chapter 3 also contains a description of Bugzilla for defect tracking (Section 3.5) and a section on various editors and IDEs available on GNU/Linux for editing source code. Static code checking and analysis of source code is presented in Section 3.8, while the use of GNU debugger GDB is shown with examples in Section 3.9.1. Graphical front-ends to GDB (including GDB Insight) are shown in Section 3.9.2. Code optimization using profiling and cache measurement with GNU profiler and Valgrind is discussed with examples in Section 3.12. The C standard library and the C++ standard library (including STL) are discussed in Chapter 4. In Chapter 5 I describe the Apache Portable Runtime (APR) library. In particular APR memory pools, APR process library, APR thread and thread pool library, APR file information and memory mapping library are explained with the help of short examples. Advanced APR concepts, dealing with the use of Memcache library are also demonstrated. All examples have been compiled and run on GNU/Linux system running Fedora Core 12, and thus are known to work. Using the examples presented in this part of the book, alongwith the documentation for the library, simplifies the learning process of the API. Chapter 6 contains a description of the most useful parts of the Boost C++ API. For lack of space, I had to choose only a small portion of Boost for demonstra- tion. I depict the design and usage of Boost library with the help of examples in- cluding Boost smart pointer and memory pool, the Boost asynchronous IO (asio) framework, Boost data-structures. Boost Graph Library (BGL) is an almost com- plete graph representation library, which also includes an implementation of many graph algorithms. BGL is presented with the help of examples in Section 6.4. Boost multi-threading (like APR) is a portable and integrated (with C++) threading sys- tem. I have presented examples which the reader can contrast with APR threading and POSIX pthread examples presented in this book. Python language integration with C++ can be achieved using SWIG as well as using Boost Python integration framework, an example of which is presented in Section 6.7. Boost generic im- age processing library is presented in Section 6.8, while Boost parsing framework (SPIRIT) is presented in Section 13.4.1 of Chapter 13 on Compiler systems. Performance optimization of programs using Google perftools memory alloca- tion and profiler is shown in Chapter 7.
Recommended publications
  • Freenas® 11.0 User Guide
    FreeNAS® 11.0 User Guide June 2017 Edition FreeNAS® IS © 2011-2017 iXsystems FreeNAS® AND THE FreeNAS® LOGO ARE REGISTERED TRADEMARKS OF iXsystems FreeBSD® IS A REGISTERED TRADEMARK OF THE FreeBSD Foundation WRITTEN BY USERS OF THE FreeNAS® network-attached STORAGE OPERATING system. VERSION 11.0 CopYRIGHT © 2011-2017 iXsystems (https://www.ixsystems.com/) CONTENTS WELCOME....................................................1 TYPOGRAPHIC Conventions...........................................2 1 INTRODUCTION 3 1.1 NeW FeaturES IN 11.0..........................................3 1.2 HarDWARE Recommendations.....................................4 1.2.1 RAM...............................................5 1.2.2 The OperATING System DeVICE.................................5 1.2.3 StorAGE Disks AND ContrOLLERS.................................6 1.2.4 Network INTERFACES.......................................7 1.3 Getting Started WITH ZFS........................................8 2 INSTALLING AND UpgrADING 9 2.1 Getting FreeNAS® ............................................9 2.2 PrEPARING THE Media.......................................... 10 2.2.1 On FreeBSD OR Linux...................................... 10 2.2.2 On WindoWS.......................................... 11 2.2.3 On OS X............................................. 11 2.3 Performing THE INSTALLATION....................................... 12 2.4 INSTALLATION TROUBLESHOOTING...................................... 18 2.5 UpgrADING................................................ 19 2.5.1 Caveats:............................................
    [Show full text]
  • GNU/Linux AI & Alife HOWTO
    GNU/Linux AI & Alife HOWTO GNU/Linux AI & Alife HOWTO Table of Contents GNU/Linux AI & Alife HOWTO......................................................................................................................1 by John Eikenberry..................................................................................................................................1 1. Introduction..........................................................................................................................................1 2. Symbolic Systems (GOFAI)................................................................................................................1 3. Connectionism.....................................................................................................................................1 4. Evolutionary Computing......................................................................................................................1 5. Alife & Complex Systems...................................................................................................................1 6. Agents & Robotics...............................................................................................................................1 7. Statistical & Machine Learning...........................................................................................................2 8. Missing & Dead...................................................................................................................................2 1. Introduction.........................................................................................................................................2
    [Show full text]
  • PHP Game Programming
    PHP Game Programming Matt Rutledge © 2004 by Premier Press, a division of Course Technology. All rights SVP, Course Professional, Trade, reserved. No part of this book may be reproduced or transmitted in any Reference Group: form or by any means, electronic or mechanical, including photocopy- Andy Shafran ing, recording, or by any information storage or retrieval system with- Publisher: out written permission from Course PTR, except for the inclusion of Stacy L. Hiquet brief quotations in a review. Senior Marketing Manager: The Premier Press logo and related trade dress are trademarks of Premier Sarah O’Donnell Press and may not be used without written permission. Marketing Manager: Paint Shop Pro 8 is a registered trademark of Jasc Software. Heather Hurley PHP Coder is a trademark of phpIDE. Series Editor: All other trademarks are the property of their respective owners. André LaMothe Important: Course PTR cannot provide software support. Please contact Manager of Editorial Services: the appropriate software manufacturer’s technical support line or Web Heather Talbot site for assistance. Senior Acquisitions Editor: Course PTR and the author have attempted throughout this book to Emi Smith distinguish proprietary trademarks from descriptive terms by following Associate Marketing Manager: the capitalization style used by the manufacturer. Kristin Eisenzopf Information contained in this book has been obtained by Course PTR Project Editor and Copy Editor: from sources believed to be reliable. However, because of the possibility Dan Foster, Scribe Tribe of human or mechanical error by our sources, Course PTR, or others, the Publisher does not guarantee the accuracy, adequacy, or complete- Technical Reviewer: ness of any information and is not responsible for any errors or omis- John Freitas sions or the results obtained from use of such information.
    [Show full text]
  • Linux on the Road
    Linux on the Road Linux with Laptops, Notebooks, PDAs, Mobile Phones and Other Portable Devices Werner Heuser <wehe[AT]tuxmobil.org> Linux Mobile Edition Edition Version 3.22 TuxMobil Berlin Copyright © 2000-2011 Werner Heuser 2011-12-12 Revision History Revision 3.22 2011-12-12 Revised by: wh The address of the opensuse-mobile mailing list has been added, a section power management for graphics cards has been added, a short description of Intel's LinuxPowerTop project has been added, all references to Suspend2 have been changed to TuxOnIce, links to OpenSync and Funambol syncronization packages have been added, some notes about SSDs have been added, many URLs have been checked and some minor improvements have been made. Revision 3.21 2005-11-14 Revised by: wh Some more typos have been fixed. Revision 3.20 2005-11-14 Revised by: wh Some typos have been fixed. Revision 3.19 2005-11-14 Revised by: wh A link to keytouch has been added, minor changes have been made. Revision 3.18 2005-10-10 Revised by: wh Some URLs have been updated, spelling has been corrected, minor changes have been made. Revision 3.17.1 2005-09-28 Revised by: sh A technical and a language review have been performed by Sebastian Henschel. Numerous bugs have been fixed and many URLs have been updated. Revision 3.17 2005-08-28 Revised by: wh Some more tools added to external monitor/projector section, link to Zaurus Development with Damn Small Linux added to cross-compile section, some additions about acoustic management for hard disks added, references to X.org added to X11 sections, link to laptop-mode-tools added, some URLs updated, spelling cleaned, minor changes.
    [Show full text]
  • Extending the UHC LLVM Backend: Adding Support for Accurate Garbage Collection
    Extending the UHC LLVM backend: Adding support for accurate garbage collection Paul van der Ende MSc Thesis October 18, 2010 INF/SCR-09-59 Daily Supervisors: Center for Software Technology dr. A. Dijkstra Dept. of Information and Computing Sciences drs. J.D. Fokker Utrecht University Utrecht, the Netherlands Second Supervisor: prof. dr. S.D. Swierstra Abstract The Utrecht Haskell Compiler has an LLVM backend for whole program analyse mode. This backend pre- viously used the Boehm-Weiser library for heap allocation and conservative garbage collection. Recently an accurate garbage collector for the UHC compiler has been developed. We wanted this new garbage collection library to work with the LLVM backend. But since this new collector is accurate, it needs cooperation with the LLVM backend. Functionality needs to be added to find the set of root references and traversing all live references. To find the root set, the bytecode interpreter backend currently uses static stack maps. This is where the problem arises. The LLVM compiler is known to do various aggressive transformations. These optimizations might change the stack layout described by the static stack map. So to avoid this problem we wanted to use the shadow-stack provided by the LLVM framework. This is a dynamic structure maintained on the stack to safely find the garbage collection roots. We expect that the shadow-stack approach comes with a runtime overhead for maintaining this information dynamically on the stack. So we did measure the impact of this method. We also measured the performance improvement of the new garbage collection method used and compare it with the other UHC backends.
    [Show full text]
  • MELT a Translated Domain Specific Language Embedded in the GCC
    MELT a Translated Domain Specific Language Embedded in the GCC Compiler Basile STARYNKEVITCH CEA, LIST Software Safety Laboratory, boˆıte courrier 94, 91191 GIF/YVETTE CEDEX, France [email protected] [email protected] The GCC free compiler is a very large software, compiling source in several languages for many targets on various systems. It can be extended by plugins, which may take advantage of its power to provide extra specific functionality (warnings, optimizations, source refactoring or navigation) by processing various GCC internal representations (Gimple, Tree, ...). Writing plugins in C is a complex and time-consuming task, but customizing GCC by using an existing scripting language inside is impractical. We describe MELT, a specific Lisp-like DSL which fits well into existing GCC technology and offers high-level features (functional, object or reflexive programming, pattern matching). MELT is translated to C fitted for GCC internals and provides various features to facilitate this. This work shows that even huge, legacy, software can be a posteriori extended by specifically tailored and translated high-level DSLs. 1 Introduction GCC1 is an industrial-strength free compiler for many source languages (C, C++, Ada, Objective C, Fortran, Go, ...), targetting about 30 different machine architectures, and supported on many operating systems. Its source code size is huge (4.296MLOC2 for GCC 4.6.0), heterogenous, and still increasing by 6% annually 3. It has no single main architect and hundreds of (mostly full-time) contributors, who follow strict social rules 4. 1.1 The powerful GCC legacy The several GCC [8] front-ends (parsing C, C++, Go .
    [Show full text]
  • Bringing GNU Emacs to Native Code
    Bringing GNU Emacs to Native Code Andrea Corallo Luca Nassi Nicola Manca [email protected] [email protected] [email protected] CNR-SPIN Genoa, Italy ABSTRACT such a long-standing project. Although this makes it didactic, some Emacs Lisp (Elisp) is the Lisp dialect used by the Emacs text editor limitations prevent the current implementation of Emacs Lisp to family. GNU Emacs can currently execute Elisp code either inter- be appealing for broader use. In this context, performance issues preted or byte-interpreted after it has been compiled to byte-code. represent the main bottleneck, which can be broken down in three In this work we discuss the implementation of an optimizing com- main sub-problems: piler approach for Elisp targeting native code. The native compiler • lack of true multi-threading support, employs the byte-compiler’s internal representation as input and • garbage collection speed, exploits libgccjit to achieve code generation using the GNU Com- • code execution speed. piler Collection (GCC) infrastructure. Generated executables are From now on we will focus on the last of these issues, which con- stored as binary files and can be loaded and unloaded dynamically. stitutes the topic of this work. Most of the functionality of the compiler is written in Elisp itself, The current implementation traditionally approaches the prob- including several optimization passes, paired with a C back-end lem of code execution speed in two ways: to interface with the GNU Emacs core and libgccjit. Though still a work in progress, our implementation is able to bootstrap a func- • Implementing a large number of performance-sensitive prim- tional Emacs and compile all lexically scoped Elisp files, including itive functions (also known as subr) in C.
    [Show full text]
  • Improving Memory Management Security for C and C++
    Improving memory management security for C and C++ Yves Younan, Wouter Joosen, Frank Piessens, Hans Van den Eynden DistriNet, Katholieke Universiteit Leuven, Belgium Abstract Memory managers are an important part of any modern language: they are used to dynamically allocate memory for use in the program. Many managers exist and depending on the operating system and language. However, two major types of managers can be identified: manual memory allocators and garbage collectors. In the case of manual memory allocators, the programmer must manually release memory back to the system when it is no longer needed. Problems can occur when a programmer forgets to release it (memory leaks), releases it twice or keeps using freed memory. These problems are solved in garbage collectors. However, both manual memory allocators and garbage collectors store management information for the memory they manage. Often, this management information is stored where a buffer overflow could allow an attacker to overwrite this information, providing a reliable way to achieve code execution when exploiting these vulnerabilities. In this paper we describe several vulnerabilities for C and C++ and how these could be exploited by modifying the management information of a representative manual memory allocator and a garbage collector. Afterwards, we present an approach that, when applied to memory managers, will protect against these attack vectors. We implemented our approach by modifying an existing widely used memory allocator. Benchmarks show that this implementation has a negligible, sometimes even beneficial, impact on performance. 1 Introduction Security has become an important concern for all computer users. Worms and hackers are a part of every day internet life.
    [Show full text]
  • Ethereal Developer's Guide Draft 0.0.2 (15684) for Ethereal 0.10.11
    Ethereal Developer's Guide Draft 0.0.2 (15684) for Ethereal 0.10.11 Ulf Lamping, Ethereal Developer's Guide: Draft 0.0.2 (15684) for Ethere- al 0.10.11 by Ulf Lamping Copyright © 2004-2005 Ulf Lamping Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or any later version published by the Free Software Foundation. All logos and trademarks in this document are property of their respective owner. Table of Contents Preface .............................................................................................................................. vii 1. Foreword ............................................................................................................... vii 2. Who should read this document? ............................................................................... viii 3. Acknowledgements ................................................................................................... ix 4. About this document .................................................................................................. x 5. Where to get the latest copy of this document? ............................................................... xi 6. Providing feedback about this document ...................................................................... xii I. Ethereal Build Environment ................................................................................................14 1. Introduction .............................................................................................................15
    [Show full text]
  • A Compiler Front-End for the WOOL Parallelization Library
    A compiler front-end for the WOOL Parallelization library GEORGIOS VARISTEAS KTH Information and Communication Technology Master of Science Thesis Stockholm, Sweden 2010 TRITA-ICT-EX-2010:291 Royal Institute of Technology A compiler front-end for the WOOL Parallelization library Georgios Varisteas yorgos(@)kth.se 15 October, 2010 A master thesis project conducted at Examiner: Mats Brorsson Supervisor: Karl-Filip Faxén Abstract WOOL is a C parallelization library developed at SICS by Karl-Filip Faxén. It provides the tools for develop- ing fine grained independent task based parallel applications. This library is distinguished from other similar projects by being really fast and light; it manages to spawn and synchronize tasks in under 20 cycles. However, all software development frameworks which expose radically new functionality to a programming language, gain a lot by having a compiler to encapsulate and implement them. WOOL does not differ from this category. This project is about the development of a source-to-source compiler for the WOOL parallelization library, supporting an extension of the C language with new syntax that implements the WOOL API, transform- ing it and eventually outputting GNU C code. Additionally, this compiler is augmented with a wrapper script that performs compilation to machine code by using GCC. This script is configurable and fully automatic. The main advantage gained from this project is to satisfy the need for less overhead in software development with WOOL. The simplified syntax results in faster and more economical code writing while being less error- prone. Moreover, this compiler enables the future addition of many more features not applicable with the current state of WOOL as a library.
    [Show full text]
  • SNC: a Cloud Service Platform for Symbolic-Numeric Computation Using Just-In-Time Compilation
    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2017.2656088, IEEE Transactions on Cloud Computing IEEE TRANSACTIONS ON CLOUD COMPUTING 1 SNC: A Cloud Service Platform for Symbolic-Numeric Computation using Just-In-Time Compilation Peng Zhang1, Member, IEEE, Yueming Liu1, and Meikang Qiu, Senior Member, IEEE and SQL. Other types of Cloud services include: Software as a Abstract— Cloud services have been widely employed in IT Service (SaaS) in which the software is licensed and hosted in industry and scientific research. By using Cloud services users can Clouds; and Database as a Service (DBaaS) in which managed move computing tasks and data away from local computers to database services are hosted in Clouds. While enjoying these remote datacenters. By accessing Internet-based services over lightweight and mobile devices, users deploy diversified Cloud great services, we must face the challenges raised up by these applications on powerful machines. The key drivers towards this unexploited opportunities within a Cloud environment. paradigm for the scientific computing field include the substantial Complex scientific computing applications are being widely computing capacity, on-demand provisioning and cross-platform interoperability. To fully harness the Cloud services for scientific studied within the emerging Cloud-based services environment computing, however, we need to design an application-specific [4-12]. Traditionally, the focus is given to the parallel scientific platform to help the users efficiently migrate their applications. In HPC (high performance computing) applications [6, 12, 13], this, we propose a Cloud service platform for symbolic-numeric where substantial effort has been given to the integration of computation– SNC.
    [Show full text]
  • Xcode Package from App Store
    KH Computational Physics- 2016 Introduction Setting up your computing environment Installation • MAC or Linux are the preferred operating system in this course on scientific computing. • Windows can be used, but the most important programs must be installed – python : There is a nice package ”Enthought Python Distribution” http://www.enthought.com/products/edudownload.php – C++ and Fortran compiler – BLAS&LAPACK for linear algebra – plotting program such as gnuplot Kristjan Haule, 2016 –1– KH Computational Physics- 2016 Introduction Software for this course: Essentials: • Python, and its packages in particular numpy, scipy, matplotlib • C++ compiler such as gcc • Text editor for coding (for example Emacs, Aquamacs, Enthought’s IDLE) • make to execute makefiles Highly Recommended: • Fortran compiler, such as gfortran or intel fortran • BLAS& LAPACK library for linear algebra (most likely provided by vendor) • open mp enabled fortran and C++ compiler Useful: • gnuplot for fast plotting. • gsl (Gnu scientific library) for implementation of various scientific algorithms. Kristjan Haule, 2016 –2– KH Computational Physics- 2016 Introduction Installation on MAC • Install Xcode package from App Store. • Install ‘‘Command Line Tools’’ from Apple’s software site. For Mavericks and lafter, open Xcode program, and choose from the menu Xcode -> Open Developer Tool -> More Developer Tools... You will be linked to the Apple page that allows you to access downloads for Xcode. You wil have to register as a developer (free). Search for the Xcode Command Line Tools in the search box in the upper left. Download and install the correct version of the Command Line Tools, for example for OS ”El Capitan” and Xcode 7.2, Kristjan Haule, 2016 –3– KH Computational Physics- 2016 Introduction you need Command Line Tools OS X 10.11 for Xcode 7.2 Apple’s Xcode contains many libraries and compilers for Mac systems.
    [Show full text]