Parallel Processing.Pptx

Total Page:16

File Type:pdf, Size:1020Kb

Parallel Processing.Pptx Compute Grid: Parallel Processing RCS Lunch & Learn Training Series Bob Freeman, PhD Director, Research Technology Operations HBS 8 November, 2017 Overview • Q&A • Introduction • Serial vs parallel • Approaches to Parallelization • Submitting parallel jobs on the compute grid • Parallel tasks • Parallel Code Serial vs Parallel work Serial vs Multicore Approaches Traditionally, software has been written for serial computers • To be run on a single computer having a single Central Processing Unit (CPU) • Problem is broken into a discrete set of instructions • Instructions are executed one after the other • One one instruction can be executed at any moment in time Serial vs Multicore Approaches In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: • To be run using multiple CPUs • A problem is broken into discrete parts (either by you or the application itself) that can be solved concurrently • Each part is further broken down to a series of instructions • Instructions from each part execute simultaneously on different CPUs or different machines Serial vs Multicore Approaches Many different parallelization approaches, which we won't discuss: Shared memory Distributed memory 6 Hybrid Distributed-Shared memory Parallel Processing… So, we are going to briefly touch on two approaches: • Parallel tasks • Tasks in the background • gnu_parallel • Pleasantly parallelizing • Parallel code • Considerations for parallelizing • Parallel frameworks & examples We will not discuss parallelized frameworks such as Hadoop, Apache Spark, MongoDB, ElasticSearch, etc Parallel Jobs on the Compute Grid… Nota Bene!! • In order to run in parallel, programs (code) must be explicitly programmed to do so. • And you must ask the scheduler to reserve those cores for your program/work to use. Thus, requesting cores from the scheduler does not automagically parallelize your code! # SAMPLE JOB FILE #!/bin/bash #BSUB -q normal # Queue to submit to (comma separated) #BSUB -n 8 # Number of cores ... blastn –query seqs.fasta –db nt –out seqs.nt.blastn # WRONG!! blastn –query seqs.fasta –db nt –out seqs.nt.blastn –num_threads $LSB_MAX_NUM_PROCESSORS# YES!! # SAMPLE PARALLELIZED CODE bsub –q normal –n 4 –W 24:00 -R "rusage[mem=4000]" stata-mp4 –d myfile.do # SAMPLE PARALLEL TASKS bsub –q normal –n 4 –W 24:00 -R "rusage[mem=4000]" \ parallel –joblog .log --outputasfiles –j\$LSB_MAX_NUM_PROCESSORS :::: tasklist.txt # SAMPLE PLEASANT PARALLELIZATION for file in folder/*.txt; do echo $file bsub -q normal -W 24:00 -R "rusage[mem=1000]" python process_input_data.py $file done Parallel Tasks Background tasks Shells, by default, have the ability to multitask: doing more than one thing at a time In BASH, this can be accomplished by sending a command to the background: • Explicitly, with & • After the fact, with ^Z and bg When you put a task in the background • The task keeps running, while you continue to work at the shell in the foreground • If any output is done, it appears on your screen immediately • If input is required, the process prints a message and stops • When it is done, a message will be printed From Processes & Job Control: http://slideplayer.com/slide/4592906/ GnuParallel Approach GNU parallel is a shell tool for executing jobs in parallel using one or more computers. • single command or small script that has to be run for each of the lines in the input. • typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. • Many options for working with control and output of results • Can specify the degree of parallelization # create list of files to unzip for index in `seq 1 100`; do echo "unzip myfile$index.zip" >> tasklist.txt; done # Ask the compute cluster to do this for me in parallel, using 4 CPU/cores bsub –q normal –n 4 –W 2:00 -R "rusage[mem=4000]" \ parallel –joblog .log --outputasfiles –j\$LSB_MAX_NUM_PROCESSORS :::: tasklist.txt 11 Concept of Pleasant Parallelization Problem: How do I BLAST 200,000 transcripts against NR? Solution: Fake a parallel BLAST. But how? • Divide your input file into n separate files • BLAST each smaller input file on a separate core • Running on n cores will be almost exactly as n times faster! Why? • Each core doesn't need to talk to one another • You could submit n jobs individually, but not recommended • Use more sophisticated techniques: job arrays, gnu_parallel, GridRunner • Shouldn't confuse this with truly parallel mpiBLAST The efficiency of your work depends on how parallelized you make your task: • You want to ensure that your jobs spend most of their time computing, and not in the queue or doing compute prep schedule module load BLAST Job finish versus schedule module load BLAST Job finish 1 X 100?? What would you choose? 2 Manual (Script) Approach • Split input file into N files that run 1 to 6 hrs each • Can be done with perl or python script, unix split, etc • User script parses the datafile whose name is passed as the command parameter for file in my*.dat do echo $file bsub –q normal –W 6:00 -R "rusage[mem=1000]" \ python process_data_file.py $file sleep 1 done For advanced users, one can submit this as one job in a job array, a feature on most schedulers # create script for job array (process_data_file_array.py) # and now submit file as job array num_files=`wc –l < $( ls -1 my*.dat )` bsub –J myarray[1-$num_files] –q normal –W 6:00 -R "rusage[mem=1000]" \ python process_data_file_array.py This process is ideal for serially numbered files, parameter sweeps, & optimization routines!! 13 Parallel Code Can my code be parallelized? ß Does it have large loops that repeat the same operations? ß Does your code do multiple tasks that are not dependent on one another? If so is the dependency weak? ß Can any dependencies or information sharing be overlapped with computation? If not, is the amount of communications small? ß Do multiple tasks depend on the same data? ß Does the order of operations matter? If so how strict does it have to be? 23 Basic guidance for efficient parallelization: ß Is it even worth parallelizing my code? ° Does your code take an intractably long amount of time to complete? ° Do you run a single large model or do statistics on multiple small runs? ° Would the amount of time it take to parallelize your code be worth the gain in speed? ß Parallelizing established code vs. starting from scratch ° Established code: Maybe easier / faster to parallelize, but my not give good performance or scaling ° Start from scratch: Takes longer, but will give better performance, accuracy, and gives the opportunity to turn a “black box” into a code you understand 24 Basic guidance for efficient parallelization: ß Increase the fraction of your program that can be parallelized. Identify the most time consuming parts of your program and parallelize them. This could require modifying your intrinsic algorithm and code’s organization ß Balance parallel workload ß Minimize time spent in communication ß Use simple arrays instead of user defined derived types ß Partition data. Distribute arrays and matrices – allocate specific memory for each MPI process 25 Designing parallel programs - partitioning: One of the first steps in designing a parallel program is to break the problem into discrete “chunks” that can be distributed to multiple parallel tasks. Domain Decomposition: Data associate with a problem is partitioned – each parallel task works on a portion of the data There are different ways to partition the data 31 Designing parallel programs - partitioning: One of the first steps in designing a parallel program is to break the problem into discrete “chunks” that can be distributed to multiple parallel tasks. Functional Decomposition: Problem is decomposed according to the work that must be done. Each parallel task performs a fraction of the total computation. 32 Designing parallel programs - communication: Most parallel applications require tasks to share data with each other. Cost of communication: Computational resources are used to package and transmit data. Requires frequently synchronization – some tasks will wait instead of doing work. Could saturate network bandwidth. Latency vs. Bandwidth: Latency is the time it takes to send a minimal message between two tasks. Bandwidth is the amount of data that can be communicated per unit of time. Sending many small messages can cause latency to dominate communication overhead. Synchronous vs. Asynchronous communication: Synchronous communication is referred to as blocking communication – other work stops until the communication is completed. Asynchronous communication is referred to as non-blocking since other work can be done while communication is taking place. Scope of communication: Point-to-point communication – data transmission between tasks. Collective communication – involves all tasks (in a communication group) This is only partial list of things to consider! 33 Designing parallel programs – load balancing: Load balancing is the practice of distributing approximately equal amount of work so that all tasks are kept busy all the time. How to Achieve Load Balance? Equally partition the work given to each task: For array/matrix operations equally distribute the data set among parallel tasks. For loop iterations where the work done for each iteration is equal, evenly distribute iterations among tasks. Use dynamic work assignment: Certain class problems result in load imbalance even if data is distributed evenly among tasks (sparse matrices, adaptive
Recommended publications
  • Pash: Light-Touch Data-Parallel Shell Processing
    PaSh: Light-touch Data-Parallel Shell Processing Nikos Vasilakis∗ Konstantinos Kallas∗ Konstantinos Mamouras MIT University of Pennsylvania Rice University [email protected] [email protected] [email protected] Achilles Benetopoulos Lazar Cvetković Unaffiliated University of Belgrade [email protected] [email protected] Abstract Parallelizability Parallelizing Runtime Classes §3 Transformations §4.3 Primitives §5 Dataflow This paper presents PaSh, a system for parallelizing POSIX POSIX, GNU §3.1 Regions shell scripts. Given a script, PaSh converts it to a dataflow Annotations §3.2 § 4.1 DFG § 4.4 graph, performs a series of semantics-preserving program §4.2 transformations that expose parallelism, and then converts Seq. Script Par. Script the dataflow graph back into a script—one that adds POSIX constructs to explicitly guide parallelism coupled with PaSh- Fig. 1. PaSh overview. PaSh identifies dataflow regions (§4.1), converts provided Unix-aware runtime primitives for addressing per- them to dataflow graphs (§4.2), applies transformations (§4.3) based onthe parallelizability properties of the commands in these regions (§3.1, §3.2), formance- and correctness-related issues. A lightweight an- and emits a parallel script (§4.4) that uses custom primitives (§5). notation language allows command developers to express key parallelizability properties about their commands. An accompanying parallelizability study of POSIX and GNU • Command developers, responsible for implementing indi- commands—two large and commonly used groups—guides vidual commands such as sort, uniq, and jq. These de- the annotation language and optimized aggregator library velopers usually work in a single programming language, that PaSh uses. PaSh’s extensive evaluation over 44 unmod- leveraging its abstractions to provide parallelism when- ified Unix scripts shows significant speedups (0.89–61.1×, ever possible.
    [Show full text]
  • GNU Astronomy Utilities
    GNU Astronomy Utilities Astronomical data manipulation and analysis programs and libraries for version 0.7, 8 August 2018 Mohammad Akhlaghi Gnuastro (source code, book and webpage) authors (sorted by number of commits): Mohammad Akhlaghi ([email protected], 1101) Mos`eGiordano ([email protected], 29) Vladimir Markelov ([email protected], 18) Boud Roukema ([email protected], 7) Leindert Boogaard ([email protected], 1) Lucas MacQuarrie ([email protected], 1) Th´er`eseGodefroy ([email protected], 1) This book documents version 0.7 of the GNU Astronomy Utilities (Gnuastro). Gnuastro provides various programs and libraries for astronomical data manipulation and analysis. Copyright c 2015-2018 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled \GNU Free Documentation License". For myself, I am interested in science and in philosophy only because I want to learn something about the riddle of the world in which we live, and the riddle of man's knowledge of that world. And I believe that only a revival of interest in these riddles can save the sciences and philosophy from narrow specialization and from an obscurantist faith in the expert's special skill, and in his personal knowledge and authority; a faith that so well fits our `post-rationalist' and `post- critical' age, proudly dedicated to the destruction of the tradition of rational philosophy, and of rational thought itself.
    [Show full text]
  • OSS Alphabetical List and Software Identification
    Annex: OSS Alphabetical list and Software identification Software Short description Page A2ps a2ps formats files for printing on a PostScript printer. 149 AbiWord Open source word processor. 122 AIDE Advanced Intrusion Detection Environment. Free replacement for Tripwire(tm). It does the same 53 things are Tripwire(tm) and more. Alliance Complete set of CAD tools for the specification, design and validation of digital VLSI circuits. 114 Amanda Backup utility. 134 Apache Free HTTP (Web) server which is used by over 50% of all web servers worldwide. 106 Balsa Balsa is the official GNOME mail client. 96 Bash The Bourne Again Shell. It's compatible with the Unix `sh' and offers many extensions found in 147 `csh' and `ksh'. Bayonne Multi-line voice telephony server. 58 Bind BIND "Berkeley Internet Name Daemon", and is the Internet de-facto standard program for 95 turning host names into IP addresses. Bison General-purpose parser generator. 77 BSD operating FreeBSD is an advanced BSD UNIX operating system. 144 systems C Library The GNU C library is used as the C library in the GNU system and most newer systems with the 68 Linux kernel. CAPA Computer Aided Personal Approach. Network system for learning, teaching, assessment and 131 administration. CVS A version control system keeps a history of the changes made to a set of files. 78 DDD DDD is a graphical front-end for GDB and other command-line debuggers. 79 Diald Diald is an intelligent link management tool originally named for its ability to control dial-on- 50 demand network connections. Dosemu DOSEMU stands for DOS Emulation, and is a linux application that enables the Linux OS to run 138 many DOS programs - including some Electric Sophisticated electrical CAD system that can handle many forms of circuit design.
    [Show full text]
  • Openstackでnecが実現する 「OSSクラウド」の世界
    OpenStackでNECが実現する 「OSSクラウド」の世界 2013年3月12日 日本電気株式会社 プラットフォームマーケティング戦略本部 OSS推進室 技術主幹 高橋 千恵子 目次 ▐ OSSの動向 ▐ NECのOSS/Linux事業 ▐ OSSプラットフォームへの取り組み 高可用Linuxプラットフォーム OSSミドルウェアの活用 ▐ OSSクラウド OpenStackへの取り組み OSSクラウドソリューション ~沖縄クラウドサービス基盤~ OpenFlow+OpenStack ▐ 最後に ●本資料に掲載された社名、商品名は各社の商標または登録商標です。 Page 1 © NEC Corporation 2013 OSSの動向 Page 2 © NEC Corporation 2013 OSSコミュニティによるプロジェクト ▐ 全般 ・・・・・SaaS型アプリケーションが増加/著名OSSのベンダー配布、有償サポートや買収が進む OSS定点観測は、freecode.com にて実施。 ▐ インフラ系・・・仮想化やクラウド基盤関係のOSSが注目される/システム、ネット管理OSSの伸びがある サイトでの人気度(300以上)で順位付け。 OSSプロジェクトは2012.4時点で32.4万件。 ▐ デプロイメント系・・PostgreSQL、mySQLが高人気 これらのDB管理、DBクラスタリングOSSも伸びている ビジネス用途を中心に表示。 ▐ アプリ系・・・・SaaS型グループウェアOSS、ソフトウェア分類を越えた統合的な著名OSSの伸びがある コンシューマ・プライベート系は除く。 コンテンツアプリケーション •PHProject [GW] エンジニ コラボレーティブ •EGroupware CRM ERM SCM • Dokuwiki [Wiki] • jGnash アリング • WebGUI [CMF] • phpBB [GW] •OBM [GW] • Enterprise CRM and ア • Tiki Tiki CMS Groupware • ProcessMaker • CorneliOS [CMS] • Ariadne Groupware System • Task Juggler [PM] • Blender オペレーション • TinyMCE [DCM] • XODA • Simple Groupware [GW] プ • OpenWebMail[webmail]• Teamwork [GW] • Dolibarr • Achievo [PM] • mxGraph 製造管理 • Drupal [CMS] • mnoGoSerch • LedgerSMB • white_dune • eZpublish [publish FW]• Managing • ZIm [blog] • Plans [GW] • The Apache Open リ • Tine2.0 [CRM&GW] • GnuCash • Jgraph • OTRS • XWiki [Wiki] • Midgard • Zimbra [GW] for Business • Elastix • GroupOffice [GW] • Twiki [KB] • TUTOS[ERP&PM] • graphviz • FUDForum • OpenSearch Project コンシューマ 系 • SquirrelMail [Webmail] • Time Trex • BRL-CAD • Asterisk • Plone CMS Server • Citadel [bbs] •
    [Show full text]
  • Use of Linux Command Line Not Only for Metacentrum of CESNET
    Introduction Linux UN*X Command line Text Scripting Software MetaCentrum Administration The end Linux, command line & MetaCentrum Use of Linux command line not only for MetaCentrum of CESNET Vojtěch Zeisek Department of Botany, Faculty of Science, Charles University in Prague Institute of Botany, Czech Academy of Sciences, Průhonice https://trapa.cz/, [email protected] January 28 and 29, 2016 . Vojtěch Zeisek (https://trapa.cz/) Linux, command line & MetaCentrum January 28 and 29, 2016 1 / 146 Introduction Linux UN*X Command line Text Scripting Software MetaCentrum Administration The end Outline I 1 Introduction Licenses and money 2 Linux Choose one Differences 3 UN*X Basic theory of operating system Permissions Text FISH 4 Command line Chaining Information and management Directories . Archives . Vojtěch Zeisek (https://trapa.cz/) Linux, command line & MetaCentrum January 28 and 29, 2016 2 / 146 Introduction Linux UN*X Command line Text Scripting Software MetaCentrum Administration The end Outline II Searching Network Parallelisation Other 5 Text Reading Extractions Manipulations Editors Regular expressions 6 Scripting Basic skeleton Reading variables Branching the code Loops . Vojtěch Zeisek (https://trapa.cz/) Linux, command line & MetaCentrum January 28 and 29, 2016 3 / 146 Introduction Linux UN*X Command line Text Scripting Software MetaCentrum Administration The end Outline III 7 Software 8 MetaCentrum Information Usage Tasks Graphical connection 9 Administration File systems System services 10 The end . Vojtěch Zeisek (https://trapa.cz/)
    [Show full text]
  • Application of Open-Source Enterprise Information System Modules: an Empirical Study
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Dissertations, Theses, and Student Research from the College of Business Business, College of Summer 7-20-2010 APPLICATION OF OPEN-SOURCE ENTERPRISE INFORMATION SYSTEM MODULES: AN EMPIRICAL STUDY Sang-Heui Lee University of Nebraska-Lincoln Follow this and additional works at: https://digitalcommons.unl.edu/businessdiss Part of the Management Information Systems Commons, Other Business Commons, and the Technology and Innovation Commons Lee, Sang-Heui, "APPLICATION OF OPEN-SOURCE ENTERPRISE INFORMATION SYSTEM MODULES: AN EMPIRICAL STUDY" (2010). Dissertations, Theses, and Student Research from the College of Business. 13. https://digitalcommons.unl.edu/businessdiss/13 This Article is brought to you for free and open access by the Business, College of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Dissertations, Theses, and Student Research from the College of Business by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. APPLICATION OF OPEN-SOURCE ENTERPRISE INFORMATION SYSTEM MODULES: AN EMPIRICAL STUDY by Sang-Heui Lee A DISSERTATION Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Doctor of Philosophy Major: Interdepartmental Area of Business (Management) Under the Supervision of Professor Sang M. Lee Lincoln, Nebraska July 2010 APPLICATION OF OPEN-SOURCE ENTERPRISE INFORMATION SYSTEM MODULES: AN EMPIRICAL STUDY Sang-Heui Lee, Ph.D. University of Nebraska, 2010 Advisor: Sang M. Lee Although there have been a number of studies on large scale implementation of proprietary enterprise information systems (EIS), open-source software (OSS) for EIS has received limited attention in spite of its potential as a disruptive innovation.
    [Show full text]
  • W. Augustine Dunn, III Ph.D. –
    132 Nicoll St FL 1 New Haven CT, 06511 H 770-312-9544 B [email protected] W. Augustine Dunn, III Ph.D. Í gusdunn.com Technical Expertise EXPERT Python, Bash, regular expressions, advanced data visualization, IPython notebooks, SGE & PBS HPC schedulers, YAML, HTML, CSS, XML, LATEX, pandoc, markdown, reStructuredText, Git, Python software packaging & templating, software documentation with Sphinx, Gnu Parallel INTERMEDIATE Perl, R, Makefiles/build-systems, JSON, vim, unit testing, Mercurial, Bazaar, Subversion BASIC MySQL, PostgreSQL, SQlite, Lua, Tcl, Apache, javascript PYTHON LIBS pandas, scipy, numpy, statsmodels, pyMC, matplotlib/pyplot, seaborn, ggplot, rpy2, networkx, pybedtools, pysam, cookiecutter MISC OSX, Windows, Linux, MS Word, MS Excel, Photoshop/Gimp, Illustrator/Inkscape WET-LAB RNA-seq, ddRAD-seq, proteomics, broad range of molecular biology & protein biochemistry techniques Authored Software blacktie An object oriented python pipeline that simplifies & streamlines the running of complex tophat/cufflinks- based RNA-seq experiments to a single command plus configuration file: prioritizing repeatability & usability. Downloaded from https://pypi.python.org/pypi/blacktie over 9,000 times. gFunc A python-based integrative analysis framework using network graphs to combine multidimensional data-types from disparate “Omics” sources for creating/exploiting functional-genomic gene sets across multiple species. spartan A bioinformatics package, providing the essentials to get a variety of computational jobs done quickly without flourish when that is all that is needed. Experience 2014–present Postdoctoral Associate, Dept. of Ecology & Evolutionary Biology, Yale University, New Haven, CT. Characterization of gene-flow & genotype/phenotype relationships in tsetse fly populations in Uganda. Highlights: { Supervised month-long field expedition collecting tsetse flies in northern Uganda.
    [Show full text]
  • An Order-Aware Dataflow Model for Parallel Unix Pipelines
    65 An Order-Aware Dataflow Model for Parallel Unix Pipelines SHIVAM HANDA∗, CSAIL, MIT, USA KONSTANTINOS KALLAS∗, University of Pennsylvania, USA NIKOS VASILAKIS∗, CSAIL, MIT, USA MARTIN C. RINARD, CSAIL, MIT, USA We present a dataflow model for modelling parallel Unix shell pipelines. To accurately capture the semantics of complex Unix pipelines, the dataflow model is order-aware, i.e., the order in which a node in the dataflow graph consumes inputs from different edges plays a central role in the semantics of the computation and therefore in the resulting parallelization. We use this model to capture the semantics of transformations that exploit data parallelism available in Unix shell computations and prove their correctness. We additionally formalize the translations from the Unix shell to the dataflow model and from the dataflow model backtoa parallel shell script. We implement our model and transformations as the compiler and optimization passes of a system parallelizing shell pipelines, and use it to evaluate the speedup achieved on 47 pipelines. CCS Concepts: • Software and its engineering ! Compilers; Massively parallel systems; Scripting languages. Additional Key Words and Phrases: Unix, POSIX, Shell, Parallelism, Dataflow, Order-awareness ACM Reference Format: Shivam Handa, Konstantinos Kallas, Nikos Vasilakis, and Martin C. Rinard. 2021. An Order-Aware Dataflow Model for Parallel Unix Pipelines. Proc. ACM Program. Lang. 5, ICFP, Article 65 (August 2021), 28 pages. https://doi.org/10.1145/3473570 1 Introduction Unix pipelines
    [Show full text]
  • Tuning Mysql Operating System Is in a Good Database Performance Depends on a Number of Factors You Position to Take Off
    LINUX MAGAZINE November 2010 NEWS 8 NEWS Find out why the MeeGo mobile Tuning MySQL operating system is in a good Database performance depends on a number of factors you position to take off. must identify, analyze, and fine-tune in a systematic way. 9 DELTACLOUD Learn how to test, measure, and optimize your MySQL Red Hat releases its cloud APIs installation from the bottom up by looking at your into the open source wilderness. hardware, operating system, and database. 10 MOZILLA Mozilla announces the Firefox 4 Beta and Fennec Alpha releases and a new gaming platform. MySQL Tuning 18Take a holistic approach to 11 UTOUCH analyzing and optimizing your MySQL Shuttleworth talks about the new database. gesture suite for touch devices. 12 GALAXY TAB TCP Tuning Tips 28Return to the fundamentals Samsung reveals its Android then apply some simple 2.2-powered tablet. techniques to keep your 14 TECH TOOLS network humming. Useful tools for the tech domain. Miro 32Breaking news 24/7 can leave Speeding Up Python you feeling behind the times before 36Stop waiting on your code to you get up in the morning. The Miro execute. Learn about some cool tools SERVICE media aggregator helps you keep up. for speeding up your Python apps. 3 Comment 15 DVD 16 Letters Community Notebook 96 Featured Events 84 Cache 88 Anna Torvalds Then and now: Rikki looks back 10 Take the Torvalds tour of Helsinki. 98 Preview years to the first issue. 90 Nils Torvalds 85 Doghouse Early influences. Linux Magazine ISSN 1471-5678 Memory doesn't always serve us Linux Magazine is published monthly well, so maddog sets us straight.
    [Show full text]
  • Phpmyadmin Hpmyadmin May Sound Like Most Distros Have a Package for a Tool for Administering PHP, Phpmyadmin, Though This Might Not Pbut It’S Not
    FOSSPICKS Sparkling gems and new releases from the world of FOSSpicks Free and Open Source Software Hunting snarks is for amateurs – Ben Everard spends his time in the long grass, stalking the hottest, free-est Linux software around. Web-based database management PHPMyAdmin HPMyAdmin may sound like most distros have a package for a tool for administering PHP, PHPMyAdmin, though this might not Pbut it’s not. It’s a front-end always be the latest version. for MySQL and MariaDB written in One of the big advantages of PHP. From creating databases and PHPMyAdmin is that it makes it tables, to backups, to finding easy for non-experts to manage particular pieces of data in the databases. Backing up and tables, PHPMyAdmin really can querying are probably the most perform just about everything you basic tasks, and these are easily need to do on a database, but for performed provided you know a bit anything that’s not directly about databases. The search tool supported, there’s an SQL interface works as an SQL query builder, so it The cities of the world displayed in PHPMyAdmin’s GIS data view on the web page. helps you learn SQL as you use it. In on top of an OpenStreetMap outline. Unsurprisingly given the name, fact, the whole PHPMyAdmin PHPMyAdmin runs on top of a As well as general database LAMP stack, so if you’ve already got tools, there’s a range of tools to help this installed, then getting “PHPMyAdmin really can perform you visualise data including GIS PHPMyAdmin is just a case of just about everything you need to (geographical) data map overlays, downloading it and unzipping it various chart-drawing tools, and somewhere in the webroot.
    [Show full text]
  • Bash Reference Manual Reference Documentation for Bash Edition 5.1, for Bash Version 5.1
    Bash Reference Manual Reference Documentation for Bash Edition 5.1, for Bash Version 5.1. December 2020 Chet Ramey, Case Western Reserve University Brian Fox, Free Software Foundation This text is a brief description of the features that are present in the Bash shell (version 5.1, 21 December 2020). This is Edition 5.1, last updated 21 December 2020, of The GNU Bash Reference Manual, for Bash, Version 5.1. Copyright c 1988{2020 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled \GNU Free Documentation License". i Table of Contents 1 Introduction ::::::::::::::::::::::::::::::::::::: 1 1.1 What is Bash?:::::::::::::::::::::::::::::::::::::::::::::::::: 1 1.2 What is a shell? :::::::::::::::::::::::::::::::::::::::::::::::: 1 2 Definitions ::::::::::::::::::::::::::::::::::::::: 3 3 Basic Shell Features ::::::::::::::::::::::::::::: 5 3.1 Shell Syntax :::::::::::::::::::::::::::::::::::::::::::::::::::: 5 3.1.1 Shell Operation :::::::::::::::::::::::::::::::::::::::::::: 5 3.1.2 Quoting ::::::::::::::::::::::::::::::::::::::::::::::::::: 6 3.1.2.1 Escape Character ::::::::::::::::::::::::::::::::::::: 6 3.1.2.2 Single Quotes ::::::::::::::::::::::::::::::::::::::::: 6 3.1.2.3 Double Quotes ::::::::::::::::::::::::::::::::::::::::
    [Show full text]
  • GNU Parallel
    Introduction to Parallel Computing [email protected] October 2017 1 Why parallel? Speed up – Solve a problem faster → more processing power (a.k.a. strong scaling) Scale up – Solve a larger problem → more memory and network capacity (a.k.a. weak scaling) Scale out – Solve many problems → more storage capacity 2 Agenda 1. General concepts 2. Hardware 3. Programming models 4. User tools 3 1. General concepts 4 Amdahl's Law In parallel computing, Amdahl's law is mainly used to predict the theoretical maximum speedup for programs using multiple processors. 5 http://tutorials.jenkov.com/java-concurrency/amdahls-law.html Amdahl's Law 6 https://en.wikipedia.org/wiki/Amdahl%27s_law Parallel overhead 7 https://computing.llnl.gov/tutorials/parallel_comp/images/helloWorldParallelCallgraph.gif Parallelization involves ● decomposition of the work – distributing instructions to processors – distributing data to memories ● collaboration of the workers – synchronization of the distributed work – communication of data 8 Decomposition ● Work decomposition : task-level parallelism ● Data decomposition : data-level parallelism ● Domain decomposition : decomposition of work and data is done in a higher model, e.g. in the reality 9 Collaboration ● Synchronous (SIMD) at the processor level ● Fine-grained parallelism if subtasks must communicate many times per second (instruction level); loosely synchronous ● Coarse-grained parallelism if they do not communicate many times per second (function-call level) ● Embarrassingly parallel if they rarely or never have to communicate (asynchronous) 10 Parallel programming paradigms ● Task-farming (master/slave or work stealing) ● SPMD (Single program multiple data) ● Pipelining (A->B->C, one process per task concurrently) ● Divide and Conquer (processes spawned at need and report their result to the parent) ● Speculative parallelism (processes spawned and result possibly discarded) 11 2.
    [Show full text]