<<

Lecture Notes in 8385

Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board David Hutchison Lancaster , Lancaster, UK Takeo Kanade Carnegie Mellon University, Pittsburgh, PA, USA Josef Kittler University of Surrey, Guildford, UK Jon M. Kleinberg Cornell University, Ithaca, NY, USA Alfred Kobsa , Irvine, CA, USA Friedemann Mattern ETH Zurich, Zürich, Switzerland John C. Mitchell Stanford University, Stanford, CA, USA Moni Naor Weizmann Institute of Science, Rehovot, Israel Oscar Nierstrasz University of Bern, Bern, Switzerland C. Pandu Rangan Indian Institute of Technology, Madras, India Bernhard Steffen TU Dortmund University, Dortmund, Germany Demetri Terzopoulos University of California, Los Angeles, CA, USA Doug Tygar University of California, Berkeley, CA, USA Gerhard Weikum Max Planck Institute for Informatics, Saarbruecken, Germany

For further volumes: http://www.springer.com/series/7407 Roman Wyrzykowski • Jack Dongarra Konrad Karczewski • Jerzy Was´niewski (Eds.)

Parallel Processing and Applied

10th International Conference, PPAM 2013 , , September 8–11, 2013 Revised Selected Papers, Part II

123 Editors Roman Wyrzykowski Jerzy Was´niewski Konrad Karczewski Informatics and Mathematical Modelling Institute of Computer and Technical University of Denmark Information Science Kongens Lyngby Czestochowa University of Technology Denmark Czestochowa Poland Jack Dongarra Department of Computer Science University of Tennessee Knoxville, TN USA

ISSN 0302-9743 ISSN 1611-3349 (electronic) ISBN 978-3-642-55194-9 ISBN 978-3-642-55195-6 (eBook) DOI 10.1007/978-3-642-55195-6 Springer Heidelberg New York Dordrecht London

Library of Congress Control Number: 2014937670

LNCS Sublibrary: SL1 – Theoretical Computer Science and General Issues

Ó Springer-Verlag Berlin Heidelberg 2014 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com) Preface

This volume comprises the proceedings of the 10th International Conference on Parallel Processing and Applied Mathematics, PPAM 2013, which was held in Warsaw, Poland, September 8–11, 2013. The jubilee PPAM conference was organized by the Department of Computer and Information Science of the Czestochowa Uni- versity of Technology, under the patronage of the Committee of Informatics of the Polish of Sciences, in cooperation with the Polish-Japanese Institute of Information Technology. The main organizer was Roman Wyrzykowski. PPAM is a biennial conference. Nine previous events have been held in different places in Poland since 1994. The proceedings of the last six conferences have been published by Springer-Verlag in the Lecture Notes in Computer Science series (Nałe˛czów, 2001, vol. 2328; Cze˛stochowa, 2003, vol. 3019; Poznan´, 2005, vol. 3911; Gdan´sk, 2007, vol. 4967; Wrocław, 2009, vols. 6067 and 6068; Torun´, 2011, vols. 7203 and 7204). The PPAM conferences have become an international forum for exchanging ideas between researchers involved in parallel and distributed computing, including theory and applications, as well as applied and computational mathematics. The focus of PPAM 2013 was on models, algorithms, and software tools that facilitate efficient and convenient utilization of modern parallel and distributed computing architectures, as well as on large-scale applications. This meeting gathered the largest number of participants in the of PPAM conferences – more than 230 participants from 32 countries. A strict refereeing pro- cess resulted in the acceptance of 143 contributed presentations, while approximately 44 % of the submissions were rejected. Regular tracks of the conference covered such important fields of parallel/distributed/cloud computing and applied mathematics as: – Numerical algorithms and parallel scientific computing – Parallel non-numerical algorithms – Tools and environments for parallel/distributed/cloud computing – Applications of parallel computing – Applied mathematics, evolutionary computing, and metaheuristics The plenary and invited talks were presented by: – Fran Berman from the Rensselaer Polytechnic Institute (USA) – Ewa Deelman from the University of Southern California (USA) – Jack Dongarra from the University of Tennessee and Oak Ridge National Labo- ratory (USA), and University of Manchester (UK) – Geoffrey Ch. Fox from Indiana University (USA) – Laura Grigori from Inria (France) – Fred Gustavson from the IBM T.J. Watson Research Center (USA) – Georg Hager from the University of Erlangen-Nuremberg (Germany) – Alexey Lastovetsky from the University College Dublin (Ireland) VI Preface

– Miron Livny from the University of Wisconsin (USA) – Piotr Luszczek from the University of Tennessee (USA) – Rizos Sakellariou from the University of Manchester (UK) – James Sexton from the IBM T.J. Watson Research Center (USA) – Leonel Sousa from the Technical (Portugal) – Denis Trystram from the Grenoble Institute of Technology (France) – Jeffrey Vetter from the Oak Ridge National Laboratory and Georgia Institute of Technology (USA) – Richard W. Vuduc from the Georgia Institute of Technology (USA) – Robert Wisniewski from Intel (USA) Important and integral parts of the PPAM 2013 conference were the workshops: – Minisympsium on GPU Computing organized by José R. Herrero from the Universitat Politecnica de Catalunya (Spain), Enrique S. Quintana-Ortí from the Universidad Jaime I (Spain), and Robert Strzodka from NVIDIA – Special Session on Multicore Systems organized by Ozcan Ozturk from Bilkent University (Turkey), and Suleyman Tosun from (Turkey) – Workshop on Numerical Algorithms on Hybrid Architectures organized by Prze- mysław Stpiczyn´ski from the Maria Curie Skłodowska University (Poland), and Jerzy Was´niewski from the Technical University of Denmark – Workshop on Models, Algorithms and Methodologies for Hierarchical Parallelism in New HPC Systems organized by Giulliano Laccetti and Marco Lapegna from the University of Naples Federico II (Italy), and Raffaele Montella from the University of Naples Parthenope (Italy) – Workshop on Power and Energy Aspects of Computation organized by Richard W. Vuduc from the Georgia Institute of Technology (USA), Piotr Luszczek from the University of Tennessee (USA), and Leonel Sousa from the Technical University of Lisbon (Portugal) – Workshop on Scheduling for Parallel Computing, SPC 2013, organized by Maciej Drozdowski from Poznan´ University of Technology (Poland) – The 5th Workshop on Language-Based Parallel Programming Models, WLPP 2013, organized by Ami Marowka from the Bar-Ilan University (Israel) – The 4th Workshop on Performance Evaluation of Parallel Applications on Large- Scale Systems organized by Jan Kwiatkowski from Wrocław University of Tech- nology (Poland) – Workshop on Parallel Computational , PBC 2013, organized by David A. Bader from the Georgia Institute of Technology (USA), Jarosław Zola_ from Rutgers University (USA), and Bertil Schmidt from the University of Mainz (Germany) – Minisymposium on Applications of Parallel Computations in Industry and Engi- neering organized by Raimondas Cˇ iegis from Vilnius Gediminas Technical Uni- versity (Lithuania), and Julius Zˇ ilinskas from (Lithuania) – Minisymposium on HPC Applications in Physical Sciences organized by Grzegorz Kamieniarz and Wojciech Florek from A. Mickiewicz University in Poznan´ (Poland) Preface VII

– Minisymposium on Applied High-Performance Numerical Algorithms in PDEs organized by Piotr Krzyzanowski_ and Leszek Marcinkowski from Warsaw Uni- versity (Poland), and Talal Rahman from Bergen University College (Norway) – Minisymposium on High-Performance Computing Interval Methods organized by Bartłomiej J. Kubica from Warsaw University of Technology (Poland) – Workshop on Complex Colective Systems organized by Paweł Topa and Jarosław Wa˛s from AGH University of Science and Technology in Kraków (Poland) The PPAM 2013 meeting began with five tutorials: – Scientific Computing on GPUs, by Dominik Göddeke from the University of Dortmund (Germany), and Robert Strzodka from NVIDIA – Design and Implementation of Parallel Algorithms for Highly Heterogeneous HPC Platforms, by Alexey Lastovetsky from University College Dublin (Ireland) – Node Level Performance Engineering, by Georg Hager from the University of Erlangen-Nuremberg (Germany) – Delivering the OpenCl Performance Promise: Creating and Optimizing OpenCl Applications with the Intel OpenCl SDK, by Maxim Shevtsov from Intel (Russia) – A History of A Central Result of Linear Algebra and the Role of that Gauss, Cholesky and Others Played in Its Development, by Fred Gustavson from the IBM T.J. Watson Research Center (USA) The PPAM Best Poster Award is granted to the best poster on display at the PPAM conferences, and was established at PPAM 2009. This award is bestowed by the Program Committee members to the presenting author(s) of the best poster. The selection criteria are based on the scientific content, and on the quality of the poster presentation. The PPAM 2013 winners were Lars Karlsson, and Carl Christian K. Mikkelsen from Umea University, who presented the poster ‘‘Improving Perfect Parallelism.’’ The Special Award was bestowed to Lukasz Szustak, and Krzysztof Rojek from the Cze˛stochowa University of Technology, and Pawel Gepner from Intel, who presented the poster ‘‘Using Intel Xeon Phi to Accelerate Computation in MPDATA Algorithm.’’ A new topic was introduced at PPAM 2013: Power and Energy Aspects of Com- putation (PEAC). Recent advances in computer hardware rendered the issues related to power and energy consumption as the driving metric for the design of computa- tional platforms for years to come. Power-conscious designs, including multicore CPUs and various accelerators, dominate large supercomputing installations as well as large industrial complexes devoted to cloud computing and the big data analytics. At stake are serious financial and environmental impacts, which the large-scale com- puting community has to now consider and embark on careful re-engineering of software to fit the demanding power caps and tight energy budgets. The workshop presented research into new ways of addressing these pressing issues of energy preservation, power consumption, and heat dissipation while attaining the best possible performance levels at the scale demanded by modern scientific challenges. VIII Preface

The PEAC Workshop, as well as the conference as a whole, featured a number of invited and contributed talks covering a diverse array of recent advances, including: – Cache-aware roofline model for monitoring performance and power in connection with application characterization (by L. Sousa et al.) – Resource scheduling and allocation schemes based on stochastic models (by M. Oxley et al.) – A comprehensive study of iterative solvers on a large variety of computing plat- forms including modern CPUs, accelerators, and embedded computers (by Enrique S. Quintana-Ortí et al.) – Energy and power consumption trends in HPC (by P. Luszczek) – Sensitivity of graph metrics to missing data and the benefits they have for overall energy consumption (by A. Zakrzewska et al.) – Cache energy models and their analytical properties in the context of embedded devices (by K. de Vogeleer et al.) – Predictive models for execution time, energy consumption, and power draw of algorithms (by R. Vuduc) The organizers are indebted to the PPAM 2013 sponsors, whose support was vital to the success of the conference. The main sponsor was the Intel Corporation. The other sponsors were: IBM Corporation, Hewlett-Packard Company, Rogue Wave Software, and AMD. We thank to all the members of the international Program Committee and additional reviewers for their diligent work in refereeing the submitted papers. Finally, we thank all of the local organizers from the Cze˛stochowa University of Technology, and the Polish-Japanese Institute of Information Technology in Warsaw, who helped us to run the event very smoothly. We are especially indebted to Grazyna_ Kołak- owska, Urszula Kroczewska, Łukasz Kuczyn´ski, Adam Tomas´, and Marcin Woz´niak from the Cze˛stochowa University of Technology; and to Jerzy P. Nowacki, Marek Tudruj, Jan Jedlin´ski, and Adam Smyk from the Polish-Japanese Institute of Infor- mation Technology. We hope that this volume will be useful to you. We would like everyone who reads it to feel invited to the next conference, PPAM 2015, which will be held September 6–9, 2015, in Kraków, the old capital of Poland.

January 2014 Roman Wyrzykowski Jack Dongarra Konrad Karczewski Jerzy Was´niewski Organization

Program Committee

Jan We˛glarz Poznan´ University of Technology, Poland (Honorary Chair) Roman Wyrzykowski Cze˛stochowa University of Technology, Poland (Program Committee Chair) Ewa Deelman University of Southern California, USA (Program Committee Vice-Chair) Francisco Almeida Universidad de La Laguna, Spain Pedro Alonso Universidad Politecnica de Valencia, Spain Peter Arbenz ETH, Zurich, Switzerland Piotr Bała Nicolaus Copernicus University, Poland David A. Bader Georgia Institute of Technology, USA Michael Bader TU München, Germany Włodzimierz Bielecki West Pomeranian University of Technology, Poland Paolo Bientinesi RWTH Aachen, Germany Radim Blaheta Institute of Geonics, Czech Academy of Sciences Jacek Błazewicz_ Poznan´ University of Technology, Poland Adam Bokota Cze˛stochowa University of Technology, Poland Pascal Bouvry University of Luxembourg Tadeusz Burczyn´ski Silesia University of Technology, Poland Jerzy Brzezin´ski Poznan´ University of Technology, Poland Marian Bubak AGH Kraków, Poland, and , The Netherlands Christopher Carothers Rensselaer Polytechnic Institute, USA Jesus Carretero Universidad Carlos III de Madrid, Spain Raimondas Cˇ iegis Vilnius Gediminas Technical University, Lithuania Andrea Clematis IMATI-CNR, Italy Jose Cunha University Nova of Lisbon, Portugal Zbigniew Czech Silesia University of Technology, Poland Jack Dongarra University of Tennessee and ORNL, USA, and University of Manchester, UK Maciej Drozdowski Poznan´ University of Technology, Poland Erik Elmroth Umea University, Sweden Mariusz Flasin´ski , Poland Franz Franchetti Carnegie Mellon University, USA Tomas Fryza Brno University of Technology, Czech Republic Pawel Gepner Intel Corporation X Organization

Domingo Gimenez University of Murcia, Spain Mathieu Giraud LIFL and Inria, France Jacek Gondzio , UK Andrzej Gos´cin´ski Deakin University, Australia Laura Grigori Inria, France Adam Grzech Wroclaw University of Technology, Poland Inge Gutheil Forschungszentrum Juelich, Germany Georg Hager University of Erlangen-Nuremberg, Germany José R. Herrero Universitat Politecnica de Catalunya, Barcelona, Spain Ladislav Hluchy Slovak Academy of Sciences, Slovakia Florin Isaila Universidad Carlos III de Madrid, Spain Ondrej Jakl Institute of Geonics, Czech Academy of Sciences Emmanuel Jeannot Inria, France Bo Kågström Umea University, Sweden Alexey Kalinov Cadence Design System, Russia Aneta Karaivanova Bulgarian Academy of Sciences, Sofia Eleni Karatza Aristotle University of Thessaloniki, Greece Ayse Kiper Middle East Technical University, Turkey Jacek Kitowski Institute of Computer Science, AGH, Poland Jozef Korbicz University of Zielona Góra, Poland Stanislaw Kozielski Silesia University of Technology, Poland Dieter Kranzlmueller Ludwig Maximillian University, Munich, and Leibniz Supercomputing Centre, Germany Henryk Krawczyk Gdan´sk University of Technology, Poland Piotr Krzyzanowski_ University of Warsaw, Poland Mirosław Kurkowski Cze˛stochowa University of Technology, Poland Krzysztof Kurowski PSNC, Poznan´, Poland Jan Kwiatkowski Wrocław University of Technology, Poland Jakub Kurzak University of Tennessee, USA Giulliano Laccetti University of Naples Federico II, Italy Marco Lapegna University of Naples Federico II, Italy Alexey Lastovetsky University College Dublin, Ireland Joao Lourenco University Nova of Lisbon, Portugal Hatem Ltaief KAUST, Saudi Arabia Emilio Luque Universitat Autonoma de Barcelona, Spain Vyacheslav I. Maksimov Ural Branch, Russian Academy of Sciences Victor E. Malyshkin Siberian Branch, Russian Academy of Sciences Pierre Manneback University of Mons, Belgium Tomas Margalef Universitat Autonoma de Barcelona, Spain Svetozar Margenov Bulgarian Academy of Sciences, Sofia Ami Marowka Bar-Ilan University, Israel Norbert Meyer PSNC, Poznan´, Poland Jarek Nabrzyski University of Notre Dame, USA Raymond Namyst University of Bordeaux and Inria, France Maya G. Neytcheva Uppsala University, Sweden Gabriel Oksa Slovak Academy of Sciences, Bratislava Organization XI

Ozcan Ozturk Bilkent University, Turkey Tomasz Olas Cze˛stochowa University of Technology, Poland Marcin Paprzycki IBS PAN and SWPS, Warsaw, Poland Dana Petcu West University of Timisoara, Romania Enrique S. Quintana-Ortí Universidad Jaime I, Spain Jean-Marc Pierson Paul Sabatier University, France Thomas Rauber University of Bayreuth, Germany Paul Renaud-Goud Inria, France Jacek Rokicki Warsaw University of Technology, Poland Gudula Runger Chemnitz University of Technology, Germany Leszek Rutkowski Cze˛stochowa University of Technology, Poland Robert Schaefer Institute of Computer Science, AGH, Poland Olaf Schenk Università della Svizzera Italiana, Switzerland Stanislav Sedukhin University of Aizu, Japan Franciszek Seredyn´ski Cardinal Stefan Wyszyn´ski University in Warsaw, Poland Happy Sithole Centre for High Performance Computing, South Africa Jurij Silc Jozef Stefan Institute, Slovenia Karolj Skala Ruder Boskovic Institute, Croatia Peter M.A. Sloot University of Amsterdam, The Netherlands Leonel Sousa Technical University of Lisbon, Portugal Radek Stompor Université Paris Diderot and CNRS, France Przemysław Stpiczyn´ski Maria Curie Skłodowska University, Poland Maciej Stroin´ski PSNC, Poznan´, Poland Ireneusz Szczes´niak Cze˛stochowa University of Technology, Poland Boleslaw Szymanski Rensselaer Polytechnic Institute, USA Domenico Talia University of Calabria, Italy Christian Terboven RWTH Aachen, Germany Andrei Tchernykh CICESE Research Center, Ensenada, Mexico Suleyman Tosun Ankara University, Turkey Roman Trobec Jozef Stefan Institute, Slovenia Denis Trystram Grenoble Institute of Technology, France Marek Tudruj Polish Academy of Sciences and Polish-Japanese Institute of Information Technology, Warsaw, Poland Bora Uçar Ecole Normale Superieure de Lyon, France Marian Vajtersic Salzburg University, Austria Jerzy Was´niewski Technical University of Denmark Bogdan Wiszniewski Gdan´sk University of Technology, Poland Andrzej Wyszogrodzki IMGW, Warsaw, Poland Ramin Yahyapour University of Göttingen/GWDG, Germany Jianping Zhu Cleveland State University, USA Julius Zˇ ilinskas Vilnius University, Lithuania Jarosław Zola_ Rutgers University, USA Contents – Part II

Workshop on Scheduling for Parallel Computing (SPC 2013)

Scheduling Bag-of-Tasks Applications to Optimize Computation Time and Cost ...... 3 Anastasia Grekioti and Natalia V. Shakhlevich

Scheduling Moldable Tasks with Precedence Constraints and Arbitrary Speedup Functions on Multiprocessors ...... 13 Sascha Hunold

OStrich: Fair Scheduling for Multiple Submissions...... 26 Joseph Emeras, Vinicius Pinheiro, Krzysztof Rzadca, and Denis Trystram

Fair Share Is Not Enough: Measuring Fairness in Scheduling with Cooperative Game Theory ...... 38 Piotr Skowron and Krzysztof Rzadca

Setting up Clusters of Computing Units to Process Several Data Streams Efficiently ...... 49 Daniel Millot and Christian Parrot

The 5th Workshop on Language-Based Parallel Programming Models (WLPP 2013)

Towards Standardization of Measuring the Usability of Parallel Languages . . . 65 Ami Marowka

Experiences with Implementing Task Pools in Chapel and X10 ...... 75 Claudia Fohry and Jens Breitbart

Parampl: A Simple Approach for Parallel Execution of AMPL Programs . . . 86 Artur Olszak and Andrzej Karbowski

Prototyping Framework for Parallel Numerical Computations ...... 95 Ondrˇej Meca, Stanislav Böhm, Marek Beˇhálek, and Martin Šurkovsky´

Algorithms for In-Place Matrix Transposition...... 105 Fred G. Gustavson and David W. Walker

FooPar: A Functional Object Oriented Parallel Framework in Scala ...... 118 Felix Palludan Hargreaves and Daniel Merkle XIV Contents – Part II

Effects of Segmented Finite Difference Time Domain on GPU ...... 130 Jose Juan Mijares Chan, Gagan Battoo, Parimala Thulasiraman, and Ruppa K. Thulasiram

Optimization of an OpenCL-Based Multi-swarm PSO Algorithm on an APU. . . 140 Wayne Franz, Parimala Thulasiraman, and Ruppa K. Thulasiram

Core Allocation Policies on Multicore Platforms to Accelerate Forest Fire Spread Predictions ...... 151 Tomàs Artés, Andrés Cencerrado, Ana Cortés, and Tomàs Margalef

The 4th Workshop on Performance Evaluation of Parallel Applications on Large-Scale Systems

The Effect of Parallelization on a Tetrahedral Mesh Optimization Method .... 163 Domingo Benitez, Eduardo Rodríguez, José M. Escobar, and Rafael Montenegro

Analysis of Partitioning Models and Metrics in Parallel Sparse Matrix-Vector Multiplication ...... 174 Kamer Kaya, Bora Uçar, and Ümit V. Catalyürek

Achieving Memory Scalability in the GYSELA Code to Fit Exascale Constraints...... 185 Fabien Rozar, Guillaume Latu, and Jean Roman

Probabilistic Analysis of Barrier Eliminating Method Applied to Load-Imbalanced Parallel Application ...... 196 Naoki Yonezawa, Ken’ichi Katou, Issei Kino, and Koichi Wada

Multi-GPU Parallel Memetic Algorithm for Capacitated Vehicle Routing Problem...... 207 Mieczysław Wodecki, Wojciech Bozejko,_ Michał Karpin´ski, and Maciej Pacut

Parallel Applications Performance Evaluation Using the Concept of Granularity...... 215 Jan Kwiatkowski

Workshop on Parallel Computational Biology (PBC 2013)

Resolving Load Balancing Issues in BWA on NUMA Multicore Architectures ...... 227 Charlotte Herzeel, Thomas J. Ashby, Pascal Costanza, and Wolfgang De Meuter

K-mulus: Strategies for BLAST in the Cloud...... 237 Christopher M. Hill, Carl H. Albach, Sebastian G. Angel, and Mihai Pop Contents – Part II XV

Faster GPU-Accelerated Smith-Waterman Algorithm with Alignment Backtracking for Short DNA Sequences ...... 247 Yongchao Liu and Bertil Schmidt

Accelerating String Matching on MIC Architecture for Motif Extraction. . . . 258 Solon P. Pissis, Christian Goll, Pavlos Pavlidis, and Alexandros Stamatakis

A Parallel, Distributed-Memory Framework for Comparative Motif Discovery ...... 268 Dieter De Witte, Michiel Van Bel, Pieter Audenaert, Piet Demeester, Bart Dhoedt, Klaas Vandepoele, and Jan Fostier

Parallel Seed-Based Approach to Protein Structure Similarity Detection . . . . 278 Guillaume Chapuis, Mathilde Le Boudic - Jamin, Rumen Andonov, Hristo Djidjev, and Dominique Lavenier

Minisymposium on Applications of Parallel Computation in Industry and Engineering

A Parallel Solver for the Time-Periodic Navier–Stokes Equations ...... 291 Peter Arbenz, Daniel Hupp, and Dominik Obrist

Parallel Numerical Algorithms for Simulation of Rectangular Waveguides by Using GPU ...... 301 Raimondas Cˇ iegis, Andrej Bugajev, Zˇilvinas Kancleris, and Gediminas Šlekas

OpenACC Parallelisation for Diffusion Problems, Applied to Temperature Distribution on a Honeycomb Around the Bee Brood: A Worked Example Using BiCGSTAB ...... 311 Hermann J. Eberl and Rangarajan Sudarsan

Application of CUDA for Acceleration of Calculations in Boundary Value Problems Solving Using PIES ...... 322 Andrzej Kuzelewski, Eugeniusz Zieniuk, and Agnieszka Boltuc

Modeling and Simulations of Beam Stabilization in Edge-Emitting Broad Area Semiconductor Devices ...... 332 Mindaugas Radziunas and Raimondas Cˇ iegis

Concurrent Nomadic and Bundle Search: A Class of Parallel Algorithms for Local Optimization ...... 343 Costas Voglis, Dimitrios G. Papageorgiou, and Isaac E. Lagaris

Parallel Multi-objective Memetic Algorithm for Competitive Facility Location ...... 354 Algirdas Lancˇinskas and Julius Zˇilinskas XVI Contents – Part II

Parallelization of Encryption Algorithm Based on Chaos System and Neural Networks...... 364 Dariusz Burak

Minisymposium on HPC Applications in Physical Sciences

Simulations of the Adsorption Behavior of Dendrimers ...... 377 Jarosław S. Kłos and Jens U. Sommer

An Optimized Lattice Boltzmann Code for BlueGene/Q ...... 385 Marcello Pivanti, Filippo Mantovani, Sebastiano Fabio Schifano, Raffaele Tripiccione, and Luca Zenesini

A Parallel and Scalable Iterative Solver for Sequences of Dense Eigenproblems Arising in FLAPW ...... 395 Mario Berljafa and Edoardo Di Napoli

Sequential Monte Carlo in Bayesian Assessment of Contaminant Source Localization Based on the Sensors Concentration Measurements ...... 407 Anna Wawrzynczak, Piotr Kopka, and Mieczyslaw Borysiewicz

Effective Parallelization of Quantum Simulations: Nanomagnetic Molecular Rings ...... 418 Piotr Kozłowski, Grzegorz Musiał, Michał Antkowiak, and Dante Gatteschi

DFT Study of the Cr8 Molecular Magnet Within Chain-Model Approximations ...... 428 Valerio Bellini, Daria M. Tomecka, Bartosz Brzostowski, Michał Wojciechowski, Filippo Troiani, Franca Manghi, and Marco Affronte

Non-perturbative Methods in Phenomenological Simulations of Ring-Shape Molecular Nanomagnets...... 438 Piotr Kozłowski, Grzegorz Musiał, Monika Haglauer, Wojciech Florek, Michał Antkowiak, Filippo Esposito, and Dante Gatteschi

Non-uniform Quantum Spin Chains: Simulations of Static and Dynamic Properties...... 448 Artur Barasin´ski, Bartosz Brzostowski, Ryszard Matysiak, Paweł Sobczak, and Dariusz Woz´niak

Minisymposium on Applied High Performance Numerical Algorithms in PDEs

A Domain Decomposition Method for Discretization of Multiscale Elliptic Problems by Discontinuous Galerkin Method ...... 461 Maksymilian Dryja Contents – Part II XVII

Parallel Preconditioner for the Finite Volume Element Discretization of Elliptic Problems ...... 469 Leszek Marcinkowski and Talal Rahman

Preconditioning Iterative Substructuring Methods Using Inexact Local Solvers . . . 479 Piotr Krzyzanowski

Additive Schwarz Method for Nonsymmetric Local Discontinuous Galerkin Discretization of Elliptic Problem ...... 489 Filip Z. Klawe

Fast Numerical Method for 2D Initial-Boundary Value Problems for the Boltzmann Equation ...... 499 Alexei Heintz and Piotr Kowalczyk

Simulating Phase Transition Dynamics on Non-trivial Domains ...... 510 Łukasz Bolikowski and Maria Gokieli

Variable Block Multilevel Iterative Solution of General Sparse Linear Systems ...... 520 Bruno Carpentieri, Jia Liao, and Masha Sosonkina

An Automatic Way of Finding Robust Elimination Trees for a Multi-frontal Sparse Solver for Radical 2D Hierarchical Meshes ...... 531 Hassan AbouEisha, Piotr Gurgul, Anna Paszyn´ska, Maciek Paszyn´ski, Krzysztof Kuz´nik, and Mikhail Moshkov

Parallel Efficiency of an Adaptive, Dynamically Balanced Flow Solver . . . . 541 Stanislaw Gepner, Jerzy Majewski, and Jacek Rokicki

Modification of the Newton’s Method for the Simulations of Gallium Nitride Semiconductor Devices...... 551 Konrad Sakowski, Leszek Marcinkowski, and Stanislaw Krukowski

Numerical Realization of the One-Dimensional Model of Burning Methanol . . . 561 Krzysztof Moszyn´ski

Minisymposium on High Performance Computing Interval Methods

A Shaving Method for Interval Linear Systems of Equations ...... 573 Milan Hladík and Jaroslav Horácˇek

Finding Enclosures for Linear Systems Using Interval Matrix Multiplication inCUDA...... 582 Alexander Dallmann, Philip-Daniel Beck, and Jürgen Wolff von Gudenberg XVIII Contents – Part II

GPU Acceleration of Metaheuristics Solving Large Scale Parametric Interval Algebraic Systems ...... 591 Jerzy Duda and Iwona Skalna

Parallel Approach to Monte Carlo Simulation for Option Price Sensitivities Using the Adjoint and Interval Analysis ...... 600 Grzegorz Kozikowski and Bartłomiej Jacek Kubica

Subsquares Approach – A Simple Scheme for Solving Overdetermined Interval Linear Systems ...... 613 Jaroslav Horácˇek and Milan Hladík

Using Quadratic Approximations in an Interval Method for Solving Underdetermined and Well-Determined Nonlinear Systems ...... 623 Bartłomiej Jacek Kubica

The Definition of Interval-Valued Intuitionistic Fuzzy Sets in the Framework of Dempster-Shafer Theory ...... 634 Ludmila Dymova and Pavel Sevastjanov

Interval Finite Difference Method for Solving the Problem of Bioheat Transfer Between Blood Vessel and Tissue ...... 644 Malgorzata A. Jankowska

Workshop on Complex Collective Systems

Bridging the Gap: From Cellular Automata to Differential Equation Models for Pedestrian Dynamics ...... 659 Felix Dietrich, Gerta Köster, Michael Seitz, and Isabella von Sivers

Cellular Model of Pedestrian Dynamics with Adaptive Time Span ...... 669 Marek Bukácˇek, Pavel Hrabák, and Milan Krbálek

The Use of GPGPU in Continuous and Discrete Models of Crowd Dynamics. . . 679 Hubert Mróz, Jarosław Wa˛s, and Paweł Topa

Modeling Behavioral Traits of Employees in a Workplace with Cellular Automata...... 689 Petros Saravakos and Georgios Ch. Sirakoulis

Probabilistic Pharmaceutical Modelling: A Comparison Between Synchronous and Asynchronous Cellular Automata...... 699 Marija Bezbradica, Heather J. Ruskin, and Martin Crane

The Graph of Cellular Automata Applied for Modelling Tumour Induced Angiogenesis ...... 711 Paweł Topa Contents – Part II XIX

Neighborhood Selection and Rules Identification for Cellular Automata: A Rough Sets Approach ...... 721 Bartłomiej Płaczek

Coupling Lattice Boltzmann Gas and Level Set Method for Simulating Free Surface Flow in GPU/CUDA Environment ...... 731 Tomir Kryza and Witold Dzwinel

Creation of Agent’s Vision of Social Network Through Episodic Memory .... 741 Michał Wrzeszcz and Jacek Kitowski

The Influence of Multi-agent Cooperation on the Efficiency of Taxi Dispatching ...... 751 Michał Maciejewski and Kai Nagel

Basic Endogenous-Money Economy: An Agent-Based Approach ...... 761 Ivan Blecic, Arnaldo Cecchini, and Giuseppe A. Trunfio

Author Index ...... 771 Contents – Part I

Algebra and Geometry Combined Explains How the Mind Does Math . . . . . 1 Fred G. Gustavson

Numerical Algorithms and Parallel Scientific Computing

Exploiting Data Sparsity in Parallel Matrix Powers Computations ...... 15 Nicholas Knight, Erin Carson, and James Demmel

Performance of Dense Eigensolvers on BlueGene/Q ...... 26 Inge Gutheil, Jan Felix Münchhalfen, and Johannes Grotendorst

Experiences with a Lanczos Eigensolver in High-Precision Arithmetic . . . . . 36 Alexander Alperovich, Alex Druinsky, and Sivan Toledo

Adaptive Load Balancing for Massively Parallel Multi-Level Monte Carlo Solvers ...... 47 Jonas Šukys

Parallel One–Sided Jacobi SVD Algorithm with Variable Blocking Factor.... 57 Martin Becˇka and Gabriel Okša

An Identity Parareal Method for Temporal Parallel Computations ...... 67 Toshiya Takami and Daiki Fukudome

Improving Perfect Parallelism ...... 76 Lars Karlsson, Carl Christian Kjelgaard Mikkelsen, and Bo Kågström

Methods for High-Throughput Computation of Elementary Functions ...... 86 Marat Dukhan and Richard Vuduc

Engineering Nonlinear Pseudorandom Number Generators...... 96 Samuel Neves and Filipe Araujo

Extending the Generalized Fermat Prime Number Search Beyond One Million Digits Using GPUs ...... 106 Iain Bethune and Michael Goetz

Iterative Solution of Singular Systems with Applications...... 114 Radim Blaheta, Ondrˇej Jakl, and Jirˇí Stary´

Statistical Estimates for the Conditioning of Linear Least Squares Problems. . . 124 Marc Baboulin, Serge Gratton, Rémi Lacroix, and Alan J. Laub XXII Contents – Part I

Numerical Treatment of a Cross-Diffusion Model of Biofilm Exposure to Antimicrobials ...... 134 Kazi Rahman and Hermann J. Eberl

Performance Analysis for Stencil-Based 3D MPDATA Algorithm on GPU Architecture...... 145 Krzysztof Rojek, Lukasz Szustak, and Roman Wyrzykowski

Elliptic Solver Performance Evaluation on Modern Hardware Architectures. . . 155 Milosz Ciznicki, Piotr Kopta, Michal Kulczewski, Krzysztof Kurowski, and Pawel Gepner

Parallel Geometric Multigrid Preconditioner for 3D FEM in NuscaS Software Package ...... 166 Tomasz Olas

Scalable Parallel Generation of Very Large Sparse Benchmark Matrices . . . . 178 Daniel Langr, Ivan Šimecˇek, Pavel Tvrdík, and Tomáš Dytrych

Parallel Non-Numerical Algorithms

Co-operation Schemes for the Parallel Memetic Algorithm ...... 191 Jakub Nalepa, Miroslaw Blocho, and Zbigniew J. Czech

Scalable and Efficient Parallel Selection ...... 202 Christian Siebert

Optimal Diffusion for Load Balancing in Heterogeneous Networks ...... 214 Katerina A. Dimitrakopoulou and Nikolaos M. Missirlis

Parallel Bounded Model Checking of Security Protocols ...... 224 Mirosław Kurkowski, Olga Siedlecka-Lamch, Sabina Szymoniak, and Henryk Piech

Tools and Environments for Parallel/Distributed/Cloud Computing

Development of Domain-Specific Solutions Within the Polish Infrastructure for Advanced Scientific Research ...... 237 J. Kitowski, K. Wiatr, P. Bała, M. Borcz, A. Czyzewski,_ Ł. Dutka, R. Kluszczyn´ski, J. Kotus, P. Kustra, N. Meyer, A. Milenin, Z. Mosurska, R. Paja˛k, Ł. Rauch, M. Sterzel, D. Stokłosa, and T. Szepieniec

Cost Optimization of Execution of Multi-level Deadline-Constrained Scientific Workflows on Clouds ...... 251 Maciej Malawski, Kamil Figiela, Marian Bubak, Ewa Deelman, and Jarek Nabrzyski Contents – Part I XXIII

Parallel Computations in the Volunteer–Based Comcute System ...... 261 Paweł Czarnul, Jarosław Kuchta, and Mariusz Matuszek

Secure Storage and Processing of Confidential Data on Public Clouds . . . . . 272 Jan Meizner, Marian Bubak, Maciej Malawski, and Piotr Nowakowski

Efficient Service Delivery in Complex Heterogeneous and Distributed Environment...... 283 Mariusz Fras and Jan Kwiatkowski

Domain-Driven Visual Query Formulation over RDF Data Sets ...... 293 Bartosz Balis, Tomasz Grabiec, and Marian Bubak

Distributed Program Execution Control Based on Application Global States Monitoring in PEGASUS DA Framework ...... 302 Damian Kopan´ski, Łukasz Mas´ko, Eryk Laskowski, Adam Smyk, Janusz Borkowski, and Marek Tudruj

Application of Parallel Computing

New Scalable SIMD-Based Ray Caster Implementation for Virtual Machining. . . 317 Alexander Leutgeb, Torsten Welsch, and Michael Hava

Parallelization of Permuting XML Compressors ...... 327 Tyler Corbin, Tomasz Müldner, and Jan Krzysztof Miziołek

Parallel Processing Model for Syntactic Pattern Recognition-Based Electrical Load Forecast...... 338 Mariusz Flasin´ski, Janusz Jurek, and Tomasz Peszek

Parallel Event–Driven Simulation Based on Application Global State Monitoring...... 348 Łukasz Mas´ko and Marek Tudruj

Applied Mathematics, Evolutionary Computing and Metaheuristics

It’s Not a Bug, It’s a Feature: Wait-Free Asynchronous Cellular Genetic Algorithm ...... 361 Frédéric Pinel, Bernabé Dorronsoro, Pascal Bouvry, and Samee U. Khan

Genetic Programming in Automatic Discovery of Relationships in Computer System Monitoring Data ...... 371 Wlodzimierz Funika and Pawel Koperek

Genetic Algorithms Execution Control Under a Global Application State Monitoring Infrastructure ...... 381 Adam Smyk and Marek Tudruj XXIV Contents – Part I

Evolutionary Algorithms for Abstract Planning ...... 392 Jaroslaw Skaruz, Artur Niewiadomski, and Wojciech Penczek

Solution of the Inverse Continuous Casting Problem with the Aid of Modified Harmony Search Algorithm ...... 402 Edyta Hetmaniok, Damian Słota, and Adam Zielonka

Influence of a Topology of a Spring Network on its Ability to Learn Mechanical Behaviour...... 412 Maja Czoków and Jacek Mie˛kisz

Comparing Images Based on Histograms of Local Interest Points...... 423 Tomasz Nowak, Marcin Gabryel, Marcin Korytkowski, and Rafał Scherer

Improved Digital Image Segmentation Based on Stereo Vision and Mean Shift Algorithm ...... 433 Rafał Grycuk, Marcin Gabryel, Marcin Korytkowski, Jakub Romanowski, and Rafał Scherer

Minisymposium on GPU Computing

Evaluation of Autoparallelization Toolkits for Commodity GPUs ...... 447 David Williams, Valeriu Codreanu, Po Yang, Baoquan Liu, Feng Dong, Burhan Yasar, Babak Mahdian, Alessandro Chiarini, Xia Zhao, and Jos B.T.M. Roerdink

Real-Time Multiview Human Body Tracking Using GPU-Accelerated PSO. . . 458 Boguslaw Rymut and Bogdan Kwolek

Implementation of a Heterogeneous Image Reconstruction System for Clinical Magnetic Resonance ...... 469 Grzegorz Tomasz Kowalik, Jennifer Anne Steeden, David Atkinson, Andrew Taylor, and Vivek Muthurangu

X-Ray Laser Imaging of Biomolecules Using Multiple GPUs ...... 480 Stefan Engblom and Jing Liu

Out-of-Core Solution of Eigenproblems for Macromolecular Simulations . . . 490 José I. Aliaga, Davor Davidovic´, and Enrique S. Quintana-Ortí

Using GPUs for Parallel Stencil Computations in Relativistic Hydrodynamic Simulation ...... 500 Sebastian Cygert, Daniel Kikoła, Joanna Porter-Sobieraj, Jan Sikorski, and Marcin Słodkowski Contents – Part I XXV

Special Session on Multicore Systems

PDNOC: An Efficient Partially Diagonal Network-on-Chip Design ...... 513 Thomas Canhao Xu, Ville Leppänen, Pasi Liljeberg, Juha Plosila, and Hannu Tenhunen

Adaptive Fork-Heuristics for Software Thread-Level Speculation ...... 523 Zhen Cao and Clark Verbrugge

Inexact Sparse Matrix Vector Multiplication in Krylov Subspace Methods: An Application-Oriented Reduction Method...... 534 Ahmad Mansour and Jürgen Götze

The Regular Expression Matching Algorithm for the Energy Efficient Reconfigurable SoC ...... 545 Paweł Russek and Kazimierz Wiatr

Workshop on Numerical Algorithms on Hybrid Architectures

Performance Evaluation of Sparse Matrix Multiplication Kernels on Intel Xeon Phi ...... 559 Erik Saule, Kamer Kaya, and Ümit V. Cßatalyürek

Portable HPC Programming on Intel Many-Integrated-Core Hardware with MAGMA Port to Xeon Phi ...... 571 Jack Dongarra, Mark Gates, Azzam Haidar, Yulu Jia, Khairul Kabir, Piotr Luszczek, and Stanimire Tomov

Using Intel Xeon Phi Coprocessor to Accelerate Computations in MPDATA Algorithm...... 582 Lukasz Szustak, Krzysztof Rojek, and Pawel Gepner

Accelerating a Massively Parallel Numerical Simulation in Electromagnetism Using a Cluster of GPUs ...... 593 Cédric Augonnet, David Goudin, Agnès Pujols, and Muriel Sesques

Multidimensional Monte Carlo Integration on Clusters with Hybrid GPU-Accelerated Nodes ...... 603 Dominik Szałkowski and Przemysław Stpiczyn´ski

Efficient Execution of Erasure Codes on AMD APU Architecture ...... 613 Roman Wyrzykowski, Marcin Woz´niak, and Lukasz Kuczyn´ski

AVX Acceleration of DD Arithmetic Between a Sparse Matrix and Vector . . . 622 Toshiaki Hishinuma, Akihiro Fujii, Teruo Tanaka, and Hidehiko Hasegawa XXVI Contents – Part I

Using Quadruple Precision Arithmetic to Accelerate Krylov Subspace Methods on GPUs...... 632 Daichi Mukunoki and Daisuke Takahashi

Effectiveness of Sparse Data Structure for Double-Double and Quad-Double Arithmetics ...... 643 Tsubasa Saito, Satoko Kikkawa, Emiko Ishiwata, and Hidehiko Hasegawa

Efficient Heuristic Adaptive Quadrature on GPUs: Design and Evaluation .... 652 Daniel Thuerck, Sven Widmer, Arjan Kuijper, and Michael Goesele

An Efficient Representation on GPU for Transition Rate Matrices for Markov Chains ...... 663 Jarosław Bylina, Beata Bylina, and Marek Karwacki

Eigen-G: GPU-Based Eigenvalue Solver for Real-Symmetric Dense Matrices . . . 673 Toshiyuki Imamura, Susumu Yamada, and Masahiko Machida

A Square Block Format for Symmetric Band Matrices ...... 683 Fred G. Gustavson, José R. Herrero, and Enric Morancho

Workshop on Models, Algorithms, and Methodologies for Hierarchical Parallelism in New HPC Systems

Transparent Application Acceleration by Intelligent Scheduling of Shared Library Calls on Heterogeneous Systems ...... 693 João Colaço, Adrian Matoga, Aleksandar Ilic, Nuno Roma, Pedro Tomás, and Ricardo Chaves

A Study on Adaptive Algorithms for Numerical Quadrature on Heterogeneous GPU and Multicore Based Systems...... 704 Giuliano Laccetti, Marco Lapegna, Valeria Mele, and Diego Romano

Improving Parallel I/O Performance Using Multithreaded Two-Phase I/O with Processor Affinity Management...... 714 Yuichi Tsujita, Kazumi Yoshinaga, Atsushi Hori, Mikiko Sato, Mitaro Namiki, and Yutaka Ishikawa

Storage Management Systems for Organizationally Distributed Environments PLGrid PLUS Case Study ...... 724 Renata Słota, Łukasz Dutka, Michał Wrzeszcz, Bartosz Kryza, Darin Nikolow, Dariusz Król, and Jacek Kitowski

The High Performance Internet of Things: Using GVirtuS to Share High-End GPUs with ARM Based Cluster Computing Nodes ...... 734 Giuliano Laccetti, Raffaele Montella, Carlo Palmieri, and Valentina Pelliccia Contents – Part I XXVII

Workshop on Power and Energy Aspects of Computation

Monitoring Performance and Power for Application Characterization with the Cache-Aware Roofline Model ...... 747 Diogo Antão, Luís Taniça, Aleksandar Ilic, Frederico Pratas, Pedro Tomás, and Leonel Sousa

Energy and Deadline Constrained Robust Stochastic Static Resource Allocation. . . 761 Mark A. Oxley, Sudeep Pasricha, Howard Jay Siegel, and Anthony A. Maciejewski

Performance and Energy Analysis of the Iterative Solution of Sparse Linear Systems on Multicore and Manycore Architectures...... 772 José I. Aliaga, Hartwig Anzt, Maribel Castillo, Juan C. Fernández, Germán León, Joaquín Pérez, and Enrique S. Quintana-Ortí

Measuring the Sensitivity of Graph Metrics to Missing Data ...... 783 Anita Zakrzewska and David A. Bader

The Energy/Frequency Convexity Rule: Modeling and Experimental Validation on Mobile Devices ...... 793 Karel De Vogeleer, Gerard Memmi, Pierre Jouvelot, and Fabien Coelho

Author Index ...... 805