Zuse Institute Berlin | Annual Report 2013 ZIB 2017 Annual Report
SPATIAL AND SOFTWARE DYNAMIC MATHEMATICS SUSTAINABILITY SCALABILITY STRUCTURE OF MEETS IN THE AGE OF AND LIFE HUMANITIES OPEN SCIENCE CONCURRENCY PAGE 26 PAGE 36 PAGE 50 PAGE 66
Preface PREFACE
We are living in a rapidly changing world In its white paper SWD (2016) 106, the From a more scientific view, one has to At ZIB, we are guided by the belief In 2017, we worked frantically to open interdisciplinary projects in the fields with enormous opportunities, but also European Commission states: “High- add that MSO (modeling, simulation, and that a high-level approach to mod- new horizons, for example, into compu- of high-energy physics and the automo- many threats. Mastering complexity is performance computing (HPC) is at the optimization) is equally important for eling, simulation, and optimization tational social science and humanities, tive industry. “Dusting off Cometary becoming more and more important. The core of major advances and innovation the development of most innovations in (MSO), enriched by data analytics and we intensified our research activities Surfaces” shows how simulation and last decade has caused the digitization in the digital age. In the massively technology, health, energy, and finance. and high-performance computing in established application fields like life data analytics can be combined to model of almost every aspect of our daily lives connected digital economy, the expo- Although HPC, data analytics, and artifi- (HPC), delivers a considerable con- and materials sciences, nanophotonics, the gas and dust cloud around the comet and generated an avalanche of digital nential growth of data, networking, and cial intelligence offer new opportunities, tribution to solving the grand chal- and traffic and transport networks. Churyumov–Gerasimenko in order to data. What looks like an immense gain computing will continue to drive societal their impact on decision-making, tech- lenges, improving the scientific and This annual report provides insights extract relevant information from the in information for our networked society changes, scientific advances, and produc- nological and social innovation, and the industrial innovation capability, and into a variety of other success stories data gathered by a space mission to the at first, quickly turns out to be a curse. tivity gains. The nature of computing is improvement of products and services allowing better services for citizens and gives a general overview of ZIB’s comet. And, “Software Sustainability Data per se still is not information, and changing with an increasing number of will be limited without a massive effort and better decision-making. Major organization and key factors for its suc- in the Age of Open Science” reports on certainly is not knowledge. On the con- data-intensive critical applications, and on the MSO axes. This statement is opportunities rely on the connections cessful development. In particular, six research activities at ZIB regarding trary: intelligent questions about data the intertwining of HPC with a growing supported by various scientific studies at this interface across the whole feature articles highlight aspects of our sustainability methods that form part of diversity can only generate information number of industrial applications and that show that MSO methods have out- spectrum of the sciences and human- work: “Mathematics Meets Humanities” increasing efforts to conduct research in in the true sense of the term, and thus scientific domains makes HPC the paced computational power in terms of ities. Additionally, the convergence provides an insight for new fruitful accordance with open-science principles, allow for taming complexity. However, engine to power the new global digital capability over the past decades. It is well of MSO, HPC, and big data will enable interactions between mathematics, and demonstrates how these efforts may even the most reliable information is not economy. Mastering HPC technologies known that the performance of comput- traditional computational-intensive facing new challenges by integrating change the way research is done in the sufficient for decision-making, which has has become indispensable for supporting ing machines improves by a factor of two sectors to be more productive and the human factor into complex systems future. to be based on exploring several options policy making, maintaining national sov- every 18 months. Moore’s law describes move up into higher-value products modeling, and the humanities, seeking In summary, ZIB continues to be a place and trying to predict their consequences ereignty, and economic competitiveness. this trend. In the period from 1990 to and services, and will also pave the possible solutions to otherwise unsolv- booming with excellent research and (costs and benefits) in order to select The development of the next HPC gen- 2014, this resulted in a speedup of more way for new science, businesses, and able tasks. In “Spatial and Dynamic first-rate scientific services and infra- the best option or create new solutions eration has become a national strategic than a factor of 1,000,000. What is not applications that we are far from Structure of Life,” we report on recent structure. Against our own expectations, to grand challenges. The prediction of priority for the most powerful nations, known to the general public is that MSO being able to imagine now. ZIB is one research results at ZIB that provide we again broke several all-time records complex processes, however, requires including the USA, China, Japan, Russia, research has achieved similar or even of the places where this convergence insight into fundamental principles of in 2017, although present capacity limita- massive simulation based on efficient India, and Europe as well.” higher speedups in algorithms needed is happening now. biological processes, such as cell division, tion seemed to prevent this. For example, algorithms and high-performance com- for grand challenge simulations, so that protein regulation, and brain formation in total, €8 million worth of third-party puting infrastructures. the combined speedup is estimated to and activity. “20,000 Feet Above the funding was acquired, which marked an be larger than an expressive factor of Ground” sheds some light on one of increase for the sixth year in a row and a 1,800,000,000,000. This means that ZIB’s innovation projects with industry new record in ZIB’s history. computations that we can do in one sec- partners by showing how basic research ond today would have needed more than leads to new navigation systems for air- 50,000 years in 1990. craft that save both fuel and time. The Berlin, May 2018 article “Scalability and Concurrency” Christof Schütte features joint research problems in two President of ZIB
4 2017 Annual Report Zuse Institute Berlin 5 26 50 66 SOFTWARE SPATIAL AND DYNAMIC SUSTAINABILITY SCALABILITY AND STRUCTURE OF LIFE IN THE AGE OF CONCURRENCY Solving common problems Insight into fundamental OPEN SCIENCE in high-energy physics and principles of life the automotive industry Software as sustained 4 scientific output ZUSE INSTITUTE 36 58 BERLIN Preface | Executive Summary | MATHEMATICS Organization | ZIB Structure | 20,000 FEET ABOVE ZIB in Numbers | Kombiverkehr 2021 – Optimizing the Line MEETS HUMANITIES THE GROUND Network of Karlsruhe | ZIB's Illustrating the potential of New navigation systems Data Center | Economic enhanced collaboration Situation | Spin-Offs | for aircraft Number of Employees 76 84 DUSTING OFF ZIB PUBLICATIONS COMETARY SURFACES REFERENCES First realistic simulations of the IMPRINT CONTENTS dust and gas coma of a comet Executive Summary
PARALLEL AND DISTRIBUTED COMPUTING
MATHEMATICAL From the smallest structures at the sub-atomistic level to large-scale OPTIMIZATION cosmic matter: supercomputers are indispensable tools to simulate impor EXECUTIVE AND SCIENTIFIC tant real-world phenomena. Our feature article “Dusting off Cometary Surfaces” INFORMATION describes how the beautiful dust corona of the comet 67P/Churyumov– The ongoing digitization of industry Gerasimenko was investigated on the and society continues to open up new HLRN supercomputer. Taking the shape opportunities for optimization in many of the nucleus, the temperature distribu- applications, such as intermodal travel tion on the surface, the centrifugal, and or computational biology, and brings the Coriolis forces into account, the dust up exciting mathematical challenges distribution was predicted with a much on the interfaces to data and computer higher accuracy than previously possi- science, as well as economics. The ble. This project is also a good example computation of a new line plan for the of the new breed of mixed HPC/big-data city of Karlsruhe (see the respective projects, since our simulation results had report below) and for the ICE rotations to be correlated with a great deal of obser- SUMMARY vation data from the Rosetta mission. of Deutsche Bahn, as well as spectacular performance improvements in the gen- The two interdisciplinary projects of eral mixed-integer programming solver the feature article “Scalability and MATHEMATICS SCIP and the parallel computing frame- Concurrency” also combine high-per- work UG were particular highlights. formance computing with data-stream FOR LIFE AND FOR Networking and outreach was also in processing. A scalable software for the the focus of the department in 2017. real-time analysis of nucleus–nucleus MATERIAL SCIENCES inverse problems, and massive simula- Together with Freie Universität Berlin’s collisions at the FAIR particle accel- tion of particle-based and agent-based Department of Information Systems, erator was developed and parallel models allowed for novel insights in ZIB’s Optimization department co-orga- algorithms were designed for testing This year was marked by the prepara- such diverse areas as biomedicine or nized the 55th International Conference and verifying software components tion of the Cluster of Excellence initia- archaeology. on Operations Research (OR2017), in that are used in cars for the real-time tive MATH+, which fueled work on new analysis of sensor data. As in the years These new developments are comple- which more than 900 participants from application directions, in particular in before, research at ZIB was focused on mented by a successful continuation 44 countries contributed almost 600 the framework of collaborations with improving scalability and fault toler- of activities in modeling, simulation, presentations in 21 parallel streams. the existing excellence clusters TOPOI ance in dynamic systems with failing and optimization as well as visual data Two international workshops in Tokyo (digital humanities and computational components. analysis. They resulted, for instance, and Berlin, jointly organized with the social sciences) and Neurocure (ana- in the development of a nonaddictive Institute of Statistical Mathematics Our next supercomputer, the HLRN-IV, lyzing and interpreting life microscopy painkiller (with a publication in Science in Tokyo, the Institute of Mathematics will make it much easier to conduct the data). and emergence of a spin-off company), for Industry of Kyushu University, and most challenging research projects. Many research activities that were the first reconstruction and analysis MODAL AG, explored the fascinating The European procurement lead by ZIB started in previous years have been the- of entire microtubule spindles of C. interface of mathematical optimization was successfully finalized in 2017 and matically extended, such as augmenting elegans (with a publication in Nature and data analysis. Successful activities the contract signed in early 2018. With UQ by optimal design of experiments Communication), the acquisition of for the upcoming extension of the 244,000 processor cores, the new system and empirical Bayesian approaches, EU projects on optical metrology and Research Campus MODAL focused on will provide a six-fold increase in appli- or moving from linear to nonlinear scientific cloud computing, an indus- the acquisition of new partners from cation performance. The supercomputer shape manifolds. The investigation and try project on cloth simulation and both industry and academia. will be operated on behalf of the North utilization of machine-learning tech- visualization, and the first prize in German HLRN consortium by ZIB in niques has increased, with applications the worldwide MICCAI segmentation Berlin and the University of Göttingen particularly in image segmentation and competition. in Lower Saxony.
8 2017 Annual Report Zuse Institute Berlin 9 Organization
THE STATUTES
The Statutes, adopted by the Board of Directors at its meeting on June 30, 2005, define the functions and procedures of ZIB’s bodies, determine ZIB’s research and development mission and its service tasks, and decide upon the composition of the Scientific Advisory Board and its role. ADMINISTRATIVE SCIENTIFIC BODIES ADVISORY BOARD
The bodies of ZIB are the President and DR. JUTTA KOCH-UNTERSEHER The Scientific Advisory Board advises the Board of Directors Der Regierende Bürgermeister von ZIB on scientific and technical issues, (Verwaltungsrat). Berlin Senatskanzlei – Wissenschaft supports ZIB’s work, and facilitates und Forschung ZIB’s cooperation and partnership with President of ZIB universities, research institutions, and PROF. DR. CHRISTOF SCHÜTTE DR. JÜRGEN VARNHORN industry. Senatsverwaltung für Wirtschaft, Vice President Energie und Betriebe The Board of Directors appointed the N.N. following members to the Scientific PROF. DR. MANFRED HENNECKE Advisory Board: The Board of Directors was composed in Bundesanstalt für Materialforschung ORGANIZATION PROF. DR. JÖRG-RÜDIGER SACK 2017 as follows: und -prüfung (BAM) Carleton University, Ottawa, Canada PROF. DR. PETER FRENSCH THOMAS FREDERKING Vice President, Humboldt-Universität PROF. DR. ALFRED K. LOUIS Helmholtz-Zentrum Berlin für zu Berlin (Chairman) Universität des Saarlandes, Materialien und Energie (HZB) Saarbrücken PROF. DR. CHRISTIAN THOMSEN DR. HEIKE WOLKE President, Technische Universität Berlin PROF. DR. RAINER E. BURKARD SCIENTIFIC BOARD OF DIRECTORS Max-Delbrück-Centrum für Molekulare (Vice Chairman) Technische Universität Graz, Austria Medizin (MDC) CHAIRMAN: PROF. DR. PETER FRENSCH ADVISORY BOARD Humboldt-Universität zu Berlin (HUB) PROF. DR. BRIGITTA SCHÜTT PROF. DR. MICHAEL DELLNITZ The Board of Directors met on May 19, CHAIRMAN PROF. DR. JÖRG-RÜDIGER SACK Vice President, Freie Universität Berlin Universität Paderborn 2017, and December 4, 2017. | Ottawa PROF. DR. RAINER E. BURKARD | Graz LUDGER D. SAX PROF. DR. MICHAEL DELLNITZ | Paderborn Grid Optimization Europe GmbH PROF. DR. ALFRED K. LOUIS | Saarbrücken LUDGER D. SAX | Essen DR. ANNA SCHREIECK DR. ANNA SCHREIECK | Ludwigshafen PRESIDENT BASF SE, Ludwigshafen DR. REINHARD UPPENKAMP | Berlin PROF. DR. CHRISTOF SCHÜTTE DR. KERSTIN WAAS | Frankfurt am Main DR. REINHARD UPPENKAMP PROF. DR. DOROTHEA WAGNER | Karlsruhe Berlin Chemie AG, Berlin VICE PRESIDENT N.N. DR. KERSTIN WAAS Deutsche Bahn AG, Frankfurt am Main
PROF. DR. DOROTHEA WAGNER Karlsruher Institut für Technologie (KIT), Karlsruhe MATHEMATICS FOR LIFE AND MATHEMATICAL OPTIMIZATION AND PARALLEL AND ADMINISTRATION MATERIAL SCIENCES SCIENTIFIC INFORMATION DISTRIBUTED COMPUTING AND LIBRARY The Scientific Advisory Board met on Prof. Dr. Christof Schütte Prof. Dr. Ralf Borndörfer Prof. Dr. Alexander Reinefeld Annerose Steinke July 3 and 4, 2017, at ZIB. Prof. Dr. Thorsten Koch
10 2017 Annual Report Zuse Institute Berlin 11 ZIB Structure
MATHEMATICS FOR LIFE MATHEMATICAL OPTIMIZATION PARALLEL AND DISTRIBUTED COMPUTING ADMINISTRATION AND LIBRARY AND MATERIAL SCIENCES AND SCIENTIFIC INFORMATION A. Reinefeld A. Steinke C. Schütte R. Borndörfer, T. Koch
NUMERICAL VISUAL DATA MATHEMATICAL SCIENTIFIC DISTRIBUTED SUPERCOMPUTING MATHEMATICS ANALYSIS OPTIMIZATION INFORMATION ALGORITHMS T. Steinke M. Weiser H.-C. Hege R. Borndörfer, T. Koch (B. Rusch) F. Schintke T. Koch ZIB
COMPUTATIONAL VISUAL DATA MATHEMATICS OF WEB DISTRIBUTED DATA HPC CONSULTING MEDICINE ANALYSIS IN TRANSPORTATION TECHNOLOGY MANAGEMENT T. Steinke M. Weiser, SCIENCE AND AND LOGISTICS AND MULTIMEDIA F. Schintke S. Zachow ENGINEERING R. Borndörfer W. Dalitz H.-C. Hege
COMPUTATIONAL IMAGE ANALYSIS MATHEMATICAL DIGITAL PRESER‑ SCALABLE HPC SYSTEMS MOLECULAR IN BIOLOGY OPTIMIZATION VATION ALGORITHMS STRUCTURE C. Schimmel DESIGN AND MATERIAL METHODS W. Peters-Kottig T. Schütt M. Weber SCIENCE A. Gleixner S. Prohaska, D. Baum
COMPUTATIONAL THERAPY MATHEMATICS OF SERVICE CENTER MASSIVELY ALGORITHMS ZIB is structured into four divisions: NANO OPTICS PLANNING TELECOMMUNI‑ DIGITIZATION PARALLEL DATA FOR INNOVATIVE three scientific divisions and ZIB’s F. Schmidt S. Zachow CATION BERLIN ANALYSIS ARCHICTETURE Since May 1, 2017 Until June 30, 2017 A. Müller F. Schintke T. Steinke administration. S. Burger R. Borndörfer Each of the scientific divisions is composed of two departments that are further subdivided into research groups COMPUTATIONAL BIOINFORMATICS ENERGY KOBV LIBRARY SYSTEMS BIOLOGY IN MEDICINE NETWORK NETWORK – (darker bluish color) and research service S. Röblitz T. Conrad OPTIMIZATION RESEARCH AND groups (lighter bluish color). J. Zittel DEVELOPMENT J. Schweiger B. Rusch
UNCERTAINTY MACHINE MATHEMATICS OF KOBV LIBRARY QUANTIFICATION LEARNING FOR HEALTH CARE NETWORK – T. Sullivan TIME SERIES Since July 1, 2017 OPERATING Since June 1, 2017 G. Sagnol S. Lohrum H. Wu
FRIEDRICH‑ ALTHOFF‑ KONSORTIUM U. Kaminsky
CORE FACILITY IT AND DATA SERVICES C. Schäuble LEGEND SCIENTIFIC DIVISIONS AND DEPARTMENTS RESEARCH GROUPS
RESEARCH SERVICE GROUPS BRAIN BERLIN RESEARCH AREA CORE FACILITY INFORMATION NETWORK C. Schäuble
12 2017 Annual Report Zuse Institute Berlin 13 ZIB in Numbers DATA ARCHIVE AT ZIB TOTAL CAPACITY ON 65 PB 16,600 TAPES ZIB IN NUMBERS 13,986 SEMINARS GIVEN BY ZIB SCIENTISTS AT UNIVERSITIES 15 OUTREACH EVENTS FOR 2,004 SCHOOL CLASSES AND THE GENERAL PUBLIC ¤7,896,000 ¤6,147,000 58 PROJECT-RELATED PUBLIC 9 VISITORS SCIP THIRD-PARTY FUNDS LECTURES LONG NIGHT OF THE SCIENCES GIVEN BY ZIB SCIENTISTS AT UNIVERSITIES OF MIP SOLVER SCIP ¤1,749,000 INDUSTRIAL THIRD-PARTY PROJECTS PROMOTION OF YOUNG SCIENTISTS: DISSERTATIONS MASTER’S PROFESSORSHIPS OFFERED TO ZIB RESEARCHERS 6,150 99 18 INTERNATIONAL GUESTS CONFERENCES AND 4 AT ZIB IN 2017 WORKSHOPS AT ZIB SCIENTIFIC 230TALKS 136 IN JUNE 2017: PEER-REVIEWED PUBLICATIONS IN ZIB SUPERCOMPUTER IS INTERNATIONAL SCIENTIFIC JOURNALS 65 DISTINGUISHED 165 INVITED NO.140 IN TOP500 LIST
14 2017 Annual Report Zuse Institute Berlin 15 Kombiverkehr 2021 – Optimizing the Line Network of Karlsruhe KOMBIVERKEHR 2021 – 1 OPTIMIZING THE LINE NETWORK OF KARLSRUHE ZIB’s research group Mathematics around 20,000 people, such that the certain transportation modes, mini- of Transportation and Logistics Kombi plan was no longer adequate. In mum frequency requirements for each supported the design of a new line a joint project with Verkehrsbetriebe station, or maximum frequency bounds network for Karlsruhe. The optimized Karlsruhe GmbH (VBK), PTV Transport on tracks. A solution of the optimiza- solution reduces costs by around 5% Consult GmbH, and TTK Transport tion model implies a set of line routes while travel times remain constant. Technology Consult GmbH, our goal together with frequencies of operations, was to investigate the potentials in cost such that the given passenger demand Kombiverkehr 2021 (combined traffic and efficiency improvements by using can be routed with respect to the result- 2021) is currently one of the largest mathematical optimization methods, ing line capacities. Using bi-criteria inner-city construction projects in in order to find the best compromise optimization, the whole potential and Germany. It aims at increasing the over- between travel-time improvements and the interdependencies between cost all traffic flow and the attractiveness cost efficiency. and travel-time minimization can be of the city of Karlsruhe by tunneling investigated by computing the entire public transport under the pedestrian The mathematical basis of our Pareto front of efficient solutions. Potentials for travel-time improvements zone of Karlstrasse. In this way, the line-planning optimization system is and cost savings were illustrated by tramway network is expanded, such a mixed-integer linear optimization The focus in Karlsruhe was on the computing different solutions address- that it becomes necessary to redesign model that considers all possible line redesign of the tramway network. All ing cost minimization, travel-time the complete line network. As early routes and passenger paths simulta- other transportation systems (e.g. bus minimization, and some compromise as 2002, a line plan (“Kombi”, see fig- neously. To favor traveling without and regional traffic) were considered for solutions (see table 1). In several rounds ure 2) was designed using the so-called transfers, the model is based on a novel passenger routing as well, however, the of computation, evaluation, and discus- “Standardisierte Bewertung” method. concept of metric inequalities for direct line routes of these systems remained sion, the infrastructure, operational, Kombi was supposed to be operated in connections, for further details see [1]. unchanged. The computation was done service-related, etc. constraints and 2021 after all construction works were All kinds of practical and technical for the peak hour. requirements of VBK, as well as a suit- finished. The population of Karlsruhe, requirements can be included in the able cost model, were incorporated into however, has increased since then by model, e. g. minimum cycle times for both the line-planning optimizer and the accompanying micro-simulation model VISUM of PTV. In this way, realistic line plans could be computed. Total travel time [T h] 2 Assessing many alternatives (which Cost (incl. eight-minute Total no. are all Pareto optimal, i.e. realize “best No. tram [% Kombi] transfer penalty) transfers possible compromises”), in the end, VBK Kombi 9 100.0% 131.78 124,259 decided on a network that yields similar travel times as the Kombi line network, C+ 8 83.1% 133.82 125,302 but reduces costs by around 5%. This 1 Karlsruhe. Source: VBK/Uli Deck. line plan is illustrated in figure 3. The C 7 86.8% 132.27 128,482 Verkehrsbetriebe Karlruhe GmbH are 2 Kombi: line plan computed for the currently evaluating the implementa- Final 8 95.1% 131.71 126,138 Standardisierte Bewertung. tion of either plan for the year 2021. T 9 95.2% 131.42 126,130 3 Optimized solution. T+ 9 102.0% 130.77 122,730
Table 1: Overview of some solutions.
16 2017 Annual Report 3 Zuse Institute Berlin 17 ZIB’s Data Center ZIB’S DATA CENTER VIRTUAL OPEN SOURCE 100 SCALABLE VIRTUAL INFRASTRUCTURE GBIT/S FIREWALLS DATA MANAGEMENT FILE SYSTEM DATA CENTER – TENANT MODEL
After the presentation of ZIB’s open The Berlin Research Area Information Since the beginning of 2017, the data ser- The handling of petabytes of user data in A complete renewal of ZIB’s data-center Parallel to the transformation of the data science infrastructure project Network (BRAIN) and ITDS are work- vice infrastructure has been updated: the management for direct data access operating area was started in August data center, a new multi-tenant virtu- in October 2016, the institute’s core ing closely together to create a 100 Gbit cache systems have been upgraded with on a multiuser level is a challenge that 2017. All server racks, the cooling, the alization solution was introduced. The facility “IT and Data Service” (ITDS) firewall solution based on standard the latest design SSDs, the Oracle HSM repeatedly produces previously unre- cabling have been updated and provided system now consists of 144 processors, started to rebuild and modernize the hardware with open-source software is now available in the latest version, solved problems for ITDS and our users with a modern and variable cooling and 3.5 TB RAM, and more than 200 TB entire IT infrastructure at ZIB. From components. This implements a highly additional protocols such as CIFS in searching, accessing, authorizing, alignment concept combined with a hot- SSD memory connected with a 40 Gbit the infrastructure of the data center to flexible firewall system with horizontal and NFS are now directly available and analyzing the data. In response, aisle air containment. We now operate network. This allows us to combine virtual file systems and server environ- scaling and building redundancy. In for cooperation partners with BRAIN ITDS introduced a virtual file system 50 directly cooled server racks with a about 200 virtual servers in a common ments, all IT services have been revised other words, this is a step from active/ connection, and additional services that implements data migration, data total capacity of 2,350 height units with infrastructure. and reviewed since 2016. We introduced passive standby to an active/active like Nextcloud or a scalable file-system hierarchy (HSM), metadata handling a theoretical peak cooling capacity of The virtual infrastructure tenant model new operating and development models system. integration are possible. The connection and capture, role-based management, 900 kW. In addition, redundant systems implements a combined environment for IT services. ZIB’s new life-cycle of our largest service consumer is now backup and duplication, as well as such as network and fiber channels have for software projects. These tenants management system transforms the available with 80 Gbit/s sustainable versioning transparently to the user. In been strategically distributed to build a consist of virtual servers, a firewall, IP IT deployment processes from crafting data rate. addition to the established access pro- high-availability data center. segments, network management like to automation. The hardware basis is tocols, a Web interface will enable user DNS and DHCP, and deployment proce- now strongly geared toward commodity self-service for Windows, Linux, and dures for virtual machine installations. hardware. Universal servers can be used Mac OS X. Automation interfaces are for specific software configurations. added to build data-related applications. The specifics of IT services will no Before, users had to interact with many longer be implemented in hardware but different storage and data management in software and thus at runtime of the systems. With the introduction of the systems. new paradigm, ITDS realized a new convergent view upon the virtual file The integration and operational density system. in the data center with more than 15 kW per server rack is close to the maximum This is a shift in paradigm away from cooling capacity when using in-row file-system management to data man- coolers. NOW FUTURE agement. Central aspects are automatic data classification, storage inventory VM VM and classification, storage analysis, VM WINDOWS (CIFS) user groups and project quota, storage USERS USERS LINUX (NFS) WEB (HTTPS) analysis, and storage pools that can VM … METADATA, HANDLING, be expanded dynamically and vendor SEARCH, REVISIONING HPC SCRATCHDATA ZIB independently.
BACKUP HPC SCRATCHDATA ZIB BACKUP
CONVERGENCE
18 2017 Annual Report Zuse Institute Berlin 19 Economic Situation in 2017
In 2017, the total income of ZIB comprised €18.5 million. The main part of this was made available by the Federal State of Berlin as the basic financial stock of ZIB (€8.4 million), including investments and Berlin’s part of the budget of HLRN at ZIB. A similarly large part resulted from third-party funds (€8.0 million) acquired by ZIB from public funding agencies (mainly DFG and BMBF) and via industrial research projects. This was complemented by a variety of further grants like the budgets of ECONOMIC BRAIN (State of Berlin) and KOBV (mixed funding) as well as the part of the HLRN budget made available by other German states. ¤2,100,000 SITUATION
¤8,400,000 IN 2017 ¤8,000,000 ZIB INCOME
45% Core budget by State of Berlin 43% Third-party funds 12% Further grants
20 2017 Annual Report Zuse Institute Berlin 21 Economic Situation in 2017
ZIB THIRD- ECONOMIC PARTY FUNDS SITUATION IN 2017 IN EUROS
The Zuse Institute Berlin (ZIB) finances INDUSTRY its scientific work via three main sources: the basic financial stock of the PUBLIC FUNDS Federal State of Berlin and third-party ¤9,000,000 funds from public sponsors and those of industrial cooperation contracts. In 2017, ZIB raised third-party funding by a large number of projects. Project- related public third-party funds raised ¤8,000,000 from €5.487 thousand in 2016 to €6.147 thousand in 2017, industrial third-party projects declined from €2.371 thousand to €1.749 thousand. In total, €7.896 ¤7,000,000 thousand third-party funding marked a new record in ZIB’s history, an increase for the sixth year in a row. ¤6,000,000
¤1,748,496 ¤2,992,033 ¤5,000,000 Industry BMBF incl. FC MODAL
¤4,000,000 ¤1,945,551 Other public funds
¤1,102,968 DFG ¤3,000,000
¤106,628 EU ¤2,000,000 ZIB THIRD- PARTY FUNDS ¤1,000,000 BY SOURCE 1% EU ¤0 14% DFG 25% Other public funds 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 22% Industry 38% BMBF incl. FC MODAL
22 2017 Annual Report Zuse Institute Berlin 23 Spin-Offs | Number of Employees
COMPUTING IN BIT-SIDE GMBH LAUBWERK GMBH In 2017, 231 people were TECHNOLOGY GMBH 2000 | www.bit-side.com 2009 | www.laubwerk.com employed at ZIB; of these, 171 (CIT) Telecommunication applications Construction of digital plant models positions were financed by 1992 | www.cit-wulkow.de and visualization third-party funds. Mathematical modeling and develop- 1000SHAPES GMBH ment of numerical software for technical DRES. LÖBEL, BORNDÖRFER & 2010 | www.1000shapes.com chemistry WEIDER GBR / LBW OPTIMIZATION Statistical shape analysis GMBH RISK-CONSULTING 2000 | www.lbw-berlin.de TASK – Berthold Gleixner Heinz Koch NUMBER PROF. DR. WEYER GMBH Optimization and consulting in GbR 1994 | www.risk-consulting.de public transport 2010 Database marketing for insurance LBW Optimization GmbH was founded Distribution, services, and consulting for companies in 2017 and is a spinn-off of LBW GbR ZIB’s optimization suite
INTRANETZ GMBH LENNÉ 3D GMBH QUOBYTE INC. 1996 | www.intranetz.de 2005 | www.lenne3d.com 2013 I www.quobyte.com Software development for logistics, 3-D landscape visualization, Quobyte develops carrier-grade storage database publishing, and e-government software development, and services software that runs on off-the-shelf OF EMPLOYEES hardware AKTUARDATA GMBH JCMWAVE GMBH 1998 | www.aktuardata.de 2006 | www.jcmwave.com KEYLIGHT GMBH Development and distribution of risk- Simulation software for 2015 I www.keylight.de evaluation systems in health insurance optical components Keylight develops scalable real-time Web services and intuitive apps. The focus VISAGE IMAGING GMBH ONSCALE SOLUTIONS GMBH is on proximity, marketing, iBeacon, (Originating from the ZIB spin-off 2006 | www.onscale.de and Eddystone for interactive business Visual Concepts GmbH) Software development, consulting, and models 1999 | www.visageimaging.com services for parallel and distributed Advanced visualization solutions for storage and computing systems DOLOPHARM BIOSCIENCES UG diagnositic imaging 2017 A specialty pharmaceutical company 1/1/2017 1/1/2018 ATESIO GMBH focused on the clinical and commercial 2000 | www.atesio.de development of new products in pain Development of software and consulting management that meet the needs of 3 0 3 3 0 3 MANAGEMENT for planning, configuration, and optimi- acute and chronic care practitioners and zation of telecommunication networks their patients 15 99 114 15 100 115 SCIENTISTS 38 10 48 34 10 44 SERVICE PERSONNEL 8 8 16 8 9 17 KOBV HEADQUARTERS SPIN- 0 55 55 0 52 52 STUDENTS 64 172 236 60 171 231 TOTAL Temporary Temporary Permanent Permanent OFFS TOTAL TOTAL
24 2017 Annual Report Zuse Institute Berlin 25 Dr. Steffen Prohaska | [email protected] | +49-30-84185-337 SPATIAL AND DYNAMIC STRUCTURE OF LIFE How cells organize patterns in space and time Researchers at ZIB develop data analysis and mathematical modeling techniques and apply them in cooper- ation with biologists in order to gain insight into fundamental principles of biological processes, such as cell division, protein regulation, and brain formation and activity. Spatial and Dynamic Structure of Life a
1 The mitotic spindle during the metaphase of cell division in C.elegans: a Schematic of the microtubule spindle (green) and the chromosomes (blue). b Electron-microscopy image of the full cell. c Complete stitched UNDERSTANDING electron-tomography serial-section stack (120 GB). d Segmented microtubules (green, MICROTUBULE red) and chromosomes (blue).
Microtubules are tube-like polymers ORGANIZATION DURING that play an important role in cell divi- sion. They form the spindle, which is a b microtubule assembly that segregates the chromosomes, see figure 1 (a,b). Electron tomography is the preferred technique to acquire most detailed spatial infor- CELL DIVISION 1 mation about microtubules. The cells are prepared for electron microscopy as 300 nm thick serial sections (slices). With the application of image-processing techniques, we are able to automatically extract the microtubules within a sec- tion [1,2,3] (see figure 1). Furthermore, we have been developing reconstruction methods for joining multiple sections to create a geometric model of the entire spindle [4], see figure 1 (c,d). Our integra- c tion of automated techniques with real- time user interactions allows handling of practical challenges during day-to-day research, such as varying tomogram quality. In cooperation with TU Dresden, we visually and quantitatively analyzed entire microtubule spindles of C. ele- gans for the first time [5]. We identified different classes of microtubules and analyzed geometrical properties, such as length and density distributions. In particular, we investigated potential microtubule interactions based on their spatial arrangement. One major finding is that microtubules do not directly con- nect chromosomes and centrosomes, but form an interactive dynamic network. Currently, the full reconstruction and analysis pipeline is adapted for upcoming d experiments that will study spindles in different phases of cell division, spindles in mutant organisms, meiotic spindles, and spindles of different species.
28 2017 Annual Report Zuse Institute Berlin 29 Spatial and Dynamic Structure of Life
3 Gene expression in a eukaryotic cell. In the nucleus, the DNA is transcribed into messenger RNA (mRNA), which is then exported to the cytoplasm and translated into proteins. Such cellular reaction kinetics can appropriately be characterized by the spatiotemporal chemical CELLULAR master equation [8].
NUCLEUS REACTION DNA
Transcription KINETICS mRNA CYTOPLASM Export 2 Protein mRNA
2 Chemical master equation (CME) for well-mixed stochastic reaction kinetics. It defines the temporal evolution of the Translation probability P(x,t) to find the system at time t in a certain state x, given the reaction propensities k.
�
Accurate modeling of reaction kinetics is approximative mathematical formula- than all other involved reactive species. important to understand how biological tions of the cellular dynamics. Especially In collaboration with the DFG research cells work. Spatially well-mixed reaction multiscale reaction systems, which often center CRC 1114 “Scaling cascades in 3 dynamics can be modeled by the chemi- appear in real-world applications, are in complex systems,” we applied new hybrid cal master equation (CME, see formula the focus of our investigations because methods to efficiently simulate the gene 2), an infinite set of ordinary differential they require new combinations of exist- expression process and showed that, in equations, which is, in general, too com- ing approximation methods. contrast to classical uniform approxima- plex to be solved analytically. There are tion methods, advanced hybrid schemes An example for a cellular reaction pro- accurate numerical simulation schemes are able to reproduce the characteristic cess that involves cascades of particle for solving the CME indirectly, like patterns of the cellular dynamics [7]. numbers is the process of gene expres- Gillespie’s stochastic simulation algo- the spatiotemporal CME [8], to include plasm (see figure 3), which both have a models form the basis for the simula- sion, which is relevant for all known But what can we do if the central well- rithm [6]. For many relevant realistic the necessary spatial information, central meaning for the process of gene tion of cellular reaction kinetics in all life. See figure 2 for an illustration. The mixed assumption underlying the CME settings, however, even our high-perfor- with the level of detail adapted to the expression. By making use of this natu- relevant details; their use in massive information encoded in a gene is used for is broken and some spatial resolution is mance computers fail to create reliable reactive system under consideration. A ral cellular structure, we obtain models simulation environments, however, still the synthesis of functional gene prod- needed to capture the dynamics? Based statistics within an acceptable amount eukaryotic cell, for example, naturally that are computationally practical and poses significant challenges that will be ucts, the proteins, which typically arise on mathematical theory, we have been of time. This is the motivation to reduce decomposes into several compartments at the same time close to the reality of the motivation for further research in in bursts with much higher abundance developing suitable extended models, like the model complexity by considering (e.g. the nucleus and parts of the cyto- a biological cell. Spatiotemporal hybrid this direction at ZIB.
30 2017 Annual Report Zuse Institute Berlin 31 Spatial and Dynamic Structure of Life
4 Single “spiny pyramid” neuron in a cortical column (transparent cylinder) in the rat brain, with its dendrites (red) and axons (gray). Synapse density (yellow) indicates where neurons of this type receive information through synaptic contacts. Data: 3-D BRAIN M. Oberlaender (CAESAR Bonn). 5 Modeling and analysis pipeline, consisting of an offline and an online part. The 3-D neural-network model is created using the Amira visualization software. The large connectome dataset is precomputed on a batch cluster. A Web application, intended for the research community and implemented using state-of-the-art Web frameworks and cloud technologies, provides a query interface to extract MODELING TO and analyze on the fly specific subsets of the large dataset. UNDERSTAND In the recent past, we developed meth- Based on this realistic model, neuron Within the Priority Programme ods to count the number of neurons in simulations can be performed, which “Computational Connectomics” of the the network from 3-D may provide insight into the function DFG (German Research Foundation) microscope images [9]. of each neuron type in the network [11]. starting 2018, we aim to use a data- SENSORY The synaptic connectivity Also, mathematical models, for example, driven Bayesian approach to investigate between them, called the “connec- describing population dynamics [12], can more in depth what anatomical proper- tome,” cannot, however, be directly be tested using this model, and may lead ties underlie synapse formation in order recovered from image data at this to a better understanding of biological to make even more accurate predictions scale. Therefore, we estimate the mechanisms involved. of network connectivity. number of synapses and their location Currently, we are developing a Web on the individual neurons based on mor- INFORMATION application to make this model publicly phological features, in particular spatial available to the research community. overlap between signal-sending axons Neuroscientists can perform in-silico and signal-receiving dendrites (figure 5) experiments by interactively defining [10]. This predicted connectivity turned queries that extract specific information out to match available measurements 4 from this large dataset on the fly (figure 5). PROCESSING well. 5 OFFLINE MODELING AND BATCH PROCESSING ONLINE INTERACTIVE QUERYING AND ANALYSIS
3-D NEURAL-NETWORK CONNECTOME COMPUTATION COMPUTE SERVER WEB BROWSER How does the brain process informa- Amira HTC cluster tion from our senses, and how does Openstack Meteor, D3, ThreeJS this ultimately lead to specific behav- ior? To investigate this question, we, together with our collaborators at WEB SERVER CAESAR Bonn, created an anatom- Galaxy ically realistic model of the part of the rat brain that processes informa- tion from the whisker hairs on the animal’s snout. The model consists STORAGE, DB SERVER of a large network of interconnected S3, MongoDB neurons.
32 2017 Annual Report Zuse Institute Berlin 33 BEFORE SYNAPTOGENESIS DURING SYNAPTOGENESIS 3 3 Spatial and Dynamic Structure of Life Wild type Mutant Wild type Mutant [] 2 [] 2 7 1 1 Average length length Average STOCHASTICITY length Average <8 8-30 >30 <8 8-30 >30 Filopodia lifetime classes [min] Filopodia lifetime classes [min] DRIVING PATTERN AXON GUIDANCE LOOKING FOR REVISITED SIMPLICITY
For the last 70 years, the dominant model The genome is too small for encoding and retraction of filopodia as well as FORMATION IN of axon guidance has been based on the brain wiring explicitly. Thus, the release of guidance molecules [14]. global concentration gradients of guid- pattern must emerge from rather simple For simulating such models, solvers for ance molecules. During axon growth, regulatory mechanisms encoded in the deterministic partial differential equa- filopodia, small extrusions of the growth genome [13]. One compelling hypothesis tions describing diffusion, reaction, and cone, sprout in different directions, as is that these developmental rules do not transport processes need to be coupled is visible in microscopy images. The only tolerate randomness in the axons’ to a stochastic simulation algorithm [6] general assumption has been that the environment, but use stochasticity as a BRAIN WIRING capturing the random events of guidance filopodia sample the surroundings driving force and to achieve robustness. Neurons form specific patterns in a molecule reception and filopodia growth. for chemical gradients of guidance very robust and reliable way during In a joint Matheon project with M. von molecules, allowing them to find One of the simplest models of this type, brain development. How axons and Kleist (FU), we aim at identifying mech- the right direction despite the containing just a single type of guidance dendrites find the appropriate syn- anistic models that are both physically stochasticity of molecule sensing. molecule, can already create robust and aptic partners has been studied for plausible and able to reproduce observed quasi-regular space-filling axon struc- decades. But the question is posed With new microscopy technology, the wiring patterns and the statistics of tures that avoid self-contact as well as today with a new twist. group of R. Hiesinger (FU) is able to filopodia dynamics (figure 7). The sim- neighbor contact (figure 8). acquire in-vivo 4-D movies of axons pler and more general the mechanisms growing in drosophila brains, which forming such a model are, the more one shed new light on the brain wiring can expect that structurally similar process. The filopodia dynamics processes are actually driving 6 Four examples of growth cones feature a much richer structure the neural development. with extending filopodia of varying than would be necessary for As a first candi- number and shape, displayed as stochastic gradient sampling date, we con- volume renderings for individual (figure 6). Currently, the role sider a model time points from in-vivo microscopy of these complex dynamics comprising image time series. Data courtesy R. for brain wiring is essentially three essential com- Hiesinger. unknown. ponents. First, dif- fusion and decay of 7 Statistics of filopodia dynamics guidance molecules extracted from 4-D microscopy in the extracellular data. Length, a characteristic space seem neces- quantity, appears to be correlated 6 sary for inter-axon with filopodia lifetime, and differs communication. between mutants (green) and wild Second, reception of type (blue) for growth cones during guidance molecules by synaptogenesis. Data courtesy of R. the filopodia is a stochastic Hiesinger. process due to the low concentra- 8 tion. Finally, nonlinear reactions 8 25 axons in a spatially periodic within the axons, affected setting growing to a robust and by the sensing of guidance quasi-regular space-filling but molecules, control growth contact-avoiding pattern.
34 2017 Annual Report Zuse Institute Berlin 35 Dr. Daniel Baum | [email protected] | +49-30-84185-293
MATHEMATICS © mapsland.com MEETS HUMANITIES
Examples from ancient studies and psychology For many years, the humanities and mathematics were considered as two sciences that have little to do with one another. During the last decade, however, researchers on both sides realized the potential lying in the respective other field. In this article, we present four examples illustrating this potential. Mathematics Meets Humanities
a
1 MODELING ? THE PAST: ? INNOVATION SPREADING IN THE PREHISTORIC WORLD How can we understand processes and provides new links in our modeling and its usage was passed on from one b that happened in prehistoric times approach. person to another, we can expect the when there were no written records spreading path to be strongly connected Let us consider a more specific problem: of events? One possibility is to use to the human mobility dynamics. In order How did innovations spread in the available archaeological findings and to study this, we built a mathematical prehistoric world? One of the impact- build a good mathematical model that we model that allows us to simulate both ful innovations that spread from the can study. However, most of the existing human migration and innovation spread- Near East to Europe between 6200 and archaeological data is sparse and uncer- ing. These simulations can be used to 3000 BC, was the wool-bearing sheep. 1 Geographical area of our interest for wooly sheep spreading. tain; lots of information is unknown analyze dynamical properties of the two The change from herding hairy sheep to a Suitability landscape constructed from environmental data; and there is no procedure to repeat the processes, explore how they are coupled, woolly sheep was an essential driving the assumed origin of the woolly sheep is around Tell Sabi Abyad history and obtain new data. To deal and discover the important factors that force for the later textile production. But (star). b Agents move in the suitability landscape and adopters with these problems, we work closely could have affected this innovation the exact spreading path is unknown. (red) of the innovation can pass on the innovation to neighboring with researchers from the humanities, spreading. Since the knowledge of an innovation non-adopters (yellow) with a certain probability. whose expertise enhances our studies
38 2017 Annual Report Zuse Institute Berlin 39 Mathematics Meets Humanities
Where are we from? 3
Through a fruitful collaboration with experts from the Cluster of Excellence TOPOI, we gained access to the archaeo- INFO BOX logical and environmental data. We used this data to build a suitability landscape This research is done in collaboration that indicates how suitable a particular with Brigitta Schütt (Freie Universität region was for herding woolly sheep in 6200 BC Berlin, FB Geowissenschaften and the respective period (see figure 1a). TOPOI, Berlin, Germany), Wolfram Then, we constructed an agent-based Schier (Freie Universität Berlin, FB model (ABM) in order to simulate Geschichts- und Kulturwissenschaften possible spreading scenarios. An ABM and TOPOI, Berlin, Germany), Daniel consists of a set of rules describing the Fürstenau (Freie Universität Berlin, behavior of agents and their interaction FB Wirtschaftswissenschaft and patterns with each other and the environ- Einstein Center Digital Future, Berlin, ment. In our model, an agent represents Germany), and their coworkers. It has a group of people in the ancient world. 5900 BC been partially funded by the Cluster Agents move in the suitability landscape of Excellence TOPOI – The Formation by following the so-called dynamical and Transformation of Space and equation, according to which they move Knowledge in Ancient Civilizations freely but with a bias toward regions that and ECMath (Einstein Center for are attractive for herding woolly sheep. Mathematics Berlin). At the same time, agents can interact socially with other agents in their vicin- ity and pass on the innovation with a cer- tain probability. By applying this abstract 5000 BC model to the example of the woolly sheep (figure 2), we can generate time series for the spreading process (see figure 3)[1]. This opens up new research questions like: What are the best parameters of our model? How did the innovation spread between different geographical regions? With our research, we try to answer such questions and, thus, help understand 4000 BC processes in prehistoric times. © Georg Mittenecker
Snapshots of one realization of the wool- 3000 BC bearing sheep innovation spreading in 2 the prehistoric world.
40 2017 Annual Report Zuse Institute Berlin 41 Mathematics Meets Humanities 5 b
ACCESSING HIDDEN TEXTS 4 Left: Photograph of sealed legal texts. Right: Physically unfolded package. (ÄMP Berlin, SMB)
5 a Volume rendering of CT scan of rolled mock-up papyrus. b Virtually unfolded mock-up from CT scan seen on the left.
IN EGYPTIAN PAPYRI 6 Virtual unrolling of a papyrus roll. Right: Red contours are interactively set; yellow and blue contours are computed by inter- and extrapolation. Left: Depiction of a cross section.
7 Left: Computer-tomographic (CT) scan of folded mock-up One of the best sources of information figure 4). However, for papyri that are method depicts the writing with a suf- papyrus. Right: Virtually unfolded papyrus from CT scan seen about our cultural origin is provided too fragile to be physically unfolded or ficiently high contrast. For documents on the left. by written texts. unrolled, the writings are inaccessible. written with ferrous ink, this is the case when using X-ray-based tomography. For example, excavations in Elephantine, Our aim is to make such papyri readable. a small island in the Nile in Egypt, For this, we acquire a 3-D tomographic For virtual unrolling, we depict cross a brought to light large quantities of papyri, image of it, reconstruct the papyrus sections in high detail, allowing the user telling us 4,000 years of cultural history geometrically on the computer, virtually to define manually closed polylines that INFO BOX of various religious, ethnic, and linguistic unfold and unroll it, and finally visualize approximate the contours (figure 6, left); groups that lived there [2,3]. The papyri the writing on it [4]. A prerequisite is that using interpolation and extrapolation, This project was carried out in collaboration with are typically rolled or folded (see the tomographic similar contours are computed in EXTRAPOLATION the archaeologist Verena M. Lepper (Ägyptisches other planes (figure 6, right). SLICE i Museum und Papyrussammlung (ÄMP) – Ensuring that the number Staatliche Museen zu Berlin (SMB) – Stiftung of points in each polyline INTERPOLATION Preußischer Kulturbesitz), the physicists Heinz- is equal, a quadrangular Eberhard Mahnke (Freie Universität Berlin, FB 2-D mesh is implicitly SLICE j Physik and TOPOI, Berlin, Germany) and Ingo defined, repre- 6 Manke (Helmholtz-Zentrum Berlin), and their senting a surface; INTERPOLATION coworkers. It was financially supported by the this surface is Starting Grant ELEPHANTINE of the European flattened such that SLICE k Research Council (ERC), the Beauftragte der differences in the EXTRAPOLATION Bundesregierung für Kultur und Medien (BKM), distances between the Cluster of Excellence TOPOI (The Formation n e i g h b o r e d and Transformation of Space and Knowledge 4 contour points in Ancient Civilizations) of the Deutsche are minimized Forschungsgemeinschaft (DFG). (figure 5).
For virtual unfold- ing, each step of the physical folding process is virtually undone. 7 After a series of steps, the unfolded papyrus becomes rent state using volume rendering (see virtually been undone, we unroll the Only some papyri are described with topologically figure 7, left); furthermore, we display roll as described previously. In a final ferrous ink; for the majority, soot ink equivalent to a cross sections in high detail in which the step, an equalization is performed that was used. For this case, alternative role, which is then user can interactively draw polylines. largely eliminates all distortions that tomographic procedures are currently unrolled, as described For the actual unfolding, Moving Least accumulated during the whole proce- being investigated, in the hope of finding before. In each step, Squares deformation is applied based on dure. Then the writing is visualized one that offers sufficient contrast. we provide the user with these polylines to unfold the package as using contrast-enhancing techniques an overview of the cur- rigidly as possible. After all folds have (figure 7, right).
42 2017 Annual Report Zuse Institute Berlin 43 Mathematics Meets Humanities INFO BOX 10 The project is done in collaboration with the biologist Gerhard Scholtz (HU Berlin) and the psychologists Torsten Schubert (University Halle) and Antonia Reindl (Cluster of Excellence Image Knowledge Gestaltung). It is financially 3-D MESH supported by the Cluster of Excellence “Image Knowledge Gestaltung”.
MORPHING FOR 8 Global morphing from a lobster (far left) to a crab (far right).
9 Even though only the carapace is modified, the images show a smooth PSYCHOLOGICAL transition of the whole object without any noticeable artifacts at the joints between the carapace and other parts of the body. Local morphing of the carapace from a lobster (far left) to the carapace of a crab (far right) while EXPERIMENTS keeping all other parts fixed as in the lobster.
9 8
Understanding how object recognition Therefore, in an interdisciplinary ware. This resulted (see figure 8). Controlled psychological and categorization is performed by the project, perceptual psychol- in intermediate experiments, however, require local human brain is a topic that has been ogists and biologists in the images between a mesh morphing that transforms one part studied in psychology for many years. As Cluster of Excellence “Image crab and a lobster (e.g. the carapace) of the animal at a time. a result, it is widely believed that for this Knowledge Gestaltung” have that were then used To achieve this goal, we developed an task our brain builds up a mental repre- been investigating object catego- to study the adaptation approach that combines two previously sentation of objects. Studies have shown rization on the biological categories effect [5]. Unfortunately, proposed methods for mesh morphing to that such mental representations are not of crabs and lobsters. In particular, they these images showed a rather poor obtain a natural-looking transformation fixed but remain flexible and are subject are investigating how a previously seen resemblance with real crabs and of selected parts of the mesh, while keep- to continuous short- and long-term input stimulus influences the categoriza- lobsters, which may bias the experi- ing the rest of the mesh as stable as pos- adaptation. The majority of such studies tion decision of a test person. This effect mental results. sible (see figure 9). From the interpolated have so far concentrated on human faces, is known as adaptation effect and has representations, standardized images To create more realistic-looking the recognition of which plays a very been shown to exist for the recognition were created that much better resemble images, we utilize 3-D shape morphing essential role in our daily life. Hence, it of human faces [6]. For their study, they real crabs and lobsters even though 10 Overlay of images using state-of-the-art methods from is not clear how these findings generalize produced images in which categoriza- artificial. Currently, new psychological representing a differential geometry. This allows us to to the recognition and categorization of tion-relevant features of crabs and lob- experiments are being carried out using global morphing easily create natural-looking 3-D shapes other objects. sters were systematically morphed using these new images, thus greatly improving from a lobster to that are in between a crab and a lobster standard 2-D image-processing soft- a crab. the experimental conditions.
44 2017 Annual Report Zuse Institute Berlin 45 Mathematics Meets Humanities FACIAL MORPHOLOGY 11 AND ITS APPLICATIONS
They suffer from so-called Together with our collaborators from face blindness (pro- anthropology, we have studied regular Our face has an immense significance sopagnosia), which facial development in male and female for interpersonal communication. is a neurological children. Nonlinear growth trajectories Sympathy and antipathy for others disorder of the were determined for the use in the early often depend on our first impression. Already within milliseconds, stereotypes brain that needs diagnosis of syndromes. Similarly, may have been triggered by an opponent’s to be identified typical aging effects like facial sagging face. The perception of facial expres- and thor- have been investigated morphometric sions determines our interpretation of https://twitter.com/realdonaldtrump o u g h l y ally. For the analysis and synthesis of statements (see figure 10). Similarly, tested facial expressions (computational facial we always use facial expressions – con- for any morphology), we have established a 3-D sciously or unconsciously – to emphasize Image source: treat- morphable model that is able to synthe- what we say. Therefore, the face and its ment. size various individual nuances beyond perception have been the subject of art typical expression categories [6]. Their significance for expression perception is and science for centuries and are still A small percentage of 10 being investigated in a broad user study intensively researched. the population suffers as part of the Mimik-Explorer shown from an autism-spec- People with congenital or acquired facial at the German Hygiene Museum in trum disorder. Those malformations, or palsy due to nerve Dresden. damage, are often stigmatized. An aim people often have diffi- of facial plastic surgery is to correct such culties in interpreting In collaboration with psychologists, we dysplasia in view of a normal or natural facial expressions. It are developing novel experiments for facial anatomy. To plan and perform is difficult for them to the investigation of self-representation surgical modifications of faces in such imagine what other peo- based on virtual-reality techniques. Our a way that both desired aesthetic visual ple think or feel, which goal is to gain understanding of the role appearance and expressions are consid- causes impairments of of facial identity in (virtual) communi- ered is still a challenge. social interaction. The train- cation. For the planning of facial sur- ing of expression interpretation gery, our work allows the assessment of and production therefore supports expressions before and after treatments their social skills. Another small group [7]. For example, in facial palsy, a detailed of the population has difficulties in rec- planning of facial nerve reanimation ognizing familiar faces. along with an objective follow-up evalu- ation method can be established.
INFO BOX 10 We have probably all encountered the experience of drawing conclusions from interpreting other people’s facial expressions. Disputes up to serious conflicts may have arisen from a (mis)interpretation of facial expressions. Hence, facial expressions and With applications in anthropology, medicine, psychology, and affective computing in mind, we target questions like: (a) How their deliberate use are intensively trained, not only in acting but also for successful negotiations (e.g. in business and politics). do facial proportions vary in a normal or healthy range, (b) what kinds of growth and aging trajectories can be observed during childhood or in elderly people, (c) are there characteristic patterns for facial expressions giving rise to an objective classification 11 Expressions are formed by the actions of facial muscles. In our research, we are studying the resulting effects on the facial scheme, and (d) is it possible to employ such a scheme to artificially synthesize plausible expressions? These are all driving surface. questions in research that motivate an in-depth analysis of the almost endless diversity of real faces.
46 2017 Annual Report Zuse Institute Berlin 47 Mathematics Meets Humanities DIGITAL FACIAL 14 MORPHOLOGY 12
13 CAMERA FACIALIS – 15 12 The 3-D portrait studio Camera Facialis uses multiview 3-D FACIAL photogrammetry to capture high- COMPUTATIONAL PERFORMANCE resolution 3-D facial performances. ZIB 3-D FACE DIGITAL 3-D FACIAL 13 Expression scans from the ZIB 3-D CAPTURE Face Database with photographic ANALYSIS OF FACIAL DATABASE MORPHOLOGY texture containing high-resolution Human face perception is particularly skin details. MORPHOLOGY AND sensitive to the subtlest details. For Photometric reconstruction of 3-D faces The statistical analysis of shape and the acquisition of high-resolution is usually a tedious manual process even appearance requires a well-defined 14 Nuances of facial expressions EXPRESSIONS facial details including small wrinkles for expert users. To establish a compre- mathematical framework. Descriptors synthesized using expression resulting from micro expressions, the hensive large-scale 3-D face database, we are needed to transform a face into a patterns determined by statistical Computational facial morphology offers appearance due to expressions can be 3-D portrait studio Camera Facialis have developed a fully automated recon- typically high-dimensional space, where shape analysis. the opportunity to apply mathematical precisely extracted and quantified from (see figure 12) has been established at struction pipeline for high-throughput similarities or distances, correlations, methods for the analysis of facial shape a representative set of facial perfor- ZIB [8]. In contrast to expensive and data acquisition. This has been achieved or clusters can be analyzed. At ZIB, we 15 In the Camera Facialis, 3-D face and proportions as well as skin appear- mances. Our research comprises (1) the laborious commercial setups common by consequent integration of statistical are developing correspondence-based scans in dense correspondence to ance and expression. Based on image development of new techniques for the in entertainment productions and media shape templates. Detailed 3-D facial descriptors that take the Riemannian an animatable surface mesh are and geometry processing, statistics, acquisition of high-resolution 3-D faces, art, Camera Facialis has been built on scans result from specifically optimized structure of shape spaces into account acquired. On top of a neutral face and machine learning, the relation of (2) their integration into a large-scale top of affordable optical devices and algorithms [9]. That way, several thou- [10]. We enable the analysis of facial scan (left), arbitrary expression facial morphology to attributes like sex, database, and, using this, (3) establishing commodity hardware to enable ultrafast, sand faces have already been integrated features up to the level of wrinkles and patterns extracted from the ZIB age, culture, and so forth, can be stud- a foundation of morphological analysis high-fidelity measurements for a very into the 3-D face database, which is skin pores by establishing accurate dense face database can be transferred, ied. Similarly, characteristic patterns as well as applications in psychology and detailed capture of facial performances continuously extended for digital facial correspondence with our techniques (see for instance, the smile expression describing the change in shape and medicine. in a large-scale manner (see figure 14). morphology. figure 15) [6]. shown on the right.
48 2017 Annual Report Zuse Institute Berlin 49 Prof. Dr. Thorsten Koch | [email protected] | +49-30-84185-213 © pixabay SOFTWARE SUSTAINABILITY IN THE AGE OF OPEN SCIENCE
When mathematicians combine mathematical theory and real-world applications, then mathematical research software complements chalk and blackboard, or pencil and paper. Today, mathematical software plays a central role in research and key technologies and has an increasing impact on mathematical education. Software Sustainability in the Age of Open Science
Since its creation more than 30 years ago, ZIB has promoted this combination as a significant extension of scientific work. This has resulted in many software pack- ages, some of which are of outstanding importance to the scientific community. Prominent examples today include SCIP in the area of optimization, KASKADE for numerical mathematics, and AMIRA for scientific visualization and data anal- SOFTWARE ysis. An overview of software developed at ZIB can be found here: http://www.zib. de/software. Software Sustainability is a new working area that has been gaining momentum in SUSTAINABILITY AT ZIB the past few years. With the continuing In order to establish sustainable soft- shift toward data-centric science, the ware development practices, a variety software utilized to produce the scien- of infrastructure and organizational 1 tific output is increasingly regarded as measures have to be addressed. The an equally important “product of science” following approaches, services, and itself. project results form the basis and start- A general understanding is evolving that ing point for these measures. In recent ciples of Open Access, software should it very difficult to define an encapsulated not only a publication, but also the doc- years there have been several activities also be made publicly available, ideally status as a datum. In addition, software umented and available datasets should to promote research-data management. corresponding to the FAIR principles of is a living entity that evolves over time. get scientific credit. Moreover, there is a One of the essential first steps was the research-data management: data should The concept of Software Sustainability growing consensus in the research-da- formation of a focus group supporting be Findable, Accessible, Interoperable, involves development, deployment, main- ta-management community that the handling of research data, composed and Reusable. tenance, and publication efforts, with the research software could be considered as of researchers from different divisions aim to ensure the ongoing functionality datasets (i.e. research data). Publication To transfer these principles to the man- and departments at ZIB. This group has of research software. The incorporation and reuse of software is necessary to agement of research software is a true started to identify common needs across of Software Sustainability methods ensure validity, reproducibility, and to challenge, especially because software is all departments and divisions. It acts as OPUS is utilized to present landing pages ital preservation system EWIG at ZIB, at ZIB is part of increasing efforts to drive further discoveries. Since a com- characterized by an exceptional variety a contact point concerning all questions (called front doors in OPUS) for specific which is composed of freely available conduct research in accordance with mon research-data-management prac- of dependencies on other entities (hard- related to research-data management, software with an additional set of meta- open-source components, is used as a open-science principles. tice fosters publication along the prin- ware, OS, algorithms, etc.), which renders including the generation of data-man- data and a download link. The software research-software source-code archive, agement plans. Its most important action packages will reside on a dedicated accompanied by extensive metadata. item is publication of research data and download server. OPUS itself is an open- To date, the incorporation of digital research software. A twofold strategy is source product of ZIB, published under preservation strategies within the con- being pursued to increase the visibility a GNU General Public License. Current text of software sustainability has not of produced software. In addition to the work on OPUS involves other important been straightforward. There is a gap to project page (i.e., the product page of the features with regard to open-science be closed between software archiving research software, which is maintained principles: automatic DOI minting and and maintenance approaches on the one by the respective research group), all ORCID implementation. hand, and long-term aspects of digital released versions of a software are curation of software on the other. intended to be published as research A further step to achieve software data with the institutional repository sustainability at ZIB involves the use OPUS (https://opus4.kobv.de/opus4-zib/ of the swMATH portal that analyzes home). In this context, the Kooperativer mathematical publications for software Bibliotheksverbund Berlin-Brandenburg citations. The aim is to identify the (KOBV), part of ZIB since 1997, provides connection of research software and support for OPUS. scientific articles that present results based on the use of this software. Another aspect of the long-term availability of research software is being addressed by the experimental 1 Figure 1: Search interface application of digital preservation of ZIB OPUS institutional techniques. The OAIS-compliant dig- repository
52 2017 Annual Report Zuse Institute Berlin 53 Software Sustainability in the Age of Open Science
SWMATH – WEB-BASED FINDING SOFTWARE HOW TO CITE APPROACH IN SWMATH SOFTWARE?
the websites related to the software, in swMATH offers several ways to find When analyzing zbMATH articles, you AN OPEN-ACCESS DATABASE repositories, or on portals, which provide software. Based on the referenced publi- will find different citation styles for information about and access to software cations, swMATH analyzes the abstracts software. Software companies, repos- for a special subject. Therefore, some and the MSC classification for each itories, and publishers give different concepts for capturing and analyzing software entry. This allows software to recommendations for citing software or further information about software be searched by key word (e.g. “integer research data. swMATH will close this FOR MATHEMATICAL from websites, software repositories, programming”) or by browsing through gap. In cooperation with related commu- and Internet archives have been devel- the MSC classification scheme (e.g. nities, e.g. the Software Citation Working oped. In principle, we are faced with the “90C10”). Additionally, swMATH clas- Group of the FORCE11 Initiative, we are same tasks as in the publication-based sifies all entries by types, for example, working on standards for citing software approach: identification of the relevant programming languages, benchmarks, and research data. SOFTWARE information and analyzing and struc- data collections, or Web services. Last turing the information about a software. but not least, you also have access to But instead of publications, we have to documentation and manuals wherever extract it from information on the Web. it is accessible. How does one find software for a The swMATH pages represent an online An important side effect of analyzing specific mathematical problem? Is CENTRAL IDEA: portal to retrieve information regarding research papers is the discovery of there already a solution or an imple- mathematical software: they provide related software: if software S1 is mentation? What is the mathematical PUBLICATION- general information about the software, referred by paper P1 and paper P2 refers background? Who are the authors? namely the profile of a software derived not only to software S1, but also software What hardware do you need? Is there BASED APPROACH ANALYZING ZBMATH from the publication-based approach and S2, we conclude that S1 is related to S2. any documentation? Is it free for edu- links to other relevant Web resources This helps to identify more than one cational use or do you need a license? The most informative and relevant The manual maintenance of Web data- that contain more detailed information. possible software solution for solving a Moreover, it is also a real problem secondary source for information about bases is an effort that is both expensive Each swMATH page has a unique iden- given problem. to cite software when writing a mathematical software is the corre- and time-consuming. Therefore, the tifier, which can be used for the citation scientific paper. Where to find the sponding scientific literature. Therefore, use of machine-based methods for the of a software. software? Which version was used? the most complete bibliographic database data analysis and content generation Is it accessible? Can you reproduce of mathematical literature, zbMATH has been a significant aspect in the the results? (the former Zentralblatt MATH), is used design of the swMATH service from The project swMATH (www.swmath. as a basis to extract information about its beginning. Heuristic methods have org) is an attempt to develop and estab- mathematical software. The fact that been developed for identification and lish an information service for mathe- zbMATH covers almost all mathematical to analyze software information in matical software and mathematical journals with a focus on mathematical zbMATH entries. The information research data. It started as a project of software is of crucial importance. regarding publications citing a software Mathematisches Forschungsinstitut is aggregated and provides a profile of zbMATH provides a review or a sum- Oberwolfach (MFO) and FIZ Karlsruhe, the software and its context involving mary, characteristic key phrases, the and is presently continued as a project of use cases, mathematical background, references lists, and classification of the the Research Campus MODAL at ZIB. acceptance, life cycle, related software, mathematical subjects and application swMATH provides information about and a list of publications citing the soft- areas of mathematical publications. mathematical software and its mathe- ware. The publication-based approach is Also, information about the authors, the matical background. It will improve the the basic step in the swMATH workflow sources, and the language of the publica- visibility of software and strengthen the and provides general information about tion are presented. Today, the database role of software within mathematics. mathematical software. But details, such zbMATH stores the bibliographic data of swMATH is focused on software, but as source code, versions, the technical nearly four million peer-reviewed mathe- also benchmarks, data collections, and environment, license information, doc- matical publications with an increase of manuals are listed. umentation, manuals, or installation 10,000 items per month. In general, this guides, as well as links to related bench- approach ensures quality control. marks or data collections, are missing. This kind of information can be found on
54 2017 Annual Report Zuse Institute Berlin 55 Software Sustainability in the Age of Open Science
VISIBILITY OF 2,000,000 3 AUTHORS STATE OF THE ART Careers in the scientific community are 1,500,000 based on the visibility of papers pub- AND CHALLENGES lished in peer-reviewed journals. But what about the contributions of software Currently (i.e. December 2017), swMATH developers? Do they get any credit for 1,000,000 contains information about more than writing software while helping to find 20,000 software entries with nearly solutions to a mathematical problem? 270,000 software references in publica- Software developers play a different role tions. swMATH is an innovative service to the authors of publications. swMATH 500,000 for mathematical software citations lists the authors of a software and builds and has unique features. The swMATH a citation graph presenting the annual service is largely based on the automatic numbers of publications citing the processing of information and requires software. This also characterizes the 0 little effort for maintenance. Usage development stage and reflects the dis- 07 07 07 statistics reveal increasing interest in tribution and acceptance of the software. 2015 2016 2017 the service. A large number of publications indicate that the software has been widely used PAGES Software information is challenging and can be considered as an indicator for because it is highly dynamic and stan- the quality of the software. dards for software information, e.g. for software citations and metadata If you take, for example, the SCIP schemes, have been missing until now. Optimization Suite, you find more than The swMATH project working group 250 bibliographical entries in zbMATH will participate in the development of that cite the software. The software standards and will work on both new packages with most citations are concept algorithms for a more complete MATLAB (8,000 references), CPLEX, recording of mathematical software MAPLE, MATHEMATICA, and R, each and on improved content analysis of the with more than 4,000 references. information about a software. 24,000 2 160,000 18,000 140,000 12,000 120,000 6,000 100,000 0 80,000 2 Figure 2: swMATH figures for software and 2015 07 2016 07 2017 07 zbMATH references 2015–2017
3 Figure 3: Usage statistics for swmath.org SOFTWARE (20,000) ZBMATH REFERENCES (150,000) 2015–2017 (Apache log file, Webalizer pages, no robots).
56 2017 Annual Report Zuse Institute Berlin 57 Prof. Dr. Ralf Borndörfer | [email protected] | +49-30-84185-243 20,000 FEET ABOVE THE GROUND
New navigation systems for aircraft save fuel and time 20,000 Feet Above the Ground