max planck institut informatik

Report 2009/2010

09 10 11 12 13 bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 2

max planck institut informatik bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 3 ...... 3

Report 2009 / 2010 bericht_2011_en_korr2 24.11.201116:39UhrSeite4 4

...... emnRcos Conference,Bonn German Rectors’ Prof. Dr. MargretWintermantel, University ofKarlsruhe Informatics andFormal DescriptionMethods(AIFB), Prof. Dr. Wolffried Stucky, EMEA, Google,Zurich,Switzerland Dr. NelsonMattos, of EducationandResearch,Bonn Prof. Dr. Wolf-Dieter Lukas, University, Saarbrücken Prof. Dr. Volker Linneweber, Academy ofScienceandTechnology (acatech), Prof. Dr. HenningKagermann, Computer Science Association (GI),Bonn Prof. Dr. StefanJähnichen, Software GmbH,Saarbrücken Prof. Dr. Joachim Hertel, Science, Saarland Dr. ChristophHartmann, of ScienceandTechnology, ZDF, Munich Christiane Götz-Sobel, Management ofRobertBoschGmbH,Stuttgart Dr. SiegfriedDais, and Technologies, Siemens AG, Munich Head ofCorporateResearch Hon. Adv. Prof.(Tsinghua)Dr. Reinhold Achatz, Curatorship Board Science, UniversityofCalifornia,Los Angeles, USA Prof. Dr. DemetriTerzopoulos, Prof. Dr. ÉvaTardos, Prof. Dr. ClaudePuech, Carnegie MellonUniversity, Pittsburgh,USA Prof. Dr. Frank Pfenning, Ashburn, USA Prof. Dr. EugeneMyers, University ofPaderborn, Prof. Dr. Friedhelm MeyeraufderHeide, Telecommunications, Universityof Athens, Greece Prof. Dr. Yannis E.Ioannidis, Science Division,UniversityofCalifornia,Berkeley, USA Prof. Dr. Joseph M.Hellerstein, School ofMedicine,USA Prof. Dr. DouglasL.Brutlag, Computer Science,DukeUniversity, Durham,USA Prof. Dr. Pankaj Kumar Agarwal, Scientific AdvisoryBoard Prof. Dr. GerhardWeikum Prof. Dr. Hans-Peter Seidel Prof. Dr. BerntSchiele Prof. Dr. KurtMehlhorn Prof. Dr. Dr. ThomasLengauer Directors deputy chairmanoftheBoard Vice PresidentEngineering, Cornell University, Ithaca,USA Head oftheeditorialdepartment Howard HughesMedicalInstitute, Laboratoire INRIA,France CEO ofDACOS Minister ofEconomicsand Computer ScienceDepartment, Institute of Applied President oftheGerman German Federal Minstry President of Stanford University, Department ofInformatics& Department ofComputer President oftheNational EECS Computer President ofthe Department of Heinz NixdorfInstitute,

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite5 CONTENTS 118 112 108 102 100 82 66 62 52 44 36 28 25 24 22 20 74 18 92 90 26 14 16 14 8 7 ...... DIRECTIONS PUBLICATIONS COOPERATIONS INFORMATION SERVICES&TECHNOLOGY THE INSTITUTEINFIGURES NEWS IMPRS-CS THE RESEARCHAREAS DEPARTMENT OVERVIEW THE MAXPLANCKINSTITUTEFORINFORMATICS: OVERVIEW PREFACE VISUALIZATION SOFTWARE OPTIMIZATION MULTIMODAL INFORMATION INFORMATION SEARCH &DIGITAL KNOWLEDGE GUARANTEES BIOINFORMATICS UNDERSTANDING IMAGES & VIDEOS THE MAXPLANCKCENTER AUTOMATION OFLOGIC . 1 RG RESEARCH GROUPS . DATABASES 5 AND INFORMATIONDEPT SYSTEMS GRAPHICS COMPUTER . 4 DEPT COMPUTATIONAL . BIOLOGY ANDAPPLIEDALGORITHMICS 3 DEPT . COMPUTER VISION AND MULTIMODAL2 DEPT COMPUTING COMPLEXITY AND ALGORITHMS . 1 DEPT DEPARTMENTS CONTENTS 5 bericht_2011_en_korr2 24.11.201116:39UhrSeite6 6

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 7 ...... INTRODUCTION 7

PREFACE

PREFACE

Every two years, the Max Planck Institute for Informatics publishes a report for the general public. We would like to take this opportunity to present some of the current topics, goals and methods of modern informatics. We introduce the work of our insti- tute and hope to bring you, our readers, closer to the fascinating world of our science.

The Max Planck Institute for Informatics aims to be a lighthouse in research in informatics. We aim to have impact in the following ways: First, through our scientific work, which we disseminate mainly through publications and books, but also in the form of software and internet services. Second, through the training of young scien- tists, particularly during the doctoral and postdoctoral phases. We are educating future leaders for science and industry. Third, through our role in the field. We initiate and coordinate large research programs, and serve on important committees. Fourth, by attracting talent from within and outside the country. About half of our staff of over 190 scientists comes from abroad. Fifth, through the transfer of our results into industry. These transfers take place through cooperation projects, spin-offs, and people. Sixth, by building a world-class competence center for informatics in co- operation with our partners: , the German Research Center for Artificial Intelligence and the Max Planck Institute for Software Systems. We have

...... been very successful in all of these undertakings in recent years.

Our success is becoming visible in the new buildings for the Max Planck Institute for Software Systems, the Center for Bio- informatics, the Intel Visual Computing Institute, the informatics lecture halls, the informatics and mathematics library, and the Cluster of Excellence, all of which are located at the Platz der Informatik.

The report is structered as follows: After an overview of the institute and its depart- ments and research groups, we present the main areas of recent work. These topics span several departments and will also be the focus of our work in the next years. The last part of the report contains a selection of recent scientific publications and a compact presentation of the institute through key indicators.

Please enjoy reading this report.

Kurt Mehlhorn Managing Director ...... bericht_2011_en_korr2 24.11.201116:39UhrSeite8 8

...... an Overview The MaxPlanckInstituteforInformatics, computer systems,andtofurtherdevelopcomputationalthinking. informatics isneededtocopewiththiscomplexity, tolaythefoundationsforpowerful by man.Computationalthinkingisanewwayofstudyingtheuniverse.Basicresearchin software, andnetworks,areamongthemostcomplexstructuresthathavebeenconstructed Information technologyinfluencesallaspectsofourlives.Computersystems,hardware, OVERVIEW

......

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 9 ...... 9

OVERVIEW

Basic research in informatics has Goals Fourth, by attracting talent from led to dramatic changes in our everyday The Max Planck Institute for Infor- within and outside the country. Half of lives in recent years. This has become matics aims to be a lighthouse in research the research staff of the institute comes particularly clear in the last two decades: in informatics. We aim to have impact in from outside Germany. This strengthens The worldwide web, search engines, com- the following ways: the talent base in Germany and estab- pression processes for video and music, lishes bridges to foreign countries. and secure electronic banking using cryp- First, through our scientific work,

...... tographic methods have revolutionized ...... which we disseminate mainly through ...... Fifth, by transfering our results our lives just a few years after their publications and books, but also in the to industry. These transfers take place discovery at universities and research form of software and internet solutions. through cooperation projects, spin-offs, institutes. At the moment, we are concentrating and people. Intel founded the Intel Visual on algorithms for very large, multimodal Computing Institute in 2009 together The Max Planck Society, the lead- data. Multimodal means text, speech, with the UdS (Saarland University), the ing organization for basic research in images, videos, graphs and high-dimen- DFKI (German Center for Artificial Germany, reacted to these challenges by sional data. Intelligence), the MPI-SWS, and the founding and the Max Planck Institute MPI-INF. Intel is investing $12 million for Informatics (MPI-INF) in Saarbrücken Second, through the training of in a new research project with its head- in 1990. In 2005, the Max Planck Insti- young scientists, particularly in the doc- quarters on the campus of the UdS. tute for Software Systems (MPI-SWS) toral and postdoctoral phases, we are The development of future graphics was established with sites in Saarbrücken educating future leaders for research and visual computing technologies are and Kaiserslautern. There are departments and business. Over 190 researchers are at the core of the center's work. The with a strong emphasis on informatics working at our institute and remain with investment will take place over a period in other institutes of the Max Planck us on average for three years. In this of five years and is the most extensive Society as well. The restructuring of the way, we provide the society with over 60 cooperation to date by Intel with a Max Planck Institute for Metal Research well-trained young scientists each year. European university. into an Institute for Intelligent Systems has further strengthened informatics Third, by our role in the profession. Sixth, by building a world-class within the Max Planck Society. Given We initiate and coordinate large research competence center for informatics in the importance of the area, the establish- programs and serve on important com- cooperation with our partners, the UdS, ment of further institutes for informatics mittees, e.g., the “Wissenschaftsrat”. the DFKI, and the MPI-SWS. or related areas is desirable. The Institute has played a significant role in forming the Excellence Cluster We have been very successful in “Multimodal Computing and Interaction” all of these endeavors in recent years. and the “Graduate School for Computer Science”...... bericht_2011_en_korr2 24.11.201116:39UhrSeite10 10

...... accepted aprofessorshipinDüsseldorf. will endin2011because Alice has and Epidemiology” group carry outtheirworkattheinstitute.The the cooperationwithStanfordUniversity, puting andInteraction” Cluster ofExcellence Wand havebeenestablishedwithinthe Thorsten ThormählenandMichael Robert Strzodka,MartinTheobald, Ihrke, MeinardMuller, RalfSchenkel, Mario Albrecht, JanBaumbach,Ivo and severalresearchgroups,headedby research group groups. ChristophWeidenbach leadsthe institute ishometoindependentresearch led byBerntSchiele,wasadded. Multimodal Computing” of 2010,thenew department since2003.Inthesummer the departm tational Biologyand Applied Algorithmics” then joinedin2001toleadthe Hans-Peter Seidel.ThomasLengauer followed in1999underthedirectionof A thirddepartment ing” ginning andledthe Ganzinger wasinvolvedfromtheverybe- department sincethen. Also, Harald has ledthe Mehlhorn asthefoundingdirector. He matics wasfoundedin1990withKurt History andOrganization OVERVIEW “Data department untilhisdeathin2004. In additiontothedepartments, The MaxPlanckInstituteforInfor- “Informatics forGenomeResearch ent. GerhardWeikum hasled bases andInformationSystems” “Algorithms andComplexity” “Automation ofLogic” , ledby Alice McHardy, “Computer Vision and “Computer Graphics” “Logic ofProgramm- “Multimodal Com- as wellthrough department, “Compu- ,

...... calculation?) (Do Ihaveenoughstoragespace for my my computation?) (How longmustIwaitfortheresult of important resourcesarerunning time requirements ofalgorithms.The most Complexity” of cities. find anoptimalroutethroughthousands hardware andalgorithmsallowsusto cities. Thecombinedadvancementin allow ustosolveproblemswith135 ware andtheclassicalalgorithmwould grows super-exponentially. Currenthard- multiplies itby121;therunningtime algorithm by120andaddinganothercity multiplies theruntimeofclassical ter performance). Adding onemorecity and arecognizedbenchmarkforcompu- 120 cities(aclassicoptimizationproblem optimal travelingsalesmantourthrough 1970 madeitpossibletocalculatean The stateofhardwareandalgorithmsin even largergains.Hereisanexample: algorithms canlead,andhaveledto, pressive gainsinefficiency;improved rithms tointerestingapplicationproblems. bust softwaredesign.We applythealgo- tation andinvestigatetechniquesforro- We studyinherentpropertiesofcompu- software librariesandinternetservices. available totheworldinformof through experiment.We makethem We implementthemandvalidate rithms andanalyzetheirperformance. We provethecorrectnessofthesealgo- the designofnewandbetteralgorithms. large thecomputationaluniversethrough for solvingaclassofproblems.We en- subject. An algorithmisageneralrecipe Research Topics The department Improved hardwarehasledtoim- Algorithms areourcentralresearch concentrates ontheresource . Thegroupdevelopsnew and spacerequirement “Algorithms and

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 11 ...... 11

OVERVIEW

algorithms with improved running time generated by them. The computer is archives. We investigate various approa- or storage requirement and also studies now an essential tool for biology and ches for the implementation of these the basic limits of computation: How medicine. The understanding of biolo- methods in distributed computing much time and space are provably ne- gical processes on the molecular level systems for better scalability. cessary for a computation? is not possible without sophisticated information processing. Vast amounts The newest department “Computer The “Automation of Logic” research of data need to be processed in modern Vision and Multimodal Computing” in- ...... group investigates logic-based generic ...... biology, and the biochemical interactions vestigates processing and understanding procedures for solving “hard” combina- in the living organism are so complex sensor information. Sensors range from torial and decision problems. Typical that studying them is hopeless without relatively simple, e.g., GPS and accelera- applications are the verification of hard- algorithmic support. Therefore, bioinfor- tion sensors, to very powerful sensors, ware or software as well as optimization matics methods have become essential e.g., cameras. They are embedded in problems. for modern research on the diagnosis more and more devices. Although algo- and treatment of illnesses. rithmic processing of sensor information Nowadays, computers are used to has advanced considerably, it is still model, represent, and simulate aspects The “Informatics for Genome Re- mainly limited to low-level processing. of real or virtual worlds. Since the visual search and Epidemiology” research group In particular, we are far from being able sense is a key modality for humans, com- has developed new methods for the ana- to fully interpret and understand sensor puter graphics has become a key tech- lysis of genomic sequences for questions information. Such sensor understanding nology in modern information and com- of medical and biotechnological relevance. is, however, a necessary prerequisite for munication societies. The department many areas such as man-machine inter- “Computer Graphics” researches the entire The “Databases and Information action, the indexing of image and video processing chain from data acquisition Systems” department is dedicated to the databases, or for autonomous systems via modeling (creation of a suitable scene topics of search, distribution and organi- such as robots. representation) to image synthesis (gene- zation of data in digital libraries, scienti- ration of views for human consumption). fic data collections, and the worldwide The following scientific challenges emerge web. Our long-term goal is the develop- from this: For the input side, we want ment of easy-to-use, scalable and precise to develop modeling tools for efficient tools for intelligent searches, which actively handling and processing of large data support the user in the formulation of flows, and, on the output side, we are queries and in finding relevant informa- seeking new algorithms for fast compu- tion in different data formats. A special tation of high-quality views; these algo- characteristic of this research is the auto- rithms should exploit the capabilities matic extraction of structured information of modern graphics hardware. from unstructured sources such as the worldwide web. Our extraction processes The department “Bioinformatics combine pattern recognition, linguistic and Applied Algorithmics” addresses the methods, and statistical learning. In this potential of computing for the life sciences. way, the department has created one The life sciences have an increasing of the most comprehensive knowledge demand for algorithmic support due to bases over the past few years and has the recent large increase in experimental made it publicly available. In addition, data. Algorithms play a central role in we are developing new methods and the preparation and configuration of software tools for the search and analysis biological experiments and even more of XML documents, graphics-based RDF in the interpretation of biological data data, and very data-intensive internet ...... bericht_2011_en_korr2 24.11.201116:39UhrSeite12 12

...... in anaturalandintuitiveway. create systemswhichmakethispossible ciently, intelligentlyand robustly, andto stand, andsearchitinallitsformseffi- mit information,buttoorganize,under- tions technologyiswithoutprecedence. prices forinformationandcommunica- of increasingperformanceandfalling faster thaneverbefore.Thiscombination compactly, morecost-efficientlyand store, processandtransmitdatamore are moderncomputersystemswhich The technologicalbasisforthesechanges work. We liveinacomputerizedsociety. dramatic changestothewayweliveand of whichhavealreadybeenaccomplished: the cluster’s goalsforfuturework,many tion isanannouncementfrom2007of The followingsummaryoftheapplica- pleted threeyearsofsuccessfulwork. cluster’s work.Theclusterhasnowcom- and Hans-Peter Seidelcoordinatesthe 13 principalinvestigatorsofthecluster, of theinstitute’s directorsbelongtothe started itsworkinNovember2007.Four modal ComputingandInteraction” role intheExcellenceCluster Computing andInteraction Excellence ClusterMultimodal OVERVIEW The challengeisnolongertotrans- The lastthreedecadeshavebrought The Instituteplaysanimportant “Multi- , which

...... financed bytheuniversityandpar- groups. Halfofallgroupleadersare be deployedtosetupjuniorresearch scientists. A largepartofthebudgetwill over theyearsasan Saarbrücken hasacquiredareputation to qualifyandpromoteyoungscientists. presented ajointresearchprogram. from is thefirsttimethatleadingresearchers gether successfullyforseveralyears,this stitutions havealreadybeenworkingto- ces. Although somegroupsinthesein- for SoftwareSystemshavealljoinedfor- the DFKI,andMaxPlanckInstitute the MaxPlanckInstituteforInformatics, linguistics andphoneticsattheUdS partments forinformaticsandcomputer application-oriented research. multimodal dialoguesystems,focuson ments, syntheticvirtualcharacters,and sciences, large-scalevirtualenviron- web, informationprocessinginthelife The otherfive,namelyopenscience are orientedtowardsbasicresearch. secure autonomousnetworkedsystems, computing, algorithmicfoundations,and namely textandspeechprocessing,visual in nineresearchareas.Four ofthem, dimensional data.Theclusterisorganized images, movies,3Dmodelsandhigh- types ofdigitaldatasuchastext,voice, human expression,andthedifferent ally visionandhearing,thediversityof name referstothehumansensesespeci- The term A prominentgoaloftheclusteris For thisresearchprogram,thede- The clusterisfacingthesechallenges. all participating institutionshave “multimodal” “elite school” in thecluster’s for young

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 13 ...... 13

OVERVIEW

ticipating institutions. This financing Promotion of Young Scientists Structure of the Report scheme ensures continuity during and A further goal of the institute is the After a brief introduction to the beyond the life cycle of the cluster. creation of a stimulating environment for departments and research groups of our A new professorship for “Image, Video young scientists, in which they can grow, institute, we survey our work by means and Multimedia Systems” will be financed develop their own research programs, of representative examples. We group first by the cluster, and then permanently and build their own groups. We concen- the examples into subject areas, each of by the university. All participating institu- trate on doctoral and postdoctoral trai- which spans at least two departments...... tions have open leadership positions in ...... ning. Our 110 PhD students are trained This report ends with a presentation of the area of the cluster. in cooperation with the Graduate School the IMPRS-CS, a presentation of the for Computer Science at Saarland Uni- institute in figures, infrastructure aspects, The cluster strengthens the already versity and the International Max Planck and a tabular listing of cooperations and existing academic excellence in Saar- Research School for Computer Science publications. Enjoy reading it. ::: brücken and establishes it as a leading (IMPRS-CS). Our postdoctoral researchers center for informatics and speech tech- participate in international collaborations nology. such as the “Max Planck Center for Visual Computing” (a cooperation with Stan- ford University in the area of computer Publications and Software graphics), or the “Indo Max Planck Center The scientific results of the Max for Computer Science” (a cooperation with Planck Institute for Informatics are dis- leading universities in India), or one of tributed through presentations, publica- our many EU projects. Our training effort tions, software, and web services. Our is successful; about 20 PhDs graduate publications appear in the best venues every year and more than 100 alumni of the computer science field. Most from the institute are now professors at publications are freely available in the institutions all over the world. institute’s repository. Part of our results is available in the form of downloadable software or as a web service. Examples are CGAL (Computational Geometry Algorithms Library), the GISAID EpiFlu database as well as the clinically-used web service Geno2pheno for HIV therapy support. Publications in the form of soft- ware and web services make our results available more directly and to a larger audience than classical publications...... bericht_2011_en_korr2 24.11.201116:39UhrSeite14 hn +496819325-1000 Phone Ingrid Finkler-Paul |ChristinaFries Secretary Algorithms andComplexity CONTACT DEPT PROF. DR.KURT MEHLHORN Algorithms andComplexity 14 SOFTWARE .1

...... ET.1AlgorithmsandComplexity . 1 DEPT and abroad. in largeramountsofdata.Manyformermembersthegrouparetoppositionsdomestically we offercompletelynewsearchengineopportunitiesforefficientandintelligentsearches conferences inthefield,ourLEDAandCGALsoftwarelibrariesareusedworldwide, and people.We publishinthebestjournals,presentourresultsatleadinginternational We aresuccessfulinallthreeaspects.We areeffectivethroughpublications,software promotingyoungscientistsinastimulatingworkgroupsenvironment. • implementing ourfundamentalworkindemonstratorsandgenerally • carrying outoutstandingbasicresearchinthefieldofalgorithms, • members anddoctoralcandidates.Ourgoalsare The grouphasexistedsincethefoundingofinstitute.Itcurrentlyabout40staff useful softwarelibraries,

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite15

...... DEPARTMENTS DEPT. questions. understanding formanyalgorithmic search in complexitytheoryforrandomized for randomplanargraphsandourwork Our latestanalysesofdegreedistribution puter networksandreal-timescheduling. metric spaces,forloadbalancingincom- ing salesmanprobleminappropriate for findinggoodsolutionsthetravel- for provablycorrectisolationofzeroes, the pasttwoyearsincludenewalgorithms collaborate withcompanies. ries; asapartofourpracticalwork,we ware demonstratorsandsoftwarelibra- tical insightisusedtoimplementsoft- in awindfarm. A portionofourtheore- applications, e.g.,placingwindturbines cations problemsaswellforconcrete algorithms forabstractversionsofappli- on theaverage.We developefficient bounds, analysesintheworstcaseand randomized solutions,upperandlower and generalheuristics,deterministic tive solutions,problem-specificmethods or hierarchicaldata),exactandapproxima- (sequential, parallel,distributed,flatdata processes, verydifferentcomputermodels algorithms, datastructuresandsearch combinatorial, geometricandalgebraic and analysisofalgorithmsinmanyfacets: ware systems.We workonthedesign Outstanding theoreticalresultsin Algorithms aretheheartofallsoft- heuristics provideafundamental 1

...... gular internationalexchangethrough (IMPECS). Inaddition,thereisare- Planck CenterforComputerScience Tel Aviv) andtheIndo-GermanMax computing –withtheUniversityof nal projects:theGIFproject(geometric priority program“AlgorithmEngineering”. direction. has be and experimentalresearchinalgorithms standing ofalgebraiccurvesandsurfaces. rithms relyonadeeptheoreticalunder- require noadditionalspace.CGALalgo- powerful thanknownstructures,but on newindexstructures,whicharemore the CompleteSearchEngineinbased monstrators andlibraries.For instance, theoretical workformsthebasisforde- our practicalworkandviceversa.Our placement ofwindturbinesinfarms. algorithms tocalculateayield-optimized computer science,aswellevolutionary most importantliteraturedatabasein engine anditsapplicationtooneofthe possible, theCompleteSearchsearch makes thehandlingofnon-linearobjects the CGALsoftwarelibrary, whichalso recent yearsincludeourcontributionto The groupisinvolvedininternatio- The combinationoftheoretical Our theoreticalworkisinspiredby Outstanding practicalresultsin come awidelyacceptedresearch The DFGsupportsitviaits

...... oetclyadara. ::: domestically andabroad. leading universitiesorresearchinstitutes or tocontinuetheirscien tions inindustrynotlimi Planck Institutetoattain equipped aftertheirstayattheMax ensures thatourgroupmembersarewell- abroad. Theentiretyofthesemeasures stay ofoneyearata promotion andonlyafteraminimum to employourstudentsaftersuccessful of ourtrainingconceptisthatwecontinue but alsotoourdoctoral which areaddressednotonlytostudents, We givelecturesatSaarlandUniversity, is anintegralcomponentofourwork. Analysis ofComplexSystems). area AVACS (AutomaticVerification and part ofthetrans-regionalspecialresearch Engineering” priorityprogramandare many, weparticipateinthe“Algorithm National ScienceFoundation. InGer- a recipientofscholarshipfromtheSwiss scholarships, aMarie-Curiefellow, and are hostingfourrecipientsofHumboldt research. For example,atthistimewe various meansofsupportforexcellent The promotionofyoungscientists research institution ted toresearch, candidates. Part attractive posi- tific careersat informatik max planckinstitut 15 bericht_2011_en_korr2 24.11.201116:39UhrSeite16 hn +496819325-2000 Phone Anja Weber Secretary Computer Vision andMultimodalComputing CONTACT PROF. DR.BERNTSCHIELE Computer Vision andMultimodalComputing 16 DEPT SOFTWARE .2

...... ET.2ComputerVision andMultimodalComputing . 2 DEPT and wearablecomputing. description aswellmulti-sensor-based contextrecognitionintheareaofubiquitous main researchareasarecomputervisionwithafocusonobjectrecognitionand3Dscene The departmentwasfoundedin2010andcurrentlyincludes10scientists.group's

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite17

...... DEPARTMENTS DEPT. using movingcameras.Thisproblemis partment ispeopledetectionandtracking in amultimodallearningapproach. speech andimageprocessingiscombined from computergraphicmodels,orwhere where objectmodelsarelearneddirectly Current workhaspresentedapproaches in comparisonwithstandardapproaches. leading tosignificantlyimprovedresults and segmentstheobjectsimultaneously, One oftheseapproachesrecognizes presenting severalinnovativeapproaches. partment hasplayedapioneeringrolein has madeimpressiveprogress,andthede- recent years,thisareaofcomputervision problems ofimageunderstanding.In as objectrecognition,oneofthebasic department dealswithproblemssuch as gyroscopesandaccelerometers. cameras, andembeddedsensors,such using bothpowerfulsensors,suchas understanding ofsensorinformation, ment isthereforeconcernedwiththe stand theirenvironment.Thedepart- interpret itandthuscannottrulyunder- to thissensorinformationdonotfully devices andcomputerswhichhaveaccess matters. Thismeans,inparticular, that progress butisgenerallylimitedtosimple sensor informationhasmadeenormous ways. Computer-controlled processingof and theyarealreadyhelpfulinvarious embedded indevicesandenvironments, accelerometers arebeingincreasingly Another centralthemeofthede- In theareaofcomputervision, Sensors suchascameras,GPSand 2

...... embedded intheirclothing. surprising accuracyusingafewsensors interruptibility canbepredictedwith also possibletoshowthataperson's model personaldailyroutines.Itwas to recognizelong-termactivitiesand the departmenthaspresentedapproaches need forexplicitinteraction.Inthisarea, computing environmentswithoutthe communication betweenhumansand port andenableseamlessinteraction Ultimately, contextawarenessmaysup- to thesituationanduser’s needs. of makingthecomputingtaskssensitive ness andsensingisoftenseenasameans and eveninourclothing.Contextaware- be foundinourenvironment,objects ing amountofcomputersandsensorscan lying observationhereisthatanin modal sensorinformation.Theunder- processing andunderstandingofmulti- the secondcentralresearchareais image andsceneunderstanding. senting afurthersteptowardscomplete people butalsoentire3Dscenes,repre- presented approachnotonlydescribes over longerperiodsoftime. A recently that robustlydetectpeopleandtrackthem department hasdevelopedapproaches to theirbehaviormoreeffectively. The ments ofpedestriansandthereforereact such acameramaypredictthemove- dustry. For example,carsequippedwith or inroboticsandtheautomobilein- such asimageandvideounderstanding, also hasawidevarietyofapplications, not onlyscientificallychallengingbut In additiontocomputervision, creas-

...... elegantly integratepreviousknowledge. of largeamountsdata,andcanalso cessing. Inaddition,theyallowtheuse tainties thatexistwithanysensorpro- for example,themodelingofuncer- and inferencetechniques.Theseallow, extensive useofprobabilisticmodeling theme, astheotherresearchareasmake the importantroleasacross-cutting partment ismachinelearning.Thisplays The thirdresearchareaofthede- informatik max planckinstitut ::: 17 bericht_2011_en_korr2 24.11.201116:39UhrSeite18 hn +496819325-3000 Phone Ruth Schneppen-Christmann Secretary Computational BiologyandAppliedAlgorithmics CONTACT DEPT PROF. DR. THOMAS LENGAUER, PH.D. Computational BiologyandAppliedAlgorithmics 18 SOFTWARE .3

...... ET. ComputationalBiologyandAppliedAlgorithmics . DEPT 3 in theareaofbionformatics. The departmentemployscurrentlyabout20scientistswhoperformresearchexclusively This departmenthasexistedsinceOctober2001andisdirectedbyProf.Dr. ThomasLengauer.

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite19

...... DEPARTMENTS DEPT. 39) aswellotherdiseasessuch C ( HBV Resistance” such as AIDS, HepatitisB( specific casestudiesofinfectiousdiseases page 43). How LittleDoWe Actually Know?” ( lyzed action net must bemodelled,andcomplexinter- binding processesbetweenbiomolecules ships ofProteins” function ( struct between protein involved dimensional structuresoftheproteins requires of therespectivemolecularfunctions within andbetweencells.Theelucidation expression ofgenesandtransductsignals catalyze chemicalreactions,regulatethe to nucleicacids.Inthisway, proteins to eachother, toorganicmoleculesor usually proteinsthatinteractviabinding blocks ofsuchbiochemicalnetworksare circuitry of an by ab-errationsinthebiochemical level, diseaseprocessesaremanifested therapy ofdiseases. At themolecular directly relevantforthediagnosisand gates topics whicharedirectlyorin- “Fighting theHepatitisCVirus” This methodsarealsoappliedto The departmentprimarilyinvesti- “Regulatory Feedback Networks– the determinationofthree- and theanalysisofrelationships “Structure FunctionRelation- works ofproteinsmustbeana- organism. Thebuilding , page76)andHepatitis , page38).Inaddition, 3 ure andprotein “Analysis of , page ,

...... as welltheoptimizationof AIDS and 38) Function RelationshipsofProteins” analysis ofproteinfunction( include thefieldofepigenetics, are reportedoninthis2009/2010issue, and industrial users.Examples,which worldwide by in softwaresystemswhichareused development inthedepartmentresults page 41). ( an effective patient-specificdrugtherapy as otherimportantviralpreconditionsfor HIV DrugResistance” bination drugtreatments( of HIvirusesagainstadministeredcom- a stepfurther. We analyzetheresistances Planck InstituteforInformaticsgoeseven special role.W for infectiousdiseases, AIDS playsa ing thesearchforoptimizedtreatments ally RelevantProteins” proteins ( and characterizationofdisease-related the emphasisliesonidentification of Cancer” research focus( called epigeneticchangesconstitutea tion ofcancerthroughgeneticandso- logical diseases.Whiletheearlyrecogni- cancer, neurodegenerativeandimmuno- “Prediction ofHIVCoreceptorUsage” A substantialpartofthemethod protein interactionnetworks “Functional Analysis of Medic- of “Functional Analysis , page42),inotherdiseases many academic,clinical ith thisdisease,theMax “The GeneticFoundation , page77)aswell , page40).Regard- “Analysis of “Structure , page ,

...... by theGermanScienceMinistry. ::: Genome ResearchNetworksupported Hepatitis C,aswelloftheNational Deutsche Forschungsgemeinschaft on the ClinicalResearchGroup129of analyzing viraldrugresistance,partof conduct bioinformaticsresearchfor Consortium “EuResist”,bothofwhich Arevir networkaswelltheE partment isamemberoftheGerman in theareaof versity focusingonteachingandresearch an inter-faculty centeratSaarlandUni- pillars oftheBioinformaticsCenterSaar, provide aw of fluresearch( Coreceptor Usage” sistance” therapies ( The departmentisoneofthemain , page77, “Analysis ofHIVDrugRe- orld- bioinformatics. Thede- wide-used database. “GISAID” , page 41). In , page41).In “Prediction ofHIV , page86)we the area uropean informatik max planckinstitut 19 bericht_2011_en_korr2 24.11.201116:39UhrSeite20 hn +496819325-4000 Phone Sabine Budde Secretary Computer Graphics CONTACT DEPT PROF. DR.HANS-PETER SEIDEL Computer Graphics 20 SOFTWARE .4

...... DEPT . 4 Computer Graphics Computer . 4 DEPT requirement forfaster, andmoreinteractivepresentation,wherepossible. synthesis). Typical inthisareaisthebringingtogetherofverylargedatasetswith chain, fromdataacquisitionthroughmodelingtoimagesynthesis(3Danalysisand An importantcharacteristicoftheworkisthoroughconsiderationentireprocess The ComputerGraphicsworkgroupwasfoundedin1999andincludes40scientiststoday.

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite21

...... DEPARTMENTS DEPT. therefore qualitativelyhigh-value pre- lopment ofnewalgorighmsforfast and from theinputsideaswell deve- ling andfurtherprocessingofdata flows priate modelingtoolsforefficient hand- from thisdevelopment,especially arising As centralscientificchallenges lysis andsynthesishasbeen integrated views,theterm3Dimageana- hardware). Inthemeantime,for processes) aswelltheoutput(graphics hardware, theinput(imagegenerating the performancecapabilitiesofmodern are ne-cessaryinordertoadequatelyuse arbitrary views).Theseintegratedviews tion) toimagesynthesis(generationof suitable internalcomputerscenedescrip- acqui of theentireprocessingchainfromdata is there portant from ascientificpointofview. An im- lenges, alsonewapproachesareneeded with hisenvironment. much aspossibleinaninteractiveway user hastheintentionofinteractingas with highimagequality. Inaddition,the interactive) visualpresentationofresults requirement forfaster(ifpossible,more together verylargedatasetswiththe cation. Typical forthisareaisbringing or 3Dinternetaremerelyapartialindi- vision, telcommunications,virtualreality words suchasmultimedia,digitaltele- applications potentialthroughheadline communication companies,whosefuture technology formoderninformationand previous decadehavebecomeakey for people,computergraphicsinthe Due tothemeaningofvisualinformation order tosimulateandpresenttheworld. or avirtualworldonthecom today toconstructsegmentsofthereal Due totheabove-mentionedchal- Computers areoftendeployed sition viamodeling(creationofa fore thethoroughconsideration characteristic intheworkgroup 4 developed. puter, in these appro-

...... Computing andCommunication” founded nal, Europeanandinternationallevel. in aseriesofprojectactivitiesonnatio- puter Graphicsworkgroupareembedded graphics ontheoutputside. possibilities andprospectsofmodern sentation withcloselinkagethe Zurich), wasparticula of theboard,Prof.ThomasGross (ETH board inNovember2010.Thechairman lence clusterwasheldbytheadvisory second regularinspectionofthe excel- off workshopin and lence initiativebytheGermanfederal 2007 withintheframeworkofanexcel- clusterwasnewlyformedin excellence modal involved intheactivitiesof for Informatics) Hans-Peter Seidel(MaxPlanckInstitute Girod (StanfordUniversity)andProfessor center isinthehandsofProfessorBernd young scientists.Thedirectionofthe training andwinningbackoutstanding there isasignificantcontributionto with attractivereturnopportunities, establishing newexchangemechanisms and communicationstechnology. By in thiskeyareaofmoderninformation States istostrengthenresearchefforts locations inGermanyandtheUnited this bridge-buildingbetweenthetwotop additional fundingphase.Theaimof been extendedinrecentyearsforan It wasfoundedinOctober2003andhas with significantsupportfromtheBMBF. Planck InstituteandStanfordUniversity veloped withthesupportofMax state Especially importantisthejointly The scientificactivitiesoftheCom- In addition,thegroupissignificantly Computing andInteraction”. “Max PlankCenterforVisual governments. Af governments. November 2008,the rly impressedby ter thekick- “Multi- , de- The

...... teghndls er ::: strengthened lastyear. the Institute,groupwasfurther at (Stanford) toaW2permanentchair the appointmentofChristianTheobalt Society forProfessorH.-P. Seidel.With Leibniz PrizeoftheGermanResearch ding youngscientistprizesandalsothe has earnedanumberofprizes, tically andinternationally. Thegroup been promotedtoprofessorshipsdomes- 30 youngscientistsfromthegrouphave Institute. tute ontheGovernanceBoardof representative oftheMaxPlanckInsti- ware Systems.Prof.H.-P. Seidelisa and theMaxPlanckInstituteforSoft- the MaxPlanckInstituteforInformatics by Intel, SaarlandUniversity, the the campusandissupportedtogether The newresearchinstituteislocatedon Computing Institute” ment wasthefoundingof cluster isProf.Hans-Peter Seidel. scientific coordinatoroftheexcellence than isolatedindividualprojects.” This allowsthemtoachievemuchmore within theframeworkofthiscluster. approaches worktogetherveryclosely different researchgroupswith cluster: the interdisciplinaryapproachof Over thepasttenyears,morethan An additionalimportantdevelop- “Particularly impressiveisthat in May2009. “Intel Visual inclu- The DFKI, informatik max planckinstitut 21 bericht_2011_en_korr2 24.11.201116:39UhrSeite22 hn +496819325-5000 Phone Petra Schaaf Secretary Databases andInformationSystems CONTACT DEPT PROF. DR.-ING.GERHARD WEIKUM Databases andInformationSystems 22 SOFTWARE .5

...... ET. DatabasesandInformationSystems . DEPT 5 . Analysisofdistributeddata,especiallyinscalablepeer-to-peer systemsandonline 5. Queryprocessingandoptimizationofexecutionplansforefficientsearchonstructured 4. Rankingandinferencemethodsforquerieswhichonlythetopkanswersareimportant, 3. Text miningfortheautomaticclassificationofdocumentsandidentificationin- 2. KnowledgediscoveryontheWeb withstatistics-andlogics-basedmethodsforautomatic 1. The Department,whichisledbyGerhardWeikum, pursuesresearchinfivetopicareas: communities, forexample,insocialnetworksandWeb 2.0media. semistructured data(e.g.inXMLorRDFformat); and forcopingwithuncertaindata(e.g.automaticallyextractedrelationsfromtext); teresting patternsintextcorpora,especiallynewsandwebarchivesoverlongtimescales; fact extractionfromInternetsourceslikeWikipedia;

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite23

...... with itsstatisticalapproachtodatacap- facto predominating “Wisdom oftheCrowds” Social Web logic-oriented searchandinference,the tic Web the expectedconvergenceof but uncertainknowledge,andmore. relationships onthebasisofprobable changes inknowledge,deductionofnew (e.g. photosofpeople),analysistime systematic gatheringofmultimodaldata Inclusion ofmultilingualinformation, within the YAGO-NAGA framework: further researchtopicsareaddressed 3X fastest RDFsearchenginescalledRDF- and thegrouphasdevelopedoneof RDF Both usethesemanticwebdata Google Answer) engine calledNAGA search in YAGO, anewtypeofsearch taxonomy. For explorationandintelligent Wikipedia andintegratingtheWordNet been createdusingfactextractionfrom DEPARTMENTS DEPT. YAGO a verylargeknowledgecollectioncalled sources. Inthe YAGO-NAGA project, their cross-relationshipsindynamicweb entities (people,organisations,etc.)and ing, andanalyzingpatternsindividual Wide Web aswelldiscovering,track- bases frominformationontheWorld gathering ofcomprehensiveknowledge work ofthedepartmentisautomatic (RDF Triple Express) The visionthatdrivesthisworkis A centralthemeinthescientific (Resource DescriptionFrame (Yet Another GreatOntology) with itsformalontologiesand (Web 2.0)withitslatent has beendeveloped. 5 Statistical Web (Not Another , andthede- . A varietyof work) model Seman- has ,

...... These include,inparticular: other researchgroupsaroundtheworld. open sourcesoftwareandareusedby by thegrouparepubliclyavailableas Many oftheprototypesystemsdeveloped building forapplicationsandexperiments. tical foundationstopracticalsystems spans theentirespectrumfromtheore- tegic directionofDB-IRintegration. The groupisatrendsetterinthisstra- for companiesandintheWeb itself. e-science groupsand,lastbutnotleast, libraries, socialonlinecommunities, creasingly mixeddataformsfordigital become evermoreimportantwithin- torically divided;theirconnectionhas engines. Thesetwodirectionswerehis- information retrieval(IR)andsearch data analysis,thelatterintoareaof the areaofdatabasesystems(DB)and tured textualdata.Theformerfallinto statistically-based methodsforunstruc- algorithms forstructureddatasetsand its methodsofcombininglogic-based be enormous. benefits ofsuchaknowledgebasewould programs andintelligentservices.The presentation, andenableshigher-level structured, andmachine-readablere- pedic knowledgeofmankindinaformal, base, which the basisofacomprehensiveknowledge of search turing andprobabilisticunderpinnings The group’s researchmethodology The groupisaworldwideleaderin engines. Thewebcouldbe contains theentireencyclo-

...... TheXMLsearchengine 1. a eevdaGol eerhAad ::: has receivedaGoogleResearch Award. Computing andInteraction.” DFG ExcellenceCluster of Web Archive Data” Knowledge” especially theEUprojects number ofexternallyfundedprojects, Thescalablesystem 4. Thesoftware toolsthatautoma- 3. TheRDFsearchengine 2. The departmentisinvolvedina pages andnatural-languagetexts. which extractsnewfactsfromweb the YAGO contentsitself; YAGO knowledgebase,aswell tically createandmaintainthe patterns; data veryefficientlyforcomplex data andothergraph-structured which cansearchsemanticweb corpus fortheINEXcompetition; annotated Wikipedia reference and providesthesemantically the INEXbenchmarkingseries which hasachievedtopranksin and “Longitudinal Analytics as wellthe “Multimodal PROSPERA “Living The group RDF-3X TopX , informatik max planckinstitut , , 23 bericht_2011_en_korr2 24.11.201116:39UhrSeite24 hn +496819325-2900 Phone Secretary JenniferMüller Prof. Dr. ChristophWeidenbach Automation ofLogic CONTACT RG PROF. DR.CHRISTOPH WEIDENBACH Automation ofLogic 24 RESEARCHGROUPS .1

...... RG . 1 Automation of Logic of Automation . 1 RG used tools. Weidenbach, servesthecompletepipelinefrombasicresearchonnewlogicstoindustrially The independentresearchgroup"AutomationofLogic",underthedirectionProf.Dr. Christoph This is the goal of of This isthegoal perties, the problems. Inordertokeepupwithincreasinglycomplexsystems andrequiredpro- protocols, safetyproperties,softwareandinsolutionsfordecision andoptimization amount oftime.We areparticularlyinterestedintheverification ofcontrols, be capableofcomputingsolutionsforaclassapplicationswithinanacceptable years, e.g.,combinatorialoptimizationproblems. proof techniqueshasthusonceagainextendedtogeneralhardproblemsinrecent analysisofcomputersystemsandmathematics.Theapplication mal are generallysuccessfulonhardproblems,eveniftheydonotoriginatefromthefor- broadest sense.Inaddition,ithasbeenfoundthatmanyofthetechniquesdeveloped of logicwasextendedfrommathematicstopropertiescomputersystemsinthe developed toworkfullyautomaticallyinreasonabletime.Insum,theapplication however, foranumberofpracticallyrelevantproperties,procedureshavebeen grows atleastexponentiallywiththesizeofproblem.From themid-90stodate, even ifthereareautomaticproceduresavailable,thenumberofcalculationsteps time. Provingoranalyzingcomputersystemspropertiesistypically“hard”,thatis, and proofeffortsrequiredmassivemanualinteractionthereforeagreatdealof formulated, analyzedandultimatelyproven.Byaboutthemid-1990s,suchanalytical systems sothatthesecouldalsobedescribedpreciselyandproperties puters andinformationtechnology, theyhavebeenfurtherdevelopedforcomputer describe andcalculategeneralmathematicalarguments.Sincetheinventionofcom- method. Logicwasdevelopedattheendof19thcenturyinordertoexactly systems thatweknowfromhighschool,withasolution,includesthecalculation ing andexactrulesforreasoning. A simpleexampleofalogicarelinearequation Our researchgroupdevelopsautomatic,logic-basedprocedures.Theseshould Logics areformallanguagesforcalculationwithmathematicallyprecisemean- formal proceduresusedtodaymustsignificantlyincreaseinproductivity. u eerhgop Atmto fLgc.::: our researchgroup,“AutomationofLogic”.

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 25 ...... MAX PLANCK CENTER 25

Max Planck Center for Visual Computing and Communication

PROJECT DIRECTOR: PROF. DR. HANS-PETER SEIDEL

The “Max Planck Center for Visual Computing and Communication” The Max Planck Center for Visual Computing and Communication (MPC-VCC) with corresponding research was founded in 2003 with substantial support from the BMBF (Federal activities at the Max Planck Institute for Computer Science and at Stanford University was established jointly by the Ministry for Education and Research). The Max Planck Center brings Max Planck Society and Stanford University in October 2003. together the Max Planck Institute for Informatics in Saarbrücken In 2010 BMBF approved continuation of the center for an- other five year period. The proposed collaboration has two and Stanford University. It has been granted 6.9 million Euros from ......

intertwined goals: Establish a joint research program in the ...... area of visual computing and communication and incorporate a the BMBF since its founding. Due to the success of the program, strong career development component to alleviate the shortage in recent years, funds of up to 7.8 million Euros have been approved of qualified faculty and scientists in information technology in Germany. The Max Planck Center fosters the professional for a further support phase. development of a small number of selected outstanding in- dividuals by providing them with the opportunity to work at Stanford University as visiting assistant professors in the area of visual computing and communication for two years and then return to Germany to continue their research as leader of a junior research group at the Max Planck Institute for Computer Science and ultimately as a professor at a German university.

The center is directed jointly by Hans-Peter Seidel (Max Planck Institute for Informatics) and Bernd Girod and Leo Guibas (Stanford). :::

A succsessful alumni of the program is Christian Theobalt. He rejoined the Max Planck Institute for Informatics as a tenured faculty member (W2) after his previous appointment as Visiting Assistant Professor at Stanford University. Since December 2010, Christian Theobalt is also an Associate Professor in the Computer Science Department of Saarland University.

CONTACT

Prof. Dr. Hans-Peter Seidel Secretary Sabine Budde Phone +49 681 9325-4000 ...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 26

26

Research Areas

UNDERSTANDING 30 People Detection and Pose Estimation IMAGES & VIDEOS in Challenging Real-World Scenes

31 Scalability of Object Class Recognition

32 3D Scene Understanding from Monocular Cameras

33 Image-based 3D Scene Analysis

34 Markerless Reconstruction of Dynamic Scenes

BIOINFORMATICS 38 Structure Function Relationships of Proteins

39 Fighting the Hepatitis C Virus ...... 40 Functional Analysis of Medically Relevant Proteins

41 Prediction of HIV Coreceptor Usage

42 The Genetic Foundation of Cancer

43 Regulatory Feedback Networks – How Little Do We Actually Know?

GUARANTEES 46 Dealing with Selfishness in Optimization

47 Geometric Algorithms – Exact and Efficient

48 Controlled Perturbation – How to Avoid Difficult Special Situations

49 Automated Deduction

50 Model Checking for Hybrid Systems

51 Modular Proving in Complex Theories

INFORMATION SEARCH 54 Random Telephone Chains – Efficient Communication & DIGITAL KNOWLEDGE in Data Networks

55 Search and Mining in Web Archives

56 Information Search in Social Networks

57 YAGO – a Collection of Digital Knowledge

58 URDF – Efficient Reasoning in Uncertain RDF Knowledge Bases

59 EnBlogue – What is New and Interesting in Web 2.0?

60 Decision Procedures for Ontologies ...... bericht_2011_en_korr2 24.11.201116:39UhrSeite27 VISUALIZATION SOFTWARE OPTIMIZATION INFORMATION MULTIMODAL 65 64 88 87 86 85 84 80 79 78 77 76 72 71 70 69 68 ...... Correspondences, Symmetriesand Automatic Modeling Statistical ModelsofPeople Computational Photography Advanced Real-Time Rendering HDR –ImagesandVideo withEnhancedContrast RDF-3X –Fast SearchesforSemanticData TopX 2.0–EfficientSearchinDigitalLibraries GBACE: Parallel Processingof Adaptive Data Analysis ofHIVDrugResistance Analysis ofHBVResistance Model-Based ProductConfiguration Combinatorial OptimizationinInformationScience Placement ofWind Turbines Approximation Algorithms forHardOptimizationProblems The TheoryofBio-inspired Algorithms Automated MusicProcessing Recognizing Human Activity

...... 27 bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 28 ...... 28 RESEARCH AREAS

UNDERSTANDING IMAGES & VIDEOS

Understanding images and videos is one of Various research groups within the One of the fundamental problems Max Planck Institute for Informatics of understanding images is the recogni- the fundamental problems of image process- work on the different aspects of image tion of objects. With the current omni- and video understanding. In the field of presence of digital image material, auto- ing. The scientific challenges range from the modeling people, work is carried out on matic visual object-class recognition modeling and tracking of people and objects the reconstruction of animated models techniques are becoming increasingly from multi-video data, for example. The important. Even if today's approaches in camera systems to the reconstruction and ...... primary goal here is the ability to model ...... can achieve remarkable results, one of ...... and visualize people in detail and as ac- the most significant problems remains: description of 3D scenes. This area has many curately as possible. Furthermore, methods How can object models be learned or applications; these include the animation for people detection and tracking are re- constructed? Therefore, a central topic searched that require monocular came- of various research work at the institute of people and visualization of 3D scenes, ras only and that are able to detect and concerns how such object models can track many people simultaneously in be constructed with as little manual the indexing of image and video material, complex scenes. Even if these studies input as possible to enable a broad pursue fundamentally different targets applicability of today’s approaches. ::: and also 3D-capturing of the environment and use different camera configurations, for automotive applications. This research they benefit from each other, as these areas use similar models and algorithms. area is thus situated at the intersection A further area of understanding between computer vision and computer images and video is the reconstruction graphics, resulting in numerous opportunities and description of 3D scenes. Again, there are different studies within the for cooperation within the institute. institute that benefit from each other. For example, 3D scenes are reconstruc- ted from image series to enable a surface description of 3D objects that is highly detailed and accurate. Furthermore, the 3D environment of a moving car is de- scribed. Here, a special focus is on the complete identification and description of pedestrians and other road users. All this work shows that the integration and use of 3D scene models can significantly improve results...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 29 ...... 29 CONTRIBUTIONS

People Detection and Pose Estimation

in Challenging Real-World Scenes ...... 30

Scalability of Object Class Recognition ...... 31 ......

3D Scene Understanding from Monocular Cameras ...... 32

Image-based 3D Scene Analysis ...... 33

Markerless Reconstruction of Dynamic Scenes ...... 34

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:39UhrSeite30 30 Examples ofpeopledetectionandposeestimationobtainedwithourapproach People DetectionandPoseEstimationinChallengingReal-World Scenes UNDERSTANDING IMAGES&VIDEOS monocular andsingleimages. estimate viewpointsofpeoplefrom structures modelandalsoenablesto mation thatisbasedonthepictorial for peopledetectionand2Dposeesti- first isanovelandgenericprocedure to thesuccessofapproach.The 2D humanposeestimation monocular data. in recovering3Dbodyposesfrom changing backgrounds,andambiguities of people,clutteredanddynamically are frequentlyfullandpartialocclusions challenges addressedinourapproach shown onthepictures.Theimportant and estimatedbodyconfigurationsare Several examplesofpeopledetections monocular anduncalibratedcamera. can operateonimagesfromamoving, multiple synchronizedvideostreamsbut grounds. Ourapproachdoesnotrequire multiple peopleanddynamicback- poses incomplexstreetsceneswith for detectingpeopleandestimatingtheir listic scenes. remains ascientificchallengeforrea- ing problemsincomputervisionand same timeitisoneofthemostchalleng- the weborsurveillancecameras. At the or forindexingimagesandvideosfrom human-computer interactionscenarios, such asroboticsandautomativesafety, key technologyformanyapplications Several keycomponentscontribute We havedevelopedanewapproach Finding andfollowingpeopleisa

...... in everystep. guity of3Dposeestimationeffectively ple stagestherebyreducingtheambi- work, weaccumulateevidenceinmulti- 3D poseestimation.Incontrasttoprior approach enablespeopletrackingand person tracking 3D humanposeestimationandmulti- belief propagation. configuration ofthebodypartsusing a locallikelihoodweinfertheoptimal preting theoutputofeachclassifieras part atagivenimagelocation.Inter- mative forthepresenceofbody which oftheselocalfeaturesareinfor- notated humanposesinordertolearn classifier trainedonthedatasetofan- descriptors. We employaboosting is representedbyasetoflocalimage body. Theappearanceofthebodyparts parts andakinematicmodelofthehuman local appearancemodelsofhumanbody The secondkeycomponentinour Our approachisbasedonlearned

...... hn +496819325-2000 Phone ComputerVision andMultimodal Computing DEPT. 2 Bernt Schiele +496819325-2119 Phone ComputerVision andMultimodalComputing DEPT. 2 Mykhaylo Andriluka CONTACT piiain ::: optimization. minima thatotherwisehindersuccessful and allowstoprunemanyofthelocal nificantly constrains3Dposeestimation the combinationoftheseestimatessig- the sameperson.We demonstratethat and grouphypothesescorrespondingto which allowstodetectocclusionevents estimates bytrackingthemovertime, addition, werefineandimprovethese discriminative appearancemodel.In and viewpointsobtainedwithastrong estimates ofthe2Dbodyconfigurations pose likelihoodisformulatedintermsof contrast totheseapproaches,our3D in adjacentframesofthesequence.In conditional independenceofevidence edge mapsandwhichoftenassume image featuressuchassilhouettesand pose likelihoodsareoftenbasedonsimple beyond priorworkinthisarea,which 3D poseestimation.Ourapproachgoes multi-stage inferenceprocedurefor In particularweproposeanovel

...... bericht_2011_en_korr2 24.11.201116:39UhrSeite31

...... and swans. tively dissimilarclasses,suchashorses components tobere-used,evenforrela- rather thanonappearanceallowsmodel is reduced.Thefocusonobjectshape the numberofrequiredtrainingexamples nents donothavetobelearned.Hence, giraffe model.Re-usedmodelcompo- in ordertobuildanothermodel,e.g.,a of thehorses'bodyparts,canbere-used pected variationbasedonthelocalisation Components ofonemodel,e.g.,theex- UNDERSTANDING IMAGES&VIDEOS Scalability ofObjectClassRecognition [ classes, suchashorsesandgiraffes strained partialshapes.Similarobject objects asanassemblyofspatiallycon- an objectclassmodelthatrepresents Re-using objectclassmodelcomponents of achievingthisgoal. bility. We explorethreedifferentways class models,therebyincreasingscala- training examplesforbuildingobject aims atreducingtheamountofrequired manual annotationiscostly, ourresearch of manuallyannotatedimages.Since tive trainingexamples,oftenintheform sufficiently highnumberofrepresenta- re-liable objectclassmodelsrequiresa remains amajorchallenge:building taneous recognitionofmultipleclasses mance forindividualclasses,thesimul- achieve remarkablerecognitionperfor- While currentstate-of-the-artsystems such asoneparticularredsportscar. as cars,ratherthanspecificinstances, gnition ofentireclassesobjects,such portant. Thefocushereliesonthereco- jects arebecomingmoreandim- niques forthevisualrecognitionofob- digital imagerytoday, automatictech- Figure 1] In thisproject,wearedevising Because oftheomnipresence , sharesimilarrepresentations.

...... [ computer-aided design(CAD) data learning objectclassmodelsfrom3D images altogether, thisprojectaimsat Learning from3DCADdata by humansubjects. ding theobjectclass-attributerelations demonstrated tobeenparwithprovi- mance ofthefullyautomaticsystemis other models.Therecognitionperfor- entirely onre-usedcomponentsfrom training examplesareavailable,relying models, evenforclasseswhichno can thenbeusedtobuildobjectclass Yahoo websearch.Theminedrelations e.g., WordNet, Wikipedia, Flickr, and of linguisticknowledgebasesinclude, linguistic knowledgebases.Examples and attributesbyautomaticallymining We proposetorelateobjectclasses or texture,toentireobjectclassmodels. i.e., distinctvisualpropertieslikecolor nents correspondeithertoattributes, matically. Inthisproject,modelcompo- components havetobeidentifiedauto- model componentsinpractice,re-usable automatically Identifying re-usablecomponents Figure 2] In anattempttoabandontraining In ordertoapplythere-useof . CAD modelsaretypically

...... riigiae.::: training images. compared toapproaches of carsfrommultiple the-art performancefortherecognition Our experimentsdemonstratestate-of- based abstractionofobjectappearance. a images, whichweachieveby training imagesandreal-worldtest in bridgingthegapbetweenrendered rendering. Thekey artificial trainingexamples,through allow forarbitraryview-points,thatmeans they areviewpointindependent, learning ofobjectclassmodels.Since shape, theylendthemselvestothe and detailedrepresentationofobject Since CAD modelsprovideanaccurate ily availableformanyobjectclasses. games, andmoviemaking,areread- employed inproductdesign,computer recognition output Figure 2:Learningfrom3DCADdata,example nenthttp://www.d2.mpi-inf.mpg.de/nlp4vision Internet +496819325-2120 Phone ComputerVision andMultimodal Computing DEPT. 2 Michael Stark +496819325-1206 Phone ComputerVision andMultimodalComputing DEPT. 2 Marcus Rohrbach CONTACT recognition output components, example Figure 1:Re-usingmodel challenge consists viewpoints, even using real-world shape- 31 bericht_2011_en_korr2 24.11.201116:39UhrSeite32 32 Additionally, weincludeaGaussianprior and thatallobjectsaresupported byit. ground isflatinalocalneighborhood Moreover, itcanbeassumedthatthe strained andapproximatelyknown. tively toitssurroundingarea,are con- ground andtheorientation,thatisrela- area, therefore,cameraheightabove be calibratedrelativelytoitssurrounding tion areas.Thecamera,forinstance,can rage domainknowledgeforbothapplica- ning exampleforourwork.We canleve- a vehicleorrobotastestcaseandrun- cars fromamovingcameramountedon use therecognitionofpedestriansand and commercialinterest.Therefore,we systems, areclearlyofmajorscientific plications, suchaspedestrianprotection robots, aswellautomotivesafetyap- Robotics andautomotiveapplications sequences. 3D scenesfromstillimagesandvideo the problemtoautomaticallyunderstand computer visionshouldreinvestigate performance levels,webelievethat sub-tasks startstoachieveremarkable tracking. As theperformanceforthese segmentation, objectdetection,and areas suchascamerageometry, image and hasachievedremarkableresultsin munity hasturnedtoeasiersub-problems these earlyattempts,theresearchcom- and simplescenes.Disappointedby sive goalevenforrelativelyconstrained scene understandingremainedanillu- As aresultanddespiteenormousefforts, power andtheirinherentambiguities. difficult duetotheirlimitedexpressive of suchlow-levelfeaturesprovedvery Unfortunately, thereliableextraction description andsceneunderstanding. in thederivationofacompletescene their orientationasthekeyingredients low-level featuressuchasedgesand sed inabottom-upfashion,startingfrom the earlydaysthisproblemwasaddres- machine visionsinceitsbeginning.In been advocatedasthe“holygrail”for system, visualsceneunderstandinghas goal incomputervision Scene understanding–alongstanding 3D SceneUnderstandingfromMonocularCameras UNDERSTANDING IMAGES&VIDEOS Robotics, suchasmobileservice Inspired bythehumanvisual

...... ling object classdetectors ledge withpowerfulstate-of-the-art combines theabovepriordomainknow- Integrated 3Dscenemodel for asamplescene. cation domainsmoreeasily. Seefigure1 understanding problemforsuchappli- guities andallowustosolvethescene additional constraintstoeliminateambi- heights ofcars.Theseassumptionspose for pedestrianheightsaswellthe videos ofamonocularcamera. on acombinationofpriordomainknowledgewith Figure 1:Oursystemaimstoinfera3Dscenebased results). detections, segmentation,andsystem robust reasoning(seefigure2forsample over severalframestoachieveamore of time,accumulateimageinformation dynamic consistencyoveralongerperiod whereas tracklets,duetogeometricand roads, thesky, orobjectsforeachpixel, labeling inferssemanticclassessuchas cations intheimage,andsemanticscene class detectorsdetermine2Dobjectlo- and thenotionof The systemthatwehavedeveloped , semantic scenelabe- tracklets . Object

...... results ofour3Dsceneestimation Figure 2:2Ddetections,semanticscenelabelingand aeastp ::: camera setup. form similarsystemsthatuseastereo theless, ourapproachisabletooutper- depth fromtheimagedisparity. None- therefore notabletodirectlyextract have usedmonocularcamerasandare Throughout thisresearchprojectwe given bytheobjectclassdetectors. sentation butalsoimprovesthebaseline only allowsustoinfera3Dworldrepre- tion oftheindividualcomponentsnot couraging andshowthatthecombina- geometric context.Theresultsareen- sical exclusionbetweenobjects,and actions likeinter-object occlusion,phy- model isabletorepresentcomplexinter- nenthttp://www.d2.mpi-inf.mpg.de/ Internet +496819325-2000 Phone ComputerVision andMultimodalComputing DEPT. 2 Christian Wojek +496819325-2000 Phone ComputerVision andMultimodalComputing DEPT. 2 Bernt Schiele CONTACT By employing3Dreasoning,our monocular-3d-scene

...... bericht_2011_en_korr2 24.11.2011 16:39 Uhr Seite 33 ...... UNDERSTANDING IMAGES & VIDEOS 33

Image-based 3D Scene Analysis

but these skeletons are limited to mani- pulating the pose. Our approach produces a skeleton that can change the body shape; and, in addition, it produces a combined skeleton that can change both the shape and pose of the body [see figure 2]. This skeleton-based representation has the advantage that current 3D modeling packages or 3D games, which use ske- letons to change poses, can now also change the body shape without adjusting their internal representation.

Production of artificial camera shake Animated camera movements in ...... virtual 3D scenes are frequently criticized because they look too perfect. They are missing the slight camera shake which is observed in true camera recording. In our project, we have developed a tool Figure 1: Top: Input images, Bottom: line-based 3D reconstruction that analyzes the camera shake of real cameras and transfers this onto a virtual camera’s path. A database of various real Line-based 3D reconstruction can be estimated more reliably compared camera recordings was established, and Automatic approaches for 3D recon- to estimates for individual lines. Our camera movements were automatically struction from image sequences usually approach optimizes the 3D position of estimated from the image sequences. In produce, in a first step, merely a 3D the lines and their connections at the addition, the style of the observed camera point cloud. The next step is then to same time, achieving a better 3D con- shake was also stored in the database, generate a surface reconstruction of the struction. for example, whether the camera move- 3D objects. The reliable reconstruction ment took place in a helicopter, on a bi- of 3D lines is thereby an important inte- cycle, on a skateboard, etc. A 3D animator rim step. We have developed an approach Learning skeletons for the manipulation can now produce a virtual camera path that improves the reconstruction of 3D of body shape and pose and then create camera shake in the lines from image sequences. The advan- The project uses a large number desired style. With our approach only tage of our approach is that not every 3D of 3D scans of people with different the spectrum of the camera shake is line is estimated individually, but rather body shapes and in different poses to transferred and not the actual camera the global connectivity constraints be- automatically extract a skeleton. There shake itself. This prevents unwanted tween neighboring lines can be exploited. are already existing approaches that repetitions in the shaking motion of the The 3D positions of connections of lines allow automatic extraction of a skeleton, camera. :::

Figure 2: These skeletons were automatically generated from different 3D scans of people: a) a skeleton that changes body shape CONTACT b) different body shapes with a skeleton that allows to change the person’s pose Thorsten Thormählen c) a combined skeleton that allows to change body DEPT. 4 Computer Graphics shape as well as the pose Phone +49 681 9325-4017 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite34 34 the scene. process requiresnoopticalmarkersin solely frommulti-video for example,adressorballgown, a personincomplexclothing,suchas, mic geometryanddy-namictextureof to reconstructdetailed rithms a newtypeof of apersoniscaptured. time changinggeometrynorthetexture addition, withsuchsystems,neitherthe skin-tight suitwithopticalmarkers.In is measuredmustoftenwearaspecial complex process,andthepersonwho method ofmovementcaptureisavery video imagestreams.Unfortunately, this struct askeletalmodelofpersonusing ture systemsmakeitpossibletorecon- animation fromrealpeople.Motioncap- consists ofmeasuringsomeaspects level ofdetailatruehuman. movement animation,donotachievethe especially withrespecttothequalityof pletely manually-producedanimations, It isthereforeeasytoimaginethatcom- of movementmustbefinelyspecified. in precisedetailwork,andeachnuance metry ofthepersonmustbeconstructed intensive andcomplexprocess.Thegeo- animation program.Thisisaverytime- the overallappearancemanuallyinan this isdefiningeverypartialaspectof listically modeled.Onewaytoachieve surface texture,mustbeextremelyrea- example, hismovement,geometryand racteristics oftheperson,suchas,for in aconvincingway, theindividualcha- order tobeablepresentvirtualhumans or networkedvirtualenvironments.In example, incomputeranimations,films component ofvisualmedia,suchas,for called avatars,havebecomeanimportant models frommulti-videodata Reconstruction ofdetailedanimation Markerless ReconstructionofDynamicScenes UNDERSTANDING IMAGES&VIDEOS In ourresearchwehavedeveloped The alternativetomanualmodeling Computer-generated people,so- . For thefirsttimeitwaspossible Performance Capture Algo- movement, dyna- streams. Our

...... son arbitrarily the movementofreconstructedper- mation modelthatenablesustochange arbitrary clothing,andtolearnanani- estimate themovementsofapersonin the newdevelopmentofaprocessto which wasrecordedwitheightcameras. Right: Right: Figure 1: clothing model. problem. complex andhighlyunderconstrained single cameraperspectiveisanextremely cameras. Motionmeasurementfroma these methodsrequireseveralvideo are thereforenotreal-time.Inaddition, require verycomplexcalculationsand were describedintheprevioussection, for exactdetailedreconstruction,which sensors Real-time motioncapturewithdepth people canbecaptured. ment andgeometryofseveralinteracting process, withwhoseassistance,move- bined segmentationandreconstruction milestone wasthedevelopmentofacom- lation parametersofclothing. A further estimates thephysicalmaterialandsimu- nent ofthisisanewalgorithmwhich The reconstructedgeometric,skeletaland An importantstepforwardwas The Performance CaptureMethods Left: Left: An imagefromamulti-videosequence [Figure 1] . A corecompo-

...... of-Flight depthimage Figure 2:Themovementoftheperson resolution process) calculations ontheoriginaldata and increasingcameraresolutionthrough depth sensorswhicheliminatenoise, developed methodsinordertocalibrate measurement bias.We havetherefore low resolutionandexhibitsasystematic depth cameradataisverynoisy, has single cameraperspective.Unfortunately, construct bodymovementsthanfroma video datatogetheritiseasiertore- in realtime.Byusingbothdepthand cameras as, forexample,so-called of movementsequences of similarposesfromalargedatabase dure withaprocedureforfastfinding a depth-basedposeoptimizationproce- rithm tomeasuremovementcombines camera perspective.Therealtimealgo- person canbemeasuredfromonesingle our research,allbodymovementsofa through anewly-developedprocessfrom nenthttp://www.mpi-inf.mpg.de/resources/perfcap/ Internet +496819325-4028 Phone ComputerGraphics DEPT. 4 Christian Theobalt CONTACT an intensity image) intensity an New typesofdepthcameras,such With improveddepthdataand measure 2.5Dscenegeometry is reconstructedfromtheTime- (center) . in realtime [Figure 2] Time-of-Flight (bottom left as left (bottom . ::: (super- (right) .

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 35 ...... 35 bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 36 ...... 36 RESEARCH AREAS

BIOINFORMATICS

Bioinformatics is a key discipline for progress For about 20 years, bioinformatics afford cell-wide capture of molecular data, has significantly contributed to progress e.g., by measuring the frequency of all in biological sciences, such as biotechnology, in the life sciences. Therefore, the field read off genes (transcription), by captur- is a key ingredient of the revolution of ing the population of the proteins used pharmacy and medicine. With computational biology. Bioinformatics methods and by the cell (proteome), and of their in- methods Bioinformatics deepens and accel- software support researchers in planning teractions (interactome). Generating new experiments, store data which come from insight into the biology of the cell as well erates planning highly complex biological ...... all areas of biology, and assess this data ...... as the basis of diseases and providing ...... with computer support. With the help of new approaches to therapy involves experiments as well as interpreting the Bioinformatics, scientists can elucidate highly complex information processing. resulting very large amounts of data. the molecular processes in cells – the Bioinformatics addresses this challenge. basic units of living organisms – which The Max Planck Institute for Informa- form a complex system for processing tics performs research on many of the matter, energy and information. The challenges mentioned above. genome harbors the construction plan for the cell and the working plan of its Bioinformatics has the hybrid cha- metabolic and regulatory processes in racter of a basic science which poses and a complex, encoded form. In order to follows clear application perspectives. facilitate these cellular processes, parts This unique quality is highlighted by a of the genome must be “read off”. This significant number of spin-offs of bioin- includes the genes, which amount to the formatics research groups. For example, blueprints of proteins, the molecular Professor Lengauer co-founded the com- machinery of the cell. In turn, reading pany BioSolvelT GmbH, which develops off the genes is controlled via complex and distributes software for drug design. molecular networks. For the synthesis of Pharmaceutical companies all over the proteins, and also for their removal, there world are among the users of this soft- are specific molecular complexes which ware. themselves are again subject to exquisite molecular control. The cells convert The Bioinformatics Center Saar, energy, they communicate with cells in whose speaker is Professor Lengauer, their neighborhood, they take on diffe- received top marks for its research in rent structures and forms, and they move. the final evaluation in 2007 of the five They react to changes in their environ- bioinformatics centers that have been ment, for example, pertaining to light, funded by the German Research Found- temperature and pH, and they mount ation (DFG) in the past decade. ::: defenses against invaders. Dysregulation of such processes constitutes the mole- cular basis of disease. Drug therapy aims to restore a tolerable molecular balance.

For the past two decades, classical biological research, which used to be concentrated on cellular sub-systems with limited complexity, has been com- plemented by high-throughput experi- ments. The respective screening methods ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 37 ...... 37 CONTRIBUTIONS

Structure Function Relationships of Proteins ...... 38

Fighting the Hepatitis C Virus ...... 39 ...... Functional Analysis of Medically Relevant Proteins ...... 40

Prediction of HIV Coreceptor Usage ...... 41

The Genetic Foundation of Cancer ...... 42

Regulatory Feedback Networks – How Little

Do We Actually Know? ...... 43

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite38 38 BIOINFORMATICS patterns. interfaces ofproteinstorevealrecurring sites andinteractionpatternsatthe given. (2)Thecomparisonofbinding function whenproteinstructuresare here: (1)Thepredictionofmolecular Informatics. We highlighttwomethods ment 3oftheMaxPlanckInstitutefor proteins areoneresearchfocusinDepart- Methods the largercontextofcellularprocesses? the roleoftheseindividualfunctionsin or lastingissuchabinding? And whatis protein regionsbindtogether?Howstrong Which proteinsbindtogether? therefore relatestoquestionssuchas: such asdrugs.Themolecularstructure with DNA,RNA,orsmallmolecules, structure alsoplaysaroleininteraction the interactionwithotherproteins, a complementarysurface.Inadditionto can bindtoanotherspecificproteinwith mines whetherandhowwellaprotein the structureofproteinsurfacedeter- directly affectsitsfunction.For example, structure. Theprotein’s 3Dstructure into acharacteristicthree-dimensional long chainofaminoacids,whichfolds of therespectiveproteins. A proteinisa nothing isknownyetaboutthefunction 25,000 protein-codinggenesinhumans, However, formanyoftheapproximately on theinteractionofseveralproteins. cells. Mostcellularprocessesarebased blocks oflife Proteins aretheessentialbuilding Structure FunctionRelationshipsofProteins Figure 1:ProteinFunction Prediction:GOdot Structure-function relationshipsof Proteins carryoutmanytasksin

...... online at function. TheGOdotmethodisavailable with knownstructurebutunknown molecular functionofaqueryprotein models. SotheGOdotmethodpredicts structure ismodeledandbasedonthese servation offunctionwithrespectto of molecularfunction.Thelocalcon- able differenceinthelocalconservation protein structurespaceexhibitconsider- proteins areanalyzed.Various regionsin molecular functionsannotatedtothese them tothereferencedomains.The be embeddedintothatspacebyrelating reference proteins.Newproteinscan all-against-all structuralsimilaritiesof space isdefinedintermsofmeasuring protein structurespace.Protein of conservationfunctioninregions Function prediction:GOdot in red). PFAM family'PF00343=phosphorylase' are markedinblueandmembersofthe activity, transferringglycosylgroups' function 'GO:0006757=transferase is visualized,wherebyproteinswiththe local neighborhoodofprotein1gyp:B and familyaffiliations(inFigure1,the colored accordingtomolecularfunctions each other. Theproteinscanthenbe and similarproteinsaredrawncloseto space; eachproteinisplottedasapoint, embedded intothesurroundingprotein the web-servervisualizestargetproteins The GOdotsystemassessesthelevel In additiontopredictionoffunction, http://godot.bioinf.mpi-inf.mpg.de .

...... Figure 2 hn +496819325-3006 Phone ComputationalBiologyand DEPT. 3 Ingolf Sommer CONTACT fiiny ::: efficiency. be investigatedandcomparedwithgreat Using ourtools,suchbindingmodescan developed asaninteractionantagonist. cells. TheCD4M33-F23proteinhasbeen receptors onthesurfaceofhumanhost the HIVgp120proteinmustbindtoCD4 order forHIVtoinfecthumanhostcells, example fromHIVdrugdevelopment.In interaction mimicry. Figure2showsan of the investigationofmolecularbasis actions. Thisisparticularlyrelevantto similar patternsofnon-covalentinter- in identifyinglocalinterfaceregions peptidic compounds.Galinterassistsusers to comparinginterfacesinvolvingnon- mimicry. Themethodisalsoapplicable subunits areinvolved,includingcasesof parison ofinterfacesinwhichhomologous method hasbeenvalidatedinthecom- conserved interfaceregions.TheGalinter interfaces andallowsforrecognizing method successfullyalignsrespective bonds basedontheirgeometry. The der Waals interactionsandhydrogen aligns thevectorrepresentationsofvan protein-protein interfaces.Themethod covalent interactionsbetweendifferent Galinter isamethodforaligningnon- occurring atprotein-proteininterfaces. geometry ofnon-covalentinteractions available fordirectlycomparingthe sites, priortoGalinter, nomethodswere protein structuresandbinding there aremanymethodsforcomparing of interactionbetweenproteins.While the understandingofmechanisms tein-protein interfacesisessentialfor interactions: Galinter Characterization ofprotein-protein The studyandcomparisonofpro- Applied Algorithmics with

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 39 ...... BIOINFORMATICS 39

Fighting the Hepatitis C Virus

Infectious disease with world-wide University Hospital of Heidelberg since spread the experimental detection methods More than two hundred million used for this purpose require special people worldwide are affected by the bioinformatics methods to evaluate and infectious disease hepatitis C – most interpret the extensive measurement of them without even knowing it. The results [Figure 2]. In this way, medical bio- disease is triggered by the hepatitis C informatics contributes to the discovery virus (HCV), which was only discovered Figure 1: Viral sequence fragments (in red and blue) of potential host factors and their mole- in 1989. At first the disease seems to be and their mapping to the genome of the hepatitis C cular interactions in the human cell virus (on the top) inconspicuous, but without medication as well as to the identification of new it progressively damages the liver, eventu- This information gives doctors important drug targets. ::: ally provoking liver cancer and oftentimes clues about the course of the viral infect- leading to death. Like HIV and many ion and also aids pharmaceutical research other viruses, HCV is extremely adaptable in developing personalized therapy and and rapidly disguises its appearance by more effective drugs. changing its genome sequence. As a re- ...... sult, antiviral medicine becomes ineffec- tive, and to date there is still no vaccine Analysis of human host factors against HCV. At the moment, currently An alternative approach to fighting approved drugs do not work effectively HCV consists of drugs that do not target for fifty percent of all HCV infected viral molecules, but the molecular factors patients in Europe. Therefore, the bio- of the human host that are essential informatics research at the Max Planck for the virus. The virus needs such host Institute for Informatics supports lead- factors in order to replicate in human ing clinical research groups in the search cells. Thus, these factors are promising for better drugs against HCV. To fight targets for novel drugs as the virus can- the virus, the scientists use two com- not escape the drug treatment by chang- plementary approaches. ing its genome. To determine the host Figure 2: Schematic drawing of human host factors for factors for HCV, researchers at the Max the hepatitis C virus and their interactions in different Planck Institute help virologists at the molecular processes of the human cell Analysis of viral sequence changes HCV reacts to drugs that block its replication cycle by changing its genome. For this reason, a first approach for the bioinformatics support of HCV therapy is the computational analysis of the viral genomes found in patients. In this ana- lysis, sequence changes that allow the virus to resist the latest antiviral drugs CONTACT are identified. In cooperation with the Mario Albrecht University Hospital of Frankfurt, the se- DEPT. 3 Computational Biology and quences of viruses in patients are deter- Applied Algorithmics mined at different time points during Phone +49 681 9325-3000 drug therapy. Since the next-generation sequencing technology produces huge volumes of sequence data with high Hagen Blankenburg accuracy, millions of sequence fragments DEPT. 3 Computational Biology and Applied Algorithmics have to be mapped to the corresponding Phone +49 681 9325-3000 positions in the viral genome [Figure 1]. This enormous task can only be solved by powerful computers and requires the Sven-Eric Schelhorn development and application of sophisti- DEPT. 3 Computational Biology and cated algorithms. Subsequently, statistical Applied Algorithmics methods are used to identify changes in Phone +49 681 9325-3000 the viral genomes that render the hepa- ...... titis C virus insensitive to antiviral drugs...... Internet http://medbioinf.mpi-inf.mpg.de bericht_2011_en_korr2 24.11.201116:40UhrSeite40 40 improve theunderstanding oftheexact the datawithcomputational methodsto genome. Bioinformaticiansthen analyze are discoveredintheDNAof human control group,disease-associated SNPs By comparingpatientswithahealthy ill personsforgeneticcommonalities. matically searchtheDNAofchronically at theUniversityHospitalofKiel,syste- those SNPs ofinflammatoryboweldisease cell anemiaandcysticfibrosis. diseases. Well-known examplesaresickle biological processcanevenprovoke function oftheproteinandinvolved actions ofproteins.Thetherebyimpaired of theintra-andintermolecularinter- protein andresultinsignificantchanges the exchangeofanaminoacidin like SNPs. A pointmutationcanleadto caused, forexample,bypointmutations interactions. Suchmodificationscanbe of theproteinstructurecandisturbthese interactions betweenproteins,andchanges Most m interactions withintheproteinstructure. structure, whichisformedbyatomic have a cular buildingblocksofproteins.Proteins the orderofaminoacids,mole- the blueprintofaproteinasitdetermines Protein structureandfunction by disease-relevantSNPs. the molecularchangesthatarecaused research isthusparticularlyinterestedin tolerability andeffectofdrugs.Medical Additionally, SNPscaninfluencethe to diseasesandtheseverityoftheircourse. and areresponsibleforthesusceptibility appear moreofteninparticularsubgroups tion studiesrevealthatcertainSNPs than othersequencevariations.Popula- code thatoccurwithgreaterprobability changes ofsinglelettersinthegenetic are pointmutationsintheDNA,thatis, nucleotide polymorphisms The mostfrequentchangesare through individualgeneticchanges. Genetic variationwithDNA Functional AnalysisofMedicallyRelevantProteins BIOINFORMATICS Clinical associationstudies,suchas The DNAsequenceofageneis People differfromeachother carried outbyourmedicalpartners characteristic three-dimensional olecular processesoccurdueto (SNPs). They single

...... receptor IL17REL isayetlittle-exploredcell lished inanassociationstudy. Theprotein in thegene link betweenthediseaseandvariations naturally occurringbacteria.Recently, a intestinal wallreactsoversensitivelytothe system isdisturbedintheboweland that thebarrierfunctionofimmune are stilllargelyunknown.Itissuspected accumulation, whosemolecularcauses chronic bowelinflammationwithfamilial of suchadiseaseisulcerativecolitis, course ofchronicdiseases.Oneexample and theirSNPsinthedevelopment function ofrelevanthumanproteins this position protein sequenceofIL17RELatposition333.Theexchangetheaminoacidleucine(L)withproline(P) proteins inhumanandothermammals.Oneofthetwodisease-associatedSNPsisfoundatend The figureshowspartoftheverysimilarproteinsequenceshumanIL17RELandcloselyrelated ceptor effect thebindingsitesofcellre- found thatbothpointmutationscan by usingbioinformaticsmethods.Itwas protein weremorecloselyinvestigated structure andfunctionoftheIL17REL Computer analysisoftheproteinIL17REL tant signalingpathwaysofinflammations. system bindandtherebyenableimpor- chemical messengersoftheimmune Possible effectsoftwoSNPsonthe [see Figure] [see Figure] (brown) IL17REL can impactthefunctionofIL17RELstrongly. . Thus,thissuggests , ontowhichcertain could beestab-

...... hn +496819325-3000 Phone ComputationalBiologyand DEPT. 3 Gabriele Mayr http://medbioinf.mpi-inf.mpg.de Internet +496819325-3000 Phone ComputationalBiologyand DEPT. 3 Mario Albrecht CONTACT findings onamolecularle interpret andbetterunderstandgenetic supports clinicalresearchandhelpsto Medical bioinformatics colitis. treatment ofpatientswithulcerative IL17REL isnowconsideredasapossible this reason,thetherapeuticinfluenceof the emergenceofchronicdiseases.For ways canbeaffected,whichpromotes sequently, inflammatorysignalingpath- the bindingchemicalmessengers.Con- the interactionbetweenIL17RELand that thesesequencevariationsdisturb development ofdrugstocombatthem.::: of moleculardiseasecausesandthe tional methodsacceleratethe In thisway, medicalbioinformatics Applied Algorithmics Applied Algorithmics vel. Computa- discovery

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 41 ...... BIOINFORMATICS 41

Prediction of HIV Coreceptor Usage

The immunodeficiency disease be detected. Based on different viral AIDS is one of the greatest global chal- sequences, the server produces a quan- lenges, responsible for millions of deaths titative and qualitative forecast of core- per year. Despite more than two dozen ceptor use for the entire viral population. different AIDS drugs, there is still no cure in sight. Further approaches In recent years, we have researched Viral tropism and coreceptor inhibitors additional approaches to further improve The HIV virus requires two target our server’s forecasting quality. A focus molecules to enter into a cell: the has been on the analysis of further gene Prediction of coreceptor use through CD4 receptor, which is mostly found geno2pheno[coreceptor] sections of viral envelope proteins. While on immune cells, as well as a so-called up to now only the short V3 loop has coreceptor. This can be either CCR5 or been used in the analysis, we have been CXCR4. Not every virus can use these geno2pheno[coreceptor] and able to show that prediction with the V2 two receptors. There are viruses which geno2pheno[454] loop can significantly improve it and that ...... only use CCR5 (R5 viruses), while other ...... geno2pheno[coreceptor] is a web ...... the gp41 envelope protein also carries viruses can only infect through CXCR4 server which determines coreceptor use, information about coreceptor use. (X4 viruses). Viruses which are capable developed in Department 3 of the Max of using both receptors are called dual Planck Institute for Informatics. The In addition, we showed in an inter- tropic. The differentiation of viral types system predicts the tropism of a viral national cooperation that the prediction is also called viral tropism. population using the sequence of the V3 of geno2pheno[coreceptor] is as good loop of the viral envelope of the protein as the phenotypic reference test, if the In recent years, a new drug class gp120. Additional information about the patient's response to coreceptor anta- has found its way into everyday clinical immune status of the patients can be gonist therapy is used as a criterion. ::: practice, the so-called coreceptor antago- considered in the prediction, enabling nists. Maraviroc (Celsentri, ViiV Health- individual cases to be investigated in care), the first licensed active ingredient further detail. in this class, selectively binds to the CCR5 receptor, thus preventing entry Since its start in June 2004, the into the cell by R5 viruses. This does web server has been used more than not, however, affect dual tropic or X4 240,000 times. It is recommended in the viruses. Before drugs are administered, German-Austrian AIDS guidelines and the tropism of the viral population must in the European treatment guidelines therefore be determined and the pres- together with a commercial phenotypic ence of viruses that can use CXCR4 test for coreceptor determination and is must be ruled out. thus routinely used in most European laboratories.

Determination of coreceptor use geno2pheno[454] represents an ex- In order to determine the tropism tension of geno2pheno[coreceptor] which of a viral population, there is a pheno- allows the analysis of data with modern typic and a genotypic approach. In the 454-sequencing technology. Thanks to this phenotypic approach, the viral popula- technology, it is now possible to analyze tion is added to cell cultures which, in and sequence hundreds of thousands of addition to CD4, express only the CCR5 different viruses at the same time from receptor or only the CXCR4 receptor on the same blood sample. Minorities of the cell. Observing which cells the virus X4 viruses that could possibly lead to a infects allows coreceptor use to be deter- failure of a coreceptor therapy can thus mined. With a genotypic approach, the

viral genome is first sequenced and then CONTACT used for estimating the tropism with Alexander Thielen computer models. Modern statistical DEPT. 3 Computational Biology and learning methods are applied for this. Applied Algorithmics Phone +49 681 9325-3008 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite42 42 The GeneticFoundationofCancer BIOINFORMATICS between earlyand to certaintreatments,orthatdiscriminate patterns thataresusceptibleorresistant models canbeusedtoidentifygenomic genome ofhis/hertumorcells.Similar copy numberaberrationsobservedinthe between thesurvivalofapatientand models thatestablishastatisticalrelation and theirtumorsforestimatingstatistical plan tousedataonmonitoredpatients genome-scale CNAs.For example,we ted predictionofcancerphenotypesfrom cifically, wedevelopmodelsforautoma- from setsofobservedtumors.Morespe- the roleofCNAsincancerprogression work, weusestatisticaltoolsforinferring available formosttypesofcancer. Inour measurements ofCNAshavebecome quencing, largecohortsofgenome-scale microarrays andhigh-throughputse- advances inmolecularbiologysuchas e.g., uncontrolledproliferation. which leadstoaberrantcellbehavior, genes) iseitherinhibitedorpromoted, mic sequenceslocatedwithinCNAs(e.g., As aconsequence,thefunctionofgeno- their genomeduringdiseaseprogression. cerous cellsaccumulatetensofCNAsin number aberrations(CNAs).Mostcan- copies). Thesesegmentsarecalledcopy or insmallercopynumber(onezero ber (rangingfromthreetotensofcopies) cells maybepresentinhighercopynum- some segmentsoftheDNAcancerous the DNAispresentinexactlytwocopies, While inhealthysomaticcellsmostof alteration, namely we studyaspecifictypeofstructural lar death(apoptosis).Inourresearch, plication andpreventprogrammedcellu- tural alterations,whichpromotecellre- accumulates manymutationsandstruc- a cancerousstate,thegenomeofcell Due totherecenttechnological In thecourseofprogressiontowards copy numberaberrations advanced tumors...... to 85%predictionaccuracy).We also tions, withrelativelyhighaccuracies(up survival fromDNAcopynumberaberra- we areabletopredicttumorstageand lung cancer. Inthecaseofneuroblastoma, two typesofcancer:neuroblastomaand biomarkers. come promisingcandidatesfordiagnostic with whichtheyenterthemodelandbe- ted CNAscanberankedbytheweights relevant CNAsforprediction.Theselec- which relyonasmallsubsetofhighly We chooserobustclassificationmodels significant lossofrelevantinformation. to achievedimensionreductionwithno genomic featuresintoclusters,suchas We tacklethisproblembygroupingthe This phenomenoniscalledovertraining. at thecostofgeneralizationpower. of thefewavailabletrainingexamples process onlygraspstheidiosyncrasies great chancethatthestatisticallearning of observationsavailable,thatthereisa ion issolargecomparedtothenumber play asignificantroleindiseaseprogress- nomic featuresthatcouldpotentially data isrobustness.Thenumberofge- is trainedonhigh-throughputgenomics requirement foranypredictionmodelthat data miningtask.Themostimportant which addsdifficultytothestatistical of thecopynumbermeasurements, in dealingwiththehighdimensionality In ourapplicationsweinvestigate One challengeofourendeavorlies

...... process of The Figureillustrates bevto.::: observation. genetic explanationofthisempirical show thatthereexistsanunderlying in neuroblastoma.Ourresultstherefore to beaveryimportantprognosisfactor cal observations,agehasbeenknown older patients.Basedonrepeatedclini- of veryyoungpatientsandtumors that candiscriminatebetweenthetumors identify copynumberaberrationpatterns hn +496819325-3026 Phone ComputationalBiologyand DEPT. 3 Laura Tolosi CONTACT Applied Algorithmics cancer treatment the role (step III of the process) the of III (step of bioinformaticsinthe ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite43

...... An illustrationofhowlittleweknow:Theknown regulatory networkofBacillussubtilis. BIOINFORMATICS (blue) present. Theunderstandingofthese produced inacellifsugarisactually ensure, forexample,thatinsulinisonly of proteinsinanylivingorganism.They of theDNAandregulateproduction tion factorproteinsbindtocertainareas teins withtheDNA.So-calledtranscrip- from complexinteractionpatternsofpro- Gene regulatorynetwork this isaquiteastonishingresult. gene regulatoryinteractionsareknown; organisms atmostaboutonethirdofall even forthebest-researchedmodel Our statisticalsimulationsshowthat the areaofgeneregulatorynetworks. and anincreasedneedforresearchin tic estimations,whichmustbecorrected, environmental conditions. adjust theirgeneticprogramstovariable fine-tune theabilityoflivingcellsto molecular mechanismsthattriggerand stages. We knowverylittleaboutthe that thisresearchareaisstillinitsearly we candemonstrateveryimpressively regulation. Byusingstatisticalmodels, tant areasofmoderngenetics:gene of knowledgeinonethemostimpor- A soberingpicture Regulatory FeedbackNetworks–HowLittleDoWe ActuallyKnow? vs. theunknown Gene regulatorynetworksemerge Our workdealswithformeroptimis- We haveanalyzedthecurrentstate (orange) part ofthegene

...... results. provide scientificallysoundandcredible intervals weredeterminedinorderto veloped model.Inaddition,confidence methods wereusedtovalidatethede- tes to whichonecouldcomparetheestima- not asingleknown,completenetwork validation oftheresultssincethereis actions. Themainproblemwasthe whole-organism generegulatoryinter- we estimatedtheexpectednumberof construction ofstatisticalmodels,then different regulatorynetworksforthe know? How doweestimatehow rather, howlittle. about thesefundamentalnetworks,or mates ofhowmuchwealreadyknow We providestatisticallysignificantesti- answered completlyforasinglespecies. interplay betweenthegenes,cannotbe up questions,thequestionabout that oneofthemostimportantfollow- sequenced organisms,itisastounding behavior ofalllivingorganisms. able foradeeperunderstandingofthe feed-back regulationnetworksisinevit- today. We utilizedtheknownpartsof Despite wellover1,500completely Therefore, complexstatistical little we

...... latory relationships. further methodstodeterminegeneregu- portance ofaconstantdevelopment What doesthismeanforus? worse; hereweknowlessthan1%. such ashumans,thesituationiseven an illustration).For complexmammals regulatory relationships(seeFigure1for most aquarterofitstranscriptionalgene studied livingorganismandweknowat subtilis, forexample,isoneofthebest knowledge base.ThebacteriumBacillus our resultsdemonstrateverylimited different organisms. As explainedabove, h eeomn fdus ::: the developmentofdrugs. for personalizedmedicineaswell understand 99%.Thisisagreatproblem also means of thehumangeneticinteractions,this can show, thatwe have urgentmedicalrelevance.Ifwe bute toscientificknowledge,butalso These arenotonlyissuesthatcontri- stand established methodsinordertounder- tivate consistentfurtherresearchwith nenthttp://csb.mpi-inf.mpg.de Internet +49681302-70885 Phone ComputationalBiologyand DEPT. 3 Jan Baumbach CONTACT Our workdemonstratestheim- Then, themodelwasappliedto the complexinterplayofgenes. Applied Algorithmics conversely, thatwedonot know lessthan1% The resultsalsomo- 43 bericht_2011_en_korr2 24.11.201116:40UhrSeite44 44 ment researchquestionattheinstitute. performance guaranteesisacross-depart- is nothelpful.Thesearchforcorrectnessand A correctanswerthatisnotreceivedintime as important,however, isperformance: criterion forreliabilityiscorrectness.Nearly Software mustbereliable.Themostimportant GUARANTEES RESEARCH AREAS

...... ness inOptimization” pursuing differentgoals. dently butalsoreacttoeachotherwhile programs inwhichpartsworkindepen- only asingleprogram,butsystemof more complexwhenweexaminenot result. Thegrantingofguaranteesiseven for yearscansuddenlygenerateawrong and softwarewhichhasworkedreliably infrequent butpossiblebordercase, developers maynothaveconsideredan crash ortodeliverwrongresults.Or, the lations cancausetheoverallprogramto a smallchangeintheprocessor’s calcu- ral hundredthousandlinesofcode,or error inonelineofaprogramwithseve- A singlechangeinaprogram,minor hardware andsoftwarearenotrobust. is especiallydifficultandcomplexas justified. Theanswertothisquestion trust thatweplaceintheseproductsis more weposethequestionwhether dent uponhardwareandsoftware,the machine. Themoreourlivesaredepen- controls inthecar, airplaneorwashing times unknowingly, suchastheelectronic such asacomputeronthedesk,some- stantly usethem,sometimesknowingly, a ubiquitouspartofourlives.We con- microprocessor controlledsystemsare profit. of well-payingcustomerstooptimize would ideallyprioritizethedatatransfer ing allInternettraffic,althoughthey committed, ontheonehand,toforward- our currentinternetproviderswhohave is investigated. An exampleofthisare In thearticle Today, computers,networks,and “Dealing withSelfish- , thelattersituation

...... for HybridSystems” sented inthearticle based onmodelassumptions,asispre- case, itmakessensetouseprocedures an acceptableperiodoftime.Insucha of anarbitrarycomplexprogramwithin possible togivealldesiredguarantees Ultimately, itiscurrently notgenerally “Modular ProvinginComplexTheories” This aspectisdiscussedinthearticle verifications canthusbereduced. proof obligations.Thecomplexityof to properlydivideandconquerresulting “Automated Deduction” this, whicharediscussedinthearticle verification proceduresarerequiredfor gram intheformofproofs. Automatic properties tobederivedfromthepro- must beusedthatenablethedesired programs ingeneral,deductivemethods situations. required how thesubstantialnecessaryeffort in ageometriccontext,andsecondly, what isneededforaccuratecalculations Difficult SpecialSituations” “Controlled Perturbation –Howto Avoid Algorithms –ExactandEfficient” its servicelife.Thearticles“ of anairplanetobecomejammedduring error caused,forexample,therudder It wouldbeundesirableifacalculation airplane, isdesignedonacomputer. every largermachine,suchasacaror are especiallyimportantastodaynearly In ordertoprovideguaranteesof Guarantees ofgeometriccalculations for thiscanbeavoidedinmany . ::: “Model Checking , asaretechniques show firstly, Geometric and ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 45 ...... 45 CONTRIBUTIONS

Dealing with Selfishness in Optimization ...... 46

Geometric Algorithms – Exact and Efficient ...... 47 ...... Controlled Perturbation – How to Avoid Difficult

Special Situations ...... 48

Automated Deduction ...... 49

Model Checking for Hybrid Systems ...... 50

Modular Proving in Complex Theories ...... 51

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite46 46 storage. then boundedbythesizeofsmallest mum possiblecapacity:thiscapacityis consisting ofparallelstoragesmaxi- we wanttocreateabackupmedium hard drivesofvariouscapacities,and the scenariowherejobsrepresent an allocationisforinstancerelevantin measure intheareaoffairdivision.Such job allocation. optimizing mum loadonselfishmachines,i.e., ied inthisareaismaximizingthemini- is adominantstrategyfortheagents. mechanisms thatensuretruth-telling zation goal.Hence,weneedtodesign quite differentfromtheoveralloptimi- mizing theirownutility, whichmaybe The agentsareonlyinterestedinmaxi- agent knowshowfasthismachineis. trolled byselfishagents,andonlyeach problems wherethemachinesarecon- in someway. For example,scheduling zation problemsthatinvolveselfishness Dealing withSelfishnessinOptimization GUARANTEES A problemthatIhaverecentlystud- In myresearch,Ifocusonoptimi- the max-minfairnessof This isastandardfairness

...... ratio. algorithm withaconstantapproximation signed thefirstmonotoneapproximation and Annamaria Kovacs,Ihaverecentlyde- however). not necessarilyhavetoremainthesame, increase (thesetofassignedjobsdoes jobs assignedbythemechanismmaynot one unitofwork),thetotalsize its bid(i.e.,claimedcostforrunning tone meansthatifoneagentincreases dominant strategyfortheagents.Mono- monotone, thentellingthetruthisa allocation donebythemechanismis jective function.Itisknownthatifthe otherwise wecannotoptimizeanyob- true for itself.We wanttheagentstogivetheir on itsspeed,whichonlyeachagentknows the jobsassignedtoit.Thiscostdepends minus thecostthatithasforprocessing off thatitreceivesfromthemechanism speeds asinputtothemechanism; Together withGiorgosChristodoulou The utilityofanagentisthepay-

...... about on rings.Ongeneralgraphsmuchisknown work onis“selfishrouting”,inpar narn) ::: on aring). rings (e.g.,theGermannetworkisbased Europe infactconsistsofinterlocking relevant, however, astheInternetin been obtained).Thisstructureisvery detail (thoughsomeinitialresults have ture hasnotyetbeenstudiedin hn +496819325-1005 Phone AlgorithmsandComplexity DEPT. 1 Rob vanStee CONTACT An areathatIamjuststartingto selfish routing,butthe ring struc- much ticular

...... bericht_2011_en_korr2 24.11.201116:40UhrSeite47

...... GUARANTEES Geometric Algorithms–ExactandEfficient are computationallyintensive.For the jects, andthecorrespondingoperations cessed tohandlecurvedgeometric ob- systems mustbefoundandfurther pro- is thatsolutionstogeneralpolynomial computation times.Thereasonforthis exactness withasignificantincreaseof must tradeoffthecompletenessand increases and,whatismuchworse,one Firstly, thecomplexityof algorithm along withaseriesofdisadvantage. algorithms withthispropertyoftencome can be in suchawaythatallpossibleinputs veloped aredesignedandimplemented fined bythegivengeometricobjects. sition ofaplaneorspaceimplicitlyde- an explicitdescriptionofthedecompo- tion ofso-calledarrangements,thatis, operations areessentialforthecomputa- and surfacesofhigherdegree.These ellipses, planes,spheres,ellipsoids,curves, objects suchasstraightlines,circles, (intersection, union,etc.)ofgeometric tions mainlyincludeBooleanoperations methods aredeveloped.Suchbasicopera- which respectivespecializedsolution deal withanalgorithmicframeworkupon mentioned problem. provide possible Planck InstituteforInformaticsaimsto metric Computing”groupattheMax programs canbetheresult. results, crashes,ornon-terminating algorithm duetoroundingerrors.False arising fromwrongdecisionwithinthe metric algorithmsareimplemented, Difficulties frequentlyoccurwhengeo- visualize ofengineeringcompo and methods aredevelopedtodesign CAD (Computer Aided Design),where Another importantapplicationareais staircase thenexttimeyoumovehouse. piano wouldfitthroughthenarrow culations todecidewhetherornotthe which involves,simplyput,usingcal- the so-called“pianomover’s problem”, everyday problems. A classicexampleis portant roleforthesolutionofmany The algorithmsthatwehavede- The “Computer Algebra andGeo- Geometric algorithmsplayanim- exactly processed.Unfortunately, solutions fortheabove- Here, weprimarily nents.

...... Arrangement calculationofplanealgebraiccurves many smallerunitswithsignificantly the calculationsarebrokendowninto avoidable symboliccalculations.Here, efficient methodsforcarryingoutun- is thereforethedevelopmentofmore of ourmethods. severely restrictsthepracticalapplication steps isconsiderable,however, which tional effortforthesymboliccalculation of highlycurvedsurfaces,thecomputa- fastest intheworld.For theinvestigation which wehavedevelopedareamongthe the bestcomparablemethods,those algorithmic geometry. Measuredagainst efficient (combinatorial)methodsfrom methods fromcomputeralgebrawith and surfaces.We havecombinedsymbolic for theinvestigationofgeneralcurves we havebeenabletodevelopmethods presumption asgiven. tical justificationinnotseeingsucha however, thatthereiscertainlyaprac- input ispresupposed.Examplesshow, correct resultsifa“sufficientlybenign” Such methods, pletely relyupon latter reason, The focusofourlatestresearch In thefirstyearsofourresearch, commercial systemscom- however, onlyprovide numerical methods.

...... arrangement computationofcurves.::: for morecomplexalgorithmssuchasthe finding andapproximationofzeroes both forbasicoperationssuchasthe to matchorevenimproveexistingbounds, case ofacertainsize.We havemanaged is requiredfortheprocessingofanyinput bound forthenumberofoperationsthat means thatwewanttospecifyanupper retical complexityofourmethods.This their correctness. previous methodsandtofinallyensure find thesolutionsmuchfasterthanwith Following thisapproach,weareableto for post-certificationthefirsttime. numeric methodswithexactprocedures nomial equations,wehavecombined leration. Whensolvingsystemsofpoly- it, theyalsofullybenefitfromtheacce- the morehigh-levelalgorithmsbasedon hundredfold. After wehaveadapted greatest commondivisor, etc.)bya tant symboliccalculations(resultants, we havebeenabletoaccelerateimpor- found, forexample,ongraphicscards, parallel hardwarearchitecture,asitis a completesolution.Usingmassively sults areultimatelyreassembledtoform lower complexity, andtheindividualre- nenthttp://www.mpi-inf.mpg.de/ Internet +496819325-1006 Phone AlgorithmsandComplexity DEPT. 1 Michael Sagraloff CONTACT We arealsointerestedinthetheo- departments/d1/areas/GCCA.html 47 bericht_2011_en_korr2 24.11.201116:40UhrSeite48 48 Controlled Perturbation–HowtoAvoid DifficultSpecialSituations GUARANTEES program. results, crashes,ortoanon-terminating binatorially. Bothcan lead tofalseend they cannothandlethespecialcasecom- errors intheapproximatearithmetic,or do actuallyintersect)duetorounding circles donotintersect”eventhoughthey false statementsinthesub-steps(“The intersect inonepoint.Eithertheygive aforementioned onewherethecircles able todealwithspecialcaseslikethe the numberofcells. or scalingofthecirclesdoesnotchange this isduetothefactthataslightshift the sameresult.Inmoregeometricterms, in thecalculation,andeventuallyleadto the lattercase,smallerrorsareallowed the solutionismucheasier. Namely, in les howeverdonotshareacommonpoint, plicates theoverallalgorithm.Ifcirc- be regardedasaspecialcase,whichcom- In addition,itmustalsocombinatorially ted byanexactalgebraiccomputation. the mostdifficult.Itmayinfactbetrea- common intersectionpoint,isthereby special casethatthecircleshaveone circles meetinexactlyonepoint.The that youcandecidewhetherthethree quires atacertainstepinthealgorithm posed. Thesolutionofthisproblemre- of cellsintowhichtheplaneisdecom- circles inaplane,computethenumber using thefollowingexample:Giventhree (e.g. polynomial). the signofaso-calledpredicatefunction is normallyequivalenttodetermining made underguarantee.Suchadecision such thatalloccurringdecisionscanbe operations withintheprocessinaway is toformulateandimplementall problems. Thesolutionpre methods forthetreatmentofgeometric address theneedforexactandreliable rithms –exactandefficient” geometric algorithms Many idealizedalgorithmsarenot We wouldliketomakethisclear In thearticleonexactandefficient “Geometric algo- sented there , we already we , basic

...... cision ofthefloatingpointcalculations. the inputagainandincreasespre- approximate arithmetic),oneperturbs failure (adecisioncannotbemadewith to handlespecialcases.Intheeventof for theexactinput)withoutneed ly perturbedinput algorithm givesanexactoutputforaslight- account. Ifthisapproachsucceeds,the possible roundingerrorsaretakeninto fast floatingpointarithmetic.Inthiscase, sulting decisionsusinginexactbutvery algorithm itisnowtriedtomakeallre- do notintersectinonepoint.Within the means that,withhighprobabilitythey the circlesarefirstshiftedslightly, which In ourexampleabove,thismeansthat that difficultspecialcasesareexcluded. ing ofanumbergeometricobjects,so is toslightlyperturbtheinputs,consist- algorithm tomakeallinputsrun.Thetrick scribes amethodwhichusesanidealized small errorsleadtothesameresult,whilespecialcaserequiresexactcalculation. A specialcasewhichcanbegenericallymadethroughperturbation.Inagenericcase, Controlled Perturbation (CP)de- (but notnecessarily

...... rcia td fnnierpolm.::: practical studyofnonlinearproblems. to extendCPforthefirsttime with theUniversityofTel Aviv, weaim polynomial predicates.Working together a generalanalysisforthetreatmentof for avarietyofalgorithmsandtoprovide managed toprovethefeasibilityofCP probability ofsuccesstheprocess.We metic, sizeoftheperturbation)and (e.g., precisionoffloatingpointarith- relationship betweenCPparameters ral. Inaddition,weareinterestedinthe way CPisusefulforalgorithmsingene- group, weaddressedthequestioninwhich ter Algebra andGeometricComputing” ments andcircles.Within the“Compu- methods forthetreatmentoflines,seg- successfully usedwithanumberof of Tel Aviv and,sincethen,hasbeen the 1990’s byscientistsattheUniversity nenthttp://www.mpi-inf.mpg.de/ Internet +496819325-1006 Phone AlgorithmsandComplexity DEPT. 1 Michael Sagraloff CONTACT CP wasintroducedattheendof departments/d1/areas/GCCA.html

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 49 ...... GUARANTEES 49

Automated Deduction

In order to guarantee that a piece of errors, but we are not getting any nearer In our research group, we currently hardware or software is working correctly, to our goal either. The actual challenge focus on refining the general superposi- it must be verified – i.e., its correctness is thus to find the few useful derivations tion method for special applications. For must be proven formally. Proving correct- among infinitely many correct derivations. instance, we are developing techniques ness means that one has to check whether One quickly comes to the conclusion for combining the capabilities of various certain properties of the system follow that it is useful to apply equations in proof procedures (for instance, super- from other properties of the system that such a way that the result is simplified, position and arithmetic decision proce- are already known. The question how to e.g., “x + 0 = x” only from left to right dures). We are addressing the question use computer programs for such proof and not the other way round. of how superposition can be used to tasks has been an important research process even very large quantities of topic for many decades. Ever since the data, such as those that occur in the fundamental theoretic results of Gödel analysis of ontologies (knowledge bases). and Turing at the beginning of the twenti- We also use superposition to verify eth century, it is known that not every- network protocols and to analyze pro- thing that is true in a mathematical babilistic systems (i.e., systems whose sense is actually provable, and that not behavior is partially dependent on ran- ...... everything that is provable is automatically dom decisions). ::: provable. Correspondingly, deduction systems significantly differ in their ex- pressiveness and properties. For example, decision procedures are specialized for a certain type of data (e.g., real numbers) Application of equations and are guaranteed to detect the correct- ness or incorrectness of a statement from However, this approach is not this area. Automatic provers for so-called always sufficient. A typical example is first-order logic can handle any data types fractional arithmetic: We all know that defined in a program. For these provers it is occasionally necessary to expand a one can only guarantee, though, that they fraction before we can continue calculat- eventually find a proof if a proof exists; ing with it. Expansion, however, causes if none exists, the prover may continue exactly what we are actually trying to searching without success, possibly for- avoid: The equation “(x · z)/(y · z) = x/y” ever. Even more complicated problems is applied from right to left – a simple can be handled using interactive provers; expression is converted into a more com- these provers, however, only work with plicated one. The superposition calculus user assistance and without any guaran- developed by Bachmair and Ganzinger tee of completeness. in 1990 offers a way out of this dilemma. On the one hand, it performs calculations How does a theorem prover for in a forward direction; on the other hand, first-order logic work? It is not difficult it systematically identifies and repairs to write a program that correctly derives the possible problematic cases in a set of new formulas from given formulas. A formulas, where backward computation logically correct derivation is, however, could be inevitable. Superposition is thus not necessarily a useful derivation. For the foundation of almost all theorem example, if we convert 2 · a + 3 · a to provers for first-order logic with equality, 2· a + 3 · a + 0 and then to 2 · a + 3 · a + including the prover SPASS that we have 0 + 0, we do not make any computation developed at the institute.

CONTACT

Uwe Waldmann RG. 1 Automation of Logic Phone +49 681 9325-2905 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite50 50 GUARANTEES Example Model CheckingforHybridSystems machine is door open.”Ifwewanttoverifythatthe desir possible statesthatareobviouslyun- a buttonispressed.Thereare,however, through othereffects,forexamplewhen between thewashinganddrainage,or place automatically, suchasthechange one statetoanother. Thesecantake ing.” Statetransitionsarechangesfrom machine turnedoff”orthestate“wash- starting state“dooropen,drumempty, ing machinecan,forexample,beinthe their statesandstatetransitions. A wash- devices canbedescribedbyspecifying there isonlyonesuchstate: which anarrowpointsto decessor statesof state and may lookasfollows: state transitionbyanarrow. Theresult each statebyacircleandpossible using We candescribestatesandstatetransitions the startingstatetosuchan there arenostatetransitionsleadingfrom able, forexample:“waterinletopen, The behaviorofmanytechnical Let usassumethat How canthisproblembeapproached? graphs L safe is unsafe.Whatarethepre- . To thisend,wesymbolize , thenwemustshowthat L , i.e.,thestatesfrom A L Obviously, ? is thestarting unsafe K . K is thus state.

...... predecessor states,namely known as nique toverifythesafetyofasystemis the startingstate reached. All otherstates,inparticular or fromwhichanunsafestatecanbe se, are theonlystatesthatunsafeper at 2 nations aretakenintoaccount,wearrive not many, butifallthepossiblecombi- value oftrueorfalse.Now, 60bitsare on 60componentsthatcaneachhavethe sitions explicitly, butwerepresentthem unsafe statesandpermissiblestatetran- out here:We nolongerenumeratethe Symbolic modelchecking set { that thereisnowayofreach already knowntobeunsafe. L predecessor statesof equally unsafe.Ifwetakealookatthe then also unsafe,sinceif eiedpnso 0bt ( device dependson60bits Imagine thatthebehaviorofanelectronic can quicklyleadtoproblemsinpractice. Non-safe states and 60 G L It isthereforeclearthat{ Unfortunately, thenumberofstates , G states, morethanonequintillion. can bereachedfromthere. J ), thenwenoticethattheseare , K model checking , L } fromoutside. A , aresafe.Thistech- K G can bereached, and . offers away b G ing thestate 1 J This means ,..., G and (namely , b J 60 , J K , are K ), i.e., , ’s L } ...... and symbolically b For example,theformula stands forallstateswhich b clude numericalvariables,forexample: state formulaemustthereforealsoin- which canchangecontinuously. The ing), butalsowithvariablessuchasspeed, state changes(e.g.,accelerationorbrak- is nolongersolelydealingwith is a system inthisway. Suchacontrolsystem to verifythesafetyofavehiclecontrol have tobeenumeratedexplicitly. quintillion states,whichwouldotherwise as uniformly), onecanuseamethodknown point ofview(e.g.,ifthespeedchanges behavior issimplefromamathematical transitions? As longasthedynamic before. Whataboutthecontinuousstate with thediscretestatetransitionsas eair. ::: behaviors. be appliedeventocomplicateddynamic cusing ondevelopingmethodsthatcan In ourworkgroup,wearecurrentlyfo- arriving attoolargeaformulaquickly. tricks areneeded,however, toprevent hn +496819325-2905 Phone AutomationofLogic RG. 1 Uwe Waldmann CONTACT 5 5 quantifier elimination and not and not hybrid b A newproblememergesifwetry Using suchformulas,wecandeal 32 is false–thisaquarterof system. Thismeansthatone b b using logicalformulas. 32 32 and ( x 2 . Certaintechnical > 50). b 5 discrete is true

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 51 ...... GUARANTEES 51

Modular Proving in Complex Theories

The enormous advances in the de- velopment of information technology have led to the widespread use of complex, computer-controlled systems: in household appliances, cars, trains, airplanes or power plants. Especially in the last-mentioned safety-critical application domains, errors can have catastrophic consequences. It is therefore extremely important to guarantee the proper functioning of such systems and to mathematically prove their safety. In fact, it would be desirable to carry out such correctness proofs fully auto- matically by computer. However, funda- mental theoretical results from Gödel, Church and Turing show that this is Applications areas ...... not possible. Nevertheless, effective ...... automatic verification systems do exist In general, it is difficult to deal with (i.e., the numbers), it is guaranteed that for concrete application areas. such extensions even when effective in the end, the two partial solutions can procedures for the base theory exist. We be combined into one solution to the Our goal is to identify conditions approach this problem by identifying a original problem. under which efficient verification proce- class of theory extensions in which it is dures for complex systems exist. In the possible to reduce the original problem Our theoretical contributions are formal description of a complex system, to a problem in the base theory, which the basis for the development of prac- we need a diversity of data types (for can then be solved with a specialized tically usable tools for the verification example, numbers, data structures such proof procedure for this theory. of safety-critical systems, particularly as sets, lists or arrays, or functions over in the context of the SFB Transregio numerical domains). Efficient reasoning In addition, we are developing Project AVACS (Automatic Verification in such complex theories is an important methods for modular reasoning in com- and Analysis of Complex Systems). goal of our research. We are interested binations of theories. If, for example, we in developing proof procedures which want to prove a statement in program We use our method both for use the modular structure of complex verification which depends on both lists program verification and for verifying theories and allow us to use specialized of numbers and arrays of numbers, then various control systems, such as a radio provers for reasoning in the component we can decompose this problem into two controller for train systems, as well as theories. Such a modular approach is – one concerning only lists of numbers a (parametric) hybrid control system for particularly flexible and efficient as well and another dealing with arrays only. chemical plant which supervise the rela- as applicable in many areas (such as For each of these two problems we can tive concentration of various substances. mathematics, verification, and knowledge use existing proof methods for lists resp. Furthermore, we have also applied our representation [see Figure] ). arrays. If both methods exchange suffi- approaches to the verification of distri- cient information about the shared data buted databases, in security, and for The simplest form of complex theo- mathematical problems. ::: ries we consider are extensions of a given theory (which we refer to as base theory) with additional functions. Such theory extensions can, for example, be used in parametric approaches to the verification of reactive or hybrid systems: e.g., in the verification of controllers regulating systems of trains in which either the number of trains or certain variables (e.g., their speed or their acceleration) and their changes can be considered to CONTACT be parameteric. Viorica Sofronie-Stokkermans RG. 1 Automation of Logic Phone +49 681 9325-2900 ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 52 ...... 52 RESEARCH AREAS

INFORMATION SEARCH & DIGITAL KNOWLEDGE

Digital information has changed our society, In parallel to this aim for a The quantity and variety of the avail- quantum leap, we observe a complexity able information in the Internet has grown economy, work in science, and everyday life. explosion in digital information along to such a scale that search engines are several dimensions: quantity, structural no longer able to maintain all relevant Modern search engines deliver useful infor- variety, multimodality, digital history, references in a centralized index. There- mation for nearly every question, and the and distribution. fore, global information search necessi- tates distributed algorithms, in which, Internet has the potential to be the world’s ...... In addition to more than 20 billion ...... for example, many local search engines ...... web pages, important information sources are dynamically coupled for specific most comprehensive knowledge base. are: online news streams, blogs, tweets tasks. Here, not only local computing However, knowledge structures in the Inter- and social networks with several hundred and search speed play an important role, million users, Web2.0 communities for but also communication efficiency in net are amorphous, and search engines rarely photos, music, books, scientific topics, the network of networks, the Internet, and last but not least the encyclopedia and in embedded peer-to-peer networks. have precise answers to expert questions for Wikipedia. The total volume of this data is in the order of exabytes: 1018 of bytes which users would consult reference books – more than one million terabyte disks. This global issue is examined from and expert literature. A great challenge and various perspectives at the Max Planck There is an increasing amount of Institute for Informatics. These include research opportunity lies in making the leap more expressive data representations: work on efficient search in semistructu- XML documents, RSS feeds, semantic- red XML documents, which play a big from processing raw information to computer- ally connected RDF graphs, and much role in digital libraries and e-science data. more. The richer structure and hetero- Another research area is the scalable aided intelligent use of digital knowledge. geneity of the data increases the com- management of graph-structured RDF plexity of mastering this digital variety. data which arises in the context of the semantic web but is also important as In addition to text-oriented and struc- data representation in computational tured data, we are also experiencing an biology. In other projects, the user-centric explosion of multimodal information: view of Web 2.0 communities and billions of people are becoming data multimodal data are the primary issues. producers in the web by sharing images, The great vision of making the quantum videos, and sound recordings with the leap towards knowledge search and dis- rest of the world. This is often accom- covery is pursued in work on automatic panied by interpersonal contacts that knowledge extraction from web sources arise from the Internet, and large online such as Wikipedia. The Max Planck networks are organized. Institute for Informatics is a world- wide leader in this important research The history of digital information – for direction. ::: example earlier versions of our institute’s web pages, which are partly conserved in the Internet Archive – is a potential gold mine for in-depth analyses along the time dimension. Sociologists and political scientists, but also media and market analysts as well as experts in intellectual property, can profit from this...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 53 ...... 53 CONTRIBUTIONS

Random Telephone Chains – Efficient Communication

in Data Networks ...... 54

Search and Mining in Web Archives ...... 55 ......

Information Search in Social Networks ...... 56

YAGO – a Collection of Digital Knowledge ...... 57

URDF – Efficient Reasoning in Uncertain RDF

Knowledge Bases ...... 58

EnBlogue – What is New and Interesting in Web 2.0? ...... 59

Decision Procedures for Ontologies ...... 60 UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite54 54 Random Telephone Chains–EfficientCommunicationinDataNetworks INFORMATION SEARCH&DIGITAL KNOWLEDGE remain uninformed. all thefollowingparticipantsinchain a participantdoesnotpassonthenews, chain appearsnottobeveryrobust.If news. Theclassicsequentialtelephone some reason,hedoesnotforwardthe reached, orthatheisreachable,butfor might be,thataparticipantcannotbe In atelephonechain,themainproblem if somepartialstepsarenoterror-free. a processcanstillfunctionproperlyeven of theprocedure.Robustnessmeansthat a fixedseries,oneafteranother. long ifthenodesaregivennewsin information distributionlastsmuchtoo nodes, asinthisaffiliatesexample,the In anetworkwithseveralthousand such databasesystemscanbeverylarge: informatics intheeverydayworld.First, phone chainproblemisdifferentfrom is slowlycommunicatedtoallothers. the datainventoryofanaffiliate,which there mustnaturallybechangesgivenin When usingsuchreplicateddatabases, not reachableforashortperiodoftime. is alsoavailableifthecentraldatabase with littlecomplication,andthatthedata affiliate onecanquicklyaccessthisdata office. Thishasthebenefitthatineach its customerdatalocallyateachbranch with affiliatesatseveralsitescanstore centers mustbedistributed. A company informatics ifinformationincomputer third, andsoforth. the secondperson,whothencalls first persononthislist.Thenhecalls bers. Here,spreadingnewsgoestothe based onanorganizedlistofgroupmem- case, thesequentialtelephonechainis the telephonechain.Insimplest group ofpeople? A classicprocedureis sage quicklyandefficientlytoalarger A secondaspectistherobustness In twocentralaspects,thistele- Similar taskdescriptionsappearin How doyousendanurgentmes-

...... very simpleprotocoldemonstratesthis. ween thetwoextremes.Thefollowing chain. Actually, theoptimum liesbet- good asacompletelyrandomtelephone participant givesresultsthatarejustas one that strate randomness? Thefirstresultsdemon- of question is:Whatistheoptimalamount other researchefforts. A highlyinteresting domised telephonechainhasstimulated of themembersfallwithinthis10%. bers. Therefore,itisnoti on averagetoinformtheremainingmem- never reachable,ittakesonly19.49rounds accept thatevery10thparticipantis we same istruefortherobustness;evenif The rounds inordertoinformallnodes. phone chainneedsonaverageonly18.09 with oneanother, thentherandomtele- 1,024 nodeswhichcanallcommunicate take anexampleofadatanetworkwith amazingly efficientandrobust.Ifwe participant isalsoinformed. At thislatestpointintime,chosen another randomlychosenparticipant. round, eachinformedparticipantcalls change takingplaceinrounds.Ineach time. Thisleadstotheinformationex- that allcallslastthesameamountof plify thepresentation,onecanaccept calls randomotherparticipants.To sim- each callerwhohasreceivedthenews, solution. With the informatics withasurprisinglysimple there aretelephonechainproblemsin for applicationininformatics.However, classic telephonechainsareinappropriate The goodperformanceoftheran- The randomtelephonechainis For thisreason,processesbasedon sole randomdecisionper random telephonechain mportant which mportant

...... needs 17.63 rounds instead of 18.09. ::: needs 17.63roundsinsteadof18.09. the fullyrandomtelephonechainand the newprocedureisslightlybetterthan after theyhavecalledaninformednode, Even iftheparticipantsleaveimmediately stop, thelongerproceduretakes. off situation:Theearliertheparticipants 1,024 nodes. Apparently, thereisatrade- only 14.14roundsinordertoinform time. Inthiscase,forexample,weneed case toaminimalincreaseinrunning ded terminationcriterionleadsinany cipant makesonly5calls.Therecommen- efficient procedure–onaverage,aparti- reached. Furthermore,itprovidesavery thereafter thatallparticipantswillbe telephone chain,anditisguaranteed criterion hasbeengivenforarandomized This isthefirsttimethatatermination called aninformednodeafourthtime. cipants leavetheprocedureiftheyhave We canestablish,forexample,thatparti- is thatwecangiveaterminationcriterion. are neededonaverage. are neverreachable, p the (randomly dialed)10%of in only14.11roundson This simpleprotocolinforms1,024nodes position andcontinuestocallin informed, hedialsanew, randomstarting If hereachessomeonewhois at thisposition,alltheotherparticipants. position onthelistandcalls,beginning is informed,hechooseanotherrandom list ofallparticipants.Ifaparticipant have one(e.g.,alphabetically)organized hn +496819325-1004 Phone AlgorithmsandComplexity DEPT. 1 Benjamin Doerr CONTACT A furtherbenefitofthisprocedure We assumethatallparticipants only 15.09rounds average. Evenif articipants already order.

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 55 ...... INFORMATION SEARCH & DIGITAL KNOWLEDGE 55

Search and Mining in Web Archives

The World Wide Web evolves con- Our approach builds on an inverted To capture their semantics, we stantly, and every day contents are added index that keeps a list of occurrences for formally represent temporal expressions and removed. Some of the new contents every word. Depending on the type of as time intervals. We then integrate them are published first and exclusively on the query that has to be supported, the in- into a retrieval approach based on statis- Web and reflect current events. In recent verted index remembers an identifier, tical language models that has been years, there has been a growing aware- how often the word occurs, or the exact shown to improve result quality for tem- ness that such born-digital contents are positions at which the word can be found poral information needs. part of our cultural heritage and there- in the document for every document in fore worth preserving. National libraries which a word occurs. We extend this and organizations such as the Internet information by a valid-time interval to Mining of characteristic phrases Archive [http://www.archive.org] have taken also keep track of when a word was con- Mining of web archives is another over this task. Other contents were pub- tained in a document and thus to enable aspect of our current work. More preci- lished a long time ago and are now, thanks time-travel queries. sely, we are interested in insights about to improved digitization techniques, for ad-hoc subsets of the web archive, for the first time available to a wide public Consecutive versions of the same instance, all documents that deal with via the Web. Consider, as one concrete document tend to differ only slightly. Olympics. Given such an ad-hoc subset, ...... example, the archive of the British news- ...... We exploit this observation to reduce ...... we can identify persons, locations, or in paper The Times that contains articles index size drastically. To process time- general, phrases that are characteristic published from as early as 1785. travel queries more efficiently, we keep for documents published in a particular multiple lists for every word in the inver- year. In our Olympics example, these Our current research focuses on ted index, each of which is responsible could include Michael Phelps, Beijing scalable search and mining techniques for a specific time interval. This intro- and “bird‘s nest” for documents published for such web archives. Improved search duces redundancy to the index, increases in 2008. techniques, on the one hand, make it its size, and thus leads to a trade-off easier for users to access web archives. between index size and query-processing To identify such characteristic Mining techniques, on the other hand, performance. Our approach casts this phrases efficiently, one needs frequency help to gain insights about the evolution trade-off into optimization problems that statistics for so-called n-grams (i.e., se- of language or popular topics. In the can be solved efficiently and determine quences of one or more words). We de- following, we describe three aspects of the lists to be kept for every word in the velop efficient and scalable techniques our current work. inverted index. to compute these n-grams statistics in a distributed environment. One design objective here is to allow for easy scale- Time travel in web archives Temporal information needs out in order to keep up with the growth Existing search techniques ignore Information needs often have a of web archives in the future. ::: the time dimension inherent to web temporal dimension, as expressed by a archives. For instance, it is not possible temporal phrase contained in the user’s to restrict a search, so that only docu- query and are best satisfied by documents ments are retrieved that existed at a that refer to a particular time. Existing specified time in the past. retrieval models fail for such temporal information needs. For the query “german In our work, we consider time- painters 15th century”, a document with travel keyword queries that combine a detailed information about the life and keyword query (e.g., “ election work of Albrecht Durer (e.g., mentioning projection”) with a temporal context such 1471 as his year of birth) would not ne- as September 2009. For this specific cessarily be considered relevant. This is query, only relevant documents that because existing methods are unaware discuss election projections and which of the semantics inherent to temporal existed back in September 2009 should expressions and thus do not know that be retrieved as results. 1471 refers to a year in the 15th century.

CONTACT

Klaus Berberich DEPT. 5 Databases and Information Systems Phone +49 681 9325-5018 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite56 56 Information SearchinSocialNetworks INFORMATION SEARCH&DIGITAL KNOWLEDGE of relationships,inwhichthenumber a socialnetworkisbuilt,dense about yourcurrentlocation.Inthisway, or, especiallylately, toinformfriends newly addedcontent,newassessments followers. Itiseasilypossibletolookfor often providesadditionalvaluetosuch plicit listsoftheirfriends,andthesystem example, theso-called find contentsbasedonannotations,for use andintuitiveinterfacesinorderto interests. Mostservicesoffereasyto different tags,reflectingtheir i the same set. Various usersfrequentlyannotate without havingtostickapredefined tions, becausetheycanbefreelychosen tags areoftenverygoodcontentdescrip- and annotatedwithso-calledtags.These can beexplored,commented,assessed, content. Datapublishedbyotherusers enabling them,forexample,toshare the usertointeractwithotherusers, data ontheWeb. Inaddition,theyallow pictures, videos,bookmarksandsimilar Facebook offerasimplewaytopublish LibraryThing, YouTube, MySpace,or Online servicessuchasdel.icio.us,Flickr, simply generateandpublishcontents. information asbefore,nowanyonecan the Web. Insteadofsimplyconsuming lutionized thewayusersinteractwith In addition,userscanmaintainex- The adventofWeb 2.0hasrevo- mage orthesamevideowith “tag clouds”.

...... weighting annotationsbasedonthe among thefriendsofqueryinguser, It providestag-basedsearchforcontent Search andExploration”) coined not supportneighborhoodsearchatall. but notfortransitivefriends);othersdo fashion (forexample,onlyfordirect, functions, butonlyinaverylimited ing socialnetworksalreadyoffersuch than remoteacquaintances.Someexist- as closefriendsusuallyaremoretrusted therefore betakenintoconsideration, and thedistancebetweenusersshould searching insuchsystems,relations intensive useofannotations).When comments) orimplicit(e.g.,through explicit (e.g.,throughjudgmentsand Such arecommendationcaneitherbe exploiting the“wisdomofcrowds”. or throughtransitivefriendsoffriends), recommended byfriends(eitherdirectly tions, allowstofindvaluablecontent bination withtheuser-defined annota- unknown userswithsimilarinterests. it growsovertimetoincludepreviously from “real”lifeusingthesameservice, with friendsandotherpeopleknown these listsoffriendsareinitiallyfilled of one’s reputationonthenetwork.While of friendsisoftentakenasanindicator We havedevelopedasearchengine This denseusernetwork,incom- SENSE ( for “Socially ENhanced to closethisgap.

...... h ur.::: the query. pay-as-you-go mannerwhileprocessing ing; then,updatesareperformedina information isneededforqueryprocess- as possibletothepointwhenup-to-date to date.SENSEdefersupdatesaslong transitive userrelationshipsalwaysup gies inordertokeeptheprecomputed rated, requiresoptimizedindexingstrate- new contentsandannotationsaregene- especially theveryhighratewithwhich The fastgrowthoftheseservices,and of datapresentinlargesocialnetworks. rithms tocopewiththemassiveamount highly efficientandscalablesearchalgo- sonalized searches.Oursystemuses significantly betterqualityforsuchper- frequency, SENSEretrievesresultswith existing systemsthatconsiderplaintag ence inweights.Incomparisontomost from similarityofannotations)hasinflu- network, similarityininterests(derived In additiontothedistanceinsocial strength ofrelationshiptothefriend. hn +49681302-70798 Phone DatabasesandInformationSystems DEPT. 5 Ralf Schenkel +496819325-5006 Phone DatabasesandInformationSystems DEPT. 5 Tom Crecelius CONTACT

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 57 ...... INFORMATION SEARCH & DIGITAL KNOWLEDGE 57

YAGO – a Collection of Digital Knowledge

In recent years, the Internet has There is for example an article Most of the approximately 7 million developed into the most significant about Albert Einstein, so the physicist locations in YAGO2 have geographic co- source of information we have available can be recognized as an entity for the ordinates which place them on the earth’s today. Train schedules, news, and even ontology. Each article in Wikipedia is surface. Thus, spatial proximity between entire encyclopedias are available online. classified into specific categories, the two locations can be used as search cri- Using search engines, we can very ef- article about Einstein, for example, into terion. An exemplary search using the ficiently query this information. But, the category “born in 1879”. The key- space and time criteria could be: Which despite efficiency, search engines face word “born” allows the computer to store 20th century scientists were awarded limitations in terms of effectiveness, the fact that Einstein was born in 1879. a Nobel Prize and were born in the vi- in particular for complicated queries. Using this approach, we get a very large cinity of Stuttgart. In YAGO2, one finds, Assume we would like to know which ontology, in which all of the entities known among others, Albert Einstein, since scientists are also active in politics. This to Wikipedia have their place. This onto- both his lifetime (1879–1955) and his question can hardly be formulated in a logy is called YAGO (Yet Another Great birthplace Ulm (70 kilometers from way that it can be answered by Google. Ontology, http://www.mpi-inf. mpg.de/yago- Stuttgart) are stored in YAGO2. Queries like “politicians scientists” only naga/yago/). At the moment, YAGO con- return results about opinions on political tains nearly 10 million entities and about ...... events. The problem here is that the ...... 80 million facts...... computers we use today can store a tre- mendous amount of data, but are not YAGO2, a recently created extension able to relate this data to a given con- of the original knowledge base, pays text or, for that matter, even understand particular attention to the organization it. If it would be possible to make com- of entities and facts in space and time – puters understand data as knowledge, two dimensions that are highly useful this knowledge could be helpful not only when searching in a knowledge base. for Internet search, but also for many In fact, the great majority of the ap- other tasks, such as understanding spoken proximately 900,000 person-entities in language or the automatic translation YAGO2 are anchored in time by their of a text into multiple languages. This is birth and death dates, allowing us to the goal of the “YAGO-NAGA” project at position them in their historical context. the Max Planck Institute for Informatics. For example, one can ask questions about important historical events during the In our next article about URDF, Before a computer can process lifetime of a specific president, emperor we describe how existing knowledge in knowledge, it must be stored in a struc- or pope, or also ask the question of YAGO2 can also be used to deduce new tured fashion. Such a structured know- when the person of interest was actually knowledge and then how to reason with ledge collection is called an ontology. president. this knowledge under different uncer- The major building blocks of an ontology tainty models. ::: are entities and relations among entities. An entity is every type of concrete or abstract object: the physicist Albert Einstein, the year 1879, or the Nobel Prize. Entities are connected by relati- CONTACT ons, for example, Albert Einstein is con- nected to the year 1879 by the relation Johannes Hoffart “born” (see graph). We have developed DEPT. 5 Databases and Information Systems Phone +49 681 9325-5028 an approach to automatically create such an ontology using the online ency- clopedia Wikipedia. Wikipedia contains articles about thousands of personalities, Fabian Suchanek products and organizations. Each of DEPT. 5 Databases and Information Systems these articles becomes an entity in our Phone +49 681 9325-5000 ontology. Gerhard Weikum DEPT. 5 Databases and Information Systems Phone +49 681 9325-5000 ...... Internet http://www.mpi-inf.mpg.de/yago-naga/yago/ bericht_2011_en_korr2 24.11.201116:40UhrSeite58 58 implies anewfact: where aconjunctiveconditionof facts implica-tions, so-called tion, theserulesareoftenformulatedas ledge base.Inalogic-basedrepresenta- be violatedby straint, i.e.,a a represents This locations. have beenbornindifferentgeographic can surelyexcludethatapersoncould also beuncertain.Ontheotherhand,we ledge derivedfromsuchasoftrulewill real world,whichmeansthatanyknow- of instances(inthiscase,people)inthe this rulewillalsobeviolatedbyanumber rather “soft”formofaninferencerule, same placeastheirspouses.Beinga example, thatmanypeopleliveinthe certain RDFknowledgebases. statistical inferencetechniquesforun- focuses onefficient,rule-based,and in ourlatestproject, mited todeterministicdata,theresearch on theSPARQL querylanguage)areli- processing techniquesforRDF(based uncertain data we willheresummarize cise, oreveninconsistentdata,which ness withrespecttoincomplete,impre- bases thereforerequiresahighrobust- Automatic reasoninginlargeknowledge even leadtothelossofcorrectdata. eager removalofinconsistenciescan inference techniques,andanoverly only throughcomplicated,logic-based for inconsistenciescanbeaccomplished millions offacts,anautomatedsearch steps. InlargeRDFknowledgebaseswith cate andpotentiallyexpensiveinference and canonlybeuncoveredthroughintri- sistencies arenotobviousevenforahuman even inconsistency. Often,theseincon- exhibit ahighdegreeofuncertaintyor that theresultingknowledgebasesmay applied extractiontechniquesentails in recentyears,theverynatureof in thefieldofinformationextraction with softandhardrules Reasoning inuncertainknowledgebases URDF –EfficientReasoninginUncertainRDFKnowledgeBases INFORMATION SEARCH&DIGITAL KNOWLEDGE In URDF, wecanexpress,for Despite thevastimprovements “hard” rule,whichmaynot . Whiletraditionalquery any instanceinourknow- URDF Horn clauses under theterm , particularly strict con- ,

...... livesIn(person as inthefollowingexample: purely negatedfactsintoadisjunction, clause canbeformulatedbygrouping in “Berlin-Mitte”: that hiswife,“AngelaMerkel”,alsolives live in“Berlin-Mitte”,giventhatweknow us toinferthat“JoachimSauer”might This formofuncertainreasoningallows Sauer”, andtheplace“Berlin-Mitte”. the persons“AngelaMerkel”,“Joachim ample, applytheabovefirst-orderruleto order rules.Inthiscase,wecan,forex- knowledge basebyapplyingthesefirst- reason aboutindividualinstancesinour way. Conversely, wecanthendeductively rules fromthesepatternsinaninductive we cangeneralize(i.e.,learn)first-order instances intheknowledgebase,and fying commonpatternsbetweenthe our URDFframeworkallowsforidenti- first-order predicatelogic.Moregenerally, marriedTo(person by usingnegationsoftheaboveform. only oneofthetwofactsmaybetrue knowledge base,wecanformulatethat have beeninsertederroneouslyintothe bornIn(Angela_Merkel, Bremen) bornIn(Angela_Merkel, Hamburg) If, in formal meanstoidentifyinconsistencies. in theaboveexample,wenowhavea also in“Bremen”.Byusingnegationsas have beenbornbothin“Hamburg”and express that“AngelaMerkel”cannot marriedTo(Joachim_Sauer,Merkel) Angela_ bornIn(Angela_Merkel, Bremen) bornIn(Angela_Merkel, Hamburg) livesIn(person livesIn(Joachim_Sauer, Berlin-Mitte) livesIn(Angela_Merkel, Berlin-Mitte) Moreover, aspecialtypeofHorn The aboveruleisformulatedin With thisHornclause,wecan the extractionstep,bothfacts 2 , location 1 1 , person , location 1 ) 2 ) 1 ). and

...... vlainsrtge. ::: evaluation strategies. which ares with goodapproximationguarantees, on efficientapproximationalgorithms Wide Web. Ourresearchconcentrates information volumes ofdata,whichweobtainthrough algorithms cannotbeappliedtolarge databases. Therefore,exactevaluation query evaluationstrategiesinrelational combinatorial complexitythantraditional luation strategiesunderlieamuchhigher Both logic-basedandprobabilisticeva- evaluation strategiesforuserqueries. to thedevelopmentofnewandefficient niques certainlyposesmajorchallenges based andprobabilisticinferencetech- the factsviarules. listic interpretationofthederivation weights, whichcorrespondtoaprobabi- false valuestofacts,butalsoconfidence URDF doesnotonlyassignbinarytrue/ babilistic reasoning,ontheotherhand, and hardrulesdescribedabove.Inpro- cally tailoredtothecombinationofsoft the Max-Satproblem,whichisspecifi- method forsolvingageneralizationof we havedevelopedahighlyefficient blem inpropositionallogics.For URDF, ability problem(Max-Sat),aclassicpro- lem isreducedtothemaximumsatisfi- base inresponsetoaquery. Thisprob- of apotentiallyinconsistentknowledge guaranteed toobtainaconsistentoverview In propositionalreasoning,theuseris tion strategiestoansweruserqueries. (propositional) andprobabilisticevalua- Efficient evaluationstrategies hn +496819325-5007 Phone DatabasesandInformationSystems DEPT. 5 Martin Theobald CONTACT This enormousexpressivityofrule- URDF supportsbothlogic-based pecifically tailoredtothese extraction fromtheWorld

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 59 ...... INFORMATION SEARCH & DIGITAL KNOWLEDGE 59

EnBlogue – What is New and Interesting in Web 2.0?

The success of the Web 2.0 has drastically changed the way information is generated on the Web, with millions of users actively contributing content in form of blog entries or micro-news (e.g., Twitter). In order to maintain an over- view of this continuously flowing stream of information, we need methods that extract the essence of current events and display them in a suitable way to users.

With enBlogue, the Max Planck Institute has developed an approach which continuously scans Web 2.0 streams for interesting events, which have received attention in blogs, Twitter, The enBlogue website ...... or other media. The name enBlogue ...... is supposed to sound like “in vogue”, meaning it is popular. The choice of tags to be considered of topics depending on user preferences is a priori unlimited but is primarily based and interests, and addition-ally also con- EnBlogue processes web contents, on annotations generated by users, for sidering the locations of events and users annotated with the time of creation example, when commenting on a blog (if this information is available), is cur- and semantic notes to detect surprising entry or a news paper article. Methods rently being researched. ::: changes in the behavior of topics, which can also be used for classifying infor- can be interpreted as interesting events. mation into topic areas as well as for Not only individual semantic annotations discovering names of persons or places. (so-called tags) are assessed for their The enBlogue system then also enables popularity, but all correlations between a mixture of peoples, places, and regular different tags are considered. Two tags topics. are strongly correlated if the relative num- ber of documents that report about two A model that takes into account the tags at the same time is high. Therefore, quantity of change as well as time-lines the dynamic increase in correlations is is used to extract interesting emergent particularly interesting. topics and to present them in a clear and user-friendly manner. For example, normally the tags “Iceland” and “flight operation” are not EnBlogue is the starting point to very strongly correlated. This changed current information on the web and it Events identified by enBlogue on July 8, 1996 (with abruptly with the vulcanic eruption of helps users being linked to actual con- the assistance of the New York Times Archives) Eyjafjallajokull in the spring of 2010. tents. Personalization, i.e., re-ranking

CONTACT

Sebastian Michel DEPT. 5 Databases and Information Systems Phone +49 681 302-70803 ...... Internet http://blogue.mmci.uni-saarland.de bericht_2011_en_korr2 24.11.201116:40UhrSeite60 60 representation containing allitsne- it istransferredintoacompactlogical the ontologyiscompiled.Moreprecisely, an ontology, thelogicalrepresentationof efficiently answerqueriespertaining to representation ofanontology. Inorderto everything thatis is abletofullyautomaticallydeduce of theappropriatemethods,acomputer a logicalrepresentation.With thehelp this, wefirsttranslatetheontologyinto ledge comprisedinanontology. To do can beansweredbymeansoftheknow- is toquicklyanswerthequestionsthat 57) consisting ofseveralmillionentries. keywords children?” were borninthesameplaceasalltheir words. For thequery a purelysyntacticsearchbasedonkey- ever, thesesearchenginesperformonly the assistanceofsearchengines.How- to manyquestionsintheInternetwith INFORMATION SEARCH&DIGITAL KNOWLEDGE – aCollection example the YAGO ontology(see from differentsourcesofknowledge,for gies canevenbeextractedautomatically knowledge iscalledanontology. Ontolo- The aggregationofthiskindgeneral and thatpeoplehaveonlyonebirthplace. that physicistsandchildrenarepeople, the abovequeryinvolves processable byacomputer. For ex which mustbeexplicitlyavailableand sentences, certainknowledgeisrequired, understand themeaningofwordsand a strings, on words, inparticularthecharacter the for documents. Itonlyperformsasearch the wordsinbothqueryand engine isunawareofthemeaning answer. may returna query isreformulated,thesearchengine not theanswertoquestion.If the that containthekeywordsbuttypically delivers avastnumberofdocuments For theabovequery, asearchengine these keywordsoccurexactlyasastring. and Decision ProceduresforOntologies “born” One oftoday’s researchchallenges Today, wecanalreadyfindanswers In ordertoenableacomputer The problemisthatthesearch “physicists” a searchengineextractsthe and returnsdocumentswhere purely syntacticlevel. docu of Digital Knowledge” implied bythelogical ment containingthe , “Which physicists “children” knowledge , “all” “YAGO ample, , page ,

...... several ordersofmagnitudesfaster. This resultsinproceduresthatare the particularstructuresofontologies. deduction proceduressuchthattheysuit Therefore, weareadjustingthelogical logies ofthissizeinreasonabletime. dures donotsuccessfullycompileonto- considered becausethenexistingproce- with severaltensofmillionsentriesare This becomesanissuewhenontologies tensive logicalcalculationsareperformed. ledge intheontology. consequences resultingfromtheknow- find answerstoquestionsthatarelogical ter usesthisrepresentationtoefficiently not containacontradiction.Thecompu- cessary logicalconsequencesanddoes During thecompilationprocess,in-

...... ontology inlessthanonesecond.::: of queriesfromthecompiled YAGO We arealreadyabletoanswerthesekind example, a cestor whowasalsoascientist?” German physicisthaveatleastonean- alternations. Thequery queries witharbitrarymanyquantifier pute. Inprinciple,ourapproachsupports alternations areparticularlyhardtocom- The answersforquerieswithquantifier “there are(physicists)–forall(children)” tains asocalledquantifieralternation: can answer 10 million logies likethe YAGO ontology, withupto Our procedurescurrentlycompileonto- hn +496819325-2918 Phone AutomationofLogic RG. 1 Patrick Wischnewski +496819325-2900 Phone AutomationofLogic RG. 1 Christoph Weidenbach CONTACT The above-mentionedquerycon- entries inaboutonehourand queries withinseconds. “for all–thereare” “Does every query. is, for ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 61 ...... 61 bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 62 ...... 62 RESEARCH AREAS

MULTIMODAL INFORMATION

The volume, but also the importance of multi- Sensors such as cameras, micro- In a second area, we are research- phones, GPS and accelerometers are in- ing the use of portable sensors that are modal data has virtually exploded in recent creasingly embedded in equipment and already embedded in cell phones, or environments and are already useful in a even in clothing. In this case, the focus years. Research on handling such data is variety of ways. The computer-controlled is on capturing and understanding the thus becoming ever more important. This processing of sensor information has context of a person playing an important made enormous progress but is generally role in man-computer interaction. extends from the modeling and indexing to

...... limited to simple matters. This means, ...... Context awareness may enable natural ...... the understanding of multimodal sensor data. in particular, that devices and computers communication, for example, with robots with access to this sensor information that understand the users’ goals and offer The Max Planck Institute for Informatics do not fully interpret it and thus cannot support at the right time. The work de- truly understand their environment. scribed here concentrates on a specific accommodates this development and is type of context: the recognition of human Two work areas will be presented activity. ::: pursuing some of the most important issues below that address different problems arising from this area. involved in multimodal sensor processing. The first area is concerned with the auto- matic processing of music data. A central topic here is the efficient finding of in- dividual audio segments or other music fragments in large music data collections. The processes being researched are not only useful for processing search queries, but also for content-based music analy- sis. This involves, among other things, the recognition and linking of semantic relationships in various versions of diffe- rent modalities of a piece of music...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 63 ...... 63 CONTRIBUTIONS

Recognizing Human Activity ...... 64

Automated Music Processing ...... 65 ......

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite64 64 body Collecting humanactivitydata Embedded SensorSystemsTUDarmstadt Figure 1:Wrist Sensor, complex andlong-termactivities. of humanactivityrecognitionrelatedto thus focusonseveralimportantaspects assembly task)isfarlessmature.We hours (suchasamorningroutineoran human activitylastingforminutesor walking), researchonmorecomplex activities (suchasshakinghandsand made inrecognizingshortandatomic activity recognition. cular typeofcontext,namelyhuman right time.Ourworkfocusesonaparti- the user'sgoalsandoffersupportat for instance,withrobotsthatunderstand It mayenablenaturalcommunication, role inhuman-computer-interaction. context oftheuserplaysanessential wearable sensors Recognizing humanactivityfrom Recognizing HumanActivity MULTIMODAL INFORMATION using machinelearningtechniques. analyzed foractivityunderstanding, Motion datacanthenbecollectedand watches, cellphones,andevenclothing. ready todaybecomingwidelyavailablein technology, inexpensive sensorsareal- perspective. Giventheadvancesinmicro- given timeorplace,fromafirst-person sensing whattheuserisdoing,atany Wearable sensorsattachedtothe While impressiveprogresshasbeen Sensing andunderstandingthe [Figure 1] have greatpotentialfor

...... Considering aconstructionmanualfor archical activityoffersseveralbenefits. activities Hierarchical modelforcomposite algorithms. activity andallowefficientrecognition recognize thehigherlevelcomposite of relevantpartscanbesufficientto observe thatasurprisinglysmallfraction activity. Inadiscriminativeanalysis,we without evenobservingtheactual zed by ample Figure 2:Relevantpartsforcompositeactivity allow theirrecognition only afewunderlyingactivityeventsto composite activities,itissufficienttospot fore confusetherecognition.For many observation canbesuboptimalandthere- of unrelatedactivity, usingthe posite activitiescancontainlargeamounts observing atomicactivities.Sincecom- ways toinferthehigher activities aredirected. rather thehigherlevelgoalatwhichthese atomic activitiesthatisinteresting,but activity events Identifying andcombiningrelevant Preserving thestructureofahier- It isnotusuallythesequenceof having lunch walking atacertaintimeofday can becharacteri- There areseveral [Figure 2] level goalfrom . For ex- complete eating ,

...... model thatobserves Therefore, weproposeahierarchical prohibitive amountsoftrainingdata. vities canbesuboptimal,astheserequire same algorithmsrecognizingatomicacti- can happenindifferentorder. Usingthe ferent users,ortheunderlyingactivities the durationcanvarystronglyacrossdif- Composite activitiescanbeinterrupted, posite activitiesaddsignificantvariation. steps, anditbecomesobviousthatcom- mingly simpletaskconsistsofvarious to fixtheframepanel.Thissee- mirror infigure3,oneofseveraltasksis iiaty ::: nificantly. effort fornewcompositeactivitiessig- ferring sharedpartsreducesthetraining composite activitiesfromscratch,trans- like vocabulary. Insteadofre-learning composite activitiescanbeshared,much Transferring knowledge usually usedforactivityrecognition. compared tothestandardapproaches ments showindeedsuperiorperformance in whichletterscreatewords. composite activities,similartotheway events andcombinesthemtore Figure 3:Compositeactivityforconstructiontask hn +496819325-2000 Phone ComputerVision andMultimodalComputing DEPT. 2 Bernt Schiele +496819325-2000 Phone ComputerVision andMultimodalComputing DEPT. 2 Ulf Blanke CONTACT Parts thataresimilarindifferent relevant activity Experi- cognize

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 65 ...... MULTIMODAL INFORMATION 65

Automated Music Processing

The digital revolution has brought For example, two performances may Audio identification and audio a massive increase in the availability and exhibit significant non-linear global and matching are instances of fragment-level distribution of music-related documents. local differences in tempo, articulation, retrieval scenarios, where time-sensitive Therefore, the development of techniques and phrasing as well as variations in exe- similarity measures are needed to locally and tools for organizing, structuring, re- cuting ritardandi, accelerandi, fermatas, compare the query with subsections of a trieving, navigating, and presenting music- or ornamentations. Furthermore, one has document. In contrast, in document-level related data has become a major strand to deal with considerable dynamical and retrieval, a single similarity measure is of research – the field is often referred spectral deviations, which are due to considered to globally compare entire to as music information retrieval (MIR). differences in instrumentation, loudness, documents. One recently studied instance Major challenges arise because of the tone color, accentuation, and so on. of document-level retrieval is referred richness and diversity of music in form Recent matching procedures, which can to as cover song identification, where the and content, leading to novel and exciting deal with some of these variations, are goal is to identify different versions of research problems. Exemplarily, we dis- based on so-called chroma features. Such the same piece of music within a data- cuss three current research tasks in the features closely correlate to the musical base (including cover, remake, and remix context of content-based audio retrieval. aspect of harmony and have turned out versions). Also in this context, chroma- to be a powerful mid-level representation based audio features in combination with ...... The task of audio identification (also ...... applicable to a variety of multimodal ...... local alignment techniques (e.g. Smith- called audio fingerprinting) is to identify retrieval scenarios. As illustration for such Waterman algorithm) have been applied a particular audio recording within a a scenario, the figure below shows an successfully. By using so-called shingling given music collection using a small audio interface for simultaneous presentation techniques in combination with locality fragment as query input. Even for large of visual data (sheet music) and acoustic sensitive hashing (LSH), the document scale music collections and in the pre- data (audio recording). The first measures search can be significantly accelerated. ::: sence of signal distortions such as back- of the third movement (Rondo) of Beet- ground noise, MP3 compression artifacts, hoven’s Piano Sonata Op. 13 (Pathétique) and uniform temporal distortions, recent are shown. Using a visual query (measures algorithms for audio identification yield marked in green; theme of the Rondo), good recognition rates and are used in all audio documents that contain some commercial products such as Shazam. matches are retrieved. Here, one audio However, existing identification algorithms recording may contain several matches cannot deal with strong non-linear tem- (green rectangles; the theme occurs four poral distortions or with other musically times in the Rondo). motivated variations that concern, for example, the articulation or instrumen- tation.

The task of audio matching can be seen as an extension of audio identifica- tion. Here, giving a short query audio fragment, the goal is to automatically retrieve all musically related fragments contained in the documents (e. g., audio recordings, video clips) of a given music collection. Here, as opposed to traditional audio identification, one allows seman- tically motivated variations as they typic- ally occur in different performances and arrangements of a piece of music.

User interface for multimodal music retrieval.

CONTACT

Meinard Müller DEPT. 4 Computer Graphics Phone +49 681 9325-4005 ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 66 ...... 66 RESEARCH AREAS

OPTIMIZATION

Optimization procedures are today of central Efficient optimization procedures optimal solution quickly, we develop are of core importance in various areas. procedures that can at least find a significance for companies’ effectiveness. For large companies, they have a decisive solution which is close to the optimum. influence on their competitiveness. In addition, we are investigating how we They are employed, for example, to reduce With careful planning, a large amount of can employ random decisions to obtain the need for expensive resources such as resources in industrial projects can be more efficient and simpler optimization saved, leading to lower costs. However, procedures. In this context, we are also work or raw materials. The challenge to ...... such planning problems are usually very ...... studying procedures that are inspired ...... complex, and different requirements have by optimization processes in nature. science is to develop efficient procedures to be taken into account. This makes it Such methods often allow good solutions for solving optimization problems. These difficult for computers to find optimal, to be found for a given problem without or at least very good, solutions. spending too much effort on designing a procedures should quickly find an optimal specific optimiziation procedure. At the Max Planck Institute for solution, or at least a solution that is close Informatics, we are working on difficult Since optimization plays a signifi- optimization problems arising in very cant role in many different areas, scien- to the optimum. different applications such as industrial tists in all research areas of the institute optimization or medicine. On the one are working on optimization problems. hand, we are developing elaborate pro- Optimization is nowadays a crucial cedures that find optimal solutions effi- technique for the efficient design of ciently. On the other hand, if the under- planning processes. Its importance will lying problem is too difficult to find an continue to grow in the future. ::: ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 67 ...... 67 CONTRIBUTIONS

The Theory of Bio-inspired Algorithms ...... 68

Approximation Algorithms for Hard Optimization Problems ...... 69 ...... Placement of Wind Turbines ...... 70

Combinatorial Optimization in Information Science ...... 71

Model-Based Product Configuration ...... 72

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite68 68 a goodsolution is repeatedaslongnecessarytofind the parentsandoffspring.Thisprocess is createdbyselectingindividualsfrom the dates aspossible. After havingproduced is toretainasmanygoodsolutioncandi- population, thegoalforagivenproblem properties. Beginningwithastarting mutation providesthechildwithnew produces achildfromtwoadults,while over portant operatorsare,inthiscase, inherited bythechildren.Themostim- geneticmaterialis which theparents’ through so-calledvariationoperators,in to thebiologicalprinciple.Thisoccurs uced fromaparentpopulationaccording The offspringpopulationcanbeprod- blems, assessesthesolutioncandidates. function, whichdependsupongivenpro- is called vidual special solutioncandidateiscalled upon thisnaturalevolutionprinciple,a combination ofmanygoodcomponents. hope todevelopanoptimalsolutionasa with higher mones, andareusedinlatersolutions chemical messenger, so-called phero- in “good”solutionsaremarkedwitha components; componentsthatparticipate lutions foragivenproblemfromvarious algorithms constructmanypossibleso- munication mechanismsofants.These uses theimageofpathfindingandcom- rithms isantcolonyoptimization,which The TheoryofBio-inspiredAlgorithms OPTIMIZATION ple of tionary modelandtheDarwinianprinci- solution processfollowsfromtheevolu- lutionary algorithms(EA).Thisclassical An exampleofsuchalgorithmsareevo- the areaofcombinatorialoptimization. ways inengineeringdisciplinesand search algorithms,thatareusedinmany offspring anewparentpopulation and Another classofbio-inspiredalgo- Bio-inspired algorithmsaregeneral “survival ofthefittest.” and anumberofsuchcandidates population mutation probability. Indoingso, we [see Figure] . Crossoverusually . A so-called fitness so-called . A . Depending cross- indi-

...... classic analyticalmethods.Furthermore, rithms, onecanusealargenumberof are aspecialclassofrandomizedalgo- problem. Becausebio-inspiredalgorithms generate anoptimalsolutionforagiven how muchtimethealgorithmsneedto or badintheiruse.Themainfocusison problems, andwhatstructuresaregood processes arecapableofsolvingspecific We areexaminingifbio-inspiredsearch classic algorithmsisstillatanearlystage. ing ofthesealgorithmsincomparisonto many cases,thetheoreticalunderstand- already beensuccessfullyemployedin Research focus this newprocess. to understandtheworkingmethodsof specific algorithms.Rather, thefocusis bio-inspired algorithmssurpassproblem- goal ofourresearchtodemonstratethat solution process.Itisthereforenotthe that theycansurpassaspecially-designed is available.Onecannotthereforeexpect blem, nogood,problem-specificalgorithm cially employedif,foragiven(new)pro- Flowchart ofanevolutionaryalgorithm While bio-inspiredprocesseshave Bio-inspired algorithmsareespe-

...... piiainpolm. ::: optimization problems. processes withdifferentcombinatorial directions, whichleadtomoreefficient evolutionary algorithmsadditionalsearch approaches. Thisgivesthesearchfor equal targetfunctionsinmulti-criteria conditions areperceivedasadditional through manysideconditions.These a targetfunction,whichmustbeoptimized optimization problemsaregiventhrough benefits toanefficientsearch.Many inspired algorithmsbringadditional for multi-criteriaoptimizationofbio- use mutationasavariationoperator. onstrable benefitsagainstEAsthat combination andmutationshowdem- Another exampleisthatEAswithre- path betweenallnodesinagivengraph. can efficientlycalculatetheshortest onstrated thatbio-inspiredalgorithms For example,ithasbeenformallydem- wide classofcombinatorialproblems. mimic problem-specificalgorithmsfora lem specificknowledge.Theyareableto well-known problemswithoutusingprob- methods oftenfindgoodsolutionsfor teristics ofbio-inspiredmethods. to order new hn +496819325-1000 Phone AlgorithmsandComplexity DEPT. 1 Frank Neumann +496819325-1004 Phone AlgorithmsandComplexity DEPT. 1 Benjamin Doerr CONTACT analytical methodsaredevelopedin Other studiesshowthatapproaches We haveshownthatbio-inspired respond tothespecificcharac- only

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 69 ...... OPTIMIZATION 69

Approximation Algorithms for Hard Optimization Problems

One of the most basic methods for dealing with NP-hard optimization problems is to design polynomial-time algorithms that find a provably near- optimal solution. Formally, an algorithm is said to be an x-approximation algorithm for a maximization (or minimization) problem if it runs in polynomial time in the size of the input instance and produ- ces a solution whose objective value is within a factor x of the optimum solution. Under this general framework, we have been working on problems from different areas, including graph problems, compu- tational economics, computational geo- metry and mathematical programming...... We give an example below......

Example of a feasible matching between two segment sets in two frames Matching of live cell video frames In joint work with Stefan Canzar, One approach to solve this prob- While the problem was originally Gunnar W. Klau and Julian Mestre, we lem is to match sets of segments between formulated in the Computational Biology have considered a problem that arises na- neighboring frames, where the segment community, our contribution was to study turally in the computational analysis of sets are obtained by a certain hierarchical it from an approximation-algorithms point live cell video data. Studying cell motility clustering procedure. See the example, of view. We first showed that solving the using live cell video data helps under- which illustrates the hierarchical cluster- problem exactly is NP-hard; in fact, we stand important biological processes such ing of two consecutive frames. Matching showed that there is a constant value as tissue repair, the analysis of drug per- here means identifying one set of seg- x > 1 such that it is even NP-hard to formance and immune system responses. ments at a certain level in the hierarchy obtain an x-approximation. On the other Segmentation based methods for cell in one frame with one set of segments hand, we developed a 2-approximation tracking typically follow a two stage in the other frame. Since segment sets algorithm for that problem. ::: approach: The goal of the first detection representing different cells in the same step is to identify individual cells in each frame must be disjoint, no two nodes on frame of the video independently. In a any root-to-leaf path in any of the two second step, the linkage of consecutive hierarchies can be matched at the same frames, and thus the tracking of a cell, is time. In the figure, we show a feasible achieved by assigning cells identified in matching using dashed lines. Typically, one frame to cells identified in the next a non-negative weight is assigned to each frame. However, limited contrast and pair of segment sets (one from the first noise in the video sequence often leads frame and the other from the second), to over-segmentation in the first stage: a indicating how likely this pair is to re- single cell is comprised of several seg- present the same cell. The objective is ments. A major challenge in this applica- to find a feasible matching between the tion domain is therefore the ability to segment sets in the two frames, with distinguish biological cell division from maximum total weight. over-segmentation.

CONTACT

Khaled Elbassioni DEPT. 1 Algorithms and Complexity Phone +49 681 9325-1007 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite70 70 Placement ofWind Turbines OPTIMIZATION they achievesanoptimalenergyoutput. wind farmscanbeoptimizedsuchthat placement ofwindturbinesonlarge In thisproject,weconsiderhowthe methods inthecontextofwindenergy. that exploittheuseofoptimization Currently, thereareonlyinitialattempts that determinethesuccessofitsusage. demands involvesalotofcomplextasks using windasaresourcetosatisfyenergy wind farms.Buildingfarmsand help tooptimizetheperformanceof it isnecessarytoexploitmethodsthat of windfarms(onshoreandoffshore), attention. 300 MW)has F Thanet Wind windfarms (for examplethe large offshore Recently, theinstallationandstart-upof of CO-2sincethebeginning2009. which helpedtosave180Miotonnes roughly 8800windturbinesinEurope, was basedonwind.Currently, thereare 39% ofallnewenergycapacityinstalled 40000 MWto74000MW).In2009, within theEUhasalmostdoubled(from tive installedcapacityofwindenergy climate change.Since2005,thecumula- wide andplaysamajorroleintackling ing roleinthesupplyofenergyworld- To furtherincreasetheproductivity Renewable energyplaysanincreas- gained increasingmedia arm withacapcacityof

...... mization problems. to tacklehighlycomplexnon-linearopti- kind ofalgorithmsisinparticularsuitable with evolutionaryalgorithms(EAs).This output. We tackletheoptimizationtask dering acomplexmodelfortheenergy solution canonlybeevaluatedbyconsi- of ourproblem,thequalityacertain turbines shouldbeplaced.Inthecase conditions oftheareawherewind speed anddirectionsaswellother wake effects,temperature,pressure,wind Accurate modelsmaytakeintoaccount model usedtopredicttheenergyoutput. uation dependsontheaccuracyof certain siting.Thecostofsuchaneval- model topredicttheenergyoutputfora in theirwebsiteFAQ. has differentweaknesses,aspointedout conducts twostrategiesforsamplingbut “optimizer” sampling isrequired.For example,the tive and,instead,somekindofstochastic imply thatexhaustivesamplingisprohibi- Its non-linearityandsearchspacesize can besolvedbytraditionalmethods. the classofoptimizationproblems,which Layout sitingistoocomplextofitinto siting ofturbinesforlargewindfarms. approaches foroptimizingthemicro- The optimizationtaskinvolvesa We developandevaluatenew component ofOpenWind

...... ubro ubns ::: number ofturbines. loss scaleswithenergycaptureand how itallowsonetoexaminewake well itmaximizesenergyyieldandshow loss calculation.We demonstratehow gates andonthecomplexityof how manycandidate ted. rent wakeeffectmodelstobeincorpora- It ismodularandthereforeallowsdiffe- hundreds, eventually1000,turbines. optimization algorithmforthelayoutof accurate, efficient,andparallelizable weak performance.We demonstratean and imprecisemodels,showrather with asmallnumberofturbines,simple ever, currentapproachescanonlydeal order tospeedupthecomputation.How- solutions canbeevaluatedin of solutionscalledapopulation. These EAs workateachtimestep hn +496819325-1000 Phone AlgorithmsandComplexity DEPT. 1 Frank Neumann CONTACT Its computationalcost layouts itinvesti- depends on with aset parallel in its wake

...... bericht_2011_en_korr2 25.11.2011 10:59 Uhr Seite 71 ...... OPTIMIZATION 71

Combinatorial Optimization in Information Science

We now have the ability to gather Despite this negative result, we the problems). We therefore try to find enormous amounts of data, and there are would still like to find a way to rank approximation algorithms. An approxima- numerous challenges in organizing this candidates based on the preferences of tion algorithm is an efficient algorithm data into usable information: Research- different voters. The computer science for solving such a problem, i.e., an algo- ers are working on developing better web community became especially interested rithm for which the running time grows search engines. In biology, microarray in this problem because of its application polynomially rather than exponentially. experiments supply information about in meta search engines for web search. The solution returned by an approxima- expression levels of thousands of genes, Different search engines use different tion algorithm is not necessarily optimal; and this data needs to be organized and algorithms, and they may perform better however, it is guaranteed not to be too analyzed. Internet resellers would like in certain aspects and worse in other far from optimal. to give customized recommendations to aspects. In addition, algorithms are sus- users based on known preferences, user ceptible to spamming when websites In the past few years, algorithms input, and previous purchases. attempt to manipulate the algorithm to have been developed for the meta search achieve a higher position in the search engine problem. These not only work One of the questions that arises results. Finally, search engines them- well in theory, but computational studies when trying to solve these problems is the selves may accept payment to display have shown that they provide optimal ...... question of how to combine information ...... certain websites in a prominent position...... solutions in many practical instances. from different sources: for example, how The goal of a meta search engine is In our current research we are investi- do we rank a list of candidates based on therefore to combine the ranked search gating whether such good results can the different preference orders given by results from different search engines also be obtained for other objectives. ::: the members of a search committee? (the “voters”) into a new ranked list of What is a good clustering of a set of genes results which minimizes the number of if different microarray experiments all disagreements with the input lists, i.e., suggest different clusterings? Or, how do the number of pairs of results that are in we cluster webpages based on similarity reverse order in the new list compared to scores? the original lists from the search engines.

Of course, other objectives than Rank Aggregation minimizing the number of disagreements Some of these questions have a long are possible, too, and – by Arrow’s im- history, such as the question of how to possibility result – no objective is good in rank candidates based on the preferences all situations. We may want our ranking of different voters. The famous Arrow’s to be fair, for example, by ensuring that impossibility theorem states that no vot- we do not contradict any of the input ing system can simultaneously achieve preference orders by too much. Or, we three natural criteria: non-dictatorship may want to guard against selfish voters (we do not simply want to output the and try to ensure that any majority of the preference order of a particular voter), voters cannot secede and obtain a better independence of irrelevant alternatives outcome. (if the output order ranks Anna before Bob, then this should also be the case Many of these objectives give rise if one or more voters change their minds to NP-hard problems; this means that about the qualifications of Charlie), and the running time of the only algorithms finally, Pareto efficiency or unanimity that we know for optimally solving these (if all voters prefer alternative Anna to problems grows exponentially with the alternative Bob, then so does the output size of the problem (and it is widely be- ordering). lieved that this behavior is inherent to

CONTACT

Anke van Zuylen DEPT. 1 Algorithms and Complexity Phone +49 681 9325-1000 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite72 72 Model-Based ProductConfiguration OPTIMIZATION properties. and toautomatically to haveanintegratedviewofaproduct parately. to Ourgoal is not typicallygeneratedtogether, butse- Today, theabove-mentioned viewsare the sameproductlineofVWgroup. the Audi A3 andtheVWGolfbelongto fraction ofcommon ofthehighestpossible ducts onthebasis which areusedto rather theyhaveso-called‘productlines’, on thebasisofindividualproducts,but addition, m What profit financial view (How doIbuildtheproduct?) product?) design view more, manufacturersdifferentiatethe down toitsindividualscrews.Further- for sale,butultimatelytheentirecar cludes optionsandpartsthatarerelevant as themanufacturer’s viewnotonlyin- more complexthanthecustomer’s view, manufacturer’s viewofacarismuch figuration automatically. and for as cars,toshowtheircompletestructure, composed fromindividualparts,such generic methodsformodelingproducts power engines.Ourgroupresearches the tow-barmayexcludechoosinglower- possible higherspeeds,whilechoosing a specificwheelcombinationduetothe the options:astrongerenginemusthave module. Therearedependenciesbetween tional, suchasatowbarortelephone of carcolorortheengine.Othersareop- selection isobligatory, suchasthechoice ing variousoptions.For someoptions, selecting abasicmodelandthenchoos- together theirownpersonalvehicleby the Internet.Typically, customersput offer configuratorsfortheirproductson It shouldbeborneinmindthatthe Today, manyautomotivecompanies answering queriesontheircon- from themanufacturer’s view anufacturers nolongerwork margin doesitprovide?) (Which partscomposethe (What doestheproductcost? produce differentpro- parts. For example, compute relevant make itpossible and the . In

...... Sales viewofacar of thesepropertiesareingeneral configuration viewoftheproduct. All costs ortheaforementionedsales-oriented a productwithminimummanufacturing bilities couldbeanenquiryregarding bute toaproductatall.Furtherpossi- all products,i.e.,whetherthesecontri- sistency ofsubassembliesinrelationto of subassembliestobeused,orthecon- one productlinewithrespecttoaset the numberofpossibleproductsfrom process productmodelsuptoasizeof For thecarexample,wecancurrently within anacceptableperiodoftime. desired productbuiltfrommanyparts can computetheabovepropertiesforany i.e., thereiscurrentlynoprocedurethat Examples ofsuchpropertiesare “hard” , ...... more thantheestimatednumber2 forcretrsac. ::: of ourcurrentresearch. for problemsofthissizearetheobject needs about50,000parts. Algorithms product-line inautomobileconstruction lying dependencystructures. A complete and intelligentexplorationoftheunder- can onlybeprocessedthroughautomatic atoms intheuniverse.Suchlargemodels hn +496819325-2900 Phone AutomationofLogic RG. 1 Christoph Weidenbach CONTACT Design viewofanengine presents 2 of time.Themodelthenpotentiallyre- about 6,000partsinareasonableperiod 6000 different products,much 1300

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 73 ...... 73 bericht_2011_en_korr2 24.11.201116:40UhrSeite74 74 link ininformaticsresearch. is thusaninherentpartandaconnecting a concreteapplicationproblem.Software of anengineeringeffortforthesolution basic research,aswelltheproductout of theimplementationnewresultsfrom the subjectofbasicresearch,outcome fulfills manyfunctionsinthiscontext.Itis applications. Thedevelopmentofsoftware that supportsagreatvarietyofdifferent it alsoresemblesanengineeringscience as correctnessandcomplexity. Secondly, investigates fundamentalpropertiessuch tation andproblem-solvingmethods research thatdealswithuniversalcompu- Informatics isfirstlyadisciplineofbasic SOFTWARE RESEARCH AREAS

...... systems aretobesuccessfullyapplied considered inengineeringifsoftware ristics ofrealdatamustbeappropriately magnetic disksaswellthecharacte- communication, processors,storageand ted computerhardware.Propertiesof puters, particularlyoncurrentdistribu- com- cient whenimplementedonmodern effi- xity dimensions)arenotnecessarily properties (so-calledasymptotic cally analyzableruntimeandstoragespace that show, ingeneral,nicemathemati- an importantresearchsubject. Algorithms cutable softwareis,asmentionedabove, matical modelsandalgorithmsintoexe- software. develop, distribute,andsellthe further to start-up companieshavebeenfounded source software.Insomecases,however, of oursoftwareisdistributedasfreeopen in manyplacesaroundtheworld.Most community andthatareusedinresearch have foundtheirwayintothescientific ments andgroupsattheinstitute,that ware systems,developedinalldepart- There isaconsiderablenumberofsoft- them availabletoscienceandindustry. into practicalsoftwaresystemsandmake the resultsoftheirbasicresearchefforts ments andgroupsworktoimplement the foundingofinstitute. All depart- implemented withgreatsuccesssince Informatics thisphilosophyhasbeen The cleverimplementationofmathe- At theMaxPlanckInstitutefor comple-

...... otaedvlpr ::: software developer. eases theuseofparallelismfor on thisarchitecture.Theadditionallayer a specificproblemeventuallyrunning tecture andasoftware,aimedatsolving forms alayerbetweenparallelarchi- Data” of Adaptive The article which havemeanwhilebecomestandard. mance on to deve these problems.Itisagreatchallenge two softwaresystemsarepresentedfor sistance” In thearticles to automaticallygeneratedrugtherapies. In bioinformatics,weworkonsoftware software solutionsrelatedtothistopic. Searches forSemanticData” in DigitalLibraries The articles (WWW) withsemanticinformation. possibilities intheWorld Wide Web is theenhancementofsyntacticsearch our activities. A currentresearchtopic of “software”areacross-sectionthrough the methodsused. oftheprobleminvestigatedor zations special casesormeaningfulgenerali- provides valuableinformationonrelevant developed method,thesysteminturn ding asoftwaresystemfornewly in practice.Whenwesucceedbuil- The articlesontheresearchfocus lop algorithmsthatincreaseperfor- and parallel hardwarearchitectures “GBACE: Parallel Processing “TopX 2.0–EfficientSearch “Analysis ofHBVResistance” “Analysis ofHIVDrugRe- describes softwarethat and “RDF-3X –Fast describe two , ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 75 ...... 75 CONTRIBUTIONS

Analysis of HBV Resistance ...... 76

Analysis of HIV Drug Resistance ...... 77 ...... GBACE: Parallel Processing of Adaptive Data ...... 78

TopX 2.0 – Efficient Search in Digital Libraries ...... 79

RDF-3X – Fast Searches for Semantic Data ...... 80

UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite76 76 normal nucleotides;therefore,theyare property oftheirchemistrymodeledafter the treatmentofhepatitisBhave Reverse transcriptaseinhibitorsusedin found onbothsidesofthenucleotide. The relevantchemicalbindingsitesare are normallyconnectedinalongchain. building blocksofDNA,thenucleotides, through asimple,buteleganttrick.The prevent thereproductionofviralDNA reverse transcriptaseinhibitors.These Reverse transcriptaseinhibitors purpose. cells. Severaldrugsareavailableforthis prevention ofviralreplicationinliver The goaloftreatmentisthepermanent cancer, antiviraltherapymustbeginearly. sequelae suchaslivercirrhosisand this disease.Inordertopreventthelater people peryeardieasaconsequenceof chronic hepatitisBandaboutonemillion million tion (hepatitis) an acuteorchronicliverinflammation human livercells,whichcanresultin (Image source: Antonio Siber) Antonio source: (Image Hepatitis Bvirusparticles Hepatitis BVirus Analysis ofHBVResistance SOFTWARE (WHO) Of greatmedicalimportancearethe The hepatitisBvirus people worldwidesufferfrom . TheWorld HealthOrganiza- estimates that300to420 ˇ (HBV) infects

...... this model.Thisrequiresseveral genome, wecanalwayscomebackto mine theresistancefactorofviral in itsgenome.Ifwefindawaytodeter- characteristics ofthevirusareencrypted Resistance landscape immunodeficiency virus(HIV). successfully testedwiththehuman fore liketodescribeamethodthatwe to establishtheresults.We wouldthere- the otherhand,ittakesseveralweeks laboratory testingisexpensive,andon variety ofreasons.Ontheonehand, Unfortunately, thisisnotpossiblefora patient sampleandanyavailabledrugs. peutic decisionismaderegardinga ideally beestimatedbeforeeachthera- rical value.Thisresistancefactormust the formofaneasilyinterpretablenume- resistancein one candetermineavirus’ resistance. Inextensivelaboratorytests, thehepatitisBvirusinterms of sure biology. Ourtaskistocompletelymea- with ourcollaboratorsinmedicineand cisions, wehavesetahighgoal,together Resistance factor hepatitis B. difficulty inthelong-termtreatmentof The developmentofresistanceisamajor the treatmentanddevelopedresistance. long asthevirushasnotadjustedto tial diseasesoftheliver, butonlyas way, thuspreventingboundconsequen- reproduction ofthevirusinadecisive completed. Thistherapyslowsdownthe is causedandtheviralDNAcannotbe thening ismissingsothatachainbreak chemical groupsthatcauseschainleng- viral polymerase.However, oneofthe incorporated intotheDNAchainby The basisofourresearchisthatthe In ordertooptimizetherapyde- hun-

...... most effectiveindividualtherapy. ::: in thefuturesupport for ustodevelopanalgorithmwhichcan diversity ofthevirus,andmakeitpossible Europe. Theseprototypesreflectthe thousand hepatitisBvirusesfrom ral hundredviralprototypesfromsev developed asystemwhichchoosesseve- the relevantconclusions.We havenow to inferthegenotypiccausesderive our measurements,andsecond,wehave contexts thatwehavepreviouslyseenin for ourproject.First,wecanonly choice ofthesemeasurementsis for differentviralgenomes.Theclever dred measurementsofresistancefactors nenthttp://www.geno2pheno.org Internet +496819325-3016 Phone ComputationalBiologyand DEPT. 3 Bastian Beggel CONTACT presentation) (schematic Hepatitis Bresistancelandscape Applied Algorithmics choice ofthe reflect crucial all of eral

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 77 ...... SOFTWARE 77

Analysis of HIV Drug Resistance

HIV Therapy sistance. HIV mutations have to be About thirty years ago, the Human interpreted in order to find out which Immunodeficiency Virus (HIV) was dis- drugs are still effective against the variant. covered as the causal agent of AIDS. To address this problem, a database with Since then, a series of drugs against HIV the genotype and the phenotype (the have been developed. The drugs attack drug resistance) of several HIV variants the virus at four different steps of its was created. Using this database and life cycle. In order to treat HIV-infected statistical learning techniques, the com- persons, 24 different drugs are used. The puter program geno2pheno[resistance] reason for the high number of available elucidates the relationships between viral drugs is HIV’s ability to develop drug mutations and drug resistance. These resistance, which is caused by drug- relationships are used by the program to Figure 2: THEO's assessment of different drug combinations resistance mutations, i.e., persistent interpret the genotype of an HIV variant changes in the viral genetic material and assess its degree of resistance against that protect the virus against the drugs. single drugs [Figure 1]. The system has EuResist This characteristic of HIV makes the been proven successful: two thirds of all ...... treatment of an HIV infection consider- HIV therapies in Germany are selected The predictive power of statistical ably difficult. Moreover, the development with aid of geno2pheno[resistance]. models, including the model used by of certain resistance mutations leads to a THEO, depends basically on the size of cross-resistance, affecting several drugs the database it uses. EuResist, an EU- at once (which may include drugs the THEO funded project, was started for this reason. patient has never taken). Department 3 Effective HIV Therapies must deploy The goal of the project is to provide its of the Max Planck Institute for Informa- at least three different drugs. The choice users with a big Europe-wide database tics has developed a software suite to of the drugs must consider drug resistance for clinical HIV data, and to use this data aid virologists and clinicians in choosing and synergistic effects between the drugs to predict the success of HIV therapies. adequate HIV-Therapies. This software (among other things). The Software THEO At the beginning of the project, the suite is continuously kept up to date and (THerapy Optimizer) is an important EuResist database was fed with data also extended. addition to and further development from three local databases (Germany, of geno2pheno[resistance]. THEO uses Sweden and Italy). The project’s success a database as well as statistical learning has led to the incorporation of five addi- geno2pheno[resistance] techniques. In contrast to geno2pheno tional databases and to the provision The resistance of an HIV variant [resistance], THEO’s database contains of data by the pharmaceutical industry. against a certain drug can be measured clinical data on past HIV therapies, e.g. Further data collaborations will follow. in a laboratory by using a time and cost- the viral mutations, the administered A therapy evaluation system to support intensive method. In contrast, the deter- drugs, and the therapeutic success. attending physicians in decision making mination of HIV mutations is quick and THEO uses this data to learn the deter- was created within the project as well, comparatively inexpensive, but has the minants of therapeutic success of a drug which is based on three statistical models. disadvantage that it does not directly combination, given a viral genotype. One of the statical models was created provide information on the viral re- In order to be able to propose therapies in Department 3. This model arose from that will work over a longer period of time, a further development of THEO. The THEO estimates which mutations could EuResist Prediction Engine takes the be developed by the virus to escape the viral genome as an input and outputs drugs. When choosing a therapy for a a list of therapies. These therapies are patient, a physician has to consider a ranked by their success probability and series of factors. This proce-dure can be by the variance of the three different supported by THEO, as it can propose predictions. ::: several therapies for a patient and predict their success probability [Figure 2].

CONTACT

Figure 1: geno2pheno[resistance] interprets Alejandro Pironti HIV mutations to predict the susceptibility of the DEPT. 3 Computational Biology and virus to different drugs, indicated by the resistance Applied Algorithmics factor. If the latter lies on the green zone, the virus Phone +49 681 9325-3015 is susceptible, whereas the red zone indicates a possible resistance. Internet http://www.euresist.org ...... http://www.geno2pheno.org bericht_2011_en_korr2 24.11.201116:40UhrSeite78 78 one wouldliketouseseveralco-processors applications. For largerscientificproblems, allows asignificantaccelerationformany Parallelism onmanylevels core CPUs. as graphicsprocessorsandfuturemany- and highly-parallelarchitecturessuch for softwareisinscalingtomany-core traditional methods,themainchallenge architectures canbeapproachedwith cores. parallel 64 to res with 16 with 2to8cores,many-corearchitectu- tiates multi-core era.Ingeneral,onedifferen- found ourselvesatthebeginningof cores overtheyears. At thattime,we the exponentiallyincreasingnumberof on software,whichhavecomeupwith not yetpresentthesamerequirements cessors werereleased,however, thesedid Parallel co-processors performance hardware. cannot becomeanyfasteronnewhigh- provements. Without thiseffort,software to benefitfromproceeding which nowrequiresexplicitparallelization poses anextremechallenge change inthedesignofmicro cores withinthesamechip. growing hardwareparallelizationinmany and tradedinagainstanexponentially and fasterprocessorcoreswas tations, theleadingdevelopmentoflarger microprocessors. Duetophysicallimi- occurred concerningthefabricationof The parallelrevolution GBACE: ParallelProcessingofAdaptiveData SOFTWARE The useofmany-coreco-processors In 2004thefirstdouble-corepro- Around 2004,aradicalchange between multi-corearchitectures While parallelizationonmulti-core architectures with100andmore cores, andmassively hardware im- for software, This basic processors given up

...... rougher theresolution. the illustration:Thelargerblock, sizes forthedifferentresolutionlevelsin domain. We usevaryingcolorsandblock resolutions indifferentregionsofthe function isrepresentedwithdifferent Depending onaprecisionthreshold,the grid representationoftwofunctions. of theillustrationweshowanadaptive representation iseasier. Intheleftcolumn of aroughresolutioninplaceswherethe to representthefunctions,anduse resolution inareaswhereitisdifficult main ideaistheprincipalofusingafine functions uptoagivenprecision.The used inordertoapproximatecontinuous Adaptive data levels areusedefficiently. important thatalloftheseparallelism an efficientsoftwareexecution,itisvery which formathirdparallelismlevel.For are severalcomputingunitsineach within eachco-processor. Finally, rallelism levelexistsduetomanycores the firstparallelismlevel. A secondpa- several co-processorcards.Thispresents can bedistributedacrossthememoryof within acomputer, sothattheproblem parallel recursionongraphicsprocessors Adaptive gridgenerationandrefinementusing Adaptive datapresentationisoften there core

...... otaepoet. ::: software projects. impact ontheperformanceofcomplex revolution willfinallyalsohaveapositive In thisway, thebenefits oftheparallel great potentialofmany-coreprocessors. matically, offeringasimpleraccesstothe many ofthehardwarerequirementsauto- cessing ofadaptivedataandincorporates Environment) (General Block Adaptive Concurrent used uptonow. TheframeworkGBACE larger softwareprojects,hasbeenlittle many-core co-processor, especiallyfor Due tothiscomplexity, thepotentialof coding posesanadditionalchallenge. tations andthehardware-independent creases greatlyforadaptivedatarepresen- tion everywhere.Softwarecomplexityin- on problemsthatutilizethesameresolu- cult tousethemanylevelsofparallelism of hardware.Itis,however, alreadydiffi- could thereforerunwithdifferenttypes the specificmany-coreco-processor, and this wouldbeachievedindependentlyof parallelism onallthreelevels.Ideally, the adaptivedatarepresentationwith efficiently, onewouldliketocombine Parallel andadaptive the leftcolumn. between theresolutionsascomparedto the illustration.We seeasoftertransition grids canbefoundintherightcolumnof illustration. Theappropriatelyrefined of manyblocksintheleftcolumn condition leadstoarecursiverefinement neighboring blocks.Therealizationofthis transitions betweentheresolutionsof such adaptivegridsallowonlyonelevel nenthttp://www.mpi-inf.mpg.de/~strzodka/ Internet +496819325-4157 Phone ComputerGraphics DEPT. 4 Robert Strzodka CONTACT In ordertosolvescientificproblems Many applicationswhichworkon software/GBACE abstracts theparallelpro-

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 79 ...... SOFTWARE 79

TopX 2.0 – Efficient Search in Digital Libraries

Full text search in semistructured the query language (which supports the the past years. In particular, the internal documents international XPath 2.0 and XQuery 1.0 index structure used by TopX was trans- TopX is a search engine for efficient Full-Text standards defined by the W3C) formed from a conventional database full-text search in semi-structured docu- allows for a gradual clarification of the system into a completely new, highly ments, a data format which is frequently requests. optimized index structure, which was used, for example, in digital libraries. specifically designed for TopX. This new These documents are called semi-struc- Following the above example, the index structure supports highly efficient tured as they combine rich semantic user may start searching the annotated forms of index compression and the annotations and structural markup with Wikipedia corpus with a pure keyword distribution of indexes across multiple extensive text passages in a unified format, query for “Max Planck”. This query would computers (i.e., compute cluster), which the popular XML (Extensible Markup provide (amongst others) links to the contributes significantly to the scalabi- Language). In particular semantic anno- person Max Planck as well as to the lity and efficiency of the search engine. tations are of interest for query evalua- various Max Planck {institutes} among Our experimental results and the regular, tion because they can contribute to dis- its best matches. Taking advantage of very successful participation in interna- ambiguating the textual contents and the document structure, a simple refine- tional benchmark competitions support thus to better understanding the textual ment of the query might then look like the globally strong position of TopX for ...... information. Let us consider, for example, the following: information retrieval over semi-structu- the keyword query “Who were doctoral red data that significantly outperforms students of Max Planck, who themselves //article[.//person ftcontains „Max Planck“], commercial products. became scientists?” Assuming that appro- priate annotations exist in the collection, which would tell the search engine that this keyword query can be translated articles about the person “Max Planck” Applications and open-source into a very specific path query over a would be preferred. availability corresponding document structure: TopX targets both the direct user Moreover, different forms of query application as well as the API-based //article[.//person ftcontains „Max Planck“] relaxations and ontological query expan- integration with other systems. With //doctoral_students//scientist sions are supported by TopX. On the one the international XQuery 2.0 Full-Text (according to XPath 2.0 Full-Text hand, overly specific queries that would standard, TopX also offers improved Standard) not yield any exact matches can be dy- support for imperative programming namically relaxed at query time by auto- language concepts, such as loops and The results of this query over a matically switching from a conjunctive conditional statements, which increases Wikipedia-based, annotated XML corpus, query evaluation mode into a disjunctive TopX’s usability with automatically ge- will then be returned by the engine as evaluation strategy. On the other hand, nerated queries. direct references to “Gustav Ludwig Hertz” through ontological query expansions, or “Erich Kretschmann” as compact links, TopX can also find matches to semanti- The original prototype of TopX was for example, to their own encyclopedia cally similar concepts. already made available as an open-source entries. package in 2006. A respective open-source distribution of the current TopX 2.0 Efficiency and scalability version with support for distributed in- Exploratory search Much of the implementation of dexing, incremental index updates, and The prerequisite for a successful TopX has been continuously improved the XQuery 2.0 Full-Text standard is application of this query language is and in part completely redesigned over planned for 2011. ::: however the user’s detailed knowledge of the document structure, the so-called XML schema. In order to support the user in formulating structured queries over an initially unknown document schema, TopX also allows for an explora- tory form of search. Starting with simple keyword queries, the user can employ the search engine to gradually learn more about the document’s schema and then CONTACT refine the queries step by step via struc- Martin Theobald tural conditions. The expressiveness of DEPT. 5 Databases and Information Systems Phone +49 681 9325-5007 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite80 80 that EdinburghisinScotland. id1, thatid1isborninEdinburgh,and strate thattheobject and authorOf, id2> in 1859.Furtheredges,suchas the name Arthur ConanDoyle,born that thesubject object. Thereareedgesintheexample “predicate” the graph,thiscorrespondswith subject, apredicateandanobject.In A tripletconsistsinRDFwritingofa an associatednodepairinthedatagraph. each ofwhichcorrespondstoanedgeand An RDFdatacollectionconsistsoftriplets, Resource DescriptionFramework (RDF) designed forgraph-structureddata,isthe reasoning. i.e., networkswhichallowforautomatic data graphsformsemanticnetworks, formal semanticsassociatedwithit,the form anetworkorgraphstructure.Using together withtheentitiesthemselves, things, or, moreabstractly:Entities, his friends.Theserelationshipsbetween of web2.0pageshasconnectionsto takes partinspecificreactionsandauser a bookhasoneormoreauthors,protein relationships areubiquitous.For example, about relationshipsbetweenobjects.Such that is,datawhichcontainsinformation manage andsearchforsemanticdata, Semantic Data Example RDFgraph RDF-3X –FastSearchesforSemanticData SOFTWARE A dataformat,whichwasspecially RDF-3X isadatabasesystemto labeled edgefromsubjectto , “id1” “id2” is associatedwith , whichexpress is writtenby ?titel. ?author ?book ?city Scotland ?author ?city pattern: can describethiswiththefollowing the titleofbooksbyScottishauthors,one data. Ifonesearches,forexample, pattern thatmustbesearchedforinthe SPARQL, forexample,anddescribea l formulated inaquery expensive. Search fact thatRDFsearchesarerelatively graph structureofthedata,leadsto flexibility, whichisachievedthroughthe kind ofnotation.Thesimplicityand be formulatedrelativelysimplywiththis Search inRDFGraphs The databasesystemwehavede- The partsbeginningwithquestion Also, complexrelationshipscan order toanswerthisquery, the queries arenormally anguage suchas conditions by the conditions by Because triple that

...... ytm.::: systems. frequently muchfasterthanother with manytriplepatterns,RDF-3Xis tremely well.Evenforcomplexqueries systems, RDF-3Xusuallyperformsex- experimental comparisonwithother in ordertoscalesuchdatasizes.In Passing niques, suchas have thereforeincorporatedmanytech- stop thewholedatabase.Overtime,we removing olddata,allwithouthavingto i.e., insertingnewdataand,ifnecessary, challenge isupdatingsuchlargedatasets, data setswithbillionsofedges.Onebig sure thatitscalesefficientlytoverylarge for smalldatagraphs,butalsotomake important forustouseRDF-3Xnotonly follow noschema).Despitethis,itwas show noknownstructure(i.e.,they graphs becausethefrequently ready difficultforevenrelativelysmall Efficiency andScalability processing byafactorof10ormore. strategy canoftenacceleratethequery time. A carefulchoiceoftheexecution data set,andhowthisaffectstheexecution are moreauthorsorcitiesinthe query, onemustestimatewhetherthere of statisticalmodels.For theexample alternative ischosenwiththeassistance assessed, andthenthemostefficient natives. Thedifferentalternativesare time fromthedifferentexecutionalter- be expectedtohavethelowestexecution operators andchoosetheonesthatcan execution strategiesusingalgebraic we mapsuchinterrelatedqueriesinto with linkedtriplepatterns.Therefore, interrelationships andthusposequeries hn +496819325-5007 Phone DatabasesandInformationSystems DEPT. 5 Thomas Neumann CONTACT Searching insemanticdataisal- and Triple Versioning Sideways Information in RDF-3X

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 81 ...... 81 bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 82 ...... 82 RESEARCH AREAS

VISUALIZATION

Pictures are the fastest access to human The basis of qualitatively high- In order to show lifelike models of value, computer-generated images are virtual worlds we have also developed new consciousness. Algorithms for appropriate accurate scene models. At the Max simulation methods for light propagation Planck Institute for Informatics, auto- in scenes, so-called global lighting. In visualization of digital information therefore mated methods to digitize real objects global illumination, we simulate (in addi- play a significant role in informatics. The are being researched, which next to their tion to incident light, which is reflected own scene geometry, also exactly measure directly from the objects) especially the requirements for visualization algorithms ...... lighting and reflection properties. Thus, ...... indirect light, which bounces back and ...... the reconstruction of dynamic scenes forth over different surfaces. For this, have dramatically increased: Artificial gains more meaning and finds many we focus on the development of real- and natural worlds must be presented in applications in computer animation time algorithms. as well as in the areas of 3D TV and an ever more realistic and fast manner – 3D video (see also here research reports The images created through image from the area “Understanding Images synthesis or image construction typically in flight simulators, surgical operation and Videos”). have a brightness area (dynamic range) which is similar to the real world. We planning systems, computer games or with Another object of our research is research algorithms for image processing the illustration of large amounts of data the reconstruction of statistical models of these so-called High Dynamic Range of specific object categories such as images (HDR) as well as methods for in natural and engineering sciences. faces or human bodies from 3D scans. their portrayal on standard displays. These models make it possible for us to simulate a movement of the face or the Photographs and videos also repre- body, and also to simulate the changing sent forms of visual media. These are face or body form (for example, after both aesthetically appealing and impor- gaining weight). The solutions of many tant tools for data analysis. We are there- problems in image processing, computer fore working on new optical systems for animation, and 3D movement measure- cameras and turn the classical camera ment are dramatically simplified with into a calculation instrument, which can these models. One can discover underly- extract much more information from in- ing similarities and symmetries between dividual images than pure light intensity, different 3D models as well as within such as 3D geometry, for example. ::: individual 3D models using statistical analysis of 3D geometry. In this way, the basic building blocks of 3D shapes can be seen, and rules can be automatically derived as new versions of these forms are created...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 83 ...... 83 CONTRIBUTIONS

HDR – Images and Video with Enhanced Contrast ...... 84

Advanced Real-Time Rendering ...... 85 ...... Computational Photography ...... 86

Statistical Models of People ...... 87

Correspondences, Symmetries and Automatic Modeling ...... 88 ...... UNDERSTANDING IMAGES & VIDEOS

BIOINFORMATICS

GUARANTEES

INFORMATION SEARCH & DIGITAL KNOWLEDGE

MULTIMODAL INFORMATION

OPTIMIZATION

SOFTWARE ...... VISUALIZATION max planck institut

...... informatik bericht_2011_en_korr2 24.11.201116:40UhrSeite84 84 HDR –ImagesandVideo withEnhancedContrast VISUALIZATION the capabilitiesofhumanperception. which providesaccuracythatexceeds on colorwithmuchhigherprecision, ditional imagingbyperformingoperations (HDRI) overcomesthelimitationoftra- range thantheirCRTancestors. much broadercolorgamutanddynamic LCD andPlasmadisplayscanvisualize longer valid,asthenewgenerationsof was developed.Thisassumptionisno the timewhenJPEGcompression Ray Tube (CRT)monitorsorTVsetsat majority ofdisplays,whichwereCathode formation ascouldbedisplayedonthe efficiency reasons,tostoreasmuchin- popular JPEGformatwasdesigned,for of displaydevices.For instance,the reproduce themonthenewgeneration does notevenoffersufficientqualityto ness levelsvisibletothehumaneyeand only asmallportionofcolorsandbright- and videomaterialusedtodaycancapture regions ofdifferentluminancelevels by theanchoringtheoryoflightnessperception. behavior ofthehumanvisualsystemaspredicted image Tone mappingreducesdynamicrange inanHDR blue and red in the left inset) left the in red and blue (the right inset) inset) right (the The HighDynamicRangeImaging The vastmajorityofdigitalimages by separatelyprocessing , whichmodelsthe (marked in in (marked

...... going tobeavailableonthemarketsoon. level HDRdisplayandcamerasare effects inmovieproductions. A consumer in videogamesandastoundingspecial more realisticvirtual-worldexperience and realisticimagerendering.Itdelivers pularity inphotography, imageediting, strongly strongly highlights, darkshadowsaswellvivid, nous surfaces(sun,shininglamps),bright common visualphenomenaasself-lumi- images, HDRimagerycanrepresentsuch traditional imaging.Unlike cues, whicharenotachievablewith store, andvisualizearangeofperceptual precision, butalsoenablestosynthesize, HDRI doesnotonlyprovidehigher with nounder- orover-exposed regions. scenes withallperceivablecolorsand HDR imagescanfaithfullyrepresentreal dynamic range Rendering HDRimagesforthedevicesofalimited HDRI hasalreadygainedvastpo- saturated colors.

...... n cdmahsbe prvd ::: and academiahasbeenapproved. partners fromtheEuropeanindustry Research) incooperationwithmany in thefieldofScientificandTechnical COST Action (EuropeanCooperation our proposaltolaunchtheso-called fertile groundintheEUCommission; imaging. perception thatgreatlyinfluencedigital tent andresearchtheaspectsofhuman and algorithmsforprocessingHDRcon- of HDRimaging,developstandardtools Our missionistopopularizetheconcept definition ofnewimagingstandards. contentrequiresalotof HDR software andhardwaretoworkwith However, redesigningexistingimaging bilities ofthehuman required, belimitedonlybythecapa- technology and,if be restrictedbyany of animagingframeworkthatwouldnot In ourwork,wetry [http://www.mpi-sb.mpg.de/ resources/pfstools/] nenthttp://www.mpi-inf.mpg.de/resources/hdr/ Internet +496819325-4029 Phone ComputerGraphics DEPT. 4 Karol Myszkowski CONTACT projects but also used by usedby projects butalso images, intended software forimageprocessingonHDR sizes. We havealso necessary to achieve goodcom not onlystoreHDRimagery, butalso tionally highfidelity. Theformatscan encoding real-worldsceneswithexcep- and imageformatsthatarecapableof Informatics, wehavedesignednewvideo Recently, ourideashavefallenon At theMaxPlanckInstitutefor reduce datatomanageable mostly forresearch pression performance, developed asuiteof storage efficiencyis to realize the concept particular imaging visual perception. photographers effort and ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 85 ...... VISUALIZATION 85

Advanced Real-Time Rendering

Rendering is one of the fundamen- Figure 1: tal research areas in computer graphics, The depth-of-field effects rendered with a thin lens model (the upper- which is striving to bring photorealistic left half) greatly improves photo- imagery to digital media. Despite the re- realism, in comparison to the image cent strides made in graphics algorithms rendered with a standard pinhole lens model (the lower-left half). and hardware, real-time simulation of light propagation remains challenging. This article presents our attempts to realize photorealistic effects like the ones in normal photographs. A more advanced simulation can video playing abilities. Such server-client be handled with a spherical lens. One graphics platforms have many other ad- Image-based ray tracing example is aberration, resulting from the vantages, e.g., potentially huge data sets A typical approach to trace rays in deviation of rays at the focusing point. remain on the server, which simplifies a scene relies on the 3D mesh represen- Although it is considered as an artifact maintenance, improves security, and en- tation. Hierarchical data structures often in reality, the presence of aberration ables consistent updates in collaborative ...... help to accelerate the intersection test...... adds dramatic realism to a static scene...... work scenarios. In our research, we focus Nonetheless, the typical performance of Figure 2 demonstrates the curvature of on reducing the server load by perception- such approaches has been inappropriate a field, one of the common aberration based adjustment of the rendering qua- for real-time applications. Our approach effects. lity to the video streaming bandwidth. based on image-space scene representa- This way, image details, which would tion precludes invisible data and thereby otherwise removed due to the com- relieves costly computation. Moreover, pression, are not rendered on the server. its spatio-temporal coherency is appro- priate to take advantage of parallel GPUs. In another approach, to reduce server Our rays are defined in a 2D image domain. workload, we capitalize on the often sig- The complexity of depth dimension is nificant computation capabilities of the reduced with the layered data structure. client, who, for the pure-video-streaming scenario, stays mostly idle. To transfer high-precision auxiliary depth data to the Simulation of optical lens effects Figure 2: Curvature of a field makes an illusion of still velocity. client, our solution uses an efficient edge Optical simulation of a lens system encoding and a new fast diffusion pro- is an example of efficient application for cess. The availability of such additional the image-space ray tracing. Unlike the Server-based rendering information makes it possible to employ pinhole lens model common in graphics, Server-client platforms that enable many applications, such as spatio-temporal real cameras have a finite size of aperture. remote 3D graphics interaction are of upsampling (superresolution). Conse- The traditional approach, which was the high importance in the age of mobility quently, costly per-pixel shaders are only only accurate approach before ours, yet and constant hardware change. By rele- applied to low resolution frames on the very costly, was to distribute multiple gating rendering to powerful servers and server, which are streamed to the client rays sampled in a thin lens. The results streaming the resulting frames, the re- where the corresponding high-resolution rendered with the image-based ray trac- quirements on the client side can be frames are reconstructed, hereby lower- ing [Figure 1] are hardly distinguishable reduced to standard web browsing and ing server workload and bandwidth. ::: from the traditional rendering, yet the speedup of rendering performance rea- ches up to an order of two magnitudes.

CONTACT

Sungkil Lee DEPT. 4 Computer Graphics Phone +49 681 9325-4000

Karol Myszkowski DEPT. 4 Computer Graphics Phone +49 681 9325-4029 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite86 86 Computational Photography VISUALIZATION are knownascolorfilterarrays(CFAs). green andblue,forexample.These filters formation channel–forthecolors red, use ofseparatesensorchipsfor each in- pensive productionincomparisontothe sensor. Thisprocessallowsforlessex- optical filtersindifferentpixelsofasingle such ascolor, aredigitizedbysuitable mode, i.e.,variousformsofinformation, system Theoretical analysisofmultiplexcamera less expensivemodels. lenses gainamaximumof1f-stopover cameras. Incomparison,professional dynamic rangethanstandarddigital prototype whichhasa1.5f-stophigher light. Ourgrouphasdevelopedacamera of onef-stopbeingdoubletheamount depict upto12-14f-stops–thedifference stops, film-basedcamerascanaccurately have adynamicrangeofuptoonly10f- image. Whilethebestdigitalcameras est anddarkestperceivableobjectsinthe contrast relationshipbetweenthebright- pertains tothedynamicrange,i.e., compared tofilm-basedapproaches, as of thedigitalcamerasbeingusedtoday, every way. Themostsignificantlimitation are notsuperiortofilm-basedcamerasin outside thefilmindustry, digitalcameras mostly replacedanalogrecordingprocesses A camerawithhighdynamicrange(HDR) applications, suchas3Dreconstruction. in ordinarycamerasanddevelopnew in ordertocorrectcommonweakpoints group investigatesinnovativealgorithms proved bycomputationalmethods.Our of howconventionalcamerascanbeim- tional photographydealswiththequestion The newlyemergingfieldofcomputa- a computerunitlocatedonthecamera. values. Thisconversionisperformedby incident lightdirectlyintonumerical a chemicalbasis,digitalcamerasconvert While thetraditionalprocessoperateson placed byitsdigitalcounterpartyearsago. Cameras areoftenusedinmultiplex Although digitalphotographyhas Traditional photographywasre-

...... range oftheimage The cameraprototype tal problemwith3Dscanningsystems above allinfilmproduction. A fundamen- but alsointheentertainmentindustry, digital preservationofculturalartifacts, measurement, qualitycontrol,inthe is usedmoreandcommonlyin Digitization, alsocalled3Dscanning, tion areafornewcameratechnologies. of objectsisanotherimportantapplica- hemispherical viewpointdistributions Kaleidoscopic camerasforgenerating prototypes intermsoftheirnoisepatterns. ables tocomparethedifferentcamera the applicationofnewmodelen- tional imagereproduction.Moreover, struction algorithmsforcoloranddirec- can beusedtocreateimprovedrecon- single overarchingmodel.Thisknowledge recent yearsarespecialvariationsofone of thetypesnewcamerasstudiedin theoretical work,establishedthatmany ing. Ourgrouphas,inthecourseofour the focusofanimage possible, amongotherthings,tochange The recordingoflightfieldsmakesit can berecordedindifferentchannels. tribution oflight,theso-calledlightfield, to colorinformation,thedirectionaldis- Recent analysisshowsthat,inaddition utilizing thepriorencoding The three-dimensionaldigitization (top right) (top (left) (left) (bottom right) (bottom uses speciallydesignedfilters,whichcodethedynamicrangeinfrequency after . Computationaloptimizationmethodscanrestorethelostdynamicrange, the record- .

...... simultaneously points (>200).Thesecanberecorded generate ahighnumberofvirtualview- our groupthatusesamirrorsystemto quality. We havedevelopedamethodin and yieldsanincreasedreconstruction for anobjectprovidesmorerobustresults greater numberofmeasurementpositions in computedtomographytechnology, a the qualityofdigitization.Justlike between thenumberofcamerasand In addition,thereisadirectcorrelation by employingmulti-camerasystems. be achievedatconsiderableexpense: tization of a time.Thisimpliesthatall-rounddigi- only beviewedfromoneperspectiveat to haveanall-aroundview;objectcan is theinabilityofconventionalcameras view-points is possibletodividetheimageintoitsindividual tions. Withanalgorithminventedbyourgroup,it struction consistsinthemutualmaskingofreflec- The difficultyofusingtheseimagesfor3Drecon- delivers arough3Dmodelofthescene nenthttp://www.mpi-inf.mpg.de/ Internet +49681302-70756 Phone ComputerGraphics DEPT. 4 Ivo Ihrke CONTACT serves manifoldviewsofthesameobject One camerainakaleidoscopicmirrorsystemob- rent viewpoints) rent departments/d4/areas/giana/ (center, different colors indicate diffe- indicate colors different (center, dynamic . Aninterimstepinourprocedure with a objects canonly single camera. ::: (right) (left) . .

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 87 ...... VISUALIZATION 87

Statistical Models of People

Statistical face models By using a so-called Lightstage (a sphere covered with light sources, in the middle of which is a subject’s face), the properties of a face can be captured much better than in the past. Now, it is possible to precisely capture the 3D geo- Figure 1: From left to right: Input image, recommended makeup, non-recommended makeup, metry of a face, in which each individual detail enlargements pore is visible. Further measurable pro- perties are, for example, the diffuse and of the human skeleton, and therefore the This new model can be adapted to specular components of the reflected movement of the body. The main inno- image video data and thus measures the light, the shininess of the skin, or the vation is the second space, which allows movement and the constitution of a cer- strength of the scattering of light into modifying the body’s shape and constitu- tain person in a far more accurate way. the skin. This improved measurement tion through a few intuitive parameters. Using another newly-developed approach device enables us, for the first time, to Examples of such parameters are: Weight, – MovieReshape – we can also measure capture the influence of makeup on the body size, leg length, waist girth, mus- human motion in 2D video streams, that ...... properties of the skin. cularity, etc. The model was generated are, normal films and videos, using one by applying a machine learning method single camera perspective. After capturing We have developed a software based to laser scans of over 120 real people of the movement of a person, the body shape on such a Lightstage to recommend an various ages and both sexes. It thereby parameters can be modified throughout appropriate makeup for a subject and reflects not only the range of motions the entire video sequence [Figure 2]. So present her with a convincing 3D pre- a person can do, but also the wide span actors can, for example, be made taller, view. Up to now, literature only reports of variability in body shape among the smaller or more muscular without need- on approaches which enable transferring population. ing to reshoot the scene [Figure 3]. ::: makeup from a photo of a person to the photo of another person. For our new system, we established a database with 56 women, which captures their faces with and without makeup. If a suitable makeup is needed for a woman who is not yet in the database, her skin char- Figure 2: MovieReshape: The movement of the acteristics and typical facial features are actress in a video can be captured with a statistical compared with women in the database body model. Her body size can then be modified in the entire film segment. Figure 3: MovieReshape: The muscularity of the actor so that a appropriate makeup match can from the above original sequence was increased in be found. the lower sequence.

Statistical body models A detailed model of the human body is highly important in many applica- tions for image recognition and computer graphics, such as, for example, in marker- free motion analysis, man-machine inter- action, or computer animation. Previously used models, as a rule, could only show the geometry and movement of a specific CONTACT person with great accuracy. The need to produce a person-specific model preven- Christian Theobalt ted, however, applying these image pro- DEPT. 4 Computer Graphics Phone +49 681 9325-4028 cessing methods to any person. We have therefore developed a new body model of people, which is described by two low- dimensional parameter spaces. The first Thorsten Thormählen parameter space defines the joint angles DEPT. 4 Computer Graphics Phone +49 681 9325-4017 ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite88 88 put togetherinadifferentway. of thesamebuildingblocks,butperhaps is equivalenttoallothermodelsmadeup madeupofdifferentbuildingblocks del able. Inparticular, thismeansthatamo- of thesamestructureshouldbenotice- an object,nodifferencebetweenobjects look atasmall,locallyrestrictedpartof the sameonalocallevel.Ifweonlyever is structurallysimilartoanotherifitlooks has itsoriginsinimageanalysis: An object assumptions. Here,weuseamodelthat important aspectswithsimpleformal the humanbrain;trickistocapture with thecomplexcognitiveprocessesin must naturallyloweroursightscompared line withthismodel.For themodel,we automatically recognizesstructuresin of. Secondly, weneedanalgorithmthat up what thestructureofanobjectismade a problem:Firstly, weneedacriterionfor What isstructure? VISUALIZATION models usingthecomputer? eye. Canwerecognizethestructureofgeometric king structureimmediatelyrecognizedbythehuman Figure 1:Threeexamplesof3Dmodelswithastri- Correspondences, SymmetriesandAutomaticModeling tional modelingwork. to theparkinggaragewithoutanyaddi- example, enablenewfloorstobeadded zed toacertaindegree.Thiswould,for models couldbeautomaticallyrecogni- be extremelyusefulifstructuresof3D for thecomputer. Nevertheless,itwould this typeisconsiderablymoredifficult entrance andexitramps. An analysisof up ofrepeatedsegments,togetherwith various floorsthatarethemselvesmade can immediatelytellthatitconsistsof parking garage,asshowninFigure1,we example, whenwelookatthemodelofa the structuresinobjectsintuitively. For Two thingsareneededtosolvesuch As humans,wecanrecognize

...... between objectparts We calculateallpartialcorrespondences dence analysistoanindividualobject: mation. We thenapplythiscorrespon- these piecesunderarigidbodytransfor- we havefound identical exceptforarotationorshift, we findtwopiecesofgeometrythatare by usingacorrespondenceanalysis:When Correspondences andsymmetries various combinations. thus formedcanthenbeassembledin cut inthebluearea.Thethreepieces area inFigure2generatesasymmetrical symmetrical areas: A cutthroughthered by cuttingthemodelapartwithin metries an objectarereferredtoas in Figure2] (red on blue) on (red the inputobjectispartiallyprojectedontoitself Figure 2:Apartialsymmetry:Intheillustratedshift, We calculatesuchbuildingblocks . We thenfindourbuildingblocks . Suchcorrespondenceswithin . correspondences [an exampleisgiven partial sym- between

...... display asimilarstructureto red. Thenewlygeneratedmodels Figure 1 examples forhowthemodelsshownin were calculated.Figure3 model fromwhichthebuildingblocks guaranteed tobesimilartheinput to createnewmodels.Thesemodelsare kit ofbuildingblocksthatcanbeused a of thiskindandusesthemtobuildup Automatic modeling aino Dmdl. ::: of3Dmodels. cation functions intheprocessingandmodifi- work isfacilitatedbysemi-automatic supporting 3Dmodeling:Theuser's practice, thisisanimportanttoolfor build similarmodelsindependently. In models andtousethisknowledge little aboutthestructureofgeometric taught thecomputertounderstanda examples. With thisprocess,wehave nenthttp://www.mpi-inf.mpg.de/departments/d4/ Internet +496819325-4008 Phone ComputerGraphics DEPT. 4 Michael Wand +496819325-4008 Phone ComputerGraphics DEPT. 4 Martin Bokeloh CONTACT similar objectsarecalculatedautomatically. example objectsmarkedinredareanalyzedand Figure 3:Resultsoftheautomaticmodeling.The Our algorithmthendefinesallcuts areas/statgeoprocessing (red) can beautomatically gives some the input (gray) alte-

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 89 ...... 89 bericht_2011_en_korr2 24.11.201116:40UhrSeite90 90 strengthen therecruitmentandtrainingofyoungscientists. within astructuredprogramthatprovidesexcellentconditionsforresearch.Theaimisto They offerespeciallygiftedGermanandforeignstudentsthepossibilitytoearnadoctorate initiative topromoteyoungscientists:theInternationalMaxPlanckResearchSchools(IMPRS). in Germany. TheMaxPlanckSociety, incooperationwithGermanuniversities,haslaunchedan The trainingofyoungscientistsisfundamentaltothefuturescience,research,andinnovation School forComputerScience The InternationalMaxPlanckResearch IMPRS-CS

...... (IMPRS-CS)

...... bericht_2011_en_korr2 24.11.201116:40UhrSeite91

...... Poland. coming fromBulgaria,China,Indiaand from abroad,withthelargestcontingent percent ofourdoctoralcandidatescome German researchinstitutions.Over50 future workatorincooperationwith stitutions, andarousetheirinterestin familiarize themwiththeresearchin- to doingdoctoralstudiesinGermany, especially toattractforeignapplicants cooperation: TheIMPRS-CSstrives and theiracademicadvisors. collaboration betweendoctoralstudents ing ofindividualdoctorates,andclose specialization, oftenwiththematiclink- first-class trainingprograms,academic degree andthePhD.Thisincludes are betweenthebachelor’s ormaster’s for youngscientistsandscholarswho Promotion ofyoungscientists One focusisoninternational The IMPRS-CSisanopportunity

...... lish isrequiredforallcandidates. versity. OutstandingknowledgeofEng- of ComputerScienceatSaarlandUni- and theircolleaguesintheDepartment scientists oftheMaxPlanckInstitutes The projectsarejointlysupervisedbythe puter ScienceatSaarlandUniversity. Systems andtheDepartmentofCom- the MaxPlanckInstituteforSoftware Max PlanckInstituteforInformatics, are offeredinclosecooperationwiththe doctoral degree. All graduateprograms level competencyandalsotoachievea offers programstoestablishgraduate of ComputerScience,theIMPRS-CS and theSaarbrückenGraduateSchool IMPRS-CS programs Together withSaarlandUniversity

...... IMPRS-CS xusos ::: excursions. at severallevels,jointactivities,and We offerEnglishandGermanclasses administrative mattersofallkinds. in findingaccommodationandwith them. Inaddition,weassistourstudents spouses andchildrenwhoaccompany age forthemselvesaswelltheir expenses, andhealthinsurancecover- scholarship whichcoversfees,living Financial support nenthttp://www.imprs-cs.de Internet +496819325-1801 Phone IMPRS-CS Jennifer Gerling CONTACT IMPRS-CS studentsreceivea 91 bericht_2011_en_korr2 24.11.201116:40UhrSeite92 92

...... of anMPIproject The calendaroftheMaxPlanckSociety Images fromScience Minister Hartmann Participation oftheMaxPlanckInstituteforInformaticsatCeBITfair CeBIT 2010 NEWS Department ofSaarlandUniversity, the cooperation withtheComputerScience sent attheSaarlandresearchbooth,in Institute forInformaticswasagainpre- the public. form forpresentinginnovationinITto tacts. Inaddition,itisawell-usedplat- vides amajorforumforeconomiccon- addresses andcorporateevents,pro- a tradefairwithconferences,keynote events fortheITindustry. Itcombines nally oneoftheworld'smostimportant be foundinthe2011calendar. ::: Planck InstituteforInformaticsand can tributions werechosenfromtheMax organized. from the80MaxPlanckInstitutesis images, acompetitionamong week oftheyear. In current researchtoaccompanyevery surprising, andunusual ima produces ayearplannerwith In March2010,theMaxPlanck The CeBITinHanoveristraditio- As inthepreviousyear, twocon- The MaxPlanckSocietyannually (middle right) (middle order toselectthe at thepresentation ges from researchers aesthetic,

...... Editing] [http://www.mpi-inf.mpg.de/resources/ Reflection Emeliyanenko andMichaelKerber have twofurthersurfacesbyEricBerberich, Pavel Illustration ofaringcyclidewhosesection curves developed byDr. ThorstenThormählen Saarland researchstandisatechnique Saarland (KIS). Competence CenterComputerScience Computing andInteraction”, Cluster ofExcellence as partofthe ted W Prof. Dr. Christoph Processing” synchronizes them( pieces ofmusic,andcompares searches notes,audioandtextfrom is aMultimodalMusicPlayerwhich Dr. MeinardMüller’s researchproject freedom inthecompositionofascene. in aninteractiveway, givinggreatartistic tions incomputer-generated 3Dscenes cec,D.CrsohHrmn. ::: Science, Dr. ChristophHartmann. with theMinisterforEconomicsand ness visitedtheresearchstandtogether of representativesfrompoliticsandbusi- “Version ManagementintheFuture” One oftheprojectsshownat On March4,aSaarlanddelegation , whichenablesonetodivertreflec- , page65). “Future Talk” “Automated Music eidenbach presen- “Multimodal series. and the

...... motor byMartinCadikandGlennLawyer Combination ofmultiplerecordingsasunken ship Saarland researchstand

...... bericht_2011_en_korr2 25.11.2011 11:05 Uhr Seite 93 ...... NEWS 93

Business Location Saarland Saarland – a top location for national and international investors

On August 18, 2010, the media Following Prof. Seidel’s speech, the project “Business Location Saarland” was audience, consisting of more than 150 presented in the auditorium of the Saar- representatives from politics, business brücken city hall. For this trilingual loca- and science, as well as many represen- tion marketing project that aims to high- tatives of the press, took part in the light the benefits of the business location movie premiere of the promotional film Saarland, the European Economic Pub- [http://ebn24.com/index.php?id=35888]. At the lishing House presented a book, a film, concluding get-together, the book and a website created in collaboration “Business Location Saarland”, still hot with companies and institutes from the of the press, was viewed and discussed. Mayor Ralf Latz, Director Prof. Hans-Peter Seidel, region [http://ebn24.com/index.php?id=34983]. Prime Minister Peter Müller and Publisher Christian The Max Planck Institute for Informa- Kirk (from left to right) Produced with the participation tics as well as the Cluster of Excellence of renowned companies and authors from “Multimodal Computing and Interaction” Peter Müller, Mayor Ralf Latz and pub- the worlds of politics, business, science, are representatives of the quality and lisher Christian Kirk. In his speech, Prof. art, culture and sports, and released in attractiveness of this scientific location Seidel emphasized the importance of German, English and French, the book and its world-wide reputation as a cen- informatics for Saarland and pointed out and film together with the free website ter for cutting-edge research. an impending paradigm shift in informa- aim to attract the interest of national and tion technology. The collection, prepara- international investors by emphasizing Prof. Dr. Hans-Peter Seidel was tion and processing of multimodal con- Saarland’s strengths. ::: invited as a keynote speaker. Other pro- tent will become one of the main topics

...... minent guests included Prime Minister ...... of informatics in the future......

Intel Visual Computing Institute (Intel VCI) A new research center on the campus of Saarland University

In May 2009, the new Intel Visual experiences and more natural ways for Computing Institute (Intel VCI) was people to interact with computers and opened in the presence of Saarland other devices. Intel’s vision on visual Prime Minister Peter Müller, University computing is to create computer appli- President Prof. Volker Linneweber, and cations that look real, act real and feel Intel’s CTO Justin Rattner. Intel VCI real. is located on the campus of Saarland University and is carried jointly by Intel, The Intel VCI will be led by Pro- Saarland University, the German Re- fessors Thorsten Herfet (Chair for Tele- search Centre for Artificial Intelligence communications) and Philipp Slusallek Signing the Intel VCI founding contract: Justin Rattner (CTO Intel), Prof. Peter Druschel (MPI-SWS), (DFKI), the Max Planck Institute for (Chair for Computer Graphics) at Saar- Prof. Wolfgang Wahlster (DFKI), University President Software Systems and the Max Planck land University. The MPG representative Prof. Volker Linneweber (University of Saarland), Institute for Informatics. Intel Corpora- on the Governance Board of the Intel Prof. Hans-Peter Seidel (MPI-INF) (from left to right) tion has invested $12 Mio to create the VCI is Hans-Peter Seidel. Christian new research center. Theobalt is a member of the Steering Committee. The Intel VCI is a part of Visual computing is the analysis, Intel Labs Europe and, according to enhancement and display of visual in- Intel, the largest research cooperation

...... formation to create life-like, real-time ...... by Intel in Europe. ::: ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite94 94

...... possible researchdirections. to discussandevaluateprogress overview oftheongoingresearchand for anextendedon-sitevisittoget Advisory Boardmeetseverytwoyears versity foraperiodoffiveyears.The by thePresidentofSaarlandUni- visory Boardthathasbeenappointed vised byaninternationalScientific Ad- Excellence isProf.Hans-Peter Seidel. Scientific coordinatoroftheCluster stitute directorsareamongthe13PIs. established withintheCluster. Four in- 100 researchersintotalhavebeennewly 20 JuniorResearchGroupswithalmost addresses thischallenge.Morethan of theGermanExcellenceInitiative, Foundation (DFG)withintheframework established bytheGermanResearch Multimodal ComputingandInteraction, and humanexpression. ted, particularlythroughvision,hearing, the wayitisperceivedandcommunica- speech, images,video,andgraphics, rent kindsofinformationsuchastext, The termmultimodaldescribesthediffe- ral andintuitivemultimodalinteraction. ate dependablesystemsthatallownatu- efficient andintelligentway, andtocre- this multimodalinformationinarobust, face, organize,understand,andsearch cal data.Thechallengeisnowtointer- ded toincludeaudio,video,andgraphi- content wastextual.Today, ithasexpan- tion Society. Ten yearsago,mostdigital racterized astheadventofInforma- work. Thisphenomenoniswidelycha- dramatic changesinthewayweliveand Successful ScientificAdvisoryBoardreview The ClusterofExcellenceonMultimodalComputingandInteraction NEWS The ClusterofExcellenceisad- The ClusterofExcellenceon The pastthreedecadeshavebrought

...... Cluster. Thisisworkingwellandstandsas both theworkandpeoplein increasingly deepmultidisciplinarityof and acrossareas,thegrowing noteworthy istheintegrationacrossgroups performed iscommendable.Especially The diversityanddepthofresearchbeing impressive progresstowardsthispurpose. multiple media.The Advisory Boardfinds Cluster istoperformexcellentresearchin Board stated: dual projects.” to achievemuchmorethanisolatedindivi- framework ofthiscluster. Thisallowsthem tists worktogetherverycloselywithinthe nature oftheresearcheffort: larly highlightedtheinterdisciplinary Thomas Gross(ETHZürich),particu- The ChairmanoftheCommittee,Prof. Excellence washeldinNovember2010. Advisory BoardreviewoftheCluster in November2008,thesecondregular Participants intheScientificAdvisoryBoardMeeting In itswrittenreport,the Advisory Following thekick-offworkshop “The primarypurposeofthe “The scien-

...... successfully findingacademicpositions.” of theresearchersinClusteralready ing humanpotentialisevidentinmany [...] Thesuccessofthismodelindevelop- a diversityoftopicsintheresearchgroups quality whilesimultaneouslymaintaining attracting youngresearchersofthehighest row. TheClusterhasbeensuccessfulin ment oftheleadingresearcherstomor- Research GroupLeadersforthedevelop- is theestablishmentofmodelJunior An impressiveachievementoftheCluster a modeltoothercenters[...] ::: ...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 95 ...... NEWS 95

Promotion of Talent Attracting more pupils to informatics

In 2010, the Max Planck Institute as well as at the “Apprentice and Student for Informatics, as a partner of the Com- Days” in Frankfurt, were organized by petence Center Computer Science Saar- KIS and supported by staff, demonstra- land (KIS), further extended its activities tions and informational material from for promoting young talent. The compe- the institute. tence center unites the skills of about 460 scientists in informatics and related Even closer is the cooperation at faculties of Saarland University and Saar- workshops and lectures, particularly in land University of Applied Sciences the framework of long-term programs, (HTW) with two Max Planck research since they can benefit greatly from the institutes and the German research center excellent technical and staff infrastruc- for artificial intelligence. One of the goals ture of the Max Planck Institute. The ...... of this unique group is to create an en- ...... following events were supported by the vironment where highly qualified young institute: scientists can further develop their skills. April 22, 2010: Girls’ Day 2010 Workshops for university freshmen and A variety of activities was offered high-school students, information con- [http://www.girls-day.de] for these events: In one workshop, for gresses and school cooperations are com- At Girls’ Day, female high-school example, the procedure, the opportunities monly used instruments to recognize and students meet professionals from areas and the limits of laser scanning were support young talents as early as possible. in engineering, science, as well as women explained and then tested in practice. A special focus here is the promotion in leading positions in business and poli- In another workshop, pupils could learn of girls, who, as experience in Germany tics. Girls get an insight into these often to build and program robots and test their has shown, despite sufficient qualifica- unknown areas and can make first contact new knowledge using an operational tions rarely choose a career in technical with professionals. Workshops, focusing robot. In addition, the basic concepts subjects such as informatics . on the girls’ own activities complete the of informatics were communicated program of this day. through exercises from the “CS Un- plugged Activities” program.

June 18, 2010: MINToring 2010 In 2010, the Max Planck Institute [http://www.sdw.org/schuelerakademie/mintoring/] for Informatics also participated in com- MINToring = MINT (mathema- piling and organizing new concepts for tics, informatics, science and engineer- teaching informatics to pupils and high- ing) + mentoring. The training program school students: In May, the institute “MINToring” is a joint initiative of the was visited by the youth project Klick- Foundation of German Business (sdw), Clic [http://www.klick-clic.eu], where child- the Federal Ministry of Education and ren were presented with informatics Research (BMBF) as well as regional problems in the form of games. In Sep- sponsors. tember and November, two events, lec- tures, experiments, and workshops have Booth of the Competence Center Computer Science taken place within a seminar cooperation Saarland August 4, 2010: UniCamp 2010 with a local high-school. This cooperation [http://www.uni-saarland.de/verwalt/beauftr/ has been planned for an entire school The collaboration between the KIS frauen/unicamp2010] year and aims to use the concept of the and the Max Planck Institute for Infor- In addition to the Germany wide seminar subject to bring participating matics‘ public relations department is organized Girls’ Day, the UniCamp has high-school students closer to scientific particularly close in this respect. The a similar purpose with a more regional work and the field of informatics. ::: center‘s participation at the “First Girl’s focus. It is organized by Saarland Uni- Technology Congress” in the Pirmasens versity and the Ministry of Education, Dynamikum museum, at the information Culture and Science of Saarland. The fair “Graduation – what’s next?” in the goal is to encourage female high-school Saarbrücken Congress Hall, particularly students age 13–15 to discover their ...... dedicated to local high-school students, ...... science/technical talents. bericht_2011_en_korr2 24.11.201116:40UhrSeite96 96

...... and RupakMajumdar(MPI-SWS)from logy (DST).KurtMehlhorn(MPI-INF) Indian MinistryforScienceandTechno- cation andResearch(BMBF)the Society, theFederal MinistryofEdu- the firstfiveyearsbyMaxPlanck and schoolings. Planck guestprofessor, andworkshops doctoral candidatesandpostdocs,aMax in upto10projects,theexchangeof This cooperationincludesjointresearch tween IndianandGermanscientists. in informaticsclosecooperationbe- support first-classfundamentalresearch at IITDelhi.Thegoalofthecentreisto Dr. HorstKöhleronFebruary 3,2010 was openedbythePresidentofGermany Center forComputerScience”(IMPECS) Prof. Dr. ReinholdAchatzandProf.Dr. Kurt Mehlhorn Cooperation intheformofdoctoralscholarships Cooperation withSiemensAG Fundamental researchincollaborationbetweenIndianandGermanresearchers Indo-German MaxPlanckCenterforComputerScience(IMPECS) NEWS Planck InstituteforInformatics, signed current ManagingDirectoroftheMax nologies andProf.Dr. KurtMehlhorn, of SiemensCorporateResearchandTech- IMPECS willbefinancedforatleast The Prof. Dr. Reinhold Achatz, Director “Indo-German MaxPlanck

...... of ajointworkshop. was heldin April 2010inDelhiform Planck visitingprofessor. A kick-offevent several monthsatIITDehliasaMax Agarwal fromDukeUniversityspent students onbothsides.Prof.Pankaj are nowinoperation. tute forSoftwareSystems.Eightgroups for InformaticsortheMaxPlanckInsti- must befromtheMaxPlanckInstitute side; fromtheGermanside,onepartner institute canparticipatefromtheIndian Indian researchgroups. Any university/ the projectleaders. (IIT Kanpur)fromtheIndianside,are Garg (IITDelhi)and the MaxPlanckSocietyside,andNaveen and Optimization.Furtherareas ofco- Technology FieldModeling,Simulation and ChristophMollfromSiemens AG, Department headedbyKurtMehlhorn between the Algorithm andComplexity back onalongtradition,particularly tute andSiemens AG. thecooperationbetweenourinsti- pen of doctoralscholarshipholdersanddee- tion ofjointresearchprojectsbymeans agreement willfacilitatetheimplementa- the institute’s CuratorshipBoard.The Hartmut Raffler, alongtimememberof about wasthecommitmentofProf.Dr. main factorsthatbroughtthisagreement brücken onMay10,2010.Oneofthe form ofdoctoralscholarshipsinSaar- an agreementforcollaborationinthe There isexchangeofpostdocsand IMPECS financesupto10German- This cooperationcanalreadylook Manindra Agrawal

...... a and, togetherwithMicrosoftResearch, “Geometric Computing” and aworkshop(November2010)on cember Computation” a addtsi 01 ::: candidates in2011. will alsoshareresearch AG, and HerwigSchreinerfromSiemens Max PlanckInstituteforInformatics by ChristophWeidenbach fromthe based configuration,eachrepresented of logicautomationandconstraint- Friedrich fromSiemens AG. Theareas ment area,representedbyHermann cooperates withtheKnowledgeManage- the MaxPlanckInstituteforInformatics, represented byGerhardWeikum from Munich. Theareaofinformationsystems, of ajointworkshoponJune21,2010in and optimizationweredefinedaspart operation beyondthetopicsofalgorithms Federal PresidentHorstKöhlerattheopeningceremony aglr Jnay21) ::: Bangalore (January2011). “School on Approximability” “School onParameterized andExact Furthermore, IMPECSco-financed We expectthefirstjointdoctoral 2010), aschool(October2010) at IMScChennai(De- at IITDelhi efforts. at IISc

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 97 ...... NEWS 97

Alumni Informatics Saarbrücken - hotbed for outstanding researchers

One of the goals of the Max Planck has created outstanding researchers professor at the Humboldt University Institute for Informatics is to promote for computer science talent is Prof. Dr. in Berlin. Susanne Albers has received junior researchers who will eventually Susanne Albers. Susanne Albers works numerous awards: In 1993, she was take up leading positions in academia or on the design and analysis of algorithms, awarded with the Max Planck Society's industry. Since our founding in 1990, in particular online- and approximation Otto Hahn medal for her outstanding 403 scientists have been doing research algorithms. She is also interested in doctorate; in 2008, she received the at our institute and then moving on to algorithmic game theory and algorithm Gottfried Wilhelm Leibniz Prize of the continue their professional development. engineering. After completing her studies German Research Foundations (DFG) These young scientists typically come to in mathematics, informatics and business for her contributions on efficient algo- us either after completing their diploma administration in her home town of Osna- rithms. With its 2.5 million Euros, the or master's degree or following their brück, Susanne Albers earned her docto- Leibniz prize is Germany’s most presti- doctorate. We have so far hosted 231 rate in the first DFG graduate college for gious scientific award. In 2010, Susanne doctoral students, of which 153 moved informatics in 1993 under the supervision Albers was accepted into the national to other research institutes after com- of Kurt Mehlhorn at Saarland University. academy of sciences, the Leopoldina. ::: pleting their PhD at Saarland University. She then worked at the Max Planck In- The remaining 78 took employment in stitute for Informatics until 1999 and industry. Of the 172 postdoc researchers, made several research stays to the US, 131 went on to other research institutes Japan and around Europe. after spending two to five years with us, and 41 went to industry. Since the time At the age of just 33, Susanne the institute was founded, a total of 84 Albers qualified as a professor in 1999 of its researchers have gone on to pro- and immediately received three offers fessorial positions. for professorships; she ultimately decided on the one in Dortmund. In 2001, she An excellent example for the fact moved to take the professorship for algo- that the institute, and thus also the over- rithms and complexity at the University

...... all Saarbrücken location for informatics, ...... of Freiburg. Since 2009, she has been a ......

Lise Meitner Award for Outstanding Female Computer Scientists Award winner Dr. Anke van Zuylen

Female computer scientists are rare. scholarship and a small research budget. Less than 20 percent of the freshmen The first winner is Dr. Anke van Zuylen. in informatics are women. At higher Anke van Zuylen received her doctorate career levels, the percentage of women in 2008 at Cornell University in the area is even lower. Therefore, the promotion of Operations Research. Afterwards, she of women in informatics is important. was a postdoc with Andy Yao at the Insti- Saarbrücken has already achieved that tute for Theoretical Computer Science the percentage of women over the in Beijing, China. In September 2010, different qualification levels (bachelor, she joined the Max Planck Institute for master, doctorate, professorship) is con- Informatics. stant. However, constant at a much too low level. She primarily works on the design Dr. Anke van Zuylen and analysis of algorithms for NP-hard In 2010, the institute introduced optimization problems, especially for net- costs, and other factors. Anke van Zuylen the Lise Meitner Award for the promotion work design. The aim is to design net- has developed approximation algorithms of outstanding female postdoctoral stu- works that fulfill specific requirements for a wide variety of network design

...... dents. The prize consists of a two-year ...... with respect to robustness, bandwidth, ...... problems. ::: ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite98 98

...... will notbenefitappropriatelyfromthe world, whereastheprovidersof on thebasisofthesedatainWestern cause researchwillthenbe relevant viralgenomesequences,be- their hesi these geographicregionshavemounted and Africa). Inrecentyears,countriesin as birdsandpigsintheThirdWorld the Western world,butinanimals tial areemerging,notinhumans many viralvariantswithahighriskpoten- recent yearsregardinginfluenza.Here currently being Virus” titis C (see of HBVResistance” A databaseandcommunicationplatformforinfluenzaresearch GISAID –GlobalInitiativeonSharingAllInfluenzaData NEWS page 41).For and lysis ofHIVDrugResistance” taking placeregardingHIV(see For abouttenyears,thisprocesshasbeen which areascomprehensivepossible. cient volumesofsuchdataindatabases, the pathogenandcollectionofsuffi- measurement ofthegenomicchanges The basisforthisresearchisthecontinued continued evolutionofthepathogens. based onresearchholdingpacewiththe as thedevelopmentofeffectivedrugs,is infections, e.g.,viavaccination,aswell renews itself.Preventivecareagainst threat posedbypathogenscontinuously well asdrugsappliedfortherapy. So,the evading thehumanimmunesystemas constantly therebyfindingnewwaysof great threattomankind.Pathogens change “Prediction ofHIVCoreceptorUsage” A specialproblemhasoccurredin Infectious diseasescontinuetobea , page39)suchdatabasesare tation toprovidedataonthe “Fighting theHepatitisC Hepatitis B(see initiated. , page76)andHepa- performed , page77, “Analysis and in the data “Ana- such (Asia , ...... their data. value ofthe have secured the data.Thus,providersofdata scientists tofairandsustainableuseof Agreement, thesigningofwhichobliges ble throughanewtypeofData Access tries. Thesedatacouldbemadeaccessi- from thedevelopingandemergingcoun- the world,includingimportantsequences collection ofinfluenzavirussequencesin database, whichcontainsthelargest the initiativefreelyoffersEpiFlu™ On theinternetpage,www.gisaid.org, biological aswellmethodicaldisciplines. from allregionsoftheworldand was formedin2006.Itjoins Initiative onSharing All Influ international initiativeGISAID bird fluvirusin evident withthe of influenzadatabecameparticularly results. Thisbottleneckintheavailability Influenza virus Max Planck Institute for Infection Biology, Berlin) Biology, Infection for Institute Planck Max In responsetothissituation,the The MaxPlanckInstitutefor (Image: Volker Brinkmann, Brinkmann, Volker (Image: results ofresearchbasedon their shareintheadded 2005 inSoutheast Asia. emergence oftheH5N1 enza Data) scientists (Global

...... eerh ::: research. tial resourceforinternationalinfluenza GISAID promisestobecomeanessen- agreement withtheGISAIDFoundation. of theyear2011,basedonacooperation of theGISAIDportalasbeginning Protection hasbecometheofficialhost Nutrition, Agriculture andConsumer influenza. ment oftheseasonalvaccineagainst selection ofviralstrainsforthedevelop- in ordertoperformthesemiannual by allWHO of 2008,thisdatabasehasalsobeenused the EpiFlu™database.Sincemiddle the GISAIDportal,andespeciallyfor Informatics The GermanFederal Ministryof develops thesoftwarefor collaborating laboratories

...... bericht_2011_en_korr2 25.11.2011 11:00 Uhr Seite 99 ...... 99

Promoting Young Computer Science Talent in Germany: BWINF Supporting the German computer science high-school competition

Since April 2010, the Max Planck Institute for Informatics, in conjunction with the Fraunhofer Group for IUK Technology and the German Computer Science Association GI, has been the fourth sponsor of the BWINF initiative. The program is given basic funding by the German Federal Ministry for Edu- cation and Research and includes the Informatics Beaver [http://www.informatik- biber.de/], the Federal Contest for In- formatics [http:// www.bundeswettbewerb- informatik.de/], the training for the Inter- national Olympiad in Informatics [http:// www.informatik-olympiade.de/], and the Infor- matics Entry Portal [http:// www.einstieg- Federal Competition for Informatics and participants assist in the workshops informatik.de/]. The idea is to attract young the training of the Olympiad team. Every that prepare the students for the federal people to informatics right after elemen- year, we invite the successful participants competition and the Olympiad. In a tary school. The first encounter is then the from the second competition round of second step, four participants from this Informatics Beaver Competition. In the the federal contest to a research day in group are chosen for the Olympiad team. Federal Informatics Competition, further Saarbrücken. The motto is “teach and These students are then given final trai- talents are identified and promoted, and challenge”. The program consists of work ning. In 2010, this training was carried the best of them are accepted and trained in small groups on current informatics out by the Max Planck Institute for on the Olympiad team for Informatics. questions, lectures, discussions and a Informatics and the computer science These activities are accompanied by a framework program where participants department, by professors Markus portal that offers in-depth support. can get to know one another. Bläser, Sebastian Hack, and Christoph Weidenbach in Saarbrücken. And with The Max Planck Institute for There is a development program great success! Two of the four students Informatics is particularly committed, for the best students from each year that won gold medals at the 2010 Olympiad together with the computer science de- provides the first step towards participa- in Waterloo, Canada. ::: ...... partment of Saarland University, to the ...... tion in European competitions; former ......

Open House STEP 2010 „Informatics Saarland“ Introduction phase for students

At the open house of Saarland One week before the beginning University, the Max Planck Institute for of the lecture period of the computer Informatics, along with the Max Planck science winter term at Saarland Uni- Institute for Software Systems, the Ger- versity, the student council of the com- man Research Center for Artificial In- puter science department organized a telligence (DFKI), the computer science student introduction phase (called department of Saarland University, and “STEP”) [http://cs.fs.uni-saarland.de/step]. the student council presented their In three days, about 130 freshmen from research projects and results to inter- the departments of computer science, ested visitors. Scientists and students bioinformatics, and computing and digi- gave lectures and demonstrations of tal media were given all the information current scientific achievements from they need for a successful start at the running projects, providing insight university. The Max Planck Institute for into the current state of informatics Informatics supported the event with

...... research. ::: ...... informative literature and talks. ::: ...... bericht_2011_en_korr2 24.11.201116:40UhrSeite100 100 n 1,000Euro 10,000 11,000 The InstituteinFigures THE INSTITUTEINFIGURES 1,000 3,000 5,000 2,000 4,000 6,000 7,000 8,000 9,000 Budget withoutthird-partyfundsfrom2007to2010 Staff 230 Total 117 Current asof1/1/2011 77 25 11 2007 Postdoctoral staff and Masterstudents including Bachelor Scientific staffnot Non-scientific staff Guests

...... Personnel expenses 0820 2010 2009 2008 staff members Ratio ofscientifictonon-scientific Material expenses Current asof1/1/2011 10.9% 89.1% Promotion ofyoungscientists Non-scientific staff Scientific staff Investments bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 101 ...... 101

Third-party funding from 2007 to 2010 Numbers and distribution

Other Other Other Other Machine Learning EU EU EU EU Machine Learning Ind. partnergroup Machine Learning Machine Learning Ind. partnergroup TN McHardy TN McHardy Ind. partnergroup Ind. partnergroup IMPRS IMPRS TN McHardy TN McHardy Industry IMPRS Industry IMPRS Industry Industry DFG DFG DFG DFG Federal government Federal government Federal government Federal government

2007 45 projects 2008 49 projects 2009 58 projects 2010 53 projects

Third-party funding from 2007 to 2010 Revenues

5,000

4,500

4,000

3,500

3,000

2,500

2,000

1,500

1,000

500

in 1,000 Euro 2007 2008 2009 2010

CONTACT

Volker Maria Geiß Joint Administration Phone +49 681 9325-5700 bericht_2011_en_korr2 24.11.201116:40UhrSeite102 102

...... this research. and reliabilityoftheequipmentaswellitsusabilitymakeadecisivecontributionto basis foraninstitutewiththeaimofproducingfirst-classresearch.Theflexibility, quality Unhindered globalcooperationandcommunicationinamotivatingenvironmentformthe Technology Information Servicesand INFORMATION SERVICES&TECHNOLOGY

......

...... bericht_2011_en_korr2 24.11.2011 16:40 Uhr Seite 103 ...... 103

This aim can be transferred to our Reliability of the installations zones (DMZs), which are differentiated IT infrastructure: We operate a versatile Regular updates and upgrades with according to their meaning and their system that can adapt to rapidly evolving software and hardware for all supported hazard potential. requirements and provide the user with platforms pose high demands on the re- consistency and reliability. It does not liability of the installation and admini- The network is structured in a way neglect security despite the openness stration. that promotes the integration of guest necessary to support international co- scientists and students; they can connect ...... operation...... notebooks they have brought with them Cooperation and communication by wire or via WiFi, all without additional Our network is divided into various software installations. The network infra- Multiplicity of tools areas according to organizational and structure automatically treats them as We deploy systems from various security-relevant points. The end-user external machines. manufacturers, depending on the pro- equipment in network sections in indivi- jects’ requirements. Linux and Windows dual areas is powered via its floor switch International cooperation demands dominate in both notebook and work- with Gigabit Ethernet. This equipment external access possibilities to internal station use. MacOS is gaining favor. is connected in a fail-safe manner with resources in the infrastructure (Intranet). Most servers run on Linux. If the appli- 10 GB Ethernet at the central station Here, we offer, among other things, cations or services require it, however, (two redundant backbone switches). From secure login and access to e-mail and Windows or Solaris are used. We use there, the central servers are supplied other important databases and services. Solaris in file servers due to its stability with this bandwidth as their services Cooperation in software development and other performance characteristics, require it. Server farms and computer is supported by protected access to a for example. clusters are also connected over redun- revision control database (software re- dant switches in a fail-safe manner. pository: Subversion). Dynamics and innovation The external area includes the Research on the front line also necessary 1GB and 10GB connections means constantly using innovative tech- to different facilities on the campus of nologies. Sometimes, we even have to Saarland University in Saarbrücken and deploy and support prototypes. on the university campus in Kaiserslautern. Internet connectivity is realized with a 2.6 GB XWIN DFN association connec- Extensively uniform user experience tion that is used jointly with the univer- The research projects are often sity. Our external connections are also cross-platform, as they are either inten- delivered in pairs, as a rule, in order to ded for a heterogeneous composite, or implement the required fail-safe relia- the benefits of various computer and bility. operating system architectures must be used. Externally accessible, but anony- mous services (DNS (Internet address The typical workstations are com- book), WWW, FTP (data transfer), SMTP pletely exchangeable, because the user (e-mail) etc.) are combined at the insti- data and most important software packa- tute’s firewall in several demilitarized ges are available independently of equip- ment and platforms. This homogeneity makes it easier to work with the entire system and promotes its acceptance...... bericht_2011_en_korr2 24.11.201116:40UhrSeite104 104 INFORMATION SERVICES&TECHNOLOGY tions basedondiskimages. and heterogeneousforsimpleinstalla- for thesystemsinvolved. with therequiredreliabilityandprecision work withinareasonableamountoftime can allowustoresolvethenecessary to workwithindividualneeds.Onlythis automated processesthataredesigned Instead, weuseextensive,advanced, approach whensupportingeachdevice. software componentsforbidsanad-hoc tasks. geneity andreliability, areourcentral security concept,whileretaininghomo- demands oftheresearchprojectsand structure andadjustingtothenewest Automation asguarantee ously updatedlocalvirusscanners. current softwarestatusesandcontinu- systems, mustbefoughtagainstthrough net orfaultsinexternallyactingsoftware to virus-infectedcomputersintheIntra- Indirect hazards,suchastheconnection access, orvirusscannersinthemailserver. the firewall,encryptionforexternal ted throughthestructureofnetwork, follows therequirements. more thanacompromisethatflexibly rity guidelinescanthereforebenothing usability totoogreatadegree.Thesecu- possible inopensystems.Theylimit against sabotageandespionagearenot Security andprotection Our environmentistoodynamic The merevarietyofhardwareand Continually developingtheinfra- Some directhazardscanbedeflec- Fixed mechanismswhichprotect

...... is onlyusedfor of theinstallationsystem.However, it ments) quicklythroughspecificextensions requirements (suchashardwarerequire- so itcanbecustomizedtosuit disadvantages. ready, ses andtimeuntillargerchangesare tages foroperatingsecurity, totalexpen- slows work costseffortand,insomecases, infrastructure. Thisimplementation and usedveryquicklyacrossthewhole achieved canberepeatedasrequired and Windows. Resultsthathavebeen operating systemversionsofDebian administration systemforthecurrent we haveimplementedaninstallationand (netinstall). with theassistanceofadditionalsoftware have introducedsuchapackagesystem packages. For Windows platforms,we sider dependenciesbetweentheinstalled based onpackagesystemsthatalsocon- the Linuxenvironment,isbydefault nisms. Solaris,butaboveallDebianin with package-orientedinstallationmecha- uniform accesstoalargemainmemory. tions whichrequirehighparallelism and used forscientificresearchand applica- of mainmemory. Thesemachinesare processors andhaveuptooneterabyte that workwithupto64closelycoupled Compute Service associated with is notpracticalduetothehighexpense gration ofshort-livedspecialinstallations The systemisflexiblystructured, Following theexampleofSolaris, We thereforefavoroperatingsystems We operateseverallargersystems down reactiontimes.Theadvan- however, clearlyoutweighthese its automation. repetitive tasks.Theinte- changing

...... resources acrossallthewaitingjobs. appropriate andfairdistributionof of specificprocesseshelpswiththe available hardware.Theprioritization can distributethemoptimallyonthe for thecluster, sothattheSGEsoftware ing torequirementsbeforetheirrelease this system.Jobsarecategorizedaccord- the above-mentionedlargersystemsinto stallations allowsustoflexiblyintegrate homogeneity oftheoperatingsystemin- utilization onthewholesystem.The dual computersallowsustoreachhigh tion ofprocessesonthecluster’s indivi- Engine (SGE).Theautomaticdistribu- decessor, operatedundertheSunGrid operation inmid-2010.Itis,likeitspre- main memoryapiece,wasbroughtinto 128 systemswith8coresand48GB Our largestclustertodate,with

...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 105 ...... 105

File service and data security source system BackupPC). The numbers Special systems The data of the institute is made for this online data security are impressive. For special research tasks, especially available over the current 20 file servers Because no data in this system is actually in the area of computer graphics, diverse via NFS and CIFS (SMB). The now held with unnecessary redundancy, about special systems are required. Available roughly 147 TB of central data is dis- 615 TB of gross data can be reduced to systems include a video editing system, tributed across 52 RAID systems with about 36 TB in the different backup runs. several 3D scanners, multivideo record- 2843 disks, which are connected to a In order to combine the benefits of disk ing systems and 3D projection systems.

...... redundant Storage Area Network (SAN)...... and tape technologies, we will combine ...... External communication and collabora- All RAID systems are operated in a paired this system with the tape robot in the tion is supported through the operation mirroring architecture, which is also pro- future. of several videoconference installations. tected against the failure of an entire RAID system. This mirroring by the file The tape robot currently has the system software utilizes test sums to ability to access 700 tapes with a total Operational reliability protect against creeping data changes capacity of 700 TB (uncompressed). A monitoring system based on (ZFS). Any detected data errors on a The robot is located in a specially-pre- Nagios (open source) provides information mirrored half are automatically repaired pared room to maximize the protection on critical states of the server systems during running operation, whereby the of its high-value data. In the next con- and the network, and on malfunctions false data is overwritten with the correct struction phase, the system will be ex- of complex processes via e-mail and text data. The two mirrored halves are ac- tended to a second robot that can then messaging. commodated in at least two different fire operate at a remote site to protect the compartments, so that a fire would have data against disaster situations. This is at Power supply and cooling are de- little chance of causing data loss, even least currently achieved offline through signed in a way that server operation can at the deepest level. Three file servers the storage of copies in a special fire-safe be maintained, even during a power apiece share a view of the disk status data safe or at a second site. outage. As the last link in the chain, a and can substitute for one another within generator with a power output of roughly a very short period of time using virtual one megawatt supplies the core infra- network addresses and SAN technology. structure and the cooling systems with This also allows for server updates with- sufficient power to guarantee uninter- out any perceivable interruptions to rupted operation. Our new computer service. room, which was realized as part of the renovation and enlarging of infrastructure, Our data security is based on two is now operating. It hosts, among other different systems: a conventional tape things, the aforementioned redundant backup, which secures the data directly systems, which could thus finally be in- using a Tivoli Storage Manager TSM on stalled in two different locations. a tape robot (StorageTek), and an online disk backup system that minimizes space requirements by using data comparison Responsibilities and always keeps the data online (open Procurement, installation, admini- stration, operation, applications support and continuation of the described systems and techniques are the duty of IST (In- formation Services and Technology), the joint IT department for the Institutes for Software Systems and Informatics...... bericht_2011_en_korr2 24.11.201116:41UhrSeite106 106 [http://www.cs.miami.edu/~tptp/CASC/22/] “CADE ATP SystemCompetition,” international cooperationssuchasthe different scenarios,sometimesevenin for manyscientificprojectsinvery and computingclustersaredeployed Flexible supportofscientificprojects (MMCI)”. “Multimodal ComputingandInteraction library andtheClusterofExcellence also responsiblefortheentirecampus and institutesoftheUniversity, ISTis tutes incooperationwithdepartments As aresultoftheworktwoinsti- INFORMATION SERVICES&TECHNOLOGY library andMMCI. administrative taskstosupportthejoint also one Institute forSoftwareSystems.Thereis scientific staffmembersfromMaxPlanck technician. ISTissupplementedbyfive six scientificstaffmembersandone Planck InstituteforInformaticsprovides procurement (twopositions),theMax Staff structure IST onlyprovides self-administered systems,however, the ISTportfolio.For problemswith intersection ofprojectrequirementsand tasks isinmostcasesdictatedbythe applications support.Thedivisionof server hostinginindividualcasesto tailored supportthatextendsfrompure requirements, ISToffersspecifically In ordertomeetthevariousresulting The describedservices,servers In additiontomanagementand employee apieceforadditional consultation. .

...... therefore activeatmultiplesites. The membersofthefrontgroupsare cover thespecificneedsofinstitutes. together. Thefrontgroupsaccordingly for bothinstitutesorareevenoperated responsible forservicesthatareidentical simplified fashion,thecoregroupis several frontgroups.Presentedina IST isdividedintoonecoreand

...... nenthttp://www.mpi-inf.mpg.de/services/ist/ Internet +496819325-5800 Phone IT Department Jörg Herrmann CONTACT FQ,adablei or.::: (FAQ), andabulletinboard. of interestingquestionsandanswers systems, includinginteralia,asum this groupalsomaintainsinformation concerning theuseofinfrastructure, hours. Inadditiontoprocessingquestions interface, orinpersonduringbusiness reachable eitherbye-mailoronaweb munications interfaceforusers,andare one-year studentapprentices,thecom- helpdesk. Theyform,togetherwithtwo a 10-memberteamofstudentsfortheir The frontgroupsaresupportedby mary

...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 107 ...... 107 bericht_2011_en_korr2 24.11.201116:41UhrSeite108 108

...... Selected Cooperations COOPERATIONS ::: GISAID Foundation, Friedrich Loeffler Institute, ::: ::: University of Adelaide, UniversityCollegeLondon, ::: ::: Toyota, Technical UniversityofDarmstadt, ::: : Federal Officefor Agriculture and ::: EuropeanBioinformaticsInstitute, ::: CSIROLivestock Industries, ::: ::: Christian-Albrechts-University, CentrumWiskunde &Informatica, ::: ::: BioSolveIT GmbH, SwissFederal Instituteof ::: ::: Stanford University, ::: Microsoft Research, LeibnizUniversityHanover, ::: IntelVisual ComputingInstitute, ::: ::: Intel Corporation, IndianInstituteofTechnology Delhi, ::: CaliforniaInstituteofTechnology, ::: ::: Bosch, Food, Amsterdam, Netherlands Washington DC,USA Insel Riems,Germany London, UK Darmstadt, Germany Hinxton, UK St. Lucia,Queensland, Australia Kiel, Germany Sankt Augustin,Germany Australia Technology,, Germany Hanover, Saarbrücken, Germany Delhi, India Pasadena, USA IMAGES &VIDEOS UNDERSTANDING BIOINFORMATICS Bonn, Germany Hildesheim, Germany Brussels, Belgium uih Switzerland Zurich, Santa Clara,USA Stanford, USA Cambridge Adelaide, , UK

...... : HarvardUniversityandBroad ::: : UniversityofNijmegen, ::: UniversityofCologne, ::: UniversityofHeidelberg, ::: UniversityofFreiburg, ::: UniversityofDüsseldorf, ::: UniversityofCalifornia, SantaCruz, ::: UniversityofBritish Columbia, ::: UniversityofBarcelona, ::: Technical UniversityDortmund, ::: ::: Ruprecht-Karls-University, ::: Roche Diagnostics, ::: Stanford University, ::: Saarland University, MaxPlanckInstituteforNeuro- ::: MaxPlanckInstituteforMolecular ::: LaboratoryDr. Thiele, ::: ::: Karolinska Institute, ::: Informa SRL, ::: IBM Haifa, : GoetheUniversityFrankfurt am ::: GladstoneInstituteofCardiovascula ::: : MaxPlanckInstituteforBiophysical ::: Germany California, USA Vancouver, Canada Germany Sweden Main, Disease, Chemistry, Germany Institute, Nijmegen, Netherlands Heidelberg, Germany Freiburg, Germany Düsseldorf, Germany Barcelona, Spain Dortmund, Germany Heidelberg, Germany Penzberg, Germany logical Research, Physiology, Frankfurt, Germany San Francisco, USA Cambridge, USA Göttingen, Germany Dortmund, Germany Haifa, Israel Rome, Italy Cologne, Germany tnod USA Stanford, Saarbrücken, Stockholm, Kaiserslautern, Cologne,

...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 109 ...... 109

::: University of Padua, Padua, Italy ::: Tel Aviv University, Tel Aviv, Israel ::: Hebrew University, Jerusalem, Israel ::: University of Salzburg, ::: Università degli Studi di Udine, ::: Indian Institute of Technology Salzburg, Austria Udine, Italy Bombay, Bombay, India ::: University of Siena, Siena, Italy ::: University of Freiburg, ::: International Computer Science Freiburg, Germany Institute, Berkeley, USA ::: University Tor Vergata, Rome, Italy ::: University of Haifa, Haifa, Israel ::: Japan Advanced Institute of Science ::: Wellcome Trust Sanger Institute, and Technology, Ishikawa, Japan

...... Hinxton, UK ...... ::: University of Liverpool, ...... Liverpool, UK ::: Jülich Supercomputing Center, ::: WHO Collaborating Centers for Jülich, Germany Reference & Research on Influenza, ::: University of Milan, Milan, Italy Melbourne, Australia, Peking, China, ::: Karlsruher Institute for Technology Tokyo, Japan, Atlanta, USA, London, ::: University of Oldenburg, (KIT), Karlsruhe, Germany UK Oldenburg, Germany ::: Magyar Tudomanyos Akademia ::: University of Twente, Szamitastechnikai Es Automatizalasi Twente, Netherlands Kutato Intezet, Hungary GUARANTEES ::: Microsoft Research, Cambridge, UK ::: Centrum Wiskunde & Informatica, Amsterdam, Netherlands INFORMATION SEARCH ::: Microsoft Research, Redmond, USA & DIGITAL KNOWLEDGE ::: Efi Arazi School of Computer ::: National Library of the Czech Science and Engineering, ::: Centrum Wiskunde & Informatica, Republic, Prague, Czech Republic Herzliya, Israel Amsterdam, Netherlands ::: New York University, New York, USA ::: Enterpreneurial University of ::: Computer and Automation Research Munich, Munich, Germany Institute, Hungarian Academy of ::: Polish Academy of Science, Sciences, Budapest, Ungarn Warsaw, Poland ::: Indian Institute of Science Bangalore, Bangalore, India ::: Consorzio Nazional Interuniversitario ::: Rutgers Center for Operations per le Telecomunicazioni, Parma, Research, New Jersey, USA ::: Indian Institute of Technology Delhi, Italy Delhi, India ::: RWTH Aachen, Aachen, Germany ::: Documentation Research Training ::: International Computer Science Center, Indian Statistical Institute, ::: School of IT - The University of Institute, Berkeley, USA Bangalore, India Sydney, Sydney, Australia ::: Japan Advanced Institute of Science ::: Ecole Polytechnique Fédérale de ::: Siemens AG, Munich, Germany and Technology, Ishikawa, Japan Lausanne, Lausanne, Switzerland ::: Simon Fraser University, Burnaby, ::: Laboratoire Lorrain de Recherche ::: Emory University Atlanta, Canada en Informatique et ses Applications, Atlanta, USA Nancy, France ::: SORA Institute for Social Research ::: European Archive Foundation, and Analysis Ogris & Hofinger ::: Max Planck Institute for Software Amsterdam, Netherlands GmbH, Vienna, Austria Systems, Saarbrücken, Germany ::: Federal Library of Niedersachsen, ::: Stanford University, Stanford, USA ::: National Information and Göttingen, Germany Communications Technology ::: Stichting European Archive Australia, Canberra, Australia ::: German-Israeli Foundation for Amsterdam, Netherlands Scientific Research and ::: Politecnico di Milan, Milan, Italy Development, Israel ::: Technical University of Dresden, Dresden, Germany ::: Purdue University, ::: Google, Zurich, Switzerland West Lafayette, USA ::: Tel Aviv University, Tel Aviv, Israel ::: Leibniz University Hanover, ::: Rutgers Center for Operations Hanover, Germany ::: T-Systems Solutions for Research Research, New Jersey, USA GmbH, Germany ::: Hanzo Archives Limited, London, ::: School of IT - The University of UK ::: Università degli Studi di Pavia, Sydney, Sydney, Australia Pavia, Italy ...... bericht_2011_en_korr2 24.11.201116:41UhrSeite110 110

...... ::: Yahoo! Research, UniversityofWürzburg, ::: UniversityofTwente, ::: UniversityofTübingen, ::: UniversityofTrento, ::: UniversityofSouthampton, ::: UniversityofRome„LaSapienza“, ::: : INRIASaclayÎle-de-France, ::: IndianInstituteofTechnology ::: IndianInstituteofTechnology Dehli, ::: FutureSOC LaboftheHasso- ::: ::: Cornell University, ::: Aberystwyth University, UniversityofBonn, ::: Technical UniversityofDarmstadt, ::: SaarlandUniversityofMusic, ::: SaarlandUniversity, Excellence ::: COOPERATIONS ::: Southampton, UK Rome, Italy Kanpur, Plassner-Institute, University Barcelona, Spain Würzburg, Germany Netherlands Twente, Tübingen, Germany Greece Patras, Orsay Dehli, India Aberystwyth, UK Bonn, Germany Darmstadt, Germany Saarbrücken, Germany Interaction, Cluster MultimodalComputingand OPTIMIZATION INFORMATION MULTIMODAL , France Kanpur, India of Patras, Saarbrücken, Germany Potsdam, Germany Ithaca, USA Trento, Italy

...... : UniversityofIowa, ::: UniversityofFreiburg, ::: UniversityofBirmingham, ::: ::: University of Adelaide, UniversitàdiRoma„LaSapienza“, ::: UniversidaddeChile, ::: TheCommonwealthScientificand ::: Technische UniversitätWien, ::: Technical UniversityofDortmund, ::: Technical UniversityofDenmark, ::: Technical UniversityofBerlin, ::: SwissFederal InstituteofTechnology ::: ::: PROSTEP AG, Darmstadt ::: New York University, MITComputerScienceand ::: ::: Maastricht University, JosipJurajStrossmayer, University ::: : Warwick MathematicsInstitute, ::: ::: Vrije Universiteit Amsterdam, UniversityofTokyo, ::: UniversityofKiel, ::: Rome, Italy Melbourne, Australia Industrial ResearchOrganisation, Dortmund, Germany Copenhagen, Denmark Darmstadt, Germany of Osijek, Warwick, UK Amsterdam, Netherlands Freiburg, Germany Birmingham, UK Adelaide, Australia Santiago, Chile Vienna, Austria Berlin, Germany Zurich, New York,USA Cambridge, USA Artificial IntelligenceLab, Maastricht, Netherlands Zurich, Switzerland Osijek, Croatia Kiel, Germany Iowa Tokyo, Japan , USA

...... ::: UniversityofHanover, ::: UniversityofCalifornia, ::: UniversityofBonn, ::: Technical UniversityofDortmund, ::: Los Alamos NationalLaboratory, ::: ::: Linköping University, ::: Fraunhofer Institutes, : UniversityofBritishColumbia, ::: UniversityofBristol, ::: UniversityCollegeLondon, ::: ::: Telecom ParisTech, Technical UniversityofDenmark, ::: ::: Stanford University, SaarlandUniversity, Excellence ::: InternationalDigitalLaboratory, ::: IndianInstituteofTechnology Delhi, ::: SOFTWARE VISUALIZATION Germany University ofSiegen, Hanover, Germany Dortmund, Germany USA Los Alamos, Sweden Germany Vancouver, Canada London, UK Copenhagen, Denmark Interaction, Cluster MultimodalComputingand UK Warwick, University ofWarwick, Delhi, India Saarbrücken, Germany ai,France Paris, Bonn, Germany Stanford, USA Bristol Siegen, Linköping Wachtberg, Davis, USA , UK , bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 111 ...... 111 bericht_2011_en_korr2 24.11.201116:41UhrSeite112 112 PUBLICATIONS Selected Publications

...... roch forsublatticesoftherootlatticea [5] tics biased featureimportancemeasure. T. Lengauer. [4] Therapy troviral therapyfromHIVgenotype. models ininferringvirologicalresponsetoantire- of predictedphenotypesandstatisticallearning H. Van VlijmenandT. Lengauer. F. Incardona,M.Zazzi,L.Bacheler, Fessel, R. Kaiser, Y. Peres, A. Sönnerborg,W. J. Borght, S.-Y. Rhee,R.W. Shafer,E.Schülter, Winters, E.Van Craenenbroeck,K.Van der [3] 2009. 1006, April The Journal ofInfectiousDiseases geno2pheno-THEO onalargeclinicaldatabase. retroviral therapy:Retrospectivevalidationof Predicting theresponsetocombinationanti- Shafer, M.Zazzi,R.KaiserandT. Lengauer. Rhee, A. Sönnerborg,W. J.Fessel, R.W. winkel, Y. Peres,E.Schülter, J.Büch,S.-Y. [2] ber 2009,LNAI5749,pp.84–99.Springer. Symposium, FroCoS 2009, Frontiers ofCombiningSystems,7thInternational InS.GhilardiandR.Sebastiani,eds., sup(la). bach [1] Weikum. [7] 630. IEEE. Pt. 1,SanFrancisco, USA,June2010,pp.623– puter Vision andPattern Recognition (CVPR) detection. In Monocular 3Dposeestimationandtrackingby [6] R124, 50,September2010. tronic Journal ofCombinatorics 7(4):27:1–27: 15, Transactions ACM on Applied Perception and H.-P. Seidel. [8] J. Savoy, eds., and-Maillet, H.-H.Chen,E.N.Efthimiadisand cation ofpoliticaltext.InF. Crestani,S.March- , 26(10):1340–1347,2010. . Superpositionmodulolineararithmetic O. Amini andM.Manjunath. A. Altmann, L.Tolosi, O.Sander and A. Altmann, T. Sing,H.Vermeiren, B. A. Altmann, M.Däumer,N.Beeren- E. Althaus, E.KruglovandC.Weiden- R. Awadallah, M.RamanathandG. M. Andriluka, S.Rothand B.Schiele. T. O. Aydin, M.Cadik,K.Myszkowski , 14(2):273–283,2009. Language-model-based pro/con 2010 IEEEConferenceonCom- SIGIR, 2010 Permutation importance:anun- July 2010. Visually significantedges. Trento, Italy, , pp.747–748. ACM. , 17(1):R124,1– , 199(7):999– Advantages Bioinforma- Antiviral Riemann- , n . Septem- Elec- classifi- . –

...... the Twentieththe Symposium AnnualACM-SIAM not unique.InC.Mathieu,ed., C.-C. Huang. [16] pp. 13–25.Springer. (ECIR 2010) European ConferenceonInformationRetrieval and U.Kruschwitz,eds., for temporalinformationneeds.InC.Gurrin and G.Weikum. [15] 4(1):45–66, 2010. infrastructure. parametric surfacesI:Generalframeworkand K. MehlhornandR.Wein. [14] tics. Interesting-phrase miningforad-hoctextanaly- J. Dittrich,N.MamoulisandG.Weikum. [13] 2009, LNAI5663,pp.17–34.Springer. Automated Deduction CADE-22, 22ndInternationalConferenceon In R. A. Schmidt,ed., Superposition andmodelevolutioncombined. [12] Orlando, USA,2009,pp.59–66. ACM. SIGEVO Foundations ofGenetic Algorithms X FOGA'09, revisedselectedpapersfrom ACM I. Garibay, R.Wiegand and A. S.Wu, eds., single-objective fitnessfunctions.InT. Jansen, Computing singlesourceshortestpathsusing T. Friedrich, P. P. KururandF. Neumann. [11] LNCS 6347,pp.255–266.Springer. Symposium. –Pt.I Algorithms –ESA2010,18th Annual European graphs. InM.deBergandU.Meyer, eds., D. Rawitz. [10] 218–229. Springer. Pt. II ESA 2010,18th Annual EuropeanSymposium.– In M.deBergandU.Meyer, eds., for stochasticmatchings(extendedabstract). cure foryourmatchingwoes:Improvedbounds V. Nagarajanand A. Rudra. [9] PVLDB , Liverpool,UK,2010,LNCS6347,pp. U. Bhaskar,L.Fleischer,D.Hoyand K. Berberich,S.Bedathur, O. Alonso E. Berberich,Fogel, D.Halperin, S. J.Bedathur, K.Berberich, P. Baumgartner andU.Waldmann. B.Doerr, S.Biswas, S. Baswana, R. Bar-Yehuda, D.Hermelinand N. Bansal, A. Gupta,J.Li,Mestre, Minimum vertexcoverinrectangle , 3(1):1348–1357,2010. , MiltonKeynes,UK,2010, Equilibria ofatomicowgamesare Mathematics inComputerScience A languagemodelingapproach , Liverpool,England,2010, , Montreal,Canada, August Automated Deduction– Proceedings ofthe32nd Arrangements on When LPisthe Proceedings of Algorithms – , , ...... bericht_2011_en_korr2 24.11.201116:41UhrSeite113

...... [20] Bioinformatics assessing molecularinteractiondata. Albrecht. Schelhorn, J.Büch,T. LengauerandM. A. M.Jenkinson,F. Ramírez,D.Emig,S.-E. [19] ISWC Wearable Computers2010,the14th Annual Laerhoven, eds., orientation estimation.InH.-J. Yoo andK.Van human motioncapturingusinggyroscopeless [18] 2010, pp.1–8.IEEE. 2010, the14th Annual ISWC national SymposiumonWearable Computers In H.-J. Yoo andK.Van Laerhoven,eds., composite activitiesbasedonactivityspotting. and transferwhatyouhavelearned–recognizing [17] U.S.A., January2009,pp.748–757. ACM. on Discrete Algorithms (SODA 2009) ment. U. Meyer, eds., processor tasksystems.InM.deBerg and Feasibility analysisofsporadicreal-timemulti- [25] Austin (TX),USA,2010,pp.1350–1359. SIAM. Symposium onDiscrete Algorithms (SODA 2010) In M.Charikar, ed., complexity forperiodicreal-timescheduling. Spaccamela andN.Megow [24] on Graphics(Proc.Siggraph) inverse proceduralmodeling. A connectionbetweenpartialsymmetryand [23] Bioinformatics ting outcomesofhivcombinationtherapies. T. Lengauer. [22] 628:275–296, 2010. EpiGRAPH andGalaxy. Web-based analysisof(epi-)genomedatausing J. Taylor, A. NekrutenkoandT. Lengauer. [21] , Seoul,Korea,2010,pp.1–2.IEEE. C. Bock. H. Blankenburg,R.D.Finn, A. Prli´ U. BlankeandB.Schiele. U. BlankeandB.Schiele. V. Bonifaci and A. Marchetti-Spaccamela. V. Bonifaci, H.-L.Chan, A. Marchetti- M. Bokeloh,Wand andH.-P. Seidel. J. Bogojeska,S.Bickel, A. Altmann and C. Bock,G.Von Kuster,K.Halachev, Epigenomics, Dasmi: exchanging,annotatingand , 25(10):1321–1328,2009. , 26(17):2085–2092,July2010. Dealing withsparsedatainpredic- Algorithms –ESA2010, 18th Epigenetic biomarkerdevelop- International Symposiumon 1(1):99–110, 2009. 21st AnnualACM-SIAM Methods MolBiol , 29(3),2010. . Algorithms and . Algorithms , Seoul,Korea, ACM Transactions Towards Remember , New York, Inter- , c, , ...... Recognition (CVPR) Conf. onComputer Vision andPattern time-of-flight camera.In and C.Theobalt. [34] 1005–1016. SIAM. (SODA 2010) ACM-SIAM SymposiumonDiscrete Algorithms ted machines.InM.Charikar, ed., deterministic truthfulPTAS forschedulingrela- [33] nisms. A. Vidali. [32] 5893, pp.86–97.Springer. 2009 Algorithms, 7thInternationalWorkshop, WAOA and K.Jansen,eds., lity forundirectednetworkdesign.InE.Bampis E. PyrgaandR.van Stee. [27] 3424–3439, 2010. dualization. Left-to-right multiplicationformonotoneBoolean [26] pp. 230–241.Springer. United Kingdom,November2010,LNCS6347, Annual EuropeanSymposium.–Pt.II [29] Algorithms multipleinterval graphs. stein andD.Rawitz. [28] 10(1):186, 1–186,15,2010. of lentivirusgenes. selection ofhivhostfactorsandtheevolution [31] 2010, LNCS6346,pp.23–35.Springer. Symposium. –Pt.I Algorithms –ESA2010,18th Annual European time. InM.deBergandU.Meyer, eds., Non-clairvoyant speedscalingforweightedow [30] Israel, 2010,pp.181–193.COLT. The 23rdConferenceonLearningTheory In A. T. KalaiandM.Mohri,eds., shaped learningresultsbygeneraltechniques. , Copenhagen,Denmark,June2010,LNCS Y. Cui,S.Schuon,D.Chan,Thrun G. Christodoulouand A. Kovács. G. Christodoulou,E.Koutsoupiasand K. BozekandT. Lengauer. E. Boros,K.ElbassioniandMakino. J. CaseandT. Kötzing. A. Butman, D.Hermelin,M.Lewen- G. Christodoulou,C.Chung,K.Ligett, S.-H. Chan,T.-W. LamandL.-K.Lee. Algorithmica , 6(2):40,1–40,18,2010. A lowerboundforschedulingmecha- SIAM Journal onComputing , Austin, Tx.,USA,2010, pp. , Liverpool,UnitedKingdom, 3D shapescanningwitha BMC EvolutionaryBiology , 85(4):729–740,2009. , 2010,pp.1173– 1180. Approximation andOnline Optimization problemsin ACM Transactions on Proc. ofIEEEIntl. On thepriceofstabi- Strongly non-u- COLT 2010, 21st Annual Positive , Liverpool, , Haifa, , 39(7): A ,

...... [39] 2009, LNCS5555,pp.366–377.Springer. Colloquium, ICALP 2009 Languages andProgramming,36thInternational S. NikoletseasandW. Thomas,eds., In S. Albers, A. Marchetti-Spaccamela, Y. Matias, ders, pushvs.pull,androbustness. wald. [38] 29(2):713–722, May2010. Eurographics 2010,Norrköpping,Sweden) Computer GraphicsForum (Proceedings 3D contentforhigh-refresh-ratedisplays. motivated real-timetemporalupsamplingof K. MyszkowskiandH.-P. Seidel [37] 113:8, July2010. ACM Transactions onGraphics play resolutionenhancementformovingimages. K. MyszkowskiandH.-P. Seidel [36] Computer Programming with largediscretestatespaces. symbolic verificationoflinearhybridautomata U. Waldmann andB.Wirtz. W. Hagemann,F. Pigorsch,C.Scholl, [35] pp. 1449–1456. ACM. BestPaper Award. (GECCO-2010) ference onGeneticandEvolutionaryComputation J. Branke,eds., Multiplicative driftanalysis.InM.Pelikan and [41] pp. 184–193.Springer. Conference from Nature–PPSNXI,11thInternational and G.Rudolph,eds., problem. InR.Schaefer, C.Cotta,J.Kolodziej crossover operatorsfortheall-pairsshortestpath F. NeumannandM.Theile. [40] Best Paper Award. 2010, 11th InternationalConference, Parallel ProblemSolvingfromNature–PPSNX: Cotta, J.KolodziejandG.Rudolph,eds., analysis withtailbounds.InR.Schaefer, C. LNCS 6238 B. DoerrandL. A. Goldberg. B. Doerr,T. Friedrich andT. Sauer- P. Didyk,E.Eisemann,T. Ritschel, P. Didyk,E.Eisemann,T. Ritschel, W. Damm,H.Dierks,S.Disch, B. Doerr,D.JohannsenandC.Winzen. B. Doerr,D.Johannsen,T. Kötzing, Quasirandom rumorspreading:Expan- , Krakow, Poland, 2010, Proceedings of12th Annual Con- , Portland, USA,2010, , pp.174–183.Springer. Parallel ProblemSolving , toappear, 2011. , Rhodes,Greece, More effective Exact andfully , 29(4):113:1– Krakow, Poland, Science of . Perceptually- dis- . Apparent LNCS 6238 Automata, Drift , , 113 bericht_2011_en_korr2 24.11.201116:41UhrSeite114 114 PUBLICATIONS 2010. TRAF3IP2. identifies apsoriasissusceptibilitylocus at A. Franke. Elder, S.Schreiber,M.Weichenthal and Kabelitz, U.Mrowietz,G.R. Abecasis, J.T. T. H.Karlsen,G.Mayr,M. Albrecht, D. D. Gladman,C.Gieger,H.E.Wichmann, Weidinger, B.Eberlein,M.Kunz,P. Rahman, S. W. Stoll,J. J. Voorhees, S.Lambert, J. Ding, Y. Li,T. Tejasvi, J.Gudjonsson, M. Belouchi,H.Fournier, C.Reinhard, Stuart, R.P. Nair,S.Debrus,J.V. Raelson, [48] Eng. Bull. graphs withSPARQL andkeywords. R. SchenkelandG.Weikum. [47] 3(2):1589–1592, 2010. ments toexploreextractedinformation. R. Schenkel. [46] NY, USA,2009,pp.1210–1219. ACM. on Discrete Algorithms (SODA 2009) Twentieththe Symposium AnnualACM-SIAM efficients. InC.Matthieu,ed., mum feasiblesubsystemproblemwith0/1-co- R. Sitters. [45] 2010, pp.256–267.SIAM. Algorithms (SODA 2010) First Annual ACM-SIAM SymposiumonDiscrete in doublingmetrics.In for TSPwithfatweaklydisjointneighborhoods [44] pp. 720–733.Springer. Vision 2010, 11thEuropeanConferenceonComputer and N.Paragios, eds., object recognition.InK.Daniilidis,P. Maragos Extracting structuresinimagecollectionsfor [43] May 2009. of Experimental Algorithms to reconstructa2-manifoldinR3. N. Milosavljevic. [42] , Crete,Greece,2010, E. Ellinghaus,D.P. E. S. Elbassuoni,M.Ramanath, S. Elbassuoni,K.Hose,Metzgerand K. Elbassioni,R.Raman,S.Rayand K. ElbassioniandH.Chan. S. Ebert,D.LarlusandB.Schiele. D. Dumitriu,S.Funke,M.Kutzand , 33(1):16–24,2010. Genome-wide associationstudy On theapproximabilityofmaxi- Nature Genetics ROXXI: Revivingwitnessdocu- How muchgeometryittakes Computer Vision –ECCV Proceedings oftheTwenty- , Austin, USA,January , 14:2.2:1–2.2:17, , 42(11):991–995, LNCS 6311 Searching RDF Proceedings of ACM Journal A QPTAS , New York, IEEE Data PVLDB , , ...... [49] [51] Research visualizing exonexpressiondata. AltAnalyze andDomainGraph:analyzing T. Lengauer,B.R.ConklinandM. Albrecht. [50] 2010) (SIGMOD Conference onManagementofData Proceedings ofthe ACM SIGMODInternational and P. G.Spirakis,eds., C. Gavoille,Kirchner, F. MeyeraufderHeide power ofmultiplechoices.InS. Abramsky, Orientability ofrandomhypergraphsand the [56] 2010, pp.2552–2560.IEEE. Proceedings IEEEINFOCOM networks andtheeffectofdensity. In K. Panagiotou [55] 6397, pp.302–316.Springer. 17 Reasoning, 17thInternationalConference,LPAR- Logic forProgramming, Artificial Intelligence,and In C.G.Fermüller and A. Voronkov, eds., first-order probabilistictimedautomata. Weidenbach. [54] Korea, 2010,LNCS6507,pp.327–338.Springer. Symposium, ISAAC 2010.-Pt.II Algorithms andComputation,21stInternational In O.Cheong,K.-Y. ChwaandK.Park, eds., Entropy-bounded representationofpointgrids. [53] LNCS 6396,pp.152–167.Springer. Conference, IFM2010 Integrated Formal Methods,8thInternational topologies. InD.MeryandS.Merz,eds., on ofparametricspecificationswithcomplex V. Sofronie-Stokkermans. [52] pp. 336–347.Springer. Bordeaux, France, July2010,LNCS6198, International Colloquium,ICALP 2010 Automata, LanguagesandProgramming,37th F. MeyeraufderHeideandP. G.Spirakis,eds., buffer. InS. Abramsky, C.Gavoille,Kirchner, Max-min onlineallocationswithareordering , Yogyakarta, Indonesia, October2010,LNCS , 2010. ACM. A. K.ElmagarmidandD. Agrawal, L. Epstein, A. LevinandR.van Stee D. Emig,N.Salomonis, J.Baumbach, N. Fountoulakis andK.Panagiotou. N. Fountoulakis, A. Huberand A. Fietzke, H.HermannsandC. A. Farzan, T. GagieandG.Navarro. J. Faber, C.Ihlemann,S.Jacobs and , 38(Suppl.2):W755–W762,2010. Superposition-based analysisof . Reliablebroadcastinginrandom , Nancy, France, 2010, Automata, Languages Automatic verificati- , SanDiego,USA, Nucleic Acids , JejuIsland, 2010 , eds. .

...... to referees. K. MehlhornandJ.Mestre. [61] ch. 4,pp.131–164.Springer, Berlin,2010. Systems, CognitiveSystemsMonographs G.- Categorical perception.InH.I.Christensen, M. Stark, A. LeonardisandB.Schiele [60] 114(5):564–573, 2010. Computer Vision andImageUnderstanding, categories usingdifferentlevelsofsupervision. B. Schiele. [59] Austin, Texas, 2010,pp.1620–1629.SIAM. Symposium onDiscrete Algorithms (SODA 2010) In M.Charikar, ed., T. Sauerwald. [58] (IL17REL). colitis identifiesrisklociat7q22and22q13 Genomewide associationstudyforulcerative Rosenstiel, T. H.KarlsenandS.Schreiber. S. Vermeire, M.H.Vatn, M.Krawczak, P. W. L.McArdle,C.G.Mathew, P. Rutgeerts, M. Gazouli,N.P. Anagnou, D.Strachan, J. Sventoraityte,L.Kupcinskas,C.M.Onnie, S. Nikolaus,C.Gieger,H.E.Wichmann, M. Albrecht, M.Wittig, E.Buchert, D. Ellinghaus,R.Häsler,G.Mayr, [57] LNCS 6198,pp.348–359.Springer. ICALP 2010.-Pt.1 and Programming,37thInternationalColloquium, October 2010,pp.499–508. ACM. Management (CIKM2010) Conference onInformationandKnowledge In of shortestpathsinlargenetworks. and G.Weikum [63] core and Accelerators J. Kurzak,eds., smoothers. InJ.Dongarra,D. A. Baderand precision GPU-multigridsolverswithstrong [62] D Proceedings ofthe19th ACM International J. M.KruijffandL.Wyatt, eds.,Cognitive N. Garg,T. Kavitha, A. Kumar, M. Fritz, A. Mykhaylo,S.Fidler, M. Fritz, G.-J.M.Kruijffand T. Friedrich, M.Gairingand A. Franke, T. Balschun, C.Sina, A. Gubichev, S.Bedathur, S.Seufert . GöddekeandR.Strzodka. Algorithmica Nature Genetics Tutor-based learningofvisual Quasirandom loadbalancing. Scientific ComputingwithMulti- . Fast andaccurateestimation , Bordeaux,France, 2010, . CRCPress,Dec.2010. 21st AnnualACM-SIAM , 58(1):119–136,2010. , 42(4):292–294,2010. , Toronto, Canada, Assigning papers Mixed , vol.8, . , ...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 115 ...... PUBLICATIONS 115

[64] N. Hasler, T. Thormählen and H.-P. [73] S. Jain, S. Seufert and S. J. Bedathur. Düsseldorf, Germany, September 2010, vol. 10, Seidel. Multilinear pose and body shape Antourage: mining distance-constrained trips pp. 19–21. Universität Düsseldorf. Short contri- estimation of dressed subjects from image sets. from Flickr. In M. Rappa, P. Jones, J. Freire and bution. In IEEE Computer Society Conference on Com- S. Chakrabarti, eds., WWW, 2010, pp. 1121– puter Vision and Pattern Recognition (CVPR 1122. ACM. [82] Y. Mass, M. Ramanath, Y. Sagiv and G. 2010), 2010, pp. 1823–1830. IEEE. Weikum. IQ: The case for iterative querying for [74] O. Jurca, S. Michel, A. Herrmann and knowledge. In Proceedings of the 5th Biennial [65] M. Horbach and C. Weidenbach. Super- K. Aberer. Continuous query evaluation over Conference on Innovative Data Systems Research position for fixed domains. ACM Transactions on distributed sensor networks. In F. Li, M. M. (CIDR 2011), Asilomar, USA, 2011, p. 38–44 Computational Logic, 11(4):27,1–27,35, Moro, S. Ghandeharizadeh, J. R. Haritsa, G. November 2010. Weikum, M. J. Carey, F. Casati, E. Y. Chang, [83] R. A. McGovern, A. Thielen, T. Mo, W. I. Manolescu, S. Mehrotra, U. Dayal and V. J. Dong, C. K. Woods, D. Chapman, M. Lewis, I. [66] M. B. Hullin, J. Hanika, B. Ajdin, Tsotras, eds., ICDE, 2010, pp. 912–923. IEEE. James, J. Heera, H. Valdez and P. R. Harrigan. H.-P. Seidel, J. Kautz and H. P. A. Lensch. Population-based v3 genotypic tropism assay: a Acquisition and analysis of bispectral bidirec- [75] S. Kratsch and M. Wahlström. Prepro- retrospective analysis using screening samples tional reflectance and reradiation distribution cessing of min ones problems: A dichotomy. from the a4001029 and motivate studies. AIDS, ...... functions. ACM Trans. Graph. (Proc. SIG- ...... In S. Abramsky, C. Gavoille, C. Kirchner, F...... 24(16):2517–2525, October 2010. GRAPH 2010), 29(4):97:1–97:7, 2010. Meyer auf der Heide and P. G. Spirakis, eds., Automata, Languages and Programming, 37th [84] A. C. McHardy and B. Adams. The role [67] C. Ihlemann and V. Sofronie-Stokker- International Colloquium, ICALP 2010, of genomics in tracking the evolution of influen- mans. On hierarchical reasoning in combinations Bordeaux, France, 2010, LNCS 6198, za a virus. PLoS Pathog, 5(10):e1000566, of theories. In J. Giesl and R. Hähnle, eds., pp. 653–665. Springer. October 2009. Automated Reasoning, 5th International Joint Conference, IJCAR 2010, Edinburgh, 2010, [76] A. Kuroishi, K. Bozek, T. Shioda and [85] K. Mehlhorn and M. Sagraloff. LNAI 6173, p. 30–45. Springer. E. E. Nakayama. A single amino acid substituti- A deterministic descartes algorithm for real on of the human immunodeficiency virus type 1 polynomials. Accepted for ISSAC 2009, [68] I. Ihrke, K. N. Kutulakos, H. P. A. capsid protein affects viral sensitivity to TRIM5 January 2010. Lensch, M. Magnor and W. Heidrich. Trans- alpha. Retrovirology, 7:58,1–58,10, 2010. parent and specular object reconstruction. Com- [86] G. de Melo and G. Weikum. MENTA: puter Graphics Forum, 29(8):2400–2426, 2010. [77] C. M. Lange, K. Roomp, A. Dragan, Inducing multilingual taxonomies from Wikipe- J. Nattermann, M. Michalk, U. Spengler, dia. In J. Huang, N. Koudas, G. Jones, X. Wu, [69] I. Ihrke, G.Wetzstein and W. Heidrich. V. Weich, T. Lengauer, S. Zeuzem, T. Berg and K. Collins-Thompson and A. An, eds., Procee- A theory of plenoptic multiplexing. In IEEE C. Sarrazin. HLA class I allele associations with dings of the 19th ACM Conference on Information Conference on Computer Vision and Pattern HCV genetic variants in patients with chronic and Knowledge Management (CIKM 2010), Recognition (CVPR), 2010, pp. 1–8. HCV genotypes 1a or 1b infection. Journal of Toronto, Canada, 2010, pp. 1099–1108. ACM. Hepatology, 53(6):1022–1028, December 2010. [70] S. Jacobs. Incremental instance generation [87] J. Meyer, P. Schnitzspan, S. Kohlbrecher, in local reasoning. In A. Bouajjani and O. Maler, [78] S. Lee, E. Eisemann and H.-P. Seidel. K. Petersen, O. Schwahn, M. Andriluka, eds., Computer Aided Verification, 21st Internatio- Real-Time Lens Blur Effects and Focus Control. U. Klingauf, S. Roth, B. Schiele and O. von nal Conference, CAV 2009, Grenoble, France, ACM Transactions on Graphics (Proc. ACM SIG- Stryk. A semantic world model for urban search June 2009, LNCS 5643, pp. 368–382. Springer. GRAPH'10), 29(4):65:1–7, 2010. and rescue based on heterogeneous sensors. In RoboCup 2010: Robot Soccer World Cup XIV, [71] A. Jain, C. Kurz, T. Thormählen and [79] T. Lengauer, A. Altmann, A. Thielen and Singapore, Singapore, 2011 to appear, pp. 1–12. H.-P. Seidel. Exploiting global connectivity R. Kaiser. Chasing the aids virus. Communica- Springer. constraints for reconstruction of 3D line segment tions of the ACM, 53(3):66–74, 2010. from images. In IEEE Conference on Computer [88] M. Müller, M. Clausen, V. Konz, S. Vision and Pattern Recognition (CVPR 2010), San [80] T. Lengauer and R. Kaiser. Computer- Ewert and C. Fremerey. A multimodal way of Francisco, USA, June 2010, pp. 1586–1593. IEEE. jagd auf das AIDSVIRUS. Spektrum der Wissen- experiencing and exploring music. Interdiscipli- schaft, 08:62–67, August 2009. nary Science Reviews (ISR), 35(2):138–153, 2010. [72] A. Jain, T. Thormählen, H.-P. Seidel and C. Theobalt. Moviereshape: Tracking and [81] T. Lu, S. Merz and C. Weidenbach. [89] M. Müller and S. Ewert. Towards tim- reshaping of humans in videos. ACM Trans- Model checking the pastry routing protocol. bre-invariant audio features for harmony-based actions on Graphics (Proc. of SIGGRAPH Asia), In J. Bendisposto, M. Leuschel and M. Roggen- music. IEEE Transactions on Audio, Speech, 29(6):148–157, 2010. bach, eds., 10th Intl. Workshop Automatic and Language Processing (TASLP), 18(3):649– Verification of Critical Systems (AVOCS 2010), 662, 2010...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 116

116 PUBLICATIONS

[90] T. Mütze, T. Rast and R. Spöhel. Transfer. In 2010 IEEE Conference on Computer [107] V. Sofronie-Stokkermans. Locality Coloring random graphs online without creating Vision and Pattern Recognition (CVPR). – Pt. 2, results for certain extensions of theories with monochromatic subgraphs. In D. Randall, ed., San Francisco, USA, June 2010, pp. 910–917. bridging functions. In R. A. Schmidt, ed., Proceedings of the Twenty-Second Annual IEEE. Automated Deduction – CADE-22, 22nd Inter- ACM-SIAM Symposium on Discrete Algorithms national Conference on Automated Deduction, (SODA 11), San Francisco, USA, January 2011, [99] A. Rybalchenko and V. Sofronie- Montreal, Canada, 2009, LNAI 5663, p. 67–83. pp. 145–158. Society for Industrial and Applied Stokkermans. Constraint solving for interpo- Springer. Mathematics. lation. Journal of Symbolic Computation, 45(11): 1212–1233, 2010. Accepted for publication. [108] M. Sozio and A. Gionis. The community- [91] N. Nakashole, M. Theobald and search problem and how to plan a successful G. Weikum. Scalable knowledge harvesting with [100] A. Schlicker and M. Albrecht. cocktail party. In B. Rao, B. Krishnapuram, high precision and high recall. In 4th International FunSimMat update: new features for exploring A. Tomkins and Q. Yang, eds., KDD, 2010, Conference on Web Search and Data Mining functional similarity. Nucleic Acids Research, pp. 939–948. ACM. (WSDM 2011), Hong Kong, 2011, ACM. To 38(Database issue):D244–D248, 2010. appear. [109] M. Stark, M. Goesele and B. Schiele. [101] A. Schlicker, T. Lengauer and M. Back to the future: Learning shape models from ...... [92] G. Pons-Moll, A. Baak, T. Helten, Albrecht. Improving disease gene prioritization 3D CAD data. In F. Labrosse, R. Zwiggelaar, ...... M. Müller, H.-P. Seidel and B. Rosenhahn. using the semantic similarity of Gene Ontology Y. Liu and B. Tiddeman, eds., 21st British Ma- Multisensorfusion for 3D full-body human terms. Bioinformatics, 26(18):i561–i567, 2010. chine Vision Conference (BMVC), Aberystwyth, motion capture. In IEEE Conference on Compu- Wales, September 2010, Proceedings of the ter Vision and Pattern Recognition (CVPR), San [102] P. Schnitzspan, S. Roth and B. Schiele. British Machine Vision Conference, vol. 21, Francisco, CA, USA, June 2010, pp. 663–670. Automatic discovery of meaningful object parts pp. 106,1–106,11. BMVA. with latent CRFs. In 2010 IEEE Conference on [93] N. Preda, G. Kasneci, F. M. Suchanek, Computer Vision and Pattern Recognition [110] U. Steinhoff and B. Schiele. Dead T. Neumann, W. Yuan and G. Weikum. (CVPR). – Pt. 1, San Francisco, USA, June reckoning from the pocket – an experimental Active knowledge: dynamically enriching RDF 2010, Proceedings of the IEEE Conference on study. In IEEE 2010 International Conference knowledge bases by web services. Computer Vision and Pattern Recognition on Pervasive Computing and Communications In Elmagarmid and Agrawal [49], pp. 399–410. (CVPR), vol. 23, pp. 121–128. IEEE. (PerCom), Mannheim, Germany, 2010, pp. 162–170. IEEE. [94] F. Ramírez and M. Albrecht. Finding [103] V. Setty, S. J. Bedathur, K. Berberich scaffold proteins in interactomes. Trends in Cell and G. Weikum. InZeit: Efficiently identifying [111] C. Stoll, J. Gall, E. de Aguiar, S. Thrun Biology, 20(1):2–4, 2010. insightful time points. PVLDB, 3(2):1605–1608, and C. Theobalt. Video-based reconstruction of 2010. animatable human characters. ACM Transactions [95] E. Reinhard, G. Ward, S. Pattanaik, on Graphics (Proc. of SIGGRAPH Asia), P. Debevec, W. Heidrich and K. Myszkowski. [104] S. Seufert, S. Bedathur, J. Mestre and 29(6):139–149, 2010. High Dynamic Range Imaging: Acquisition, G. Weikum. Bonsai: Growing interesting small Display, and Image-based Lighting. Elsevier trees. In 10th IEEE International Conference on [112] R. Strzodka, M. Shaheen, D. Pajak and (Morgan Kaufmann), Burlington, MA, 2010. Data Mining (ICDM 2010), Sydney, Australia, H.-P. Seidel. Cache oblivious parallelograms in December 2010, pp. 1013–1018. IEEE iterative stencil computations. In ICS '10: Pro- [96] S.-Y. Rhee, S. Margeridon-Thermet, Computer Society. ceedings of the 24th ACM International Confer- M. Nguyen, T. Liu, R. Kagan, B. Beggel, ence on Supercomputing, 2010, pp. 49–59. ACM. J. Verheyen, R. Kaiser and R. Shafer. [105] M. Shmueli-Scheuer, C. Li, Y. Mass, Hepatitis b virus reverse transcriptase sequence H. Roitman, R. Schenkel and G. Weikum. [113] M. Suda, C. Weidenbach and variant database for sequence analysis and muta- Best-effort top-k query processing under budgetary P. Wischnewski. On the saturation of YAGO. tion discovery. Antiviral Research, 88(3):269–275, constraints. In ICDE, 2009, pp. 928–939. IEEE. In J. Giesl and R. Hähnle, eds., Automated December 2010. Reasoning, 5th International Joint Conference, [106] D. Skocaj, K. Matej, A. Vrecko, IJCAR 2010, Edinburgh, UK, 2010, LNAI 6173, [97] T. Ritschel, T. Thormählen and A. Leonardis, M. Fritz, M. Stark, B. Schiele, pp. 441–456. Springer. H.-P. Seidel. Interactive on-surface signal S. Hongeng and J. L. Wyatt. Multi-modal deformation. ACM Transactions on Graphics learning. In H. I. Christensen, G.-J. M. Kruijff [114] C. Theobalt, E. de Aguiar, C. Stoll, (TOG), 29(4):1–8, July 2010. and J. L. Wyatt, eds., Cognitive Systems, Cogni- H.-P. Seidel and S. Thrun. Image and Geo- tive Systems Monographs, vol. 8, ch. 7, pp. 265– metry Procesing for 3D-Cinematography, ch. [98] M. Rohrbach, M. Stark, G. Szarvas, 309. Springer, Berlin, 2010. Performance Capture from Multi-view Video. I. Gurevych and B. Schiele. What helps Where Springer, 2010. - and Why? Semantic Relatedness for Knowledge ...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 117 ...... PUBLICATIONS 117

[115] A. Thielen, N. Sichtig, R. Kaiser, [123] G. Wetzstein, I. Ihrke and W. Heidrich. J. Lam, P. R. Harrigan and T. Lengauer. Sensor saturation in Fourier multiplexed imaging. Improved prediction of HIV-1 coreceptor usage In IEEE Conference on Computer Vision and with sequence information from the second Pattern Recognition (CVPR), 2010, pp. 1–8. hypervariable loop of gp120. Journal of Infectious Diseases, 202(9):1435–1443, November 2010. [124] S. E. Whang, D. Menestrina, G. Koutrika, M. Theobald and H. Garcia- [116] L. H. U, N. Mamoulis, K. Berberich Molina. Entity resolution with iterative and S. J. Bedathur. Durable top-k search in blocking. In U. Çetintemel, S. B. Zdonik, document archives. In Elmagarmid and Agrawal D. Kossmann and N. Tatbul, eds., SIGMOD [49], pp. 555–566. Conference, 2009, pp. 219–232. ACM.

[117] L. Valgaerts, A. Bruhn, H. Zimmer, [125] T. Wittkop, D. Emig, S. Lange, J. Weickert, C. Stoll and C. Theobalt. S. Rahmann, M. Albrecht, J. H. Morris, Joint estimation of motion, structure and geo- S. Böcker, J. Stoye and J. Baumbach. metry from stereo sequences. In Proc. of Partitioning biological data with transitivity ...... European Conference on Computer Vision ...... clustering. Nature Methods, 7(6):419–420, 2010...... (ECCV), 2010, pp. 1–12. [126] I. Wohlers, F. S. Domingues and G. W. [118] S. Walk, N. Majer, K. Schindler and Klau. Towards optimal alignment of protein B. Schiele. New features and insights for structure distance matrices. Bioinformatics, pedestrian detection. In 2010 IEEE Conference 26(18):2273–2280, September 2010. on Computer Vision and Pattern Recognition (CVPR). – Pt. 2, San Francisco, USA, 2010, [127] C. Wojek, S. Roth, K. Schindler and pp. 1030–1037. IEEE. B. Schiele. Monocular 3D scene modeling and inference: Understanding multi-object traffic [119] S. Walk, K. Schindler and B. Schiele. scenes. In K. Daniilidis, P. Maragos and Disparity statistics for pedestrian detection: N. Paragios, eds., Computer Vision – ECCV Combining appearance, motion and stereo. 2010, 11th European Conference on Computer In K. Daniilidis, P. Maragos and N. Paragios, Vision. – Pt. IV, Heraklion, Crete, Greece, eds., Computer Vision – ECCV 2010, 11th September 2010, LNCS 6341, pp. 467–481. European Conference on Computer Vision. – Pt. Springer. VI, Heraklion, Crete, Greece, September 2010, LNCS 6341, pp. 182–195. Springer.

[120] M. Wand, B. Adams, M. Ovsjanikov, A. Berner, M. Bokeloh, P. Jenke, L. Guibas, H.-P. Seidel and A. Schilling. Efficient re- construction of non-rigid shape and motion from real-time 3D scanner data. ACM Transactions on Graphics, 28(2), 2009.

[121] G. Weikum, N. Ntarmos, M. Spaniol, P. Triantafillou, A. Benczúr, S. Kirkpatrick, P. Rigaux and M. Williamson. Longitudinal analytics on web archive data: It’s about time! In Proceedings of the 5th Biennial Conference on Innovative Data Systems Research (CIDR 2011), Asilomar, USA, 2011, pp. 199–202

[122] G. Weikum and M. Theobald. From information to knowledge: harvesting entities and relationships from web sources. In J. Paredaens and D. V. Gucht, eds., PODS, 2010, pp. 65–76. ACM...... bericht_2011_en_korr2 24.11.201116:41UhrSeite118 118

...... Directions STANDORT The campusisaccessible... Metz, Nancy, andParis. Mannheim, Frankfurt, Luxemburg,Trier, Köln,andStraßburgaswell intervals. Theexpresshighways–or Autobahns –providefastroutesto all overGermany, andtrainstoMetz,NancyParis leaveatregular or car. TherearegoodhourlytrainconnectionsfromSaarbrückentocities connected totheFrankfurt andLuxemburgairportsbyshuttlebus,train, Saarbrücken hasitsownairport(Saarbrücken-Ensheim)andiswell- downtown Saarbrücken,inawoodedareanearDudweiler. the campusofSaarlandUniversity, approximately5kmnorth-eastof The MaxPlanckInstituteforInformatics(BuildingE1 follow thewhitesignsfor exit from ParisonAutobahnA4 follow thewhitesignsfor exit from FrankfurtorMannheimonAutobahnA6 alternatively exit exit direction by businapproximately20minutes by taxiinapproximately15minutes from Saarbrückencentraltrainstation by taxiinapproximately20minutes from theairportatSaarbrücken-Ensheim „St.Ingbert-West“ „St.Ingbert-West“ „Universität Mensa“ „Dudweiler-Dudoplatz“ „Universität Campus“ , or „Universität“ „Universität“ or „Universität Campus“ ...... , leadingtothecampus , leadingtothecampus 4 ) islocatedon

...... Max PlanckInstituteforInformatics multi-storey carparkeast university east north bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 119 ...... 119

IMPRINT

Publisher Max Planck Institute for Informatics Campus E1 4 D-66123 Saarbrücken

Editors & Coordination Volker Maria Geiß ...... Samir Hammann Manuel Lamotte-Schubert Thomas Lengauer Kurt Mehlhorn Jennifer Müller Marcus Rohrbach Thomas Sauerwald Bernt Schiele Hans-Peter Seidel ...... Ingolf Sommer Martin Theobald Christian Theobalt

...... Uwe Waldmann Christoph Weidenbach Gerhard Weikum

Proof-reading Krista Ames Kamila Kolesniczenko Ebba Schröder Nathalie Strassel

Photos & Photo Editing ...... Martin Cadik Ivo Ihrke Manuel Lamotte-Schubert Kristina Scherbaum

Contact Max Planck Institute for Informatics phone +49 681 9325-0 fax +49 681 9325-5719 email [email protected] internet http://www.mpi-inf.mpg.de

Report Period 1st of January, 2009 to 31st of December, 2010

Design Behr Design | Saarbrücken

Printing

...... Kubbli GmbH bericht_2011_en_korr2 24.11.201116:41UhrSeite120 120

...... bericht_2011_en_korr2 24.11.2011 16:41 Uhr Seite 121

max planck institut informatik Max Planck Institute for Informatics Campus E1 4 D-66123 Saarbrücken phone +49 681 9325-0 fax +49 681 9325-5719 email [email protected] internet http://www.mpi-inf.mpg.de

max planck institut informatik