INRIA, Evaluation of Theme Sym B
Total Page:16
File Type:pdf, Size:1020Kb
INRIA, Evaluation of Theme Sym B Project-team Algorithms November 14-15, 2006 Project-team title: Algorithms (ALGO) Scientific leader: Bruno Salvy Research center: Rocquencourt 1 Personnel Personnel (March 2002) Misc. INRIA CNRS University Total DR (1) / Professors 3 3 CR (2) / Assistant Professors 1 1 Permanent Engineers (3) Temporary Engineers (4) PhD Students 5 1 6 Post-Doc. 1 1 Total 6 4 1 11 External Collaborators 2 2 2 6 Visitors (> 1 month) 1 1 (1) “Senior Research Scientist (Directeur de Recherche)” (2) “Junior Research Scientist (Charg´ede Recherche)” (3) “Civil servant (CNRS, INRIA, ...)” (4) “Associated with a contract (Ing´enieur Expert or Ing´enieurAssoci´e)” Personnel (November 2006) Misc. INRIA CNRS University Total DR / Professors 3 3 CR / Assistant Professor 2 2 Permanent Engineer Temporary Engineer 1 1 PhD Students 1 2 3 Post-Doc. 1 1 Total 3 5 2 10 External Collaborators 1 1 2 Visitors (> 1 month) 1 Changes in staff DR / Professors Misc. INRIA CNRS University total CR / Assistant Professors Arrival 1 Leaving Comments: Alin Bostan joined our team in September 2004. Current composition of the project-team in November 2006: – DR: Philippe Flajolet (DR 0), Mireille R´egnier and Bruno Salvy (DR 2); – CR: Fr´ed´eric Chyzak (CR 1), Alin Bostan (CR 2); – PhD Students: Eric´ Fusy (deputation from Corps des T´el´ecoms), Fr´ed´eric Giroire and Carine Pivoteau (with scholarships from the Ministry of Research and Higher Education); – Temporary Engineer: Andr´eDerrick Balsa; – Post-doc: Fr´ed´ericMeunier (deputation from Corps des Ponts); – External collaborators: Philippe Dumas (Mathematics Teacher in a classe pr´eparatoire aux grandes Ecoles´ , Lyc´eeJean-Baptiste Say, Paris; Pierre Nicod`eme (CR CNRS, Ecole´ polytechnique). Current position of former project-team members (including PhD stu- dents during the March 2002 to November 2006 period): Former PhD students: – Alin Bostan, researcher, Algorithms project, Inria; – Marianne Durand, analyst in computational financial mathematics, BNP-Paribas, Paris, France; – Julien Fayolle, post-doc, LRI, University of Paris-Sud, France; – Ludovic Meunier, software engineer, Murex (software development for trading, risk management and processing), Paris, France; – Vincent Puyhaubert, mathematics teacher, classe pr´eparatoire aux grandes Ecoles´ , Lyc´ee F´enelon, Paris, France; – Mathias Vandenbogaert, researcher in biology, Institut Pasteur, Paris, France. Former post-docs: – Ha Le, half-time math developer, Waterloo Maple Inc., and half-time post-doc, Uni- versity of Waterloo, Canada; – Jos´eLuis Martins, administrator at the Commission of the European Communities, Luxembourg. – Alexandre Sedoglavic, maˆıtre de conf´erences in computer science, University of Lille, France; – Markus V¨oge, researcher in computer science, department of media technology, EMPA research institute, D¨ubendorf, Switzerland. 2 Last INRIA enlistments Alin Bostan, CR 2 since September 1st, 2004. 2 Work progress 2.1 Keywords Analysis of algorithms, analytic combinatorics, random discrete structures, limit laws, computer algebra, special functions, Gr¨obner bases, combinatorics on words, biological sequences, genome. 2.2 Context and overall goal of the project The primal objective of the project, inherited from the former century, is the field of anal- ysis of algorithms. By this is meant a precise quantification of complexity issues associated to the most fundamental algorithms and data structures of computer science. Departing from traditional approaches that, somewhat artificially, place emphasis on worst-case sce- narii, the project focuses on average-case and probabilistic analyses, aiming as often as possible at realistic data models. As such, our research is inspired by the pioneering works of Knuth. The need to analyze, dimension, and finely optimize algorithms requires an in-depth study of random discrete structures, like words, trees, graphs, and permutations, to name a few. Indeed, a vast majority of the most important algorithms in practice either “make bets” on the likely shape of input data or even base themselves of random choices. In this area we are developing a novel approach based on recent theories of combinatorial analysis together with the view that discrete models connect nicely with complex-analytic and asymptotic methods. The resulting theory has been called “Analytic combinatorics”. Applications of it have been or are currently being worked out in such diverse areas as communication protocols, multidimensional search, data structures for fast retrieval on external storage, data mining applications, the analysis of genomic sequences, and data compression, for instance. The analytic-combinatorial approach to the basic processes of computer science is very systematic. It appeared early in the history of the project that its development would greatly benefit from the existence of symbolic manipulation systems and computer algebra. This connection has given rise to an original research programme that we are currently carrying out. Some of the directions pursued include automating the manipulation of combinatorial models (counting, generating function equations, random generation), the development of “automatic asymptotics”, and the development of a unified view of the theory of special functions. In particular, the project has developed the Maple library Algolib, that addresses several of these issues. 2.3 Objectives for the evaluation period Copy-pasted and translated from the previous evaluation document. One of the first objectives is the completion of a vast synthesis on methods in ana- lytic combinatorics (by P. Flajolet and R. Sedgewick, a Princeton success writer). Then the development of analysis of algorithms, along the natural evolution: worst-case (often obsolete) → average-case (current) → probabilistic analysis (i.e., in distribution, the fu- ture). We aim at: various hashing schemes; the dynamical analysis of algorithms (with 3 B. Vall´ee, Caen); random generation (thanks to a new idea with a physical origin: Boltz- mann generators); the automation of the analysis of combinatorial structures by computer algebra; tree models and random graphs; problems exhibiting a phase transition in relation with analytic methods and coalescent saddle-points; elucidating the role of “holonomic” functions in the combinatorics of “hard” models, etc. Our research projects in computer algebra are guided by two objectives. First, com- plete and make more systematic the automation of the analysis of combinatorial struc- tures, in particular towards ordered structures and analyses in distribution. Second, take advantage of our specific strength in analysis to bring progress at the interface between computer algebra and numerical analysis. The idea is to bring together our work on spe- cial functions, recurrences and on asymptotic analysis so as to develop tools producing automatically efficient numerical code. This is the topic of L. Meunier’s PhD thesis. It will be validated and focused by applications in physics. It will also be fueled by more research on the analysis of non-linear differential equations and on efficiency issues in commutative and non-commutative elimination. Concerning the analysis of sequences, the recent trends, which will be pursued in the next years, concern the study of algorithms that search for sets of words or regular ex- pression, and the extension to the (more realistic) Markovian model of analysis performed under the Bernoulli model. Biological applications reveal how crucial time and memory complexity of the algorithms are. Thus part of our research will aim at the efficient con- struction of minimal automata (with M. Raffinot) in the case of regular expressions and on the set of words neighboring a given word (with U. Bordeaux). 2.4 Evolution of the objectives The long-term directions have not changed. The unexpected discovery of the LogLog algorithm for approximate counting has fo- cused some of our energy. The hiring of Alin Bostan has allowed us to concentrate more on efficiency issues in computer algebra. 2.5 Objective 1 (Analysis of algorithms): Executive summary The planned development of the book Analytic Combinatorics has been progressing at a fair pace, though the task of synthesizing a domain of such amplitude is quite labour- intensive. Random generation by means of Boltzmann models has been advancing and is gradually finding applications developed by partner teams in computational biology and software testing. Progress has been made in the analysis of phase transitions, in particular in connection with percolation on the random graph, which provides an entry point to the analysis of several hard optimization problems (e.g., constraint satisfaction). The perfection of the standard theory of analytic combinatorics has benefited from side studies related to urn processes, tree models, continued fractions, interconnection graphs, complex asymptotics, etc. New applications have been developed in the area of data com- pression (classical and new algorithms, e.g., by antidictionary), hashing, boolean function complexity, coding theory, and pattern occurrences in the context of intrusion detection. An unexpected development has been the revival of interest for probabilistic counting al- gorithms, which are currently the object of intense study by the network monitoring and data mining communities: the LogLog algorithm is currently the absolutely best algorithm available on the market (in terms of efficiency and robustness) for the fundamental task of estimating cardinalities