LIDS – a BRIEF HISTORY Alan S
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Reinforcement Learning and Optimal Control
Reinforcement Learning and Optimal Control by Dimitri P. Bertsekas Massachusetts Institute of Technology WWW site for book information and orders http://www.athenasc.com Athena Scientific, Belmont, Massachusetts Athena Scientific Post Office Box 805 Nashua, NH 03060 U.S.A. Email: [email protected] WWW: http://www.athenasc.com Cover photography: Dimitri Bertsekas c 2019 Dimitri P. Bertsekas All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Publisher’s Cataloging-in-Publication Data Bertsekas, Dimitri P. Reinforcement Learning and Optimal Control Includes Bibliography and Index 1. Mathematical Optimization. 2. Dynamic Programming. I. Title. QA402.5.B4652019 519.703 00-91281 ISBN-10: 1-886529-39-6, ISBN-13: 978-1-886529-39-7 ABOUT THE AUTHOR Dimitri Bertsekas studied Mechanical and Electrical Engineering at the National Technical University of Athens, Greece, and obtained his Ph.D. in system science from the Massachusetts Institute of Technology. He has held faculty positions with the Engineering-Economic Systems Department, Stanford University, and the Electrical Engineering Department of the Uni- versity of Illinois, Urbana. Since 1979 he has been teaching at the Electrical Engineering and Computer Science Department of the Massachusetts In- stitute of Technology (M.I.T.), where he is currently McAfee Professor of Engineering. Starting in August 2019, he will also be Fulton Professor of Computational Decision Making at the Arizona State University, Tempe, AZ. Professor Bertsekas’ teaching and research have spanned several fields, including deterministic optimization, dynamic programming and stochas- tic control, large-scale and distributed computation, and data communi- cation networks. -
Sampling and Inference in Complex Networks Jithin Kazhuthuveettil Sreedharan
Sampling and inference in complex networks Jithin Kazhuthuveettil Sreedharan To cite this version: Jithin Kazhuthuveettil Sreedharan. Sampling and inference in complex networks. Other [cs.OH]. Université Côte d’Azur, 2016. English. NNT : 2016AZUR4121. tel-01485852 HAL Id: tel-01485852 https://tel.archives-ouvertes.fr/tel-01485852 Submitted on 9 Mar 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. École doctorale STIC Sciences et Technologies de l’Information et de la Communication Unité de recherche: INRIA (équipe Maestro) Thèse de doctorat Présentée en vue de l’obtention du grade de Docteur en Sciences de l’UNIVERSITE COTE D’AZUR Mention : Informatique par Jithin Kazhuthuveettil Sreedharan Sampling and Inference in Complex Networks (Échantillonnage et Inférence dans Réseaux Complexes) Dirigé par Konstantin Avrachenkov Soutenue le 2 décembre 2016 Devant le jury composé de: Konstantin Avrachenkov - Inria, France Directeur Nelly Litvak - University of Twente, The Netherlands Rapporteur Don Towsley - University of Massachusetts, USA Rapporteur Philippe Jacquet - Nokia Bell Labs, France Examinateur Alain Jean-Marie - Inria, France Président Abstract The recent emergence of large evolving networks, mainly due to the rise of Online Social Networks (OSNs), brought out the difficulty to gather a complete picture of a network and it opened up the development of new distributed techniques. -
Controlo'2000
VENUE AND SCOPE (1 000 PTE ≈ 5 Euros) CONTROLO’2000 Before August 15, 2000 After August 15, 2000 The Portuguese Association of Automatic Control Member of APCA: Member of APCA: (APCA) invites researchers, professionals, students, Full fee 35 000 PTE Full fee 45 000 PTE th Reduced fee* 17 500 PTE Reduced fee* 22 500 PTE 4 PORTUGUESE CONFERENCE industrialists and businessmen to participate in the th ON AUTOMATIC CONTROL CONTROLO’2000: 4 PORTUGUESE-INTERNATIONAL Non-member of APCA: Non-member of APCA: CONFERENCE ON AUTOMATIC CONTROL, which will be held Full fee 45 000 PTE Full fee 55 000 PTE at the Campus of Azurém of the University of Minho, Reduced fee* 22 500 PTE Reduced fee* 27 500 PTE in the town of Guimarães. Final Announcement The conference is intended as an international forum *Reduced fee: students under 25. and where an effective exchange of knowledge and experience amongst researchers active in various An online pre-registration form is available at the internet (http://www.dei.uminho.pt/controlo2000) Call for Papers theoretical and applied areas of systems and control can take place. This is also an excellent opportunity LOCAL ORGANIZING COMMITTEE to promote the Portuguese control community and stimulate the establishment of new international C. Couto (Chairman) (Dep. Industrial Electronics) links. E. Bicho (Dep. Industrial Electronics) 4-6 October 2000 The program will include an ample space for E. Ferreira (Dep. Biological Eng.) promoting the implementation of new technologies, J. Ferreira da Silva (Dep. Mechanical Eng.) emphasizing simultaneously the real-world F. Fontes (Dep. Mathematics) challenges in their applications. -
Madeleine Udell's Thesis
GENERALIZED LOW RANK MODELS A DISSERTATION SUBMITTED TO THE INSTITUTE FOR COMPUTATIONAL AND MATHEMATICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY Madeleine Udell May 2015 c Copyright by Madeleine Udell 2015 All Rights Reserved ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Professor Stephen Boyd) Principal Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Professor Ben Van Roy) I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy. (Professor Lester Mackey) Approved for the Stanford University Committee on Graduate Studies iii Abstract Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. This dissertation extends the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compress- ing, denoising, and imputing missing entries across all data types simultaneously. -
My Berkeley Years: Control at Cal, 1955-1961 by Michael Athans
TALESof AUTOMATIC CONTROL My Berkeley Years: Control at Cal, 1955-1961 By Michael Athans beauty of physics without having to struggle with multi- Tales of Automatic Control features personal reminis- ple integrals and other horrid math stuff. cences of personalities and events in modern control Toward the middle of my freshman year, I started hav- science and engineering. Readers who have stories to ing doubts about being a chemistry major. Apart from share are encouraged to get in touch with the Edi- causing a couple of small explosions in the chemistry lab, tor-in-Chief, [email protected]. I felt there were too many chemists in Greece. So I decided to become an engineer. I certainly did not have the foggi- was fortunate to have been an undergraduate and est idea what kind of an engineer I wanted to be, but me- chanical engineer sounded like a good choice that would graduate student at the University of California at be marketable in Greece. In their infinite wisdom, the peo- Berkeley (Cal for short) during the initial period of the I ple in the College of Engineering assigned an academic ad- modern control theory revolution. I shall recount some of visor to help me with course selection. He was Hans the events that took place at Cal during that time and rem- Albert Einstein, a professor of hydraulic engineering (and inisce about my teachers and colleagues in the early days the son of Albert Einstein). I was flabbergasted, to say the of optimal control theory. I have tried to interlace my per- least, but it turned out that he was an ordinary and kind sonal experiences with the educational, research, and so- human being. -
Nash Q-Learning for General-Sum Stochastic Games
Journal of Machine Learning Research 4 (2003) 1039-1069 Submitted 11/01; Revised 10/02; Published 11/03 Nash Q-Learning for General-Sum Stochastic Games Junling Hu [email protected] Talkai Research 843 Roble Ave., 2 Menlo Park, CA 94025, USA Michael P. Wellman [email protected] Artificial Intelligence Laboratory University of Michigan Ann Arbor, MI 48109-2110, USA Editor: Craig Boutilier Abstract We extend Q-learning to a noncooperative multiagent context, using the framework of general- sum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably converges given certain restrictions on the stage games (defined by Q-values) that arise during learning. Experiments with a pair of two-player grid games suggest that such restric- tions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Q-function, but sometimes fails to converge in the second, which has three different equilibrium Q-functions. In a comparison of offline learn- ing performance in both games, we find agents are more likely to reach a joint optimal path with Nash Q-learning than with a single-agent Q-learning method. When at least one agent adopts Nash Q-learning, the performance of both agents is better than using single-agent Q-learning. We have also implemented an online version of Nash Q-learning that balances exploration with exploitation, yielding improved performance. -
Mathematics People
NEWS Mathematics People or up to ten years post-PhD, are eligible. Awardees receive Braverman Receives US$1 million distributed over five years. NSF Waterman Award —From an NSF announcement Mark Braverman of Princeton University has been selected as a Prizes of the Association cowinner of the 2019 Alan T. Wa- terman Award of the National Sci- for Women in Mathematics ence Foundation (NSF) for his work in complexity theory, algorithms, The Association for Women in Mathematics (AWM) has and the limits of what is possible awarded a number of prizes in 2019. computationally. According to the Catherine Sulem of the Univer- prize citation, his work “focuses on sity of Toronto has been named the Mark Braverman complexity, including looking at Sonia Kovalevsky Lecturer for 2019 by algorithms for optimization, which, the Association for Women in Math- when applied, might mean planning a route—how to get ematics (AWM) and the Society for from point A to point B in the most efficient way possible. Industrial and Applied Mathematics “Algorithms are everywhere. Most people know that (SIAM). The citation states: “Sulem every time someone uses a computer, algorithms are at is a prominent applied mathemati- work. But they also occur in nature. Braverman examines cian working in the area of nonlin- randomness in the motion of objects, down to the erratic Catherine Sulem ear analysis and partial differential movement of particles in a fluid. equations. She has specialized on “His work is also tied to algorithms required for learning, the topic of singularity development in solutions of the which serve as building blocks to artificial intelligence, and nonlinear Schrödinger equation (NLS), on the problem of has even had implications for the foundations of quantum free surface water waves, and on Hamiltonian partial differ- computing. -
Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming
May 2015 Report LIDS-P-3174 Value and Policy Iteration in Optimal Control and Adaptive Dynamic Programming Dimitri P. Bertsekasy Abstract In this paper, we consider discrete-time infinite horizon problems of optimal control to a terminal set of states. These are the problems that are often taken as the starting point for adaptive dynamic programming. Under very general assumptions, we establish the uniqueness of solution of Bellman's equation, and we provide convergence results for value and policy iteration. 1. INTRODUCTION In this paper we consider a deterministic discrete-time optimal control problem involving the system xk+1 = f(xk; uk); k = 0; 1;:::; (1.1) where xk and uk are the state and control at stage k, lying in sets X and U, respectively. The control uk must be chosen from a constraint set U(xk) ⊂ U that may depend on the current state xk. The cost for the kth stage is g(xk; uk), and is assumed nonnnegative and real-valued: 0 ≤ g(xk; uk) < 1; xk 2 X; uk 2 U(xk): (1.2) We are interested in feedback policies of the form π = fµ0; µ1;:::g, where each µk is a function mapping every x 2 X into the control µk(x) 2 U(x). The set of all policies is denoted by Π. Policies of the form π = fµ, µ, : : :g are called stationary, and for convenience, when confusion cannot arise, will be denoted by µ. No restrictions are placed on X and U: for example, they may be finite sets as in classical shortest path problems involving a graph, or they may be continuous spaces as in classical problems of control to the origin or some other terminal set. -
The ACC 2009 Awards Ceremony Program, Including Lists of Past
AAOO The 2009 AACC Awards Ceremony AMERICAN AUTOMATIC CONTROL COUNCIL AIAA, AIChE, AIST, ASCE, ASME, IEEE, ISA, SCS 2009 American Control Conference St. Louis, Missouri, USA June 11, 2009 AWARDS PROGRAM 2009 ACC BEST STUDENT-PAPER AWARD Finalists and Winner Announced during Awards Ceremony O. HUGO SCHUCK BEST PAPER AWARD Robert D. Gregg and Mark W. Spong, “Reduction-based Control with Application to Three-Dimensional Bipedal Walking Robots” K. Stegath, N. Sharma, C. M. Gregory, and W. E. Dixon, “Nonlinear Tracking Control of a Human Limb via Neuromuscular Electrical Stimulation” DONALD P. ECKMAN AWARD Paulo Tabuada CONTROL ENGINEERING PRACTICE AWARD Suresh M. Joshi JOHN R. RAGAZZINI EDUCATION AWARD George Stephanopoulos RICHARD E. BELLMAN CONTROL HERITAGE AWARD George Leitmann PAST RECIPIENTS DONALD P. ECKMAN AWARD 1964 Michael Athans 1965 John Bollinger 1966 Roger Bakke 1967 Roger Brockett 1968 Robert E. Larson 1969 W. Harmon Ray 1970 John Seinfeld 1971 Raman Mehra 1972 Cecil L. Smith 1973 Edison Tse 1974 Timothy L. Johnson 1975 Alan S. Willsky 1976 Robert W. Atherton 1977 Nils R. Sandell, Jr. 1978 Narendra K. Gupta 1979 Joe Hong Chow 1980 Manfred Morari 1981 Rajan Suri 1982 Bruce Hajek 1983 John C. Doyle 1984 Mark A. Shayman 1985 P. R. Kumar 1986 Yaman Arkun 1987 R. Shoureshi 1988 Bijoy K. Ghosh 1989 P. P. Khargonekar 1990 Shankar S. Sastry 1991 Carl N. Nett 1992 Stephen P. Boyd 1993 Munther Dahleh 1994 Kameshwar Poolla 1995 Andrew Packard 1996 Jeff S. Shamma 1997 R. M. Murray 1998 I. Kanellakopoulos 1999 Andrew R. Teel 2000 Richard D. Braatz 2001 Dawn M. -
SURVEY of DECENTRALIZED CONTROL METHODS* Michael
SURVEY OF DECENTRALIZED CONTROL METHODS* Michael Athans** Massachusetts Institute of Technology Cambridge, Massachusetts INTRODUCTION This paper presents an overview of the types of problems that are being considered by control theorists in the area of dynamic large-scale systems with emphasis on decentralized control strate- gies. A similar paper by Varaiya (ref. 1) indicated the interplay between static notions drawn from the mathematical economics, management, and programming areas and the attempts by control theorists to extend the static notions into the stochastic dynamic case. In this paper, we shall not elaborate upon the dynamic or team aspects of large-scale systems. Rather we shall concentrate on approaches that deal directly with decentralized decision making for large-scale systems. Although a survey paper, the number of surveyed results is relatively small. This is due to the fact that there is still not a unified theory for decentralized control. What is available is a set of individual contributions that point out both "blind alleys" as well as potentially fruitful approaches. What we shall attempt to point out is that future advances in decentralized system theory are intimately connected with advances in the soK;alled stochastic control problem with nonclassical information pattern. To appreciate how this problem differs from the classical stochastic control problem, it is useful to briefly summarize the basic assumptions and mathematical tools associated with the latter. This is done in section 2. Section 3 is concerned with certain pitfalls that arise when one attempts to impose a decentralized structure at the start, but the mathematics "wipes out" the original intent. -
INFORMS OS Today 5(1)
INFORMS OS Today The Newsletter of the INFORMS Optimization Society Volume 5 Number 1 May 2015 Contents Chair’s Column Chair’s Column .................................1 Nominations for OS Prizes .....................25 Suvrajeet Sen Nominations for OS Officers ...................26 University of Southern California [email protected] Featured Articles A Journey through Optimization Dear Fellow IOS Members: Dimitri P. Bertsekas .............................3 It is my pleasure to assume the role of Chair of Optimizing in the Third World the IOS, one of the largest, and most scholarly Cl´ovis C. Gonzaga . .5 groups within INFORMS. This year marks the From Lov´asz ✓ Function and Karmarkar’s Algo- Twentieth Anniversary of the formation of the rithm to Semidefinite Programming INFORMS Optimization Section, which started Farid Alizadeh ..................................9 with 50 signatures petitioning the Board to form a section. As it stands today, the INFORMS Local Versus Global Conditions in Polynomial Optimization Society has matured to become one of Optimization the larger Societies within INFORMS. With seven Jiawang Nie ....................................15 active special interest groups, four highly coveted Convergence Rate Analysis of Several Splitting awards, and of course, its strong presence in every Schemes INFORMS Annual Meeting, this Society is on a Damek Davis ...................................20 roll. In large part, this vibrant community owes a lot to many of the past officer bearers, without whose hard work, this Society would not be as strong as it is today. For the current health of our Society, I wish to thank my predecessors, and in particular, my immediate predecessor Sanjay Mehrotra under whose leadership, we gathered the momentum for our conferences, a possible new journal, and several other initiatives. -
(URS). EDRS Is Not * Responsible for the Quality of the Original Document
DOCUMENT RESUME ED 119 723 IR 003 195 TITLE 16mm Film and Videotape'Lectures and Demonstrations. 1976/1977 COalog. INSTITUTION Massachusetts Inst. of Tech., Cambridge. Center for Advanced Engineering Study. PUB DATE 76 NOTE 101p. AVAILABLE FROM Massachusetts Institute of Technology, Center for Advanced Engineering Study, Cambridge, Massachusetts 02139 EDRS PRICE MF-$0.83 HC -$6.O1 Plus Postage DESCRIPTORS Artificial Intelligence; Calculus; *Catalogs; Chemistry; Computer Science; Economics; Electronics; Engineering; Experiments; Higher Education; *Instructional Films; Mathematics; Physics; Probability Theory; *Sciences; Statistics; Systems Analysis; Thermodynamics; *Video Tape Recordings IDENTIFIERS Massachusetts Institute of Technology ABSTEACT The Massachusetts Institute of Technology provides a catalog of 16mm filmed and videotaped lectures and demonstrations. Each listing includes title, short description, length of presentation, catalog number, purchase and rental prices, and indications as to whether the item is film or videotape and black-and-white or color. The catalog is divided into 17 categories: artificial intelligence; calculus; colloid and surface chemistry; computer languages; digital signal processing; economics; engineering economy; friction, wear, and lubrication; introduction to experimentation; mechanics of polymer processing; modern control theory; network analysis and design; nonlinear vibrations; probability; random processes; thermostatics and thermodynamics; and special programs. Ordering information and forms