INFORMS OS Today The Newsletter of the INFORMS Optimization Society

Volume 1 Number 1 March 2011

Contents Chair’s Column ...... 1 Chair’s Column Nominations for OS Prizes ...... 13 Nominations for OS Officers ...... 14 Jon Lee Announcing the 4th OS Conference ...... 15 IBM T.J. Watson Research Center Yorktown Heights, NY 10598 ([email protected]) Featured Articles Hooked on Optimization ...... 2 With this inaugural issue, the INFORMS Op- First Order Methods for Large Scale Optimization: timization Society is publishing a new newsletter: Error Bounds and Convergence Analysis “INFORMS OS Today.” It is a great pleasure to Zhi-Quan Luo ...... 7 welcome Shabbir Ahmed ([email protected]) Probability Inequalities for Sums of Random Matri- as its Editor. Our plan is to start by publishing one ces and Their Applications in Optimization issue each Spring. At some point we may add a Fall Anthony Man-Cho So ...... 10 issue. Please let us know what you think about the newsletter. Fast Multiple Splitting Algorithms for Convex Opti- The current issue features articles by the 2010 mization OS prize winners: George Nemhauser (Khachiyan Shiqian Ma ...... 11 Prize for Life-time Accomplishments in Optimiza- tion), Zhi-Quan (Tom) Luo (Farkas Prize for Mid- career Researchers), Anthony Man-Cho So (Prize for Young Researchers), and Shiqian Ma (Student Pa- per Prize). Each of these articles describes the prize- winning work in a compact form. Also, in this issue, we have announcements of key activities for the OS: Calls for nominations for the 2011 OS prizes, a call for nominations of candidates for OS officers, and the 2012 OS Conference (to be held at the University of Miami, February 24–26, Send your comments and feedback to the Editor: 2012). I want to emphasize that one of our most important activities is our participation at the an- Shabbir Ahmed nual INFORMS meetings, the next one being at the School of Industrial & Systems Engineering Charlotte Convention Center, Charlotte, North Car- , Atlanta, GA 30332 olina, November 13–16, 2011. Our participation at ([email protected]) that meeting is centered on the OS sponsored clus- ters. The clusters and cluster chairs for that meeting mirror our list of Vice Chairs: 2 INFORMS Optimization Society Newsletter

• Steven Dirkse, Computational Optimization and Software ([email protected]) • Oleg A. Prokopyev, Global Optimization Hooked on Optimization ([email protected]) George L. Nemhauser • Oktay G¨unl¨uk, Industrial and Systems Engineering Georgia Institute of Technology, Atlanta GA 30332 ([email protected]) ([email protected]) • Miguel Anjos, and Com- plementarity ([email protected]) • Mauricio Resende, Networks This article is based on a presentation I gave at ([email protected]) a session sponsored by the Optimization Society at • Frank E. Curtis, Nonlinear Programming the INFORMS 2010 annual meeting where I had ([email protected]) the great honor of being chosen as the first recip- • Huseyin Topaloglu, Stochastic Programming ient of the Khachiyan Prize for Life-time Accom- ([email protected]) plishments in Optimization. I would like to thank the prize committee members Martin Gr¨otschel, Their hard work is the means by which we have a , Panos Pardalos and Tam´asTer- strong presence within INFORMS — and we should laky and my friends and colleagues Bill Cook, G´erard have a strong presence which reflects the OS mem- Cornu´ejolsand Bill Pulleyblank who nominated me. bership of about 1000! Please contact any of the I was first introduced to optimization in a course Vice Chairs to get involved. in operations research taught by Jack Mitten when Additionally for the Charlotte meeting, jointly I was an M.S. student in chemical engineering at with the INFORMS Computing Society, the OS in 1958. Chemical pro- is co-sponsoring a mini-cluster on “Surrogate and cesses are a great source of optimization problems, derivative-free optimization.” Please contact ei- but at that time not much was being done except ther of the co-chairs for this cluster, Nick Sahini- for the use of linear programming for blending prob- dis ([email protected]) and Christine Shoemaker lems in the petro-chemical industry. When I learned ([email protected]) to get involved. about dynamic programming in this course and saw Pietro Belotti is the new OS webmaster, and the schematic diagram of a multi-time period inven- he will be pleased to get your feedback on our tory problem, I realized that it sort of looked like revamped website: www.informs.org/Community/ a flow diagram for a multi-stage chemical process, Optimization-Society. which I would now think of as a path in a graph. Finally, I would like to take this opportunity to The difference was that the serial structure would thank our Most-Recent Past Chair Nick Sahinidis, not suffice for many chemical processes. We needed and our Secretary/Treasurer Marina Epelman for more general graph structures that included feed- making my job as Chair an enjoyable experience. back loops or directed cycles. So Jack Mitten, also All of the OS officers and I look forward to seeing a chemical engineer by training, and I embarked you next at Charlotte in November — in particular, on how to extend dynamic programming type de- at the OS Business Meeting which is always a great composition to more general structures. I realized opportunity to have a couple of glasses of wine and that I needed to learn more about optimization than catch up with members of our community. fluid flow, so after completing my Master’s degree I switched to the I.E. department with its fledgling program in operations research for my PhD, and this work on dynamic programming became my thesis topic. I quickly learned about the trial and tribula- tions of research and publication. As my work was about to be completed, I submitted a paper on it and was informed rather quickly that there was another paper in process on the same topic written by the Volume 1 Number 1 March 2011 3 very distinguished chemical engineer/mathematician ends, and the first path to become tight is a short- Rutherford (Gus) Aris, a professor at the University est path. For the longest path build the graph with of Minnesota. Aris was an Englishman, and a true rubber bands instead of string - pull from both ends, scholar and gentleman. He was also a noted callig- and the last rubber band to remain slack is an edge rapher and a professor of classics. After recovering of the longest path. Now you could proceed recur- from the initial shock of this news, I carefully read sively. We actually built these rubber band models Aris’ paper on dynamic programming with feedback for graphs with up to about eight nodes, and our loops, and discovered a fundamental flaw in his al- conjecture seemed to hold for our examples. How- gorithm. It took some time to convince him of this, ever we were never able to prove the conjecture as and we ultimately co-authored a paper [2], which we were unable to describe the physical model math- was essentially my algorithm. This was a good in- ematically. Nevertheless, the work was published in troduction to the fact that the academic world was Operations Research [15], my first of many papers in not an ivory tower. It also motivated by first book, the journal of which I became the editor several years Introduction to Dynamic Programming [19]. later. We also submitted the paper to the 4th Inter- In 1961 I became an Assistant Professor in the national Symposium on Mathematical Programming department of Operations Research and Industrial held in 1962, but it was rejected. So I didn’t get to Engineering at . My first one of these meetings until the 8th one in 1973. But student Bill Hardgrave and I were trying to under- I haven’t missed one since then, and I wonder if I stand why the traveling salesman (TSP) problem hold the record for consecutive attendance at these seemed so much harder to solve than the shortest conferences. path problem (SPP). We thought we might have had Being an engineer, I tended to learn about opti- a breakthrough when we figured out how to reduce mization techniques and the underlying mathemat- the TSP to the longest path problem (LPP). Perhaps ics only as I needed to solve problems. My first this was an early example of a polynomial reduction. real involvement with integer programming (IP), But in our minds it was not, as is currently done, for other than teaching a little about cutting planes and the purpose of proving NP-hardness, a concept that branch-and-bound, came from a seminar presenta- would only appear more than a decade later. Our tion on school districting. This was around the time idea was that the LPP was closer to the SPP, so when congressional districting at the state level was maybe we could solve the TSP by finding an efficient being examined by the federal courts. Many states, algorithm for the LPP. Being engineers, we liked the my current home state of Georgia was one of the notion of physical models and especially the string worst, had congressional districts that were way out model of Minty for solving the SPP - pull from both of balance with respect to the principle of “one man, one vote.” The average congressional district should have had a population of around 400,000 then, but in Georgia they ranged from about 100,000 to nearly 800,000. The Supreme Court had just declared such districting plans to be unconstitutional, and redis- tricting was a hot topic. I had recently read a pa- per on solving vehicle routing problems using a set partitioning model, and it was easy to see that set partitioning was the right model for districting. The difficulty was to describe feasible districts by population, compactness, natural boundaries and possibly political requirements. These districts would be constructed from much smaller population units such as counties in rural areas and smaller geo- George L. Nemhauser graphical units in urban areas. The cost of a district would be its population deviation from the mean 4 INFORMS Optimization Society Newsletter given by the state’s population divided by the num- Bill Pulleyblank and VaˇsekChv´atalwhen they were ber of congressional districts; i.e. the population of graduate students at Waterloo. a perfect district. Given the potential districts, we I became interested in transportation and espe- then have the set partitioning problem of choosing a cially using optimization to solve equilibrium prob- minimum cost set of districts such that each popu- lems of traffic assignment. Together with my student lation unit is in exactly one district. My student, Deepak Merchant, we wrote the first papers on dy- Rob Garfinkel worked on this districting problem namic traffic assignment [18]. for his doctoral research. Rob had been a program- Marshall Fisher visited Cornell in 1974, and to- mer before beginning his graduate studies, and he gether with my student G´erard Cornu´ejols, we made great use of his machine language program- worked on a facility location problem that was moti- ming skills in implementing our implicit enumeration vated by a problem of locating bank accounts so that algorithm [9]. Of course, our solutions were subopti- companies could maximize the time they could hold mal since we generated all of our candidate districts money they owed and hence their cash flow. I think up front. Branch-and-price hadn’t entered my mind this was some of the early work on the analysis of ap- then. proximation algorithms that is now so popular, and Soon after that work was completed, in 1969 I left we were fortunate to receive the Lanchester prize for Hopkins for Cornell determined to learn more and a paper we published on it [5]. do more IP. I was convinced that IP modeling was I began my long and very fruitful collaboration very robust for operations research applications but with in the fall of 1975 when I the models were hard to solve. One of the difficul- was the research director of the Center for Oper- ties in getting students involved in integer program- ations Research and Econometrics (CORE) at the ming was the absence of an IP book that could serve University of Louvain. In generalizing the facility as a text for graduate students and also as a refer- location analysis, Laurence, Marshall and I figured ence for researchers. Fortunately when I moved to out that submodularity was the underlying princi- Cornell, Garfinkel took a position at the University ple [7]. Laurence and I wrote several papers during of Rochester only 90 miles from Ithaca, and we be- this period, but nothing on computational integer gan collaborating on our book Integer Programming programming. I think that was typical of the 1970s. which was published in 1972 [10]. We just didn’t have the software or the hardware to Also, motivated by the set partitioning work, I experiment with computational ideas. At the time I got my first NSF grant in IP, which was to study think it was fair to say that from a practical stand- theory and algorithms for set packing, partitioning point branch-and-bound was it. and covering. I worked with my student Les Trotter However, beginning in the late 1970s and early on polyhedral structure and algorithms for the node 1980s a new era of computational integer program- packing and independent set problem [22]. Our work ming based on integrating polyhedral combinatorics was influenced by Ray Fulkerson who joined the Cor- with branch-and-bound was born. As Laurence nell faculty a couple of years after I did and taught Wolsey and I saw, and participated in these excit- us, among other things, the beautiful theory con- ing developments, we thought that the field needed necting perfect graphs and the class of IPs known as a new integer programming book so that researchers node packing. While Ray was not computationally would have good access to the rapid changes that oriented, he had insight on what made integer pro- were taking place. I think we first talked about this grams hard and gave us the family of Steiner triple project in the late 1970s but I was chairman of the set covering problems that are now part of the MI- OR department at Cornell and didn’t have the time PLIB library of IP test problems and still remain until I had my sabbatical leave in 1983-84 at CORE. difficult to solve [8]. Ray also started the coopera- We worked very hard on the book that academic tion between Cornell and Waterloo where Jack Ed- year not realizing that it would take us five more monds and his students were doing exciting work years to complete it [23]. We were honored when on matroids, matching and other combinatorial op- in 1989 it received the Lanchester prize. We tried timization problems. I first met Bill Cunningham, not to get too distracted by all of the unsolved prob- Volume 1 Number 1 March 2011 5 lems that arise in the process of putting results from joined us from the Netherlands as a postdoc and be- papers into a book. But we couldn’t resist study- came the key person in the creation of our research ing the basic question of finding for mixed-integer code called MINTO [20], which for many years was problems the analog of Chv´atal’srounding for pure the code widely used by researchers in integer pro- integer programming. Solving that question led to gramming to evaluate the implementation of their what we called mixed-integer rounding, which has theoretical results. Many of the ideas from MINTO, become a key tool for deriving cuts based on struc- developed with Martin, were incorporated in the ear- ture for mixed-integer programs [24]. lier releases of CPLEX’s MIP code. In fact our stu- After returning from CORE, I spent one more year dent, Zhonghau Gu, became the key person in the at Cornell and then moved to Georgia Tech. By development of the MIP code for subsequent releases this time I really wanted to work on some real prob- of CPLEX and GUROBI. Several of my Georgia lems and to apply integer programming in practice. Tech students, including Alper Atamt¨urk,Ismael de Ellis Johnson came to Tech soon thereafter having Farias, Zhonghau Gu, Andrew Miller, Jean-Philippe played a leadership role in the development of IBM’s Richard and Juan Pablo Vielma produced many new OSL code for mixed integer programming. Thanks results in integer programming [3, 6, 11, 25, 27]. to Ellis, IBM provided support for us to teach MIP More recently, I have been collaborating with my short courses to people from industry using OSL. colleague Shabbir Ahmed. Together with students, This opened up tremendous research opportunities Yongpei Guan, Jim Luedtke and Juan Pablo Vielma, with industry and led to long-term relations with we have been making some progress on polyhedral several airlines and other companies mainly with aspects of stochastic integer programming [12, 13, supply chain optimization problems [1, 14, 16, 26]. 17]. These projects with industry generated many PhD I have been extremely fortunate throughout my dissertations including those of Pam Vance, Dan academic career to have worked with an incredible Adelman, Greg Glockner, Ladislav Lettowsky, Diego group of PhD students and postdocs. I owe them Klabjan, Andrew Schaefer, Jay Rosenberger and Jeff a great debt, and they help keep me a young 73. Day. Martin Savelsbergh and Cindy Barnhart joined So I would like to end this note by recognizing all our faculty and participated actively in our collab- of my graduated PhD students and postdoctoral orations with industry. Some of this work led us associates by listing all of their names. to thinking about column generation in integer pro- gramming and to branch-and-price [4]. During this PHD STUDENTS period, as Georgia Tech Faculty Representative for 1961-69 Johns Hopkins (11): Bill Hardgrave, Athletics, I got involved with Mike Trick in sports Hugh Bradley, Mike Thomas, Zev Ulmann, Gil scheduling. Our first work was scheduling basket- Howard, Hank Nuttle, Dennis Eklof, Robert ball for the Atlantic Coast Conference [21]. Later Garfinkel, Gus Widhelm, Joe Bowman, Sherwood on we formed a company called the Sports Schedul- Frey. ing Group, and we now, together with my student Kelly Easton, schedule major league baseball as well 1970-85 Cornell (15): Les Trotter, Jim Fer- as several other sports conferences gusson, Deepak Merchant, Pierre Dejax, Mike OSL had a much better MIP solver than earlier Ball, Glenn Weber, G´erardCornu´ejols, Wen-Lian commercial codes using some of the new technology Hsu, Yoshi Ikura, Gerard Chang, Ronny Aboudi, like lifted cover cuts. But the ability to experiment Victor Masch, Russ Rushmeier, Gabriel Sigismondi, with new ideas for cuts, branching, preprocessing Sung-Soo Park. and primal heuristics was still missing. My student Gabriel Sigismondi and I began the development of 1985-2010 Georgia Tech (29): Heesang Lee, a MIP research code that was a branch-and-bound Anuj Mehrotra, Pam Vance, Erick Wikum, John shell that would call an LP solver and would give the Zhu, Zhonghau Gu, Y. Wang, Ismael de Farias, user all of the LP information needed to write sub- Dasong Cao, Dan Adelman, Greg Glockner, Alper routines for cuts, branching, etc. Martin Savelsbergh Atamt¨urk, Ladislav Lettowsky, Diego Klabjan, 6 INFORMS Optimization Society Newsletter

Dieter Vandenbussche, Ahmet Keha, Yongpei [12] Y. Guan, S. Ahmed and G.L. Nemhauser. Cutting planes Guan, Yetkin Ileri, Jay Rosenberger, Jean Philippe for multi-stage stochastic integer programs, Operations Richard, Kelly Easton, Jeff Day, James Luedtke, Research. 57:287–298 2009. Faram Engineer, Renan Garcia, Michael Hewitt, [13] Y. Guan, S. Ahmed, G.L. Nemhauser and A. Miller. A Gizem Keysan, Alejandro Toriello, Byungsoo Na. branch-and-cut algorithm for the stochastic uncapac- itated lot-sizing problem. Mathematical Programming, 105:55–84, 2006. POST DOCTORAL STUDENTS (6): Ram Pandit, Natashia Boland, Petra Bauer, Emilie [14] C. Hane, C. Barnhart, E.L. Johnson, R. Marsten, G.L. Danna, Bram Verweij, Menal G¨uzelsoy. Nemhauser and G. Sigismondi. The fleet assignment problem: Solving a large-scale integer program. Math- ematical Programming, 70:211–232, 1995. REFERENCES [15] W.W. Hardgrave and G.L. Nemhauser. On the relation between the traveling-salesman and the longest-path [1] D. Adelman and G.L. Nemhauser. Price-directed control problems. Operations Research, 10:647–657, 1962. of remnant inventory systems. Operations Research, 47:889–898, 1999. [16] D. Klabjan, E.L. Johnson and G.L. Nemhauser. Solving large airline crew scheduling problems: Random pair- [2] R. Aris, G.L. Nemhauser and D.J. Wilde. Optimization of ing generation and strong branching. Computational multistage cycle and branching systems by serial pro- Optimization and Applications, 20:73–91, 2001. cedures. AIChE Journal, 10:913–919, 1964. [17] J. Luedtke, S. Ahmed and G.L. Nemhauser. An inte- [3] A. Atamturk, G.L. Nemhauser and. M. Savelsbergh. The ger programming approach to linear programs with mixed vertex packing problem. Mathematical Program- probabilistic constraints. Mathematical Programming, ming, 89:35–54, 2000. 122:247–272, 2010.

[4] C. Barnhart, E.L. Johnson, G.L. Nemhauser, M. Savels- [18] D. Merchant and G.L. Nemhauser. A model and an al- bergh and P. Vance. Branch-and-Price: Column gen- gorithm for the dynamic traffic assignment problem. eration for solving huge integer problems. Operations Transportation Science 12:183–199, 1978. Research, 46:316–329, 1998. [19] G.L. Nemhauser. Introduction to Dynamic Programming. [5] G. Cornu´ejols,M.L. Fisher and G.L. Nemhauser. Location Wiley, 1966. of bank accounts to optimize float: An analytic study of exact and approximate algorithms. Management Sci- [20] G.L. Nemhauser, M. Savelsbergh and G. Sigismondi. ence, 23:789–810, 1977. MINTO: A Mixed INTeger Optimizer. Operations Re- search Letters, 15:47–58, 1994. [6] I. deFarias and G.L. Nemhauser. A polyhedral study of the cardinality constrained knapsack problem. Mathe- [21] G.L. Nemhauser and M. Trick. Scheduling a major col- matical Programming, 96:439–467, 2003. lege basketball conference. Operations Research, 46:1– 8, 1998. [7] M.L. Fisher, G.L. Nemhauser and L.A. Wolsey. An analy- sis of approximations for maximizing submodular set [22] G.L. Nemhauser and L.E. Trotter, Jr. Vertex pack- functions-I. Mathematical Programming, 14:265–294, ings: Structural properties and algorithms. Mathemat- 1978. ical Programming, 8:232–248, 1975.

[8] D.R. Fulkerson, G.L. Nemhauser and L.E. Trotter, Jr. [23] G.L. Nemhauser and L.A. Wolsey. Integer and Combina- Two computationally difficult set covering problems torial Optimization. Wiley, 1988. that arise in computing the 1-width of incidence ma- trices of steiner triple systems. Mathematical Program- [24] Nemhauser and L.A. Wolsey. A recursive procedure to ming Study, 2:72–81, 1974. generate all cuts in mixed-integer programs, Mathemat- ical Programming, 46:379–390, 1990. [9] R.S. Garfinkel and G.L. Nemhauser. Optimal political dis- tricting by implicit enumeration techniques. Manage- [25] J.P. Richard, I. de Farias and G.L. Nemhauser. Lifted ment Science, 16:495–508, 1970. inequalities for 0-1 mixed integer programming: Ba- sic theory and algorithms. Mathematical Programming, [10] R.S. Garfinkel and G.L. Nemhauser. Integer Program- 98:89–113, 2003. ming. Wiley, 1972. [26] J. Rosenberger, A. Schaefer, D. Goldsman, E. Johnson, [11] Z. Gu, G.L. Nemhauser and M. Savelsbergh. Lifted A. Kleywegt and G.L. Nemhauser. A stochastic model flow cover inequalities for mixed 0-1 integer programs, of airline operations. Transportation Science, 36:357– Mathematical Programming, 85:439–467, 1999. 377, 2002. Volume 1 Number 1 March 2011 7

[27] J.P. Vielma, S. Ahmed and G.L. Nemhauser. Mixed- of our collaboration. It is to the fond memories of integer models for nonseparable piecewise linear opti- Paul that I dedicate this article. mization: Unifying framework and extensions. Opera- In 1989 while I was finishing up my Ph.D thesis tions Research, 58:303–315, 2010. at MIT, Paul Tseng, then a postdoc with Bertsekas, posed an open question to me as a challenge: estab- lish the convergence of the matrix splitting algorithm for convex QP with box constraints. We worked re- First Order Methods for ally hard for a few months and eventually settled Large Scale Optimization: this question. The set of techniques we developed for solving this problem turned out to be quite useful Error Bounds and for a host of other problems and algorithms. This Convergence Analysis then led to a series of joint papers [1-7] published Zhi-Quan Luo in the early part of 1990’s in which we developed Department of Electrical and Computer Engineering a general framework for the convergence analysis of University of Minnesota, Minneapolis, MN 55455, USA first order methods for large scale optimization. This ([email protected]) framework is broad, and allows one to establish lin- ear rate of convergence in the absence of strong con- vexity. Most interestingly, some of this work has a direct bearing on the L1-regularized sparse optimiza- 1. Preamble tion found in many contemporary applications such as compressive sensing and imaging processing. In Needless to say, I am deeply honored by the recog- the following, I will briefly outline this algorithmic nition of the 2010 Farkas prize. Over the years, my framework and the main analysis tools (i.e., error research has been focused on various algorithmic and bounds), and mention their connections to the com- complexity issues in optimization that are strongly pressive sensing applications. motivated by engineering applications. I think the 2010 Farkas prize is, more than anything, a recog- nition of this type of interdisciplinary optimization 2. A General Framework work. Let f :

(e.g., compressive sensing), A is a fat matrix (i.e., REFERENCES m  n) and the objective function f(x) may not be strictly convex, so there can be multiple optimal solutions for (4). [1] Z.-Q. Luo and P. Tseng. Error bound and reduced-gradient projection algorithms for convex minimization over a The main difference between (4) and the problem polyhedral set. SIAM Journal on Optimization, 3:43– in Section 2 is the inclusion of a non-smooth term 59, 1993. kxk in the objective function. Can we extend the 1 [2] Z.-Q. Luo and P. Tseng. On the convergence rate of dual rate of convergence analysis framework described in ascent methods for strictly convex minimization. Math- Section 2 to this non-smooth setup? The answer is ematics of Operations Research, 18:846–867, 1993. positive. In particular, Paul Tseng in 2001 (before the L -minimization became a hot research topic) [3] Z.-Q. Luo and P. Tseng. Error bounds and convergence 1 analysis of feasible descent methods: A general ap- showed a general result [8] for the coordinate descent proach. Annals of Operations Research, 46:157–178, method as long as the non-smooth term is separable 1993. across variables. Specialized to (4), Paul’s result im- [4] Z.-Q. Luo and P. Tseng. On the linear convergence of plies that each limit point generated by the coordi- descent methods for convex essentially smooth mini- nate descent algorithm for solving (4) is a global op- mization. SIAM Journal on Control & Optimization, timal solution. Recently, we have further strength- 30(2):408–425, 1992. ened this result to show that the entire sequence of [5] Z.-Q. Luo and P. Tseng. Error bound and convergence iterates converge linearly to a global optimal solution analysis of matrix splitting algorithms for the affine (4) without any nondegeneracy assumption. variational inequality problem. SIAM Journal on Op- More generally, Paul Tseng considered [9] a class timization, 2(1):43–54, 1992. of proximal gradient method for minimizing over a [6] Z.-Q. Luo and P. Tseng. On the convergence of the coor- convex set X an objective function f(x) = f1(x) + dinate descent method for convex differentiable mini- f2(x), where both f1 and f2 are convex, but f2 might mization. Journal of Optimization Theory and Appli- be non-smooth. He showed that as long as the local cations, 72(1):7–35, 1992. error bound holds around the optimal solution set, [7] Z.-Q. Luo and P. Tseng. On the convergence of a matrix the global linear convergence can be expected even in splitting algorithm for the symmetric linear comple- this non-smooth setup. Remarkably, Paul was able mentarity problem. SIAM Journal on Control & Op- to establish a local error bound for the Group Lasso timization, 29(5):1037–1060, 1991. P 1 2 case where f(x) = λ J∈J kxJ k2 + 2 kAx−bk . Here [8] P. Tseng. Convergence of block coordinate descent method J is a partition of {1, 2, ..., n} and xJ is a subvec- for nondifferentiable minimization. Journal of Opti- tor of x with components taken from J ∈ J . It is mization Theory and Applications, 201:473–492, 2001. likely that this framework can be further extended [9] P. Tseng. Approximation accuracy, gradient methods, and to other types of non-smooth error bound for structured convex optimization. Math- problems/algorithms used for recovering sparse or ematical Programming, Series B, 125(2):263–295, 2010. low-rank solutions. 10 INFORMS Optimization Society Newsletter

supported on [−1, 1], or (ii) normally distributed with unit variance. Furthermore, let Q1,...,Qh be Probability Inequalities for Ph T arbitrary m × n matrices satisfying i=1 QiQi  I Sums of Random Matrices Ph T and i=1 Qi Qi  I. Under what conditions on and Their Applications in t > 0 will we have an exponential decay of the tail   Optimization Ph probability Pr i=1 ξiQi ≥ t ? Here, kAk∞ ∞ Anthony Man–Cho So denotes the spectral norm of the m × n matrix A. Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, In a technical tour de force, Nemirovski showed Shatin, N. T., Hong Kong that whenever t ≥ Ω((m + n)1/6), the aforemen- ([email protected]) tioned tail probability will have an exponential de- cay. Moreover, he conjectured that the same result In this article, I will highlight the main ideas in would hold for much smaller values of t, namely, for t ≥ Ω(pln(m + n)). As argued in [4], the threshold my recent work Moment Inequalities for Sums of p Random Matrices and Their Applications in Opti- Ω( ln(m + n)) is in some sense the best one could mization (accepted for publication in Mathematical hope for. Moreover, if the above conjecture is true, Programming, 2009) and discuss some of the recent then it would immediately imply improved perfor- developments. mance guarantees for various optimization problems. The story begins with a groundbreaking work of Therefore, there is great interest in determining the Nemirovski [4], which revealed a close relationship validity of this conjecture. between the theoretical properties of a host of opti- As it turns out, the behavior of the matrix–valued Ph mization problems and the behavior of a sum of cer- random variable Sh ≡ i=1 ξiQi has been exten- tain random matrices. Indeed, Nemirovski showed sively studied in the functional analysis and prob- that the construction of so–called safe tractable ability theory literature. One of the tools that is approximations of certain chance constrained lin- particularly useful for addressing Nemirovski’s con- ear matrix inequalities, as well as the analysis jecture is the so–called Khintchine–type inequali- of a semidefinite relaxation of certain non–convex ties. Roughly speaking, such inequalities provide quadratic optimization problems, can be achieved by upper bounds on the p–norm of the random vari- answering the following question: able kShk∞ in terms of suitable normalizations of the matrices Q1,...,Qh. Once these bounds are avail- Question (Q) Let ξ1, . . . , ξh be independent mean zero random variables, each of which is either (i) able, it is easy to derive tail bounds for kShk∞ using Markov’s inequality. In my work, I showed that the validity of Nemirovski’s conjecture is in fact a sim- ple consequence of the so–called non–commutative Khintchine’s inequality in functional analysis [3] (see also [1]). Using this result, I was then able to obtain the best known performance guarantees for a host of optimization problems. Of course, this is not the end of the story. In fact, it is clear from recent developments that the behav- ior of a sum of random matrices plays an important role in the design and analysis of many optimization algorithms. One representative example is the anal- ysis of the nuclear norm minimization heuristic (see, e.g., [6] and the references therein). Furthermore, Nick Sahinidis presenting the 2010 there has been some effort in developing easy–to– Young Researcher Prize to Anthony Man–Cho So use recipes for deriving tail bounds on the spectral norm of a sum of random matrices [5, 7]. In a re- Volume 1 Number 1 March 2011 11 cent work [2], we applied such recipes to construct safe tractable approximations of chance–constrained linear matrix inequalities with dependent perturba- Fast Multiple Splitting tions. Our results generalize the existing ones in Algorithms for Convex several directions and further demonstrate the rel- Optimization evance of probability inequalities for matrix–valued random variables in the context of optimization. Shiqian Ma Department of IEOR, Columbia University In summary, recent advances in probability the- New York, NY 10027 ory have yielded many powerful tools for analyzing ([email protected]) matrix–valued random variables. These tools will be invaluable as we tackle matrix–valued random vari- ables that arise quite naturally in many optimization Our paper “Fast Multiple Splitting Algorithms problems. for Convex Optimization” [3] considers alternating direction type methods for minimizing the sum of REFERENCES K(K ≥ 2) convex functions fi(x), i = 1,...,K:

K X [1] A. Buchholz. Operator Khintchine inequality in non– min F (x) ≡ fi(x). (1) commutative probability. Mathematische Annalen, x∈ n R i=1 319:1–16, 2001. Many problems arising in machine learning, medical [2] S.-S. Cheung, A. M.-C. So, and K. Wang. Chance– constrained linear matrix inequalities with dependent imaging and signal processing etc., can be formu- perturbations: A safe tractable approximation ap- lated as minimization problems of the form of (1). proach. Preprint, 2011. For example, the robust principal component anal- ysis problem can be cast as minimizing the sum of [3] F. Lust-Piquard. In´egalit´esde Khintchine dans Cp (1 < p < ∞). Comptes Rendus de l’Acad´emiedes Sciences two convex functions (see [2]): de Paris, S´erieI, 303(7):289–292, 1986. min kXk∗ + ρkM − Xk1, (2) [4] A. Nemirovski. Sums of random symmetric matri- X∈Rm×n ces and quadratic optimization under orthogonality constraints. Mathematical Programming, Series B, where kXk∗ is the nuclear norm of X, which is 109(2–3):283–317, 2007. defined as the sum of the singular values of X, P m×n [5] R. I. Oliveira. Sums of random hermitian matrices and an kY k1 := ij |Yij|, M ∈ R and ρ > 0 is a weight- inequality by Rudelson. Electronic Communications ing parameter. As another example, one version of in Probability, 15:203–212, 2010. the compressed sensing MRI problem can be cast as [6] B. Recht. A simpler approach to matrix completion. minimizing the sum of three convex functions (see Preprint, available at http://arxiv.org/abs/0910. [5]): 0651, 2009. 1 2 min αT V (x) + βkW xk1 + 2 kAx − bk , (3) [7] J. A. Tropp. User–friendly tail bounds for sums of random x∈Rn matrices. Preprint, available at http://arxiv.org/ abs/1004.4389, 2010. where TV (x) is the total variation function, W is m×n m a wavelet transform, A ∈ R , b ∈ R and α > 0, β > 0 are weighting parameters. Although these problems can all be reformulated as conic pro- gramming problems, (e.g., (2) can be reformulated as a semidefinite programming problem and (3) can be reformulated as a second-order cone programming problem), polynomial-time methods such as interior- point methods are not practical for these problems because they are usually of huge size. However, these 12 INFORMS Optimization Society Newsletter problems have a common property: although it is Nesterov proved that the iteration complexity of (6) √ not easy to minimize the sum of the f ’s, it is rela- is O(1/ ) for obtaining an -optimal solution [6, 7]. i √ tively easy to solve problems of the form He also proved that O(1/ ) is the best bound one

1 2 can get if one uses only first-order information. min τfi(x) + 2 kx − zk2 x∈Rn Our MSA algorithms are based on the idea of split- ting the objective function into K parts and applying for each of the functions f (x) for any τ > 0 and i a linearization technique to minimize each function z ∈ n. Thus, alternating direction type methods R f plus a proximal term, alternatingly. The basic that split the objective function and minimize each i version of MSA iterates the following updates: function fi alternatingly can be effective for these ( i i i problems. x := pi(w , . . . , w ), ∀i = 1,...,K (k+1) (k) (k) (7) The main contribution of [3] is two classes of alter- wi := 1 PK xj , ∀i = 1,...,K, nating direction type methods, which are called mul- (k+1) K j=1 (k+1) tiple splitting algorithms (MSA) in [3], with prov- 1 K with initial points w(0) = ... = w(0) = x0, where able iteration complexity bounds for obtaining an 1 i−1 i+1 K -optimal solution. For the unconstrained problem pi(v , . . . , v , v , . . . , v ) := arg min Q (v1, . . . , vi−1, u, vi+1, . . . , vK ), min f(x), (4) u i x 1 i−1 i+1 K x is called an -optimal solution to (4) if f(x) ≤ Qi(v , . . . , v , u, v , . . . , v ) ∗ ∗ PK ˜ j f(x ) + , where x is an optimal solution to (4). It- := fi(u) + j=1,j6=i fj(u, v ) eration complexity results for gradient methods have and been well studied in the literature. It is known that if one applies the classical gradient method with step 1 2 f˜i(u, v) := fi(v) + h∇fi(v), u − vi + ku − vk . length τk 2µ k+1 k k x := x − τk∇f(x ) (5) 1 i−1 i+1 K Notice that Qi(v , . . . , v , u, v , . . . , v ) is an to solve (4), an -optimal solution is obtained in approximation to F (u) that keeps the function O(1/) iterations under certain conditions. In [6], fi(u) unchanged and linearizes the other functions Nesterov proposed techniques to accelerate (5). One fj(u), j 6= i while adding proximal terms. of his acceleration techniques implements the follow- The following theorem gives the iteration com- ing at each iteration: plexity result for Algorithm MSA.  k+1 k k ∗ x := y − τk∇f(y ) Theorem 1 Suppose x is an optimal solution k+1 k k−1 k k−1 (6) y := x + k+2 (x − x ). to problem (1). Assume fi’s are smooth func- tions and their gradients are Lipschitz continuous with Lipschitz constants L(fi), i=1,. . . ,K. If µ ≤ i i K 1/ maxi{L(fi)}, then the sequence {x(k), w(k)}i=1 generated by MSA (7) satisfies:

∗ 2 i ∗ (K − 1)kx0 − x k min F (x(k)) − F (x ) ≤ . i=1,...,K 2µk Thus (7) produces a sequence that converges in objec- tive function value. Also, when µ ≥ β/ maxi{L(fi)} where 0 < β ≤ 1, to get an -optimal solution, the number of iterations required is O(1/). The proof of this theorem basically follows an ar- (Left to right) Miguel Anjos, Sven Leyffer, gument used by Beck and Teboulle in [1] for prov- Shiqian Ma and Nick Sahinidis ing the iteration complexity of the Iterative Shrink- age/Thresholding Algorithm. Volume 1 Number 1 March 2011 13

A fast version of MSA (FaMSA) is also presented [6] Y. E. Nesterov. A method for unconstrained convex in [3]. Under the same assumptions as in Theorem minimization problem with the rate of convergence 2 1, the iteration complexity bound of FaMSA is im- O(1/k ). Dokl. Akad. Nauk SSSR, 269:543–547, 1983. √ proved to O(1/ ) while the computational effort in [7] Y. E. Nesterov. Introductory lectures on convex optimiza- each iteration is almost the same as in MSA. tion. 87:xviii+236, 2004. A basic course.

To the best of our knowledge, the iteration com- [8] Y. E. Nesterov. Smooth minimization for non-smooth plexity results for MSA and FaMSA are the first ones functions. Math. Program. Ser. A, 103:127–152, 2005. of this type that have been given for alternating di- rection methods involving three or more directions (i.e., functions). Our theorems require the functions to be smooth Nominations for Society to prove the complexity. However, if any of the func- Prizes Sought tions are not smooth as is the case in (2) and (3), we can use the smoothing technique proposed by Nesterov in [8] to smooth them. This smoothing The Society awards four prizes, now annually, at technique also guarantees that the gradients of the the INFORMS annual meeting. We seek nomina- smoothed functions are Lipschitz continuous. tions and applications for each of them, due by June The numerical results in [3] show that the mul- 30, 2011. Details for each of the prizes, including tiple splitting algorithms are much faster than the eligibility rules and past winners, can be found by classical interior-point method for solving some ran- following the links from http://www.informs.org/ domly created Fermat-Weber problems. Also, since Community/Optimization-Society/Prizes. the multiple splitting algorithms in [3] are all Jacobi Each of the four awards includes a cash amount of type algorithms, their performance should be even US$ 1,000 and a citation certificate. The award win- better if they can be implemented in a parallel com- ners will be invited to give a presentation in a special puting environment. session sponsored by the Optimization Society dur- Gauss-Seidel type alternating linearization meth- ing the INFORMS annual meeting in Charlotte, NC ods are studied in another paper [4]. in November 2011 (the winners will be responsible for their own travel expenses to the meeting). REFERENCES The Khachiyan Prize is awarded for outstand- ing life-time contributions to the field of optimiza- [1] A. Beck and M. Teboulle. A fast iterative shrinkage- tion by an individual or team. The topic of the thresholding algorithm for linear inverse problems. contribution must belong to the field of optimiza- SIAM J. Imaging Sciences, 2(1):183–202, 2009. tion in its broadest sense. Recipients of the IN- [2] E. J. Cand`es, X. Li, Y. Ma, and J. Wright. Robust FORMS John von Neumann Theory Prize or the principal component analysis? Preprint available at MPS/SIAM Dantzig Prize in prior years are not eli- http://arxiv.org/abs/0912.3599, 2009. gible for the Khachiyan Prize. The prize committee [3] D. Goldfarb and S. Ma. Fast multiple splitting algorithms for the Khachiyan Prize is as follows: for convex optimization. Technical report, Depart- ment of IEOR, Columbia University. Preprint avail- • George Nemhauser (Chair) able at http://arxiv.org/abs/0912.4570, 2009. [email protected] [4] D. Goldfarb, S. Ma, and K. Scheinberg. Fast alternat- • Tam´asTerlaky ing linearization methods for minimizing the sum of • two convex functions. Technical report, Department of IEOR, Columbia University. Preprint available at • Lex Schrijver http://arxiv.org/abs/0912.4571, 2010. Nominations and applications for the Khachiyan [5] S. Ma, W. Yin, Y. Zhang, and A. Chakraborty. An effi- Prize should be made via EasyChair https://www. cient algorithm for compressed MR imaging using to- easychair.org/conferences/?conf=ioskp2011. tal variation and wavelets. IEEE International Con- ference on Computer Vision and Pattern Recognition Please direct any inquiries to the prize-committee (CVPR), pages 1–8, 2008. chair. 14 INFORMS Optimization Society Newsletter

The Farkas Prize is awarded for outstanding con- letter indicating that the majority of the nominated tributions to the field of optimization by a researcher work was performed by the nominee(s). The prize or a team of researchers. The contribution may be a committee for the Student Paper Prize is as follows: published paper, a submitted and accepted paper, a series of papers, a book, a monograph, or software. • Matthias K¨oppe (Chair) The author(s) must have been awarded their termi- [email protected] nal degree within twenty five calendar years preced- • Katya Scheinberg ing the year of the award. The prize committee for • Jean-Philippe Richard the Farkas Prize is as follows: Nominations and applications for the Student Pa- • G´erardCornu´ejols(Chair) per Prize should be made via email to the prize- [email protected] committee chair. Please direct any inquiries to the • Alexander Shapiro prize-committee chair. • David Shmoys • Stephen Wright Nominations of Candidates Nominations and applications for the Farkas Prize for Society Officers Sought should be made via email to the prize-committee chair. Please direct any inquiries to the prize- committee chair. Nick Sahinidis will have completed his term as The Prize for Young Researchers is awarded Most-Recent Past-Chair of the Society at the con- to one or more young researcher(s) for an outstand- clusion of the 2011 annual INFORMS meeting. Jon ing paper in optimization that is submitted to and Lee is continuing as Chair through 2012. Marina accepted, or published in a refereed professional jour- Epelman will have completed her (extended) term nal. The paper must be published in, or submitted as Secretary/Treasurer, also at the conclusion of the to and accepted by, a refereed professional journal INFORMS meeting. The Society is indebted to Nick within the four calendar years preceding the year and Marina for their work. of the award. All authors must have been awarded We would also like to thank four Society Vice- their terminal degree within eight calendar years pre- Chairs who will be completing their two-year terms ceding the year of award. The prize committee for at the conclusion of the INFORMS meeting: Miguel the Prize for Young Researchers is as follows: Anjos, Steven Dirkse, Oktay G¨unl¨ukand Mauricio • Jim Renegar (Chair) Resende. [email protected] We are currently seeking nominations of • Dan Bienstock candidates for the following positions: • Endre Boros • Chair-Elect • Tom McCormick • Secretary/Treasurer Nominations and applications for the Prize for Young Researchers should be made via email to the • Vice-Chair for Computational Optimiza- prize-committee chair. Please direct any inquiries to tion and Software the prize-committee chair. The Student Paper Prize is awarded to one or • Vice-Chair for Integer Programming more student(s) for an outstanding paper in opti- • Vice-Chair for Linear Programming and mization that is submitted to and received or pub- Complementarity lished in a refereed professional journal within three calendar years preceding the year of the award. Ev- • Vice-Chair for Networks ery nominee/applicant must be a student on the first of January of the year of the award. Any co- Self nominations for all of these positions are en- author(s) not nominated for the award should send a couraged. Volume 1 Number 1 March 2011 15

To ensure smooth transition of the chairmanship serve two-year terms. of the Society, the Chair-Elect is elected and serves Please send your nominations or self-nominations a one-year term before assuming a two-year position to Marina Epelman ([email protected]), includ- as Chair; thus this is a three-year commitment. As ing contact information for the nominee, by Sunday, stated in the Society Bylaws, “The Chair shall be July 31, 2011. Online elections will begin in mid- the chief administrative officer of the OS and shall August, with new officers taking up their duties at be responsible for the development and execution of the conclusion of the 2011 annual INFORMS meet- the Society’s program. He/she shall (a) call and or- ing. ganize meetings of the OS, (b) appoint ad hoc com- mittees as required, (c) appoint chairs and members of standing committees, (d) manage the affairs of the OS between meetings, and (e) preside at OS Council Announcing the Fourth meetings and Society membership meetings.” INFORMS Optimization According to Society Bylaws, “The Secre- Society Conference tary/Treasurer shall conduct the correspondence of the OS, keep the minutes and records of the Society, maintain contact with INFORMS, receive reports of The School of Business Administration at the Uni- activities from those Society Committees that may versity of Miami is pleased to host the Fourth IN- be established, conduct the election of officers and FORMS Optimization Society conference to take Members of Council for the OS, make arrangements place on the Coral Gables campus of the Univer- for the regular meetings of the Council and the mem- sity of Miami, February 24–26, 2012. The theme bership meetings of the OS. As treasurer, he/she of the conference is “Optimization and Analytics: shall also be responsible for disbursement of the So- New Frontiers in Theory and Practice.” The Con- ciety funds as directed by the OS Council, prepare ference Co-Chairs are Anuj Mehrotra and Michael and distribute reports of the financial condition of A. Trick. The Organizing Committee includes Ed- the OS, help prepare the annual budget of the Soci- ward Baker, Hari Natarajan and Tallys Yunes. Fur- ety for submission to INFORMS. It will be the re- ther details will gradually appear on the conference sponsibility of the outgoing Secretary/Treasurer to web site at http://www.bus.miami.edu/events/ make arrangements for the orderly transfer of all the ios2012/index.html. Society’s records to the person succeeding him/her.” Previous editions of the conference were at San The Secretary/Treasurer shall serve a two-year term. Antonio (2006), Atlanta (2008) and Gainesville According to Society Bylaws, “The main respon- (2010). Some of you may know/remember that the sibility of the Vice Chairs will be to help INFORMS School of Business Administration at the University Local Organizing committees identify cluster chairs of Miami hosted the wonderful Mixed-Integer Pro- and/or session chairs for the annual meetings. In gramming meeting, MIP 2006. Anyone who was at general, the Vice Chairs shall serve as the point of that meeting knows that it is a delightful venue – contact with their sub-disciplines.” Vice Chairs shall especially in February.