Copyright © 2008 IEEE. Reprinted from IEEE Computational Intelligence Magazine, 2008; 3 (1):54-63

This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of 's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected].

By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

Research Zbigniew Michalewicz, University of Adelaide, AUSTRALIA, Frontier and Matthew Michalewicz, SolveIT Software, AUSTRALIA

Machine Intelligence, Adaptive Business Intelligence, and Natural Intelligence

Machine Intelligence ligence by replicating humans by are silly, the system hardly deserves to ost people recognize Larry directly creating rules to follow or be called “intelligent!” Note also, that Fogel for his work on evolu- creating the neural connections). the term “appropriate action” implies Mtionary programming; today, It is interesting to observe that Larry optimization, as usually the system evolutionary programming is consid- Fogel identified three key elements of should take (or recommend) the best ered one of the early branches of evo- intelligence, namely: course of action. lutionary algorithms, together with ❏ ability to predict, Interestingly, the three components genetic algorithms, evolution strategies, ❏ ability to adapt, and of prediction, adaptation, and optimiza- genetic programming, and many ❏ ability to take appropriate action. tion constitute the core modules of other—sometimes unnamed—popula- Clearly, there is no need to argue adaptive business intelligence systems. tion-based techniques (Baeck et al., that prediction is important—without When we discuss adaptive business intel- 1997). However, it is important to this capability no system (including nat- ligence in the next section of the paper, remember that Larry’s main interest at ural systems) can be called intelligent. the connection with evolutionary pro- that time was in machine intelligence, The concept of gramming will be- and his work on evolutionary program- adaptability is cer- come apparent. ming was just to address some issues of tainly gaining popu- One additional machine intelligence. larity. Adaptability aspect of Larry One of the key observations of has already been Fogel’s research was Larry Fogel was that machine intelli- introduced in every- connected with the gence might be defined as the capabil- thing from automatic concept of so-called ity of a system to adapt its behavior to car transmissions “Valuated State meet desired goals in a range of envi- (which adapt their Space”®. The “Valu- ronments. Consequently, intelligent gear-change patterns ated State Space”® behavior requires prediction. An addi- to a driver’s driving approach provides a tional component of intelligent style), to running convenient structure behavior is adaptation (which is based shoes (which adapt © DIGITALSTOCK for assessing various on prediction, as adaptation to future their cushioning to a runner’s size and decision-making parameters in terms circumstances requires predicting stride), to Internet search engines that people are familiar with. Further, it those circumstances). The final com- (which adapt their search results to a allows individuals to apply subjective ponent of intelligent behavior is the user’s preferences and prior search his- relative importance weights and pro- capability of taking appropriate action tory). These products are very appeal- vides a mechanism for dealing with (Fogel et al., 1966). Consequently, the ing to individual consumers, because, degrees of criticality of parameters. The foundation for evolutionary program- despite their mass production, they are “Valuated State Space”® approach pro- ming research was to generate capable of adapting to the preferences vides a rank ordering of all possible out- machine intelligence by simulating the of each unique user after some period comes and rapid comparison of two or evolutionary process on a class of pre- of time. Finally, the ability to take more potential decisions to determine dictive algorithms (as opposed to the appropriate action is probably the most which is better. approach of generating machine intel- important component of an intelligent For a high-level overview of valuat- system. After all, if a system can predict ed state spaces the reader is referred to Digital Object Identifier 10.1109/MCI.2007.913389 and adapt, but all the decisions made (Michalewicz and Fogel, 2004); however,

54 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | FEBRUARY 2008 1556-603X/08/$25.00©2008IEEE

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. it is worthwhile to look at this approach as a general problem-solving approach. One of the key observations of Larry Fogel was that machine After all, the major steps in “Valuated intelligence might be defined as the capability of a system State Space”® imply some general to adapt its behavior to meet desired goals in a range problem-solving steps, which require understanding of the problem, rejection of environments. of intuition, building a model of the problem by defining the variables, con- straints, and the objectives. Thus the prediction before we can choose the none of these has explained how to systematic approach proposed for the quickest driving route. At work, we combine these various technologies Valuated State Spaces can be translated need to predict the demand for our into a software system that is capable into a general problem solving method- product before we can decide how of predicting, optimizing, and adapt- ology. When we discuss Puzzle-Based much to produce. And before invest- ing. Adaptive business intelligence Learning in the third section of the ing in a foreign market, we need to addresses this very issue. paper, the connection with “Valuated predict future exchange rates and eco- Clearly, the future of the business State Space”® will be visible. nomic variables. It seems that regard- intelligence industry lies in systems that These links with Larry Fogel’s less of the decision being made or its can make decisions, rather than tools work define the organization of this complexity, we first need to make a that produce detailed reports (Loshin paper. The next part of the paper pre- prediction of what is likely to happen 2003). As most business managers now sents the main concepts behind adap- in the future, and then make the best realize, there is a world of difference tive business intelligence, and the decision based on that prediction. This between having good knowledge and following part discusses the current fundamental process underpins the detailed reports, and making smart state of a new approach to learning, basic premise of adaptive business decisions. Michael Kahn, a technology called Puzzle-Based Learning. A short intelligence. reporter for Reuters in San Francisco, section on Larry’s and authors’ business Simply put, adaptive business intel- makes a valid point in his January 16, experience concludes the paper. ligence is the discipline of combining 2006 story titled, “Business Intelli- prediction, optimiza- gence Software Looks Adaptive Business Intelligence tion, and adaptability to Future”: Since the computer age dawned on into a system capable “But analysts mankind, one of the most important of answering these say applications areas in information technology has two fundamental that actually been that of “decision support.” questions: What is answer questions Today, this area is more important likely to happen in the rather than just than ever. Working in dynamic and future? And, what is the present mounds of ever-changing environments, mod- best decision right now? data is the key dri- ern-day managers are responsible for (Michalewicz et al. ver of a market set an assortment of far reaching deci- 2007). To build such to grow 10 per- sions: Should the company increase or a system, we first cent in 2006 or decrease its workforce? Enter new markets? need to understand about twice the Develop new products? Invest in research the methods and rate of the business and development? The list goes on. But techniques that enable software industry despite the inherent complexity of prediction, optimiza- in general. these issues and the ever-increasing tion, and adaptability ‘Increasingly load of information that business (Dhar and Stein, 1997). At first blush, you are seeing applications being managers must deal with, all these this subject matter is nothing new, as developed that will result in some decisions boil down to two funda- hundreds of books and articles have sort of action,’ said Brendan Barnacle, mental questions: already been written on business intel- an analyst at Pacific Crest Equities. ‘It ❏ What is likely to happen in the ligence (Vitt et al, 2002; Loshin, is a relatively small part now, but it is future? 2003), data mining and prediction clearly where the future is. That is the ❏ What is the best decision right now? methods (Weiss and Indurkhya, 1998; next stage of business intelligence.’” Whether we realize it or not, these Witten and Frank, 2005), forecasting two questions pervade our everyday methods (Makridakis et al., 1988), Business Intelligence vs. lives—both on a personal and profes- optimization techniques (Deb 2001; Adaptive Business Intelligence sional level. When driving to work, Coello et al. 2002; Michalewicz and “The answer to my problem is hidden in for instance, we have to make a traffic Fogel, 2004), and so forth. However, my data … but I cannot dig it up!” This

FEBRUARY 2008 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 55

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. Shortliffe and Cimino, 2006), the com- monly accepted distinction between I K data, information, and knowledge is: n n ❏ f Data are collected on a daily basis in o o the form of bits, numbers, symbols, r Data w Data Data m l and “objects.” Preparation Mining a e ❏ Information is “organized data,” t d i which are preprocessed, cleaned, o g n e arranged into structures, and stripped of redundancy. ❏ Knowledge is “integrated informa- FIGURE 1 The process of Business Intelligence. tion,” which includes facts and rela- tionships that have been perceived, discovered, or learned. popular statement has been around for 50, or that product X sells much better Because knowledge is such an years as business managers gathered and in Florida than in Georgia.1 essential component of any decision- stored massive amounts of data in the Consequently, the general goal of making process (as the old saying goes, belief that they contain some valuable most business intelligence systems was “Knowledge is power!”), many businesses insight. But business managers eventu- to: (1) access data from a variety of have viewed knowledge as the final ally discovered that raw data are rarely sources; (2) transform these data into objective. But it seems that knowledge of any benefit, and that their real value information, and then into knowledge; is no longer enough. A business may depends on an organization’s ability to and (3) provide an easy-to-use graphical “know” a lot about its customers—it analyze them. Hence, the need interface to display this knowledge. In may have hundreds of charts and emerged for software systems capable other words, a business intelligence sys- graphs that organize its customers by of retrieving, summarizing, and inter- tem was responsible for collecting and age, preferences, geographical location, preting data for end-users (Moss and digesting data, and presenting know- and sales history - but management Atre, 2003). ledge in a friendly way (thus enhancing may still be unsure of what decision to This need fueled the emergence of the end-user’s ability to make good make! And here lies the difference hundreds of business intelligence com- decisions). The following diagram illus- between “decision support” and “deci- panies that specialized in providing trates the processes that underpin a tra- sion making:” all the knowledge in the software systems and services for ditional business intelligence system: world will not guarantee the right or extracting knowledge from raw data. Although different texts have illus- best decision. These software systems would analyze a trated the relationship between data and Moreover, recent research in psy- company’s operational data and provide knowledge in different ways (e.g., Dav- chology indicates that widely held knowledge in the form of tables, enport and Prusak, 2006; Prusak, 1997; beliefs can actually hamper the decision- graphs, pies, charts, and other statistics. making process. For example, common For example, a business intelligence 1Note that business intelligence can be defined both as beliefs like “the more knowledge we a “state” (a report that contains knowledge) and a report may state that 57 percent of cus- “process” (software responsible for converting data have, the better our decisions will be,” tomers are between the ages of 40 and into knowledge). or “we can distinguish between useful and irrelevant knowledge,” are not sup- ported by empirical evidence. Having more knowledge merely increases our confidence, but it does not improve the accuracy of our decisions. Similarly, Adaptability people supplied with “good” and “bad” knowledge often have trouble distin- I n K D guishing between the two, proving that Optimization f n e o irrelevant knowledge decreases our o c r w Data Data i decision-making effectiveness. Data m l Preparation a Mining s Today, most business managers real- t e d i ize that a gap exists between having the i Prediction o o g right knowledge and making the right n e n decision. Because this gap affects management’s ability to answer funda- FIGURE 2 The process of Adaptive Business Intelligence. mental business questions (such as

56 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | FEBRUARY 2008

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. “What should be done to increase prof- Current Trends tive when it helps an individual adjust its? Reduce costs? Or increase market Adaptability is a vital component of any and function well within a changing share?”), the future of business intelli- intelligent system, as it is hard to argue social environment. gence lies in systems that can provide that a system is intelligent if it does not The same holds true for any expert answers and recommendations, rather have the capacity to adapt. For humans, system. No one questions the useful- than mounds of knowledge in the form the importance of adaptability is obvi- ness of expert systems in some envi- of reports. The future of business intelli- ous: our ability to adapt was a key ele- ronments (which are usually well gence lies in systems that can make decisions! ment in the evolutionary process. In the defined and static), but expert systems As a result, there is a trend emerging in case of artificial intelligence, consider a that are incapable of learning and the marketplace called adaptive business chess program capable of beating the adapting should not be called “intelli- intelligence. In addition to performing world chess master: Should we call this gent.” Some expert knowledge was the role of traditional business intelli- program intelligent? Probably not. We programmed in, that is all. gence (transforming data into knowl- can attribute the program’s performance So, what are the future trends for edge), Adaptive business intelligence to its ability to evaluate the current adaptive business intelligence? In words also includes the decision-making board situation against a multitude of of Jim Goodnight, the CEO of SAS process, which is based on prediction possible “future boards” before selecting Institute (Collins et al. 2007): and optimization: the best move. However, because the “Until recently, business intelli- While business intelligence is often program cannot learn or adapt to new gence was limited to basic query defined as “a broad category of applica- rules, the program will lose its effective- and reporting, and it never really tion programs and technologies for ness if the rules of the game are changed provided that much intelli- gathering, storing, analyzing, and pro- or modified. Consequently, because the gence…” viding access to data,” adaptive business program is incapable of learning or However, this is about to change. intelligence can be defined as “the disci- adapting to new rules, the program is Keith Collins, chief technology officer pline of using prediction and optimiza- not intelligent. of SAS Institute (Collins et al. 2007) tion techniques to build self-learning The growing popularity of adapt- believes that: ‘decisioning’ systems” (as the above dia- ability is also underscored by a recent “A new platform definition is gram shows). Adaptive business intelli- publication of the U.S. Department of emerging for business intelligence, gence systems include elements of data Defense. This lists 19 important where BI is no longer defined as mining, predictive modeling, forecast- research topics for the next decade and simple query and reporting. … In ing, optimization, and adaptability, and many of them include the term “adap- the next five years, we’ll also see a are used by business managers to make tive:” Adaptive Coordinated Control in shift in performance management better decisions. the Multi-agent 3D Dynamic Battle- to what we’re calling predictive This relatively new approach to field, Control for Adaptive and Cooper- performance management, where business intelligence is capable of rec- ative Systems, Adaptive System analytics play a huge role in mov- ommending the best course of action Interoperability, Adaptive Materials for ing us beyond just simple metrics (based on past data), but it does so in a Energy-Absorbing Structures, and to more powerful measures.” very special way: An adaptive business Complex Adaptive Networks for Coop- Further, Jim Davis, vice president of intelligence system incorporates predic- erative Control. marketing at SAS Institute (Collins et al. tion and optimization modules to rec- For sure, adaptability was recognized 2007) stated: ommend near-optimal decisions, and an as an important component of intelli- “In the next three to five years, “adaptability module” for improving gence quite some time ago: Alfred Binet we’ll reach a tipping point where future recommendations. Such systems (born 1857), French psychologist and more organizations will be using can help business managers make deci- inventor of the first usable intelligence BI to focus on how to optimize sions that increase efficiency, productiv- test, defined intelligence as “...judg- processes and influence the bottom ity, and competitiveness. Furthermore, ment, otherwise called good sense, line…” the importance of adaptability cannot be practical sense, initiative, the faculty of overemphasized. After all, what is the adapting one’s self to circumstances.” Research Issues point of using a software system that Adaptability is a vital component of any Every problem has an objective. Usual- produces sub par schedules, inaccurate intelligent system, as it is hard to argue ly, this is a general statement describing demand forecasts, and inferior logistic that a system is “intelligent” if it does what we are looking for. The objec- plans, time after time? Would it not be not have the capacity to adapt. For tive defines the goal (or set of goals) for wonderful to use a software system that humans, the importance of adaptability a particular problem. These goals are could adapt to changes in the market- is obvious: our ability to adapt was a key translated into evaluation functions, place? A software system that could element in the evolutionary process. In which provide mappings from the improve with time? psychology, a behavior or trait is adap- solution space to a set of numbers.

FEBRUARY 2008 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 57

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. Thus, evaluation functions assign some design variables) may return dif- approximation error is quite different numeric values for each solution for ferent values. The common approach in than noise, as it is usually deterministic each specified goal. such scenarios is to approximate a noisy and systematic. Evaluation functions (for single- evaluation function eval by an averaged objective problems) or a set of evalua- sum of several evaluations: Time-Varying Environments tion functions (for multi-objective Sometimes evaluation functions depend problems) are key components of any eval (x) = 1/ni=1…n f (x) + zi, on an additional variable: time. In such heuristic method (whether genetic cases, evaluation function eval becomes: algorithms, tabu search, simulated where x is a vector of design variables annealing, ant system, or even simple (i.e., variables controlled by a method), eval (x) = f (x, t), hill-climbers), as they define the con- f (x) is the evaluation function, zi rep- nection between the method and the resents additive noise, and n is the sam- where t represents the time variable. problem. By assigning a numeric quali- ple size. Note that the only measurable Clearly, the landscape defined by the ty measure to each solution, evaluation (returned) values are f (x) + z. function f changes over time; conse- functions allow comparison between quently, the best solution may change the qualities of various candidate solu- Robustness its location over time. There are two tions. Note that evaluation functions Sometimes design variables, other vari- main approaches for handling such may return just the rank of a candidate ables, or constraints of the problem are scenarios: (1) to restart the method solution among a set of solutions, a pre- subject to perturbations after the solu- after a change, or (2) require that the cise number (when the evaluation tion is determined. The general idea is method is capable of chasing the function is defined as a closed formula), that such (slightly modified) solutions changing optimum. or they may include various compo- should have quality evaluations (thus However, it seems the largest class of nents (as penalty expressions for cases making the original solution robust). real world problems is not included in when a candidate solution violates This is important in scenarios involv- the above four categories. From our some problem-specific constraints). ing manufacturing tolerances, or when business/industry experience of the last Many real world problems are set in it is necessary to modify the original decade, it is clear that in many real uncertain (possibly changing) environ- solution because of employee illness or world problems the evaluation functions ments. There is a general agreement machine failure. The common are based on the predicted future values (Jin & Branke, 2005) that such uncer- approach to such scenarios is to use of some variables. In other words, eval- tainties can be categorized into four evaluation function eval based on the uation function eval is expressed as: classes: (1) noise, (2) robustness, (3) probability distribution of possible dis- approximation, and (4) time-varying turbances δ, which is approximated by eval (x) = f (x, P(x, y, t )), environments. Consequently, evalua- Monte Carlo integration: tion functions should be modified where P(x, y, t ) represents an outcome accordingly to deal with each particular eval (x) = 1/ni=1…n f (x + δi). of some prediction for solution vector x case. However, it seems that the above and additional (environmental, beyond classification misses the most important Note that eval (x) depends on the shape our control) variables y at time t. Let’s (and probably most frequent) real world of f (x) at point x; in other words, the compare this category with the four cat- scenario: namely, where the evaluation neighborhood of x determines the value egories defined earlier to see the differ- functions are based on predictions of of eval (x). ences between them. the future values of some variables. First of all, noise may or may not be Before we present and discuss the fifth Approximation involved. If the prediction model is category, and argue that this fifth cate- Sometimes it is too expensive to evalu- deterministic, then there is no noise in gory is the most common in real word ate a candidate solution. In such scenar- the scenario: every solution vector x is situations, let’s discuss the main features ios, evaluation functions are often evaluated to the same value. On the of these four categories. approximated based on experimental or other hand, if the prediction model simulation data (the approximated eval- involves simulations, noise might be Noise uation function is often called the meta- present. Second, the meaning of Sometimes evaluation functions are sub- model). In such cases, evaluation robustness is quite different. Unexpect- ject to noise. This happens when evalu- function eval becomes: ed disturbances (e.g., delays) influence ation functions return sensory the outcomes of the prediction model, measurements or results of randomized eval (x) = f (x) + E(x), and should be handled accordingly. simulations. In other words, the evalua- Third, the concept of approximation is tion procedure for the same solution where E(x) is the approximation error different. Note, that in some cases we (i.e., the solution defined as a vector of of the meta-model. Note that the can evaluate a candidate solution

58 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | FEBRUARY 2008

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. precisely (e.g., when the evaluation environments. This is essential, as most The Approach function is not expensive), however, real world business problems are con- Over the past few decades, various peo- approximation is connected with strained, multi-objective, and set in a ple and organizations have attempted to uncertainties of the predictions. Finally, time-changing environment. address this educational gap by teaching the time-changing environment also “thinking skills” based on some struc- has a different meaning. As the real Natural Intelligence ture (e.g. critical thinking, constructive world changes, the prediction model What is missing in most curricula— thinking, creative thinking, parallel needs constant updates and/or parame- from elementary school all the way thinking, vertical thinking, lateral think- ter adjustments, thus changing the through to university education—is ing, confrontational and adversarial problem landscape in an implicit way. coursework focused on the develop- thinking). However, all these approach- There are some recent, successful ment of problem-solving skills. Most es are characterized by a departure from implementations of adaptive business students never learn how to think mathematics as they concentrate more intelligence systems reported (e.g., about solving problems—throughout on “talking about problems” rather than Michalewicz et al. 2005), which pro- their education, they are constrained to “solving problems.” It is our view that vide daily decision support for large concentrate on specific questions at the the lack of problem solving skills in corporations and result in multi-mil- back of textbooks. So, without much general is the consequence of decreasing lion dollars return on investment. thinking, they apply the material from levels of mathematical sophistication in There are also companies (e.g., each chapter to solve a few problems modern societies. www.solveitsoftware.com) that special- given at the end of each chapter (why Hence, we believe that a different ize in development of adaptive business else would a problem be at the end of approach is needed. To address this gap intelligence tools. However, further the chapter?). With this type of in the educational curriculum, we have research effort is required. For exam- approach to “problem solving,” it is created a new course (based on our new ple, most of the research in machine unsurprising that students are ill pre- book, “Puzzle Based Learning: An learning has focused on using historical pared for framing and addressing real Introduction to Critical Thinking, data to build prediction models. Once world problems. When they finally Mathematics, and Problem Solving”) the model is built and evaluated, the enter the real world, they suddenly find that focuses on getting students to think goal is accomplished. However, that problems do not come with about how to frame and solve unstruc- because new data arrive at regular instructional textbooks. tured problems (those that are not intervals, building and evaluating a Although many educators are inter- encountered at the end of some text- model is just the first step in adaptive ested in teaching “thinking skills” rather book chapter …). The idea is to increase business intelligence. Because these than “teaching information and con- the student’s mathematical awareness models need to be updated regularly tent,” the fact remains that young peo- and problem solving skills by discussing a (something that the adaptability mod- ple often have serious difficulties in variety of puzzles. In other words, we ule is responsible for), we expect to see independent thinking (or problem-solv- believe that the course should be based more emphasis on this updating process ing skills) regardless of the nature of a on the best traditions introduced by in machine learning research. Also, the problem. As Alex Fisher wrote in his Gyorgy Polya and Martin Gardner dur- frequency of updating the prediction book, “Critical Thinking:” ing the last 60 years. In one of our module, which can vary from seconds “… though many teachers favorite books, “Entertaining Mathemat- (e.g., in real-time currency trading sys- would claim to teach their students ical Puzzles,” Martin Gardner wrote: tems), to weeks and months (e.g., in ‘how to think,’ most would say “Perhaps in playing with these fraud detection systems), may require that they do this indirectly or puzzles you will discover that different techniques and methodolo- implicitly in the course of teaching mathematics is more delightful gies. In general, adaptive business intel- the content, which belongs to their than you expected. Perhaps this ligence systems would include the special subject. Increasingly, educa- will make you want to study the research results from control theory, tors have come to doubt the effec- subject in earnest, or less hesitant statistics, operations research, machine tiveness of teaching ‘thinking skills’ about taking up the study of a sci- learning, and modern heuristic meth- in this way, because most students ence for which a knowledge of ods, to name a few. We also expect simply do not pick up the thinking advanced mathematics will eventu- that major advances will continue to be skills in question.” ally be required.” made in modern optimization tech- This approach has dominated the Many other mathematicians have niques. In the years to come, more and educational arena—whether in history, expressed similar views. For example, more research papers will be published physics, geography, or any other sub- Peter Winkler in his book “Mathemati- on constrained and multi-objective ject—almost ensuring that students cal Puzzles: A Connoisseur’s Collec- optimization problems, and on opti- never learn how to think about solving tion,” wrote: “I have a feeling that mization problems set in dynamic problems in general. understanding and appreciating puzzles,

FEBRUARY 2008 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 59

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. University of Virginia) and received Interestingly, the three components of prediction, adaptation, the most positive textbook reviews I and optimization constitute the core modules of adaptive have seen in my fifteen years of teaching.” business intelligence systems. ❏ “Most importantly, it does so in a way that no other book I’ve seen does—it makes it fun and it makes even those with one-of-a-kind solutions, is to around 2,500 BC! Yet the best evi- you think!” good for you.” dence of the puzzle-based learning This new course is the result of approach can be found in the works of Importance of Mathematics many years of experience in educating Alcuin, an English scholar born around Over the years, two primary approaches young engineers, mathematicians, and A.D. 732 whose main work was to problem solving have emerged. One computer scientists on many levels at “Problems to Sharpen the Young”—a is the technical approach (represented in many universities in many countries text which included over 50 puzzles. many textbooks), which concentrates (USA, Mexico, Argentina, New Some twelve hundred years later, one on specific problem-solving techniques. Zealand, Australia, South Korea, Japan, of his puzzles is still used by countless The other is the psychological approach, China, Poland, Sweden, Germany, artificial intelligence textbooks!2 which is based on structural thinking— Spain, Italy, France and the UK). Limit- The first author is a member of the meaning that some structure is imposed ed experiments using the puzzle-based editorial board of the “International on the thinking process during the learning approach with these students Journal Teaching Mathematics and problem solving activity. have already produced outstanding Computer Science.” It is clear that Let’s discuss these two approaches in course evaluations and countless com- new methods of teaching (especially a bit more detail; for that purpose we ments that praise the problem-solving engineers) are sought and experiment- have selected two popular texts. The orientation of the course. We believe ed with. Further, one of the earlier first one is “Operations Research: An that the main reasons behind most stu- books by the first author, “How to Introduction,” by Hamdy A. Taha, and dents’ enthusiasm for the puzzle-based Solve It: Modern Heuristics” (written the other is a book by Edward de Bono, learning approach are: together with Larry’s son, David), “Six Thinking Hats.” The first book ❏ Puzzles are educational, as they illus- included a selection of puzzles to illus- illustrates the technical approach very trate many useful (and powerful) trate some problem solving activities. well, as it is loaded with mathematical problem-solving rules in a very Despite the fact that the book aimed at techniques for a variety of different entertaining way. graduate students interested in genetic problems. On the other hand, the sec- ❏ Puzzles are engaging and thought- algorithms, neural networks, fuzzy sys- ond book presents a particular struc- provoking. tems, and many other traditional and tured way of thinking. Let us have a ❏ Contrary to many textbook prob- modern techniques, the readers— closer look at these two books. lems, puzzles are not attached to any because of puzzles—got much more “Operations Research: An Introduc- chapter (as is the case with real world than just information of particular tion,” by Hamdy A. Taha consists of problems). techniques. Some comments (still several chapters, each of which relate to ❏ It is possible to talk about different available on www.amazon.com) were: a specific problem type. For example, techniques (e.g. simulation, opti- ❏ “This book teaches you how to there is a chapter on linear program- mization), disciplines (e.g. probabili- think of a solution for the problem ming, which is a particular technique ty, statistics), or application areas you face…” for solving problems with many vari- (e.g. scheduling, finance) and illus- ❏ “…anyone interested in […] human ables and where the objective and the trate their significance by discussing a thinking should read and understand values of these variables are expressed as few simple puzzles. At the same this book.” linear expressions. Another chapter of time, the students are aware that ❏ “I used this book in a Master’s class Taha’s book discusses a transportation many conclusions are applicable to on heuristics (Systems Engineering, model and its variants, while another the broader context of solving real presents a series of techniques applicable

world problems. 2The puzzle is the “river crossing problem” (we will to network models. There are chapters return to this puzzle in chapter 12 of this book): A on goal programming, integer linear man has to take a wolf, a goat, and some cabbage Some Supporting Evidence across a river. His rowboat has enough room for the programming, dynamic programming, As a matter of fact, the puzzle-based man plus either the wolf or the goat or the cabbage. If inventory models, forecasting models, he takes the cabbage with him, the wolf will eat the learning approach has a much longer goat. If he takes the wolf, the goat will eat the cab- etc. Each chapter includes selected ref- tradition than just 60 years. The first bage. Only when the man is present are the goat and erences and a problem set. For example, the cabbage safe from their enemies. All the same, the mathematical puzzles were encoun- man carries wolf, goat, and cabbage across the river. the chapter on inventory models tered in Sumerian texts that date back How has he done it? includes the following exercise:

60 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | FEBRUARY 2008

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. “McBurger orders ground meat at the start of each week to cover What is missing in most curricula—from elementary school all the week’s demand of 300 lb. The the way through university education—is coursework focused fixed cost per order is $20. It costs on the development of problem-solving skills. about $0.03 per lb per day to refrigerate and store the meet. (a) Determine the inventory cost per week of the present ordering However, there is usually no discussion crowd in on us. It is like juggling policy. (b) Determine the optimal on “how to solve a problem”—the text with too many balls. inventory policy that McBurger gives some recipes on how to arrive at a What I am putting forward in should use, assuming zero lead solution once the problem has already this book is a very simple concept, time between the placement and been reduced to the problem type which allows a thinker to do one receipt of an order. (c) Determine defined in the text. As indicated in the thing at a time. He or she becomes the difference in the cost per week preface, students are constrained to con- able to separate emotion from between McBurger’s current and centrate on textbook questions at the logic, creativity from information, optimal ordering policy.” back of each chapter, using the informa- and so on. The concept is that of Clearly, the problem is well-defined tion learned in that chapter … So all the six thinking hats. Putting on and very specific. Earlier parts of the these specialized texts (whether on any one of these hats defines a cer- chapter on inventory models discussed, probability, statistics, simulations, etc.) tain type of thinking.” of course, a general inventory model that represent the technical approach for It seems that “Six Thinking Hats” is (where the total inventory cost is given problem solving, do not present a prob- characterized by two facts (as are many as a total of purchasing cost, setup cost, lem-solving methodology. They just other texts on thinking processes, which holding cost, and shortage cost) and the provide (very useful) information of include texts on critical thinking, con- classic economic order quantity models. particular techniques for particular class- structive thinking, creative thinking, The formula is derived in the chapter to es of problems. parallel thinking, vertical thinking, later- provide the optimum value of the order Let us now turn our attention to the al thinking, confrontational, and adver- quantity y (number of units) as a func- other book, Edward de Bono’s “Six sarial thinking, to name a few): tion of setup cost K associated with the Thinking Hats,” which represents the (a) the problem types and correspond- placement of an order (in dollars per psychological approach. As we have ing “techniques” are not very spe- order), demand rate D (in units per indicated earlier, the book suggests cific. The approach is very general time unit), and holding cost h (in dollars some structure for the thinking process and it applies to most problems (as per inventory unit per time unit). The during the problem solving activity. In opposed to specific problem model suggests to order: particular, each of six hats represents a types); and particular function of the thinking (b) the approach is mathematics-free. y = 2KD/h process: Indeed, the examples given in the White Hat—collection of objective “Six Thinking Hats” vary from house units every y/D time units. Again, it facts and figures selling activities, through advertising is not our goal to scare you by provid- Red Hat—presentation of emotional and marketing issues, to pricing prod- ing a formula in the introductory part view ucts. Further, the mathematics is of this text (especially that the deriva- Black Hat—discussion of weaknesses in nonexistent despite the fact that some tion of this formula requires some cal- an idea problems may require more precise culus …) but rather to point out the Yellow Hat—discussion on benefits of mathematics. There is no question that specific nature of the problem and the the idea the approach proposed by Edward de specific (and very precise) solution. Green Hat—generation of new ideas Bono is very useful and that many cor- This is a good illustration of the tech- Blue Hat—imposition of control of the porations benefited from the methodol- nical approach. whole process ogy of “Six Thinking Hats.” On the It seems that Taha’s text is similar to The general idea is that instead of other hand, the rejection of mathemat- many other texts from disciplines such thinking simultaneously along many ics in “Six Thinking Hats” expresses as engineering, mathematics, finance, directions, a thinker should do one itself even in the author’s statements, and business, in that it has two main thing at the time. Edward de Bono such as: “In a simple experiment with three characteristics: explains it very clearly: hundred senior public servants, the introduc- (a) The problem types and corre- “The main difficulty of think- tion of the Six Hats method increased think- sponding techniques are very spe- ing is confusion. We try to do too ing productivity by 493 percent.” Well, cific; and much at once. Emotions, informa- this is very impressive, but any person (b) mathematics is used extensively. tion, logic, hope and creativity all with any “critical thinking” skills (or

FEBRUARY 2008 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 61

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. on how to arrive at a solution. Fur- Clearly, the future of the business intelligence industry thermore, the psychological approach lies in systems that can make decisions, rather than uses natural language to describe its mechanisms, whereas the technical tools that produce detailed reports. approach uses mathematics as a prob- lem solving language. Which of these two approaches some fancy for precision) may ask for national project team discussion. (technical versus psychological) should clarifications: Using the parallel thinking of the be used in the real world? Well, each of What is the definition of pro- Six Hats method, the discussions these two approaches has a crowd of ductivity (especially in cases of can now take as little as two days.” enthusiasts and supporters; however, it senior public servants) and how However, if this was that case, then it seems that the technical approach is such productivity—and improve- seems there is something fundamentally based on the solid fundamentals of sci- ment in productivity—has been very wrong with the whole picture, as ence. Even some philosophers and psy- measured? the quality of the decisions reached is chologists tend to agree. One of the Indeed, these are very important completely ignored and not measured! pearls of wisdom taught by Anthony de questions. In the case of the public ser- We should not care so much whether Mello in his famous book, “One vants, did three hundred employees fill the problem solving process took x or y Minute Wisdom,” was the following out forms that evaluated their hours, as the quality of solution is the observation: (increased) productivity? If so, then this most important aspect. “Weeks later, when a visitor can be compared to an example provid- There is an excellent book (on sci- asked him what he taught his disci- ed by Darrell Huff in his book “How to ence and education, one can say) by ples, he said, ‘To get their priorities Lie with Statistics.” The San Francisco Eliyahu M. Goldratt and Jeff Cox, right: Better have the money than Chronicle published an article titled “The Goal.” The book describes the calculate it; better have the experi- “British He’s Bathe More Than She’s,” struggle of a plant manager who tries ence than define it.’” and the story supported the title with to improve factory performance. He It is easy to extend the above state- the following facts (based on hot-water worries about productivity, excess ments (while preserving their spirit) by survey of 6,000 representative British inventories, throughput, balancing stating that: homes). “The British male over five capacities, and many other measure- It is better to know how to years of age soaks himself in a hot tub ments. Only with the help of a consul- solve problems than to have the on an average of 1.7 times a week in tant does he realize that there is only ability to talk about them! the winter and 2.1 times in the summer. one goal and one measurement: “The On the other hand, representatives of British women average 1.5 baths a week goal of a manufacturing organization is to the technical approach admit that in the winter and 2.0 in the summer.” make money and everything else we do is “although mathematics is a cornerstone of Darrell Huff, discussing this case, made means to achieve the goal.” Similarly, in Operations Research, one should not ‘jump’ an excellent (and very important) obser- the problem solving process there is into using mathematical models until simpler vation. He wrote: only one goal: To find the best possible approaches have been explored. In some “…the major weakness is that solution. Of course, very often there is cases, one may encounter a ‘commonsense’ the subject has been changed. a trade-off between the time needed to solution through simple observations. What the Ministry really found out find a solution and the quality of the Indeed, since the human element invariably is how often these people said they solution (this is often discussed in com- affects most decision problems, a study of the bathed, not how often they did so. puter science courses on analysis of psychology of people may be key to solving When a subject is an intimate as algorithms), but is seems that the “Six the problem,” (Hamdy A. Taha, “Oper- this one is, with the British bath- Thinking Hats” method is concerned ations Research: An Introduction”). taking tradition involved, saying with only the secondary aspect of These comments are followed by a and doing may not be the same problem solving: time efficiency. Pre- delightful example, where the problem after all.” cise evaluation of the solution is of of slow elevator service in a large office It seems that the same argument can lesser importance. building was solved not by the use of be applied to the public servants … Thus the psychological approach mathematical queuing analysis or simu- Most likely, their productivity was mea- looks like the opposite extreme of the lation, but by installing full-length mir- sured in hours (i.e., the shorter the time technical approach in the spectrum of rors at the entrance to the elevators: to make a decision, the better). Edward problem-solving methodologies, as the complaints disappeared as people de Bono explains: the former focuses on organizational were kept occupied watching them- “A major corporation used to issues of “thinking” for general prob- selves (and others) while waiting for spend twenty days on their multi- lems, rather than specific techniques the elevator!

62 IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE | FEBRUARY 2008

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply. Clearly, there are many merits in textbook (“Puzzle Based Learning: An References concepts related to critical, vertical, lat- Introduction to Critical Thinking, [1] T. Baeck, D. Fogel, and Z. Michalewicz, (Editors), Handbook of Evolutionary Computation, joint publication of eral, and other thinking paradigms. Mathematics, and Problem Solving”) Oxford University Press and Institute of Physics, New York However, mathematics—the queen of will be available July 2008. and London, 1997. [2] M. Burns, I Hate Mathematics, Little, Brown, and all sciences—must remain the universal We believe that besides being a lot Company, New York, 1975. language of problem solvers. Otherwise, of fun, the puzzle-based learning [3] C.A.C. Coello, A.A. Van Veldhuizen, and G.B. Lam- as we saw, there is a danger of making approach will also do a remarkable job ont, Evolutionary Algorithms for Solving Multi-Objective Prob- imprecise statements, and what is worse, of convincing engineering students that lems, Kluwer Academic, 2002. [4] K. Collins, J. Goodnight, M. Hagström, J. Davis, J., The there is a danger of finding—and imple- (a) science is useful and interesting, (b) Future of Business Intelligence: Four Questions, Four Views, menting—poor solutions! the basic courses they take are relevant, SASCOM, First quarter, 2007. Numerous mathematicians have put (c) mathematics is not that scary (no [5] K. Deb, Multi-Objective Optimization using Evolutionary Algorithms, Wiley, 2001. a lot of effort into finding a middle need to hate it!), and (d) it is worth- [6] E. de Bono, Six Thinking Hats, MICA Management ground between the technical and psy- while to stay in school, get a degree, Resources, New York, 1999. chological approaches to problem solv- and move into the real world, which is [7] A. de Mello, One Minute Wisdom, Doubleday, New York, 1986. ing. The best known work, without a loaded with interesting problems [8] V. Dhar and R. Stein, Seven Methods for Transforming Cor- doubt, is Gyorgy Polya’s “How to (problems perceived as real world puz- porate Data into Business Intelligence, Prentice Hall, 1997. Solve It,” which stands out as one of zles…). These points are important, as [9] A. Fisher, Critical Thinking: An Introduction, Cambridge the most important contributions to most students are unclear about the University Press, Cambridge, 2001. [10] L.J. Fogel, A.J. Owens, and M.J. Walsh, Artificial Intelli- problem-solving literature of the twen- significance of the topics covered dur- gence through Simulated Evolution, Wiley, New York, 1966. tieth century. Even now, as we have ing their studies. Oftentimes, they do [11] Martin Gardner, Entertaining Mathematical Puzzles, moved into the not see a connec- Dover Publications, New York, 1961. [12] M. Gardner, My Best Mathematical and Logic Puzzles, new millennium, tion between the Dover Publications, New York, 1994. the book continues topics taught (e.g. [13] D. Huff, How to Lie with Statistics, W. W. Norton, New to be a favorite linear algebra) and York, 1993. among teachers real world prob- [14] Y. Jin J. and Branke, “Evolutionary optimization in uncertain environments—A survey,” IEEE Transactions on Evo- and students for its lems, and they lose lutionary Computation, vol. 9, no. 3, pp. 303–317, June 2005. instructive meth- interest with pre- [15] D. Loshin, Business Intelligence: The Savvy Manager’s ods. Other works dictable outcomes. Guide, Morgan Kaufmann, 2003. [16] S. Makridakis, S.C. Wheelwright, and R.J. Hyndman, include “I Hate Forecasting: Methods and Applications, Wiley, 1998. Mathematics,” Conclusions [17] Z. Michalewicz and D.B. Fogel, How to Solve It: Modern written by Marylin This paper has Heuristics, 2nd edition, Springer, Berlin, 2004. Burns, which is presented some [18] M. Michalewicz and Z. Michalewicz, Winning Credibil- ity: A guide for building a business from rags to riches, 2nd edi- full of tips and thoughts on how tion, Hybrid Publishers, , Australia, 2007. methods for solv- Larry Fogel has [19] Z. Michalewicz and M. Michalewicz, Puzzle Based Learning: An Introduction to Critical Thinking, Mathematics, and ing problems. impacted the authors’ Problem Solving, Hybrid Publishers, Melbourne, 2008. lives in a variety of [20] Z. Michalewicz, M. Schmidt, M. Michalewicz, and C. Current State different ways. How- Chiriac, “A Decision-support system based on computation- al intelligence: A case study,” IEEE Intelligent Systems, vol. Our new course ever, this paper 20, no. 4, pp. 44–49, 2005. (which aims at get- would not be com- [21] Z. Michalewicz, M. Schmidt, M. Michalewicz, and C. ting engineering students to think about plete without an additional observation. Chiriac, Adaptive Business Intelligence, Springer, Berlin, 2007. [22] L.T. Moss and S. Atre, Business Intelligence Roadmap, how to frame and solve unstructured Larry Fogel was one of few scientists Addison Wesley, 2003. problems) has been approved by the who also created a business to imple- [23] J.A. Paulos, Innumeracy: Mathematical Illiteracy and Its University of Adelaide for faculty of ment his ideas in the real world. Indeed, Consequences, Hill and Wang, New York, 1988. engineering, computer science, and the authors of this paper have done the [24] G. Polya, How to Solve It: A New Aspect of Mathematical Method, Princeton University Press, Princeton, 1945. mathematics (altogether seven schools). same, by starting and selling out of a [25] H.A. Taha, Operations Research: An Introduction, 7th edi- The course will be offered in two ver- company in the United States, and a tion, Pearson Education, Upper Saddle River, 2003. sions: (a) full-semester course and (b) a few years later establishing another [26] S.M. Weiss and N. Indurkhya, Predictive Data Mining, unit within a general course (e.g., Intro- company in Australia. The authors have Morgan Kaufmann, 1998. [27] E. Vitt, M. Luckevich, and S. Misner, Business Intelli- duction to Engineering). Many other described their business experiences in gence: Making Better Decisions Faster, Microsoft Press, 2002. universities are in the preliminary phase their recent book: “Winning Credibili- [28] I.H. Witten and E. Frank, Data Mining: Practical Machine of introducing such a course. All teach- ty: A guide for building a business from Learning Tools and Techniques, 2nd edition, Morgan Kauf- mann, 2005. ing materials (power point slides, assign- rags to riches,” (see www.WinningCredi- [29] P. Winkler, Mathematical Puzzles: A Connoisseur’s Collec- ments) are being prepared. The new bility.com). tion, A.K. Peters, Wellesley, 2004.

FEBRUARY 2008 | IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE 63

Authorized licensed use limited to: University of Adelaide Library. Downloaded on December 2, 2009 at 00:38 from IEEE Xplore. Restrictions apply.