Solving Mathematical Puzzles: a Challenging Competition for AI

Articles Solving Mathematical Puzzles: A Challenging Competition for AI Federico Chesani, Paola Mello, Michela Milano I Recently, a number of noteworthy ince its origin, the holy grail of artificial intelligence results have been achieved in various has been to understand the nature of intelligence fields of artificial intelligence, and many Sand to engineer systems that exhibit such intelli- aspects of the problem-solving process gence through vision, language, emotion, motion, and have received significant attention by reasoning. In such context, AI researchers have always the scientific community. In this context, looked for challenges to push forward the limit of what the extraction of comprehensive knowledge suitable for problem solving and computers can do autonomously and to measure the lev- reasoning, from textual and pictorial el of “intelligence” achieved. Competitions have been problem descriptions, has been less and are currently run on conversational behavior (for investigated, but recognized as essential example, the Loebner prize1), automatic control (for for autonomous thinking in artificial example, the International Aerial Robotics Competition2 intelligence. In this work we present a or the DARPA Grand Challenge on driverless cars3), coop- challenge where methods and tools for eration and coordination in robotics (for example, the deep understanding are strongly needed RoboCup4), logic reasoning and knowledge (for example, for enabling problem solving: we propose the CADE ATP System competition for theorem to solve mathematical puzzles by means 5 of computers, starting from text and dia- provers ), and natural language (for example, the EVALI- 6 grams describing them, without any TA competition for the Italian language). Historically, human intervention. We are aware that also games have raised the interest of the AI community: the proposed challenge is hard and diffi- a number of competitions are still being held nowadays cult to solve nowadays (and in the fore- (for example, the World Computer Chess Champi- seeable future), but even studying and onship,7 Mario Championship8 [Togelius et al. 2013] and solving only single parts of the proposed its successor Platformer Competition,9 the General Game challenge would represent an important Playing competition10). For a high-level overview of the step forward for artificial intelligence. field of AI in games see the paper by Yannakakis and Togelius (2014). Some of these challenges have indeed brought many insights and advancements on various artificial intelligence fields. We mention in the following three of them, notable for the achieved results, as well as for their impact on the media and the consequent dissemination Copyright © 2017, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602 FALL 2017 83 Articles of the AI discipline to the general audience. In the last answering in an effective and efficient way. However, century, a famous challenge of AI concerned the how to integrate more complex problem-solving game of chess, considered a symbol of complexity, capabilities and reasoning techniques is still a matter problem solving, and strategic ability. As widely of research. known, the competition between humans and the The challenge we propose here goes in the direc- Deep Blue computer was definitely won by comput- tion of extracting specific knowledge from general ers in 1997 when the world chess champion Garry text and diagrams that is useful for reasoning and Kasparov was defeated. A key lesson learned from problem solving and can be stated as follows: Deep Blue’s strength is that efficient heuristic search By the middle of the 21st century, (a team of) fully can be much more effective than sophisticated rea- autonomous agent(s) shall win a mathematical puzzle soning guided search. In fact, heuristic search was so competition against primary school students, winners successful that it led Kasparov to exclaim, “I could of the most recent competitions. feel — I could smell — a new kind of intelligence Mathematical puzzles are recreational games where across the table” (Time Magazine).11 Despite a very a single human player is challenged with a problem, good result in terms of problem solving, the search- described by text and diagrams. To solve them, a intensive nature of the solution approach (in terms human player uses understanding and intuition, as of number of configurations explored, memory, and well as common sense, simple logical and mathemat- processing power) was in fact the main criticism to ical or geometry knowledge, causal relations, and the experiment. Noam Chomsky commented that many other reasoning-related capabilities. The main Deep Blue defeating Kasparov in chess was “as inter- research question then can be formulated as “What esting as a bulldozer that can win the Olympic are the requirements, functionalities, and main soft- weight-lifting competition.”12 Very recently, comput- ware components needed for autonomously solving a ers won against humans at the game Go, famous for mathematical puzzle, with no human intervention?” being one of the last games where humans were still The focus is on deep reasoning featuring (1) the better players. In March 2016, the AlphaGO (Silver et extraction of comprehensive knowledge from multi- al. 2016) prototype by Google won13 against Lee modal descriptions (texts, diagrams, sound and Sedol, one of the world’s best players. The distinct fea- speech, gesture, and others); (2) the ability of deter- ture of AlphaGO is the adoption of deep learning mining both the proper model and the correspon- techniques. ding reasoning capability; and (3) the capability of A second challenge is the Robot Soccer World Cup effectively solving the problem (possibly with a feed- (RoboCup),14 the international robotics competition back to the two previous steps). All these steps require and challenge launched in 1997. The official goal of considerable integration of many artificial intelli- the RoboCup challenge is the following: gence techniques such as natural language and dia- “By the middle of the 21st century, a team of fully gram understanding, problem solving and search, autonomous humanoid robot soccer players shall win a modeling, and machine learning. In particular, recent soccer game, complying with the official rules of FIFA, advances in machine learning open new research 15 against the winner of the most recent World Cup.” avenues for deep reasoning, as they enable general- Even if we are still far from reaching such an ambi- ization over manually collected knowledge. tious goal, the RoboCup challenge has promoted AI Recently, a number of similar challenges have been techniques in robotics, planning, self-adaptive sys- proposed. We mention here the Aristo Project (Clark tems, and machine vision, showing that even 2015), a challenge with the goal of having the com- extremely ambitious and visionary challenges can puter pass elementary school math and science bring huge benefits to a research field by approaching exams. Natural language comprehension and dia- the goal step by step. gram understanding are required as fundamental The third challenge is about Watson,16 an AI sys- steps, together with some form of inference and alge- tem developed by the IBM research team. Watson was braic or mathematical solving techniques: for this originally developed to answer questions on a quiz reason, the Aristo challenge is quite related to our show called Jeopardy. In 2011, Watson competed proposal. We will discuss Aristo and other challenges against former human winners and won. Watson can extensively in the related works and challenges sec- be considered as a very advanced question-answering tion. While the large majority of existing systems that system that applies and integrates AI techniques such provide end-to-end problem solvers stick to one spe- as natural language processing (NLP), knowledge cific solution method (for example, mathematical extraction and representation, automated reasoning, equations, logic), our challenge is broader, as the and machine learning. The system is able to access choice of the appropriate model and solution method “millions of pages” of structured and unstructured is considered a fundamental capability. content in a very efficient way, thanks to the massive We are aware that the proposed challenge is hard use of parallel processing. A huge amount of knowl- and of difficult solution nowadays, but we strongly edge is learned, processed, and used together with a believe that even studying and solving only single number of inference tasks, to perform question parts of the problem would bring important steps for- 84 AI MAGAZINE Articles ward in artificial intelligence. In addition, on the road rial can be found on the web, where a large instance to agent autonomy, it would be interesting to study set of mathematical and logic puzzles is available. which level of human intervention and interaction with the machine is needed to effectively collaborate to solve the problem. Challenge Description In a computer-aided problem-solving process, there is Mathematical Puzzles always a substantial human intervention that enables the encoding of a problem described by text and dia- Recreational mathematics is a powerful source of grams in a model and a solution algorithm. Human inspiration. Michel Criton said: intervention is essential for identifying problem com- It is a tradition that comes to us with a history of ponents (decision variables, constraints,

Solving Mathematical Puzzles: a Challenging Competition for AI

Commonsense Reasoning for Natural Language Processing

Review Articles

IPG Spring 2020 Games Titles - December 2019 Page 1

Downloading Programs

Commonsense Knowledge in Wikidata

Natural Language Understanding with Commonsense Reasoning

Evaluating Commonsense Reasoning in Neural Machine Translation

For the Low Achiever Student Inmathematics

An Approach to Evaluate AI Commonsense Reasoning Systems∗

Commonsense Reasoning About Task Instructions

Using Conceptnet to Teach Common Sense to an Automated Theorem Prover∗

MAA Rocky Mountain Section Meeting