Two Robots Can Learn Unlabeled Directed Graphs
Total Page:16
File Type:pdf, Size:1020Kb
The Power of Team Exploration: Two Rob ots Can Learn Unlab eled Directed Graphs y Michael A. Bender Donna K. Slonim Aiken Computation Lab oratory MIT Lab oratory for Computer Science Harvard University 545 Technology Square Cambridge, MA 02138 Cambridge, MA 02139 b [email protected] [email protected] Septemb er 6, 1995 Abstract We show that two co op erating rob ots can learn exactly any strongly-connected directed graph with n indistinguishable no des in exp ected time p olynomial in n.Weintro duce a new typ e of homing sequence for two rob ots, which helps the rob ots recognize certain previously-seen no des. We then present an algorithm in which the rob ots learn the graph and the homing se- quence simultaneously by actively wandering through the graph. Unlike most previous learning results using homing sequences, our algorithm do es not require a teacher to provide counterex- amples. Furthermore, the algorithm can use eciently any additional information available that distinguishes no des. We also present an algorithm in which the rob ots learn by taking random walks. The rate at which a random walk on a graph converges to the stationary distribution is characterized by the conductance of the graph. Our random-walk algorithm learns in ex- p ected time p olynomial in n and in the inverse of the conductance and is more ecient than the homing-sequence algorithm for high-conductance graphs. Supp orted by NSF Grant CCR-93-13775. y Supp orted by NSF Grant CCR-93-10888 and NIH Grant 1 T32 HG00039-01. 1 Intro duction Consider a rob ot trying to construct a street map of an unfamiliar cityby driving along the city's roads. Since many streets are one-way, the rob ot may b e unable to retrace its steps. However, it can learn by using street signs to distinguish intersections. Now supp ose that it is nighttime and that there are no street signs. The task b ecomes signi cantly more challenging. In this pap er we present a probabilistic p olynomial-time algorithm to solve an abstraction of the ab ove problem by using two co op erating learners. Instead of learning a city,we learn a strongly- connected directed graph G with n no des. Every no de has d outgoing edges lab eled from 0 to d 1. No des in the graph are indistinguish able, so a rob ot cannot recognize if it is placed on a no de that it has previously seen. Moreover, since the graph is directed, a rob ot is unable to retrace its steps while exploring. For this mo del, one might imagine a straightforward learning algorithm with a running time p olynomial in a sp eci c prop erty of the graph's structure suchascover time or mixing time. Any such algorithm could require an exp onential numb er of steps, however, since the cover time and mixing time of directed graphs can b e exp onential in the numb er of no des. In this pap er, we present 2 5 a probabilistic algorithm for two rob ots to learn any strongly-connected directed graph in O d n steps with high probability. The two rob ots in our mo del can recognize when they are at the same no de and can communicate freely by radio. Radio communication is used only to synchronize actions. In fact, if we assume that the two rob ots move synchronously and share a p olynomial-length random string, then no communication is necessary.Thus with only minor mo di cations, our algorithms may b e used in a distributed setting. Our main algorithm runs without prior knowledge of the numb er of no des in the graph, n,in time p olynomial in n.We show that no probabilistic p olynomial-time algorithm for a single rob ot with a constantnumb er of p ebbles can learn all unlab eled directed graphs when n is unknown. Thus, our algorithms demonstrate that two rob ots are strictly more p owerful than one. 1.1 Related Work Previous results showing the p ower of team learning are plentiful, particularly in the eld of induc- tive inference see Smith [Smi94] for an excellent survey. Several team learning pap ers explore the + problems of combining the abilities of a numb er of di erent learners. Cesa-Bianchi et al. [CBF 93] consider the task of learning a probabilistic binary sequence given the predictions of a set of exp erts on the same sequence. They showhow to combine the prediction strategies of several exp erts to predict nearly as well as the b est of the exp erts. In a related pap er, Kearns and Seung [KS93] explore the statistical problems of combining several indep endenthyp otheses to learn a target con- cept from a known, restricted concept class. In their mo del, eachhyp othesis is learned from a di erent, indep endently-drawn set of random examples, so the learner can combine the results to p erform signi cantly b etter than any of the hyp otheses alone. There are also many results on learning unknown graphs, but most previous work has concen- trated on learning undirected graphs or graphs with distinguishable no des. For example, Deng and 1 Papadimitriou consider the problem of learning strongly-connected, directed graphs with labeled no des, so that the learner can recognize previously-seen no des. They provide a learning algorithm whose comp etitive ratio versus the optimal time to traverse all edges in the graph is exp onen- tial in the de ciency of the graph [DP90, Bet92]. Betke, Rivest, and Singh intro duce the notion of piecemeal learning of undirected graphs with lab eled no des. In piecemeal learning, the learner must return to a xed starting p oint from time to time during the learning pro cess. Betke, Rivest, and Singh provide linear algorithms for learning grid graphs with rectangular obstacles [BRS93], and with Awerbuch [ABRS95] extend this work to show nearly-linear algorithms for general graphs. Rivest and Schapire [RS87, RS93] explore the problem of learning deterministic nite automata whose no des are not distinguishable except by the observed output. We rely heavily on their results + + in this pap er. Their work has b een extended byFreund et al. [FK 93], and by Dean et al. [DA 92]. Freund et al. analyze the problem of learning nite automata with average-case lab elings by the observed output on a random string, while Dean et al. explore the problem of learning DFAs with a rob ot whose observations of the environment are not always reliable. Ron and Rubinfeld [RR95] present algorithms for learning \fallible" DFAs, in which the data is sub ject to p ersistent random errors. Recently, Ron and Rubinfeld [RR95b ] have shown that a teacher is unnecessary for learning nite automata with small cover time. In our mo del a single rob ot is p owerless b ecause it is completely unable to distinguish one no de from any other. However, when equipp ed with a numb er of p ebbles that can b e used to mark no des, the single rob ot's plight improves. Rabin rst prop osed the idea of dropping p ebbles to mark no des [Rab67]. This suggestion led to a b o dy of work exploring the searching capabilities of a nite automaton supplied with p ebbles. Blum and Sako da [BS77] consider the question of whether a nite set of nite automata can search a 2 or 3-dimensional obstructed grid. They prove that a single automaton with just four p ebbles can completely searchany 2-dimensional nite maze, and that a single automaton with seven p ebbles can completely searchany 2-dimensional in nite maze. They also prove, however, that no collection of nite automata can searchevery 3-dimensional maze. Blum and Kozen [BK78] improve this result to show that a single automaton with 2 p ebbles can search a nite, 2-dimensional maze. Their results imply that mazes are strictly easier to search than planar graphs, since they also show that no single automaton with p ebbles can search all planar graphs. Savitch [Sav73]intro duces the notion of a maze-recognizing automaton MRA, whichisaDFA with a nite numb er of distinguishable p ebbles. The mazes in Savitch's pap er are n-no de 2-regular graphs, and the MRAs have the added ability to jump to the no de with the next higher or lower numb er in some ordering. Savitch shows that maze-recognizing automata and log n space-b ounded Turing machines are equivalent for the problem of recognizing threadable mazes i.e., mazes in which there is a path b etween a given pair of no des. Most of these pap ers use p ebbles to mo del memory constraints. For example, supp ose that the no des in a graph are lab eled with log n-bit names and that a nite automaton with k log n bits of memory is used to search the graph. This situation is mo deled by a single rob ot with k distinguishable p ebbles. A rob ot dropping a p ebble at a no de corresp onds to a nite automaton storing the name of that no de. In our pap er, by contrast, weinvestigate time rather than space constraints. Since memory is now relatively cheap but time is often critical, it makes sense to ask 2 whether a rob ot with any reasonable amount of memory can use a constantnumb er of p ebbles to learn graphs in p olynomial time.