Wang Ting, Ryad Chellali, Yi Yang, Wei Mingzhu, Qin Wen

Journal of Computer and Communications, 2014, *, ** Published Online **** 2014 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2014.*****

A control strategy for multi-robot system navigating in a dynamic environment* Wang Ting, Ryad Chellali, Yi Yang, Wei Mingzhu, Qin Wen (Affiliation): College of Electric Engineering and Control Science, Nanjing Tech University, Nanjing, China Email: [email protected]

Received **** 2014

Copyright © 2014 by author(s) and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

Abstract

This paper introduces a new control strategy for heterogeneous multi-robots systems dedicated to industrial logistic setups. This control strategy is based on both distributed intelligence and machine learning and involves three parts: the rigid formation controller, the perception system and the path planner. Our controller is event-based and thus its control- coordination strategy can be self-adaptive and applied to real dynamic environment. During the navigating process, the multi-robots system derive the environment model, perform the path planning process that guaranties both the transportation constraints and the obstacle avoidance. For the validation, both simulation and real robot experiments are performed. The results show that the developed control strategy can be well used for realistic logistics applications .

Keywords

Multi-robots system; Distributed intelligence; Machine learning; Dynamic Environment

1. Introduction (Heading 1) Since 1980s, researcher interests in multi-robots systems (MRS) are increasing as it becomes clear that they offer more benefits compared single robot systems [1],[2]: MRS are more robust and they support higher mobility. Indeed, their inherent redundancy reduces the effects of individual malfunctions while having a high number of degrees of freedom for the full system. These advantages are counter-balanced by a higher complexity, i.e., one has more challenging control and path planning issues to solve. In addition to the classical control/oath planning problems, MRS should consider the interactions among robots' group under four

How to cite this paper: Author 1, Author 2 and Author 3 (2014) Paper Title. *********, *, **-**. http://dx.doi.org/10.4236/jcc.2014.***** Author’s nameAuthor’s name

umbrellas: collective, cooperative collaborative and coordinative behaviors[13]. For some applications, only part these behaviors are necessary. For the more general case, all of them are needed. In [3] for instance, an MRS is used for exploration task and the LUNARES (Lunar Exploration Scenario) system is used to gather information about its environment. In logistics applications for instance, all behaviors are necessary: the group of robots handles an object and move it for a given area to another area. In such cases, it is necessary to solve a multi-objective problem: 1) deriving the right formation (relative placement of robots) allowing transporting rigid objects, 2) performing accordingly path planning, including obstacles avoidance for individual robots. For more MRS generality, one has to consider heterogeneous groups rather than homogeneous ones. Indeed, to guaranty more mobility (e.g. more solutions to the path planning problem), MRS systems include robots with different types of locomotion. In late 1980s, several projects about heterogeneous robotic systems have been presented. The first project was ACTRESS (Actor-based Robot and Equipment’s Synthetic System) project [5], [6], [7], which is inspired by the Universal Modular ACTOR Formalism [8]. The second project was GOFFER [9], [10]. In GOFFER, a central task planning and scheduling system (CTPS) can communicate with all robots and can get a global view. Likewise, the ALLIANCE architecture was developed by Parker in [11], [12] to study the cooperation in a heterogeneous, small-to-medium-sized team of largely independent, loosely coupled robots. More recently, an UAV was combined with wheeled robots to be used in Search&Rescue operations. The UAV gathers visual information to build filed environment models thanks to the acquired top views and distributes it to the ground robots and let them perform the search task. As for stepping from one to more robots, involving heterogeneity in MRS has a cost: the design of the control strategy is harder due to the multiplicity of locomotion. Indeed controlling simultaneously a walking robot together with a wheeled robot to achieve a given formation requires combining two different state spaces for which solutions are not straightforward (moving a non-holonomic system and maintaining the equilibrium of a biped). To cope with these difficulties, we addressed the issue of unifying the four previous requirements to obtain a single formalism allowing solving a multi-objective optimization problem where both robots as a group and as individuals should obey to simultaneous constraints dealing with rigid formation and obstacle avoidance. In this paper, we present an application related to logistics. Our goal is allow the MRS transporting different kinds of objects from one point to another point in unknown and varying environments. The size, the shape and the weight of the transported objects may vary. As well, the working space is partially unknown and may vary itself. In this perspective, we used machine-learning techniques to make the MRS self-adaptive and reacting effectively and safely to any unexpected change in the working environment. The contribution is organized as follows. Section 2 introduces the control strategy. Section 3 presents the rigid formation control of multi-robots system. Section 4 details the path planning algorithm as well as the perception system. Simulations and real robot experiments are given in section 5. Conclusions and future work were given in the section 6.

2. System Overview 2.1. Task description In real transportation, it usually needed to carry large and heavy loads in many situations, both in indoor and outdoor environments. In such cases, it is crucial to maintain the rigid formation of the multi-robots system in order to keep the load not falling down during the transport. In addition, the obstacle avoidance should be solved for individual robots as well as the multiple robots as a whole. Accordingly, the MRS should perform tasks as

*Special description of the title. (dispensable) Author’s name

depicted in Fig.1. It consists of maintaining rigid formation to carry objects. Meanwhile, the perception system establishes an occupation grid map allowing planning safe and feasible paths. The perception system is responsible for gathering the image and information of the current environment. In the paper, a camera or a flying robot materializes the perception system. With the occupancy grid and given the final destination, an optimal path is determined thanks to the Q-learning algorithm.

Figure 1.Schematic diagram of the control strategy. Figure 2.Even-based coordination strategy and Petri-net.

2.2. Perception module In order to get environment model, we developed a computer vision based module to extract changes. Our system compares successive images taken by a UAV (or a moveable camera). After background substraction, we obtain a binary matrix E(i,j).

(1) In which 1 represents obstacles and 0 is free path. Here, in order to adapt to our experimental robot (KheperaIII), we reshape the size of matrix E to fit to the real dimension of the working space (see in Fig.7). In this example, the environment is divided into 100 states where each state is a square of 30cmx30cm.

2.3. Rigid Formation Control of the Multi-robots Systems

The rigid formation control method can be simply expressed as follows: a group of robots learns the unknown environment as an entirety, individually and synchronously moving and avoiding obstacles. In the process, there are no feedback or communications among homogeneous robots. According to the previous definition given by Parker [13], we can consider this rigid formation as a swarm robotics. Moreover, from an industrial point of view, the method may be very interesting to apply to the practical delivery in the indoor or the outdoor environment. In some situations, to carry some special objects, the factory needs to design specific vehicles relying on constraints coming from the objects. It is both time consuming and waste power. The use of a multi-robots system is a very flexible solution to solve this kind of problem. The model used to describe the formation is illustrated in Fig.3. Robot 1 is a reference robot in the formation in Fig.3. The reference robot is not necessarily a leader robot. Both the position and the orientation of the reference robot correspond to both the position and the orientation of Author’s nameAuthor’s name

the formation according to an absolute frame. The position of the robot i is defined according to the position of robot i-1. Two parameters are used to specify the position. There are the relative distance between two robots (e.g. and in Fig.3) and the absolute angle between the orientation of the robot and a normal direction (e.g. and in Fig.3). The rigid formation means that the relative distance between two robots is a constant value and all of the robots have the same orientation. It must be noticed that the orientation of the formation () is independent of the angle used to describe the relative orientation between two robots (= 0◦ in Fig.3(a), = 45◦ in Fig.3(b) ).

(a) (b)

Figure 3.Schematic description of a rigid (b)). formation with three two-wheeled robots. The Figure 4. Description of the control strategy. (a) relative distance between two robots is a Schematic description of 8 positions of a constant value and all robots are the same formation comprising three robots. (b) Example orientation (= 0° in Fig. 3 (a), = 45° in Fig. 3 of actions used for control of a formation. The definition of the formation (see in Fig. 4 (a)) and action (see in Fig.4 (b)) are abstracted from the physical movements in real environment (more details can be found in our published work [15],[16]). It is greatly convenient and reduced the workload of research from the theory to the practical research. It is a common model and can be used to the general study for the team robot with a rigid formation. The definition can be applied to any environment and can be amplified to more formations and actions depending on the specific requirements. Generally, in all previous publications about formation control, researches focused on the control of all robots in order to maintain the formation. In this paper, we still focus on how to control the formation and consider the formation as a virtual robot. Furthermore, assuming that there are virtual rigid links between robots, all of robots perform the same task in synchronous manner. As a result, multi-robots team forming a rigid formation can move between two different points and avoid obstacles. More information about the control method used to the trajectory control and the position control of the single non-holonomic mobile robot can be found in [14]. It should be noticed that the control is studied from the conceptualization aspect. The multi-robots system within a rigid formation can be seen as a virtual single robot and applied the ANFIS (Adaptive Neuro Fuzzy Inference System) rigid formation control. The control of a rigid formation has two forms: position control and changing the formation control.

The position control of rigid formation is in Fig. 5. The three robots all use the position control separately at the same time. The inputs are the coordinate differences between the initial position A and final position B respectively along with the X-axis and Y-axis. The output is the speed of left and right wheel. Robots maintain the rigid formation by adapting parameters of ANFIS. In the paper, the multi-robots system is moving within a rigid line shape formation. Author’s name

Figure 5. Position control of formation of robot team.

In some cases, as robots encounter obstacles, they need to change their formation in order to go across the current environment. An example is shown in Fig. 8, the diagram displayed that multi-robots system transformed their formation from horizontal formation to final formation (N (-45°)). Three robots are moving simultaneously as well as keeping the rigid formation. Taking robot 1 as the reference robot, it rotates around itself from initial N (-180°) to the final N (-45°) (N is the direction north of the robot). G1 and G2 are respectively trajectories of robot 2 and robot 3. G1 and G2 are quarter sections of arc. Their common center is the robot 1’s geometry center. Radii are respectively 0.3 cm and 0.6 cm, which are distances between their geometry center and the geometry center of reference robot. Synchronized with the reference robot, at first, robot 2 and robot 3 rotate from initial N (-180°) to the opposite direction N (0°) by the orientation ANFIS control. Then they track their trajectories G1 and G2 separately. Finally, robots arrive at the final formation at the same time. The example in Fig. 6 showed a clockwise rotation action. The anti-clockwise rotation action is similar with the process of clockwise rotation action. In the process, the reference robot rotated from beginning to the end. The other two robots firstly rotate to the direction opposite to the formation (clockwise or anti- clockwise). Then they continue to follow their arc trajectories. Robots move synchronize in the movement. The transport application by the rigid formation control can be found in [15], [16].

Figure 6. Diagram of changing formation control. (a) Three robots rotate around themselves synchronous from -180°to 0°; (b)The reference robot rotate itself from 0° to -45°, the other robots follow respectively the trajectory G1 and G2.

2.4 Q-Learning Based Path Determination The full state information should contain the position of the reference robot as well as the kind of the formation. Consequently, the size of the set S is equal to 10 × 10 × 8(10×10 is the environment matrix, and 8 is the possible formations in each state). The purpose of path planning is to look for the succession of actions and corresponding states in order to move the formation from the point A to another point B. These sequence of actions and states are the optimal result of path planning. It must ensure that any robot in the multi-robots system will not collide with the obstacles.

In the paper, we use the Q-Learning (proposed by Watkins [17]) to build path planning of multi-robots system. The Q-Learning algorithm developed by Watkins is based on (2). Assume that the whole state space can be enumerated and stored in a memory.

(2) Author’s nameAuthor’s name

In Eq.(2), a set of triples (state_x, state_y, state_f) form a three-dimensional array. (state_x and state_y) indicate robot’s coordinate information, and the formation information is represented by (state_f). A is a random chosen action in the current S. The best action A chooses the one to get the max Q value and to reach the fastest to the goal point by the current formation and action A in the current state S, and the new state is the . The formation has two types of action, the translation action and the rotation actions. If the best action is 9 or 10 (see in Fig.6 (b)), the formation needs to transform clockwise or anti-clockwise. Otherwise, the formation moves directly with the action’s orientation.

In Eq (2), parameter can be chosen in [0, 1]. is the learning rate. And r indicates the rewards for the actions with respect to the current S. The regulations of rewards are: (1) If the selected action makes any robot in the rigid formation exceed the world, r = −100; (2) If the selected action makes any robot in the rigid formation meet obstacles, r = −100; (3) Otherwise, r is the distance between the position of standard robot and the position of final position.

2.5 Example of execution

An example illustrated in Fig.7, it introduced how the control strategy works. It is a model of the virtual environment used in the simulation. The heterogeneous multi-robots team must cross a wide unknown area from A to B. The “unknown” means that they cannot get the complete environmental information at the initial time step. Since the area is huge and unknown, the heterogeneous multi-robots team must pass the area steps by steps. Supposed that, in each time, they cross an area () which we called a “room”. In this virtual environment, the multi-robots system is composed of three KheperaIII (a kind of two-wheel non-holonomic robot), a humanoid robot Nao (http://www.aldebaran-robotics.com/), a camera (role as perception system) and a virtual robot supervisor (data processing). The environment model is decomposed into 21 rooms (see Fig. 7, the room is indicated by a two-dimension array, respectively along the axis-x and axis-y). Each room is divided into 100 states, and each state is a square with side of 30 centimeters. For example, the initial A locates in room (2, 2) and the final B locates in room (1, 7) (see Fig. 8). The total area of the environment is equal to.

Figure 7. Virtual environment of the multi-robot system.

Figure 8. Illustration of the path planning task carried out by the top camera and remote computer: start point (left), goal point (right). 3. Real robot experiment The real experiment consisted firstly to compute the path planning by the supervisor and secondly to control the rigid formation of multi-robots system. In the real experiment, we use two KhperaIII robots, a humanoid robot Nao played a role of local supervisor, a camera and a computer acted as a supervisor for remote control of both camera and robots. The camera is hanging on the 2 m height. The real environment (in Fig. 9 (a)) constitutes by a 1.5 m wide and 2.1 m long rectangular floor. Some paper used to act as obstacles. The environment comprises 35 states and each state is a 30 cm side square. The image processing executed in C code. The path planning and the robots control execute in Matlab. The connection between robots and computer is WIFI, and robots can receive real- time command transmission from the supervisor. 3.1. Environment and path planning The path planning of the unknown environment is displayed in Fig. 9 (b). The initial position (the green dot and circle) and final position (the blue dot and circle) respectively are [4; 2] and [1; 6]. In the experiment, multi-robots system maintained the vertical formation 2 from the beginning to the end. The initial position marked with green dot (the reference robot) and green circle. The final position marked with blue dot and blue circle.

Author’s name

(a) (b) Figure 9. (a) Unknown environment and (b) Path Planning results

3.2. Experiment results Let us suppose that the multi-robots system entered the current room k (see in Fig. 10 (a)) from the previous room k-1. They aim to go across the room k and enter into next room k+1. The process includes two parts. In the first part, Nao led two KheperaIII robots move ahead, because there is a free pass way in front of the multi- robots system on initial time step. In the second part, we put a barrier to block the pass way in front of the multi- robots system. Based on an event-based coordination strategy, HMRS can be self-adaptive to the sudden disturbance of the environment. The detailed process is as follows: (1) Nao explored the front area at each time step. As soon as it observed the obstacle, it enabled the top camera to send an image of current unknown environment. Then, through calculating the path planning by a supervisor, two KheperaIII robots (multi-robots system) apply ANFIS rigid formation control to avoid obstacles. Nao executed the corresponding actions according to the results of path planning. Finally, multi-robots system crossed the current room k as well as they arrived at the initial position of room k+1. The results and description of go forwards part and path planning part are as follows.

(a) (b)

Figure 10. Go forwards part, (a) Initial state of the experiment, (b) Multi-robots team goes forwards.

The pictures of the first part displayed in Fig. 10. At the beginning of the experiment, HMRS went into room k from room k-1. Nao led two KheperaIII robots (MRS) keep going ahead as there is a pass way. The initial state of experiment showed in Fig. 10 (a) and the segment of the first part showed in Fig. 10 (b).

After the short go forwards part, we set an obstacle in order to stop HMRS and removed it after few seconds (see in Fig. 11 (a)). In this case, as Nao updated the current image and detected the obstacle, it demanded the external camera to give the image of the current room K to the supervisor. Using Q learning, the optimal path and actions obtained by supervisor (see in Fig. 11 (b)). Multi-robots System was interrupted and exchanged commands with the supervisor. The supervisor switched the go forwards control to rigid formation control.

(a) (b)

(c) (d)

(e) (f)

Figure 11. Path planning part

The path planning demonstrated in Fig. 11(b). Actions are 1 1 3 3 3 for the rest part of environment. In the part, two KheperaIII robots (MRS) formed a vertical line rigid formation. By the ANFIS rigid formation control, MRS crossed the area without colliding with obstacles. After two continuous go forwards translational actions (see in Fig. 11 (b) and Fig. 11 (c)), multi-robots system turned left to finish the rest three left translational actions. Nao followed the MRS to turn left (see in the Fig. 11 (d) and Fig. 11(e)) after the two continuous forwards translational actions. The rest three actions illustrated respectively in Fig. 11 (d), Fig. 11 (e) and Fig. 11 (f). The final state of the experiment showed in Fig. 11 (f). They will continue to move in the room k+1.

4. Conclusion Comparing with existed control research of the heterogeneous multi-robots system, the paper proposed a control strategy based on the machine learning and an event-based coordination strategy. The obvious advantage of the strategy is that it makes the heterogeneous multi-robots team be self-adaptive to any disturbance of the current environment. Due to the coordination and cooperation among different types of robots, the HMRS can Author’s nameAuthor’s name

accomplish a common task by the unified distributed intelligence. Besides the simulation in a very widely virtual environment, the paper also validated the strategy with a real robots experiment. The strategy may be used to the practical logistic industry, such as environment exploring, transportation tasks and so on. In the future, the work will continue to focus on the adaptation of formation versus type of load aiming to apply in the logistic industry. References

[1] Arai. T, Pagello. E, Parker. L.E(2002) Editorial: Advances in multi-robot systems, IEEE Trans. Robot. Autom.18(5), 655–661. [2] Cao.Y, Fukunaga. A, Kahng. A(1997) Cooperative mobile robotics: Antecedents and directions, Auton. Robot. 4, 1–23. [3] Cordes. F, Ahrns, I, Bartsch. S, Birnsch ein. T, Dettmann. A, Estable. S, Haase. S, Hilljegerdes. J, Koebel. D, Planthaber. S, Roehr. T, Scheper. M, Kirchner. F. LUNARES (2010) lunar crater exploration with heterogeneous multi robot systems. Intelligent Service Robotics (6 December 2010), 1-29. [4] Parker, L. E.(2008) Multiple Mobile Robot System”. Handbook of Robotics, S. Bruno, K.S Oussama (Eds), Springer-Verlag Berlin Heidelberg, 921-941. [5] Asama. H, Ishida. Ozaki. Y, K., Habib. M. K., Matsumoto. A, Kaetsu. H, and Endo. I. (1992) A communication system between multiple robotic agents. In M. Leu, editor, Proc. the Japan U.S.A. Symposium on Flexible Automation, 647–654. [6] Alami. R, Fleury. S, Herrb. M, Ingrand. F, Robert. F (1998) Multi-robot cooperation in the MARTHA project, Robot Automation. Mag. 5(1), 36–47. [7] Ishida. Y, Endo. I, and Matsumoto A(1991) Communication and cooperation in an autonomous and decentralized robot system. In IFAC int. Symp. on Distributed Intelligent Systems, 299–304. [8] Hewitt. C, Bishop. P, Greif. I, Smith. B, Matson. T, and Steiger. R (1973) A universal modular actor formalism for artificial intelligence. In Proc. Intl. Joint Conf. Artificial Intelligence, 235–245. [9] Caloud. P, Choi. W, Latombe. J.C, Le Pape. C, Yim. M (1990) Indoor automation with many mobile robots, Proc. IEEE Int. Workshop Intell. Robot. Syst. (IEEE, Tsuchiura 1990), 67–72. [10] LePape. C. (1990) A combination of centralized and distributed methods for multi-agent planning and scheduling. In IEEE ICRA, 488–493. [11] Parker. L.E(1994) ALLIANCE: An architecture for fault tolerant, cooperative control of heterogeneousmobile robots, Proc. IEEE/RSJ/GI Int. Conf. Intell. Robot. Syst. (IEEE, Munich 1994) , 776–783. [12] Parker L.E. (1994) Heterogeneous multi-robot cooperation. Ph.D. Thesis, M.I.T. Department of Electrical Engineering and Computer Science, 1994. [13] Parker. L. E (2008) Distributed Intelligence: Overview of the Field and its Application in Multi-Robot Systems. Electrical Engineering, 2 (1), 5-14. [14] Wang. T, Gautero. F, Sabourin. C, Madani. K (2011) A Neural Fuzzy Inference based Adaptive Controller Using Learning Process for Nonholonomic Robots, International Work-conference on Artificial Neural Networks (IWANN 2011), CD-Rom Proceedings, 65-72. [15] Wang. T, Ramik. D. M, Sabourin. C, Madani. K. (2012) Intelligent systems for industrial robotics: application in logistic field, Industrial Robot: An International Journal, Vol. 39 N°3, 251 – 259. [16] Wang. T, Ramik. D.M, Sabourin. C and Madani. K. (2011) Machine-Learning for Heterogeneous Multi- Robots Systems in Logistic Application Frame, proceedings of the 14th International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines (CLAWAR2011), paris, france, septemper 2011, 207-222. [17] Watkins. C. J. C. H.: Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England.1989.