Quick viewing(Text Mode)

Artificial Ecological Pyramid Model and Its Application in Autonomous Robot Strategy System

Artificial Ecological Pyramid Model and Its Application in Autonomous Robot Strategy System

http://www.paper.edu.cn

Artificial Ecological Pyramid Model and Its Application in Autonomous Robot Strategy System

Xue Fangzheng, Fang Shuai, Xu Xinhe School of Information Science &Engineering Northeastern University Shenyang, PRC [email protected]

Abstract—The Artificial ecological pyramid model is because its discovery is enlightened by ecological pyramid proposed to solve complex problems, which is a layered theory in biology. The most difference between AEPM and evolutionary multi-agent system model. The model has three traditional evolutionary and hybrid evolutionary methods is layers. The lowest layer is composed of “stimulation- that the AEPM considers the hierarchy of intelligence. response” agents called base agents. The second layer is Large numbers of agents with different intelligent level composed of combined agents that are combinations of base collect together to form an agent pyramid like an ecological agents. The top layer is composed of advanced agents that pyramid. The elements of traditional evolutionary methods have reasoning capabilities. Agents in all layers must be up are genes that indicate some information. These genes have against the test of environment. Agents on the lowest layer the simplest intelligence, which are on the lowest layer of can only improve their fitness evaluations by evolutional methods, and agents on upper layers can also improve their the pyramid, so traditional evolutionary method can not fitness evaluations by “eating” agents on lower layers. We have advanced intelligence. The evolutionary neural also construct a strategy system based on artificial system is equal to an intelligent system composed pyramid model to solve the problem of rivalry between of a neural network agent that has some advance autonomous robots. The simulation results of the test on intelligence and some gene agents that indicate some soccer robot system prove the model effect. information of nodes and structure of neural network. The two kinds of agents are on the top and bottom layer of the Keywords- autonomous robot system, HGA, MAS, soccer pyramid. Apparently, if creatures on the top layer of the robot, strategy system ecological pyramid only eat creatures on the lowest layer, the creatures on the top layer need to eat very much to get I. INTRODUCTION enough , perhaps it is not easy to eat or it can cause innutrition. The AEPM has the advantage of great fitness Many people have been looking for a method capability that is also the advantage of traditional intelligent enough to solve all kinds of complex problems. evolutionary methods. If the “eating” algorithm has been The evolutionary method is one of the most outstanding well designed, the AEPM can combine many intelligent one. It is a combination of nature genetics and computer methods and become more intelligent than other methods. science. In 1967, Holland proposed the conception of genetic algorithm [1]. Generally speaking, Holland’s work is called standard genetic algorithm. Many scholars proposed some similar algorithms, for example, Rechenberg and Schwefel’s evolutionary strategy [2] [3], L. J. Fogel’s evolutionary programming [4], Potter. M and K. De Jong’s co-evolution model [5], etc. The developments and applications of these methods make evolutionary method win more and more attentions in AI Fig. 1. An ecological pyramid in biology. area. Even then, the attempts to realize real intelligence by The ecological pyramid has four layers, and is an ecological pyramid about the evolutionary method incur many criticisms. After all, creatures on grassland. such methods cost too many years to realize intelligence in nature. So, people try combining evolutionary methods and II. ECOLOGICAL PYRAMID AND ARTIFICIAL other AI methods, such as evolutionary neural networks [6]. ECOLOGICAL PYRAMID The multi agent theory is an active theory in distributed The ecological pyramid is a conception of biology to artificial intelligence in recent years. It affords us a new describe an ecological system composed of creatures at method to realize hybrid intelligence. The proposed different . In the ecological system, creatures artificial ecological pyramid model is a multi agent model. are more intelligent and the amount of creatures decreases It is a layered evolutionary pagoda-like model. It is called when the layer become higher. Creatures at the same layer artificial ecological pyramid model (for short, AEPM)

转载 中国科技论文在线 http://www.paper.edu.cn

are rivals, and creatures at near layer are hunters and foods problems and becoming more popular for its characteristic (see Fig.1.). of and entertainment. A typical soccer robot system (MIROSOT [7]) has four subsystems: computer If we look on the creatures in the ecological pyramid as vision subsystem, strategy subsystem, communication agents, we get a multi agent model called artificial subsystem and car-like robot subsystem. Its operation ecological pyramid model. A three-layer artificial mechanism is that the strategy subsystem makes decisions ecological pyramid model is described in Fig.2. The model by analyzing the positions of moving objects on the ground has three layers: layer of base agents, layer of combined afforded by the computer vision subsystem, and transmits agents, and layer of advanced agents. velocity commands to the home robots via the The layer of base agents (for short, LBA) is composed communication subsystem. Then home robots act of traditional genes. Each agent on the layer is a simple according to the intentions of the system designer. “stimulation- response” agent describes an “if The strategy subsystem is a kernel system of a soccer then” rule. It is called base agent (for robot system. The strategy procedure of soccer robot short, BA). Traditional genetic algorithms can be used in system is described as below: this layer. N + )1*2( N The layer of combined agents (for short, LCA) has : =→ {}()θθ ∈ RyxyxPVPD ,,,,,, (1), “combined intelligence”. Agents on this layer are = {}(),, ∈ RrvlvrvlvV combinations of base agents, such as rule tree agents, rule network agents, etc. Such agents are called combinational where x, y and θ are the x coordinate, y coordinate and agents (for short, CA). Some custom-built genetic direction of an object, respectively, lv, rv are the velocities algorithms can be used here. For example, if a rule tree of left and right wheel, respectively, P is the position space, agent is described as a “variable length chromosome” agent, V is the velocity space, and N is the amount of robot of one then the “variable length chromosome genetic algorithm” team. can be used. We can get a state S describing the situation of the The layer of advanced agents (for short, LAA) has game from the P space. According to the characteristic of agents that have advanced intelligence. Agents on this layer the soccer robot system, this paper proposes some key are called advance agents (for shoot, AA). Traditional decision-making variables to construct the state. The said S genetic algorithms can not be used here because agents on is a 5-tuple collection, described as below: this layer often have complex structures. S =< bPosition bDirection bSpeed wControl,,,, Agents in different near layers can learn and (2), communicate by the “eating” algorithm. The said “eating” oFormation > algorithm describes the action that an agent on a higher Where bPosition is ball’s position area, bDirection is layer tries to “eat” an agent with some advantages on a ball’s moving direction, bSpeed is ball’s moving speed, lower layer. There are two results after an “eating” wControl indicates which side controls the ball, algorithm. The first is that the agent on the higher layer has oFormation is the formation of the adversary team. part of or all the functions of the agent on the lower layer. The second is that the hunter agent’s fitness value becomes The robot actions are the base of the strategy subsystem, higher and the other one’s become lower. The “eating” including shooting, passing ball, defending, and goal algorithm is not described here, because such algorithms must be variable and be relative to the structure of relevant agents. In the artificial ecological pyramid, the procedure that intelligence comes into being can be looked on as a growing up procedure. Simple reactive intelligence grows up to be advanced intelligence by the “eating” procedure of agents. This procedure can be also looked on as a learning procedure, too. Base agents discover the unknown world by genetic algorithm. Combined agents learn from base agent with high fitness value. Advanced agents learn from combined agents with high fitness value. Comparing with traditional hybrid genetic methods, the AEPM is more flexible and effective because of the existence of LCA. The detailed idea and method of AEPM will be described below by a soccer robot strategy system based on AEPM.

III. SOCCER ROBOT STRATEGY SYSTEM

Soccer robot system is a multi robots oppositional Fig. 2. The soccer robot strategy system based on the artificial ecological system arisen in recent years. The system has been pyramid model. becoming a standard test-bed for the AI and robotics 中国科技论文在线 http://www.paper.edu.cn

keeping, etc. The pivotal attribute of an action is the target. 18-bit binary code, where S occupies 10 bits, and A For example, the target of a pass-ball action is where to occupies 8 bits. pass the ball. If we discretize the pivotal attribute, an action 2) Evaluation method: The system does not always will be divided into multi actions with specific targets. The evaluate after an individual rule was used, because the strategy subsystem’s each decision-making course gets an evaluation of a single rule relies not only on the result after n-tuple collection composed of discrete actions described as below: it executed but also on how tying in with the near rules. We define some specific states when the evaluation events

=< 21 i ,...,,...,, RARARARAA n > (3), happen. The evaluation process aims at all the rules used during a period of time. The evaluation function is where RA is the action of home robot i. i described as below: So, a simple strategy subsystem is a “state-action” table − CC se combining multi “state-action” couples. Each “state- f = k = SgC k )(, (5), − NN action” couple indicates a rule like “if then ”. This se

paper constructs a strategy subsystem for a 3vs3 game. We where Ce is the terminal point of a evaluation period, make a specific robot to be a goal keeper, so the strategy C is the start point of a evaluation period, S is the state subsystem need only decide the actions of the other two s k

robots. of the game, N e - N s is the amount of rules used in a evaluation period. IV. SOCCER ROBOT STRATEGY SYSTEM BASED ON The evaluation after training once can not denote the real AEPM fitness value of a rule, so the paper makes the average of The strategy system constructed by the “state-action” recent 3 evaluations real fitness value of a rule, as in (6) k table obviously can not have good fitness capability, and r = ∑ ff i 3 (6). can not fit the demand of high intelligence of autonomous ki −= 3 robots oppositional environments. So we construct a new The selection strategy of LBA is selecting the best gene. strategy subsystem based on the AEPM (described as The genetic operators are crossover and mutation operator. Fig.2.). The terminal condition is that higher layers don’t demand LBA to evolve. B. LCA The LCA is composed of variable length chromosome, whose size is 20. Agents on this layer compete with agents in the LBA to survive. All the agents from the LBA and LCA are evaluated together. Genetic algorithms are not used in the layer. The amount of CAs is equal to the amount of islands in the LBA, in order that one CA is relative to one island. The CAs tries to get evaluations much higher than all the agents on relative islands. Before then, the CAs demands the LBA to evolve and “eat” agents on relative islands to enhance their fitness value. The CAs eat BAs when conditions below are satisfied: • The evaluation of the relative BAs is steadily high; Fig. 3. Genetic codes of a rule. • The fitness value of relative CA is not much higher than relative BAs. A. LBA In LBA, we construct 32 islands according to the ball After the “eating” procedure, the fitness value of the position area bPosition. Each island has 30 genes. Standard CA increases and the fitness value of the BA decrease. The genetic algorithm is used in islands, described as below: variable length chromosome is the array of the BAs. After the “eating” procedure, the BA is placed in the first place = ()() ,,,,,,,0 tfpgslNPGA (4), of the BA array, and the oldest BA in the array is removed.

where P(0) is initial species, N is , s is C. LAA selection strategy, g is genetic operator, p is probability of genetic operators, f is the evaluation function, t is the The LAA is composed of a BDI [8] [9] agent. A BDI terminal condition. agent is described as below: 1) Coding strategy: The paper uses the binary coding strategy, described in Fig.3. A “ifthen” rule is an 中国科技论文在线 http://www.paper.edu.cn

=< => ii 21 iN ii ∈= PddDaaaPIDBPAA ,},{},,...,,{,,,, • Step6: CAs “eat” BAs when all of the relative conditions satisfied; ii =∈= iii ),(:},{,},{ → REdffBPiiI (7), • Step7: Start evolutional operators in all islands in LBA every specific period of time; where P is plan, B is belief, D is desire, I is intention, • Step8: Start evolutional operators between agents on aij is action, E is environment, or states described before. near islands in LBA every specific period of time; The operation mechanism of the BDI agent is described below: • Step9: Turn to Step 3.

• Step1: Initialize all the elements in AA; V. EXPERIMENTAL RESULTS • Step2: Get E from the soccer robot environment, In order to verify the validity of the proposed algorithm, and get D from P; we play a long time game with the strategy based on the AEPM (for short, S ) and an experienced strategy (for • Step3: Evaluate plans in D by B; AEP short, S0) that won the 2nd place in the international • Step4: Select a plan as the Intentions (namely, I) to competition FiraCup2002 [10] on Newsim Studio V1.0, execute; which is a soccer robot simulation test-bed of NewNeu team. In order to avoid the state that all the agents can not • Step5: If no terminal command, return Step2, or stop fit the environment, we define a default agent on each running. island, described as below: The LAA has only one advanced agent, BDI agent. The DA =< ,*,*,*,*, aa > (8), BDI agent competes and is evaluated together with agent 21 on the LBA and LCA. It communicates with CAs by where * is wildcard character, ai is robot action. The “eating” algorithm. Traditional genetic algorithms are not default agents don’t evolve with other agents. We get 5 used. The AA “eats” CAs, and places them at 32 places. minutes scores at some specific time, the result is described Each place can only accommodate 5 agents, and is relative in Table.1. to an island in LBA. The AA “eats” CAs when conditions below are satisfied: TABLE I. RESULTS OF THE EXPERIMENT. • The evaluation of the AA is not much higher than those of BAs in the relative island and the relative At 10 Time At 5 minute At 1 o’clock At 5 o’clock CA; o’clock • The evaluation of the relative CA is steadily high. Scores 1:10 4:9 5:4 5:2 After “eating” procedure, the fitness of the relative CA (SAEP:S0) decreases and the new CA replace the old one in the specific place of B. The new 5 agents evaluate the plans in D. The evaluation strategy is that make 5 agents experts with different experiences. The 5 experts evaluate every plan. The final evaluation of a plan is the average of these VI. CONCLUSIONS evaluations. It’s very difficult to realize real intelligence by computer. We consider that the intelligence has different D. The operation mechanism of the strategy subsystem level, evolve step by step, and it may be more effective if based on the AEPM we combine some other existing results in AI fields with The operation mechanism of the strategy system is the custom evolutional algorithm. Enlightened by the described as below: ecological pyramid theory in biology, we propose the AEPM, and try to realize intelligence by it. • Step1: Initialize LBA, LCA; The experience of the soccer robot strategy • Step2: Design the BDI agent of LAA, initialize the system based on the AEPM proves the AEPM can realize all the elements of AA; some complex intelligence. But there are many works to do before the AEPM get mature, such as: • Step3: Select an agent from all agents on three layers to make the decision of strategy system when • To find a reasonable method to evaluate and design the message of environment arrives; the “eating” algorithm; • Step4: Evaluate all agents used during the period of • To study the population size of each level and its time at specific states; influence to effect of the AEPM; • Step5: AAs “eat” CAs when all of the relative • To find the mathematical theory to study the AEPM. conditions satisfied; 中国科技论文在线 http://www.paper.edu.cn

REFERENCE [6] Goldberg D. Genetic Algorithms in Search, Optimization and Machine Learning[M]. Massacheusettes: Addision-Wesley Pub., [1] Holland. Adaptation in Artificial and Natural Systems[M]. 1989. University of Michigan, Press, Ann Arbor, 1975 [7] Jong-Hwan Kim . Lecture Notes on Multi-Robot Cooperative [2] Rechenberg I. Evolutionsstrategie: Optimierung technischer Syste System Development [M] . Seoul, Korea: Green Publishing menach PrinzISien der biologischen Evolution [M]. Frommann- Company,1998 Holzboog, Stuttgart, 1973 [8] Haddadi A, Sundermeyer K. Belief-desire-intention agent [3] Klockgether J, Schwefel H P. Two-phase nozzle and hollow core architectures[A]. In: O`Hare G M P, Jennings N R eds. Foundations jet experiments [A]. In Elliott D. (eds.) Proc.11th Symp. of Distributed Artificial Intelligence[C]. New York: John Wiley & Engineering Aspects of Magneto hydrodynamics[C]. California SONS Inc, 1996. 169~185 Institute of Technology, Pasadena CA, March, 1970, 24-26:141- 148 [9] Anand S Rao. Multi-agent mental-state recognition and its application to air-combat modellint[A]. In: Workshop on [4] Fogel L J, Owens A J, Walsh M J. Artificial Intelligence Through Distributed Artificial Intelligence[C]. Tokyo, 1994.283~304 Simulated Evolution[M]. John Wiley, Chichester, UK, 1966. [10] FIRA official cite, Results of firacup 2002, Available: [5] Potter. M, K. De Jong. A Cooperative Coevolutionary Approach to www.fira.net/firacup/2002.html. Function Optimization[A]. In Y. Davidor and S. H. P. (Eds), Proceedings of the Third Conference on Parallel Problem Solving from Nature[C], pp. (1994) 249-257. Springer-Verlag.



© 2022 Docslib.org