<<

CombiningMachineLearningwithEvolutionaryComputation:

RecentResultsonLEM GuidoCervone RyszardS.Michalski* KennethK.Kaufman LiviuA.Panait MachineLearningandInferenceLaboratory GeorgeMasonUniversity Fairfax,VA,22030 *AlsowiththeInstituteofComputerScience,PolishAcademyofSciences,Warsaw,Poland Abstract TheLearnableModel(LEM),firstpresentedattheFourthInternationalWorkshopon MultistrategyLearning,employsmachinelearingtoguideevolutionarycomputation .Specifically, LEMintegratestwomodesofoperation:MachineLearningmode,whichemploysamachine learning,andDarwinianEvolutionmode,whichemploysaconventionalevolutionary algorithm.ThecentralnewideaofLEMisthatinmachinelear ningmode,newindividualsare “geneticallyengineered”byarepeatedprocessofhypothesisformationandinstantiation,rather thancreatedbyrandomoperatorsofand/orrecombination,asinDarwinian -type evolutionary.Ateachstageo fevoluation,hypothesesareinducedbyamachine learningsystemfromexamplesofhighandlowperformanceindividuals.Newindividualsare createdbyinstantiatingthehypothesesindifferentways.Inrecentexperimentsconcernedwith complexfunctionop timizationproblems,LEMhassignif icantlyoutperformedselected evolutionarycomputationa lgorithms,sometimesachievingspeed -upsoftheevolutionaryprocess bytwoormoreordersofmagnitude(intermsofthenumberofgenerations).Inanotherrecent applicationinvolvingaproblemofoptimizingheatexchangers,LEMproduceddesignsequalor superiortobestexpertdesigns.TherecentresultshaveconfirmedearlierfindingsthatLEMisable tosignificantlyspeed-upevolutionaryprocesses(intermsofthenumberofgenerations)forcertain problems.FurtherresearchisneededtodetermineclassesofproblemsforwhichLEMismost advantagious. . 1Introduction

Theideathatmachinelearningcanbeusedtodirectlyguideevolutionarycomputationwasfirstpresentedat theFourthInternationalWorkshoponMultistrategyLearning(Michalski,1998).Thispresentation describedtheLearnableEvolutionModel(LEM),whichintegratesamachinelearningalgorithmwitha conventionalevolutionaryalgorithm,andrepo rtedinitialresultsfromLEM'sapplicationtoselected functionoptimizationproblems.Presentedresultswereverypromisingbuttentative.Theywereobtained usingLEM1,arudimentaryimplementationoftheproposedmethod,andtheexperimentswereperfor med onlyonafewproblems.

1 Subsequently,amoreadvancedimplementation,LEM2,wasdeveloped,andmanymoreexperimentswere performedwithit(Cervone,1999).Theoriginalmethodologywasalsosubstantiallyextendedandimproved (Michalski,2000).One oftheimportantimprovementsisthedevelopmentoftheadaptiveanchoring discretizationmethod,ANCHOR,forhandlingcontinuousvariables(MichalskiandCervone,2000).This paperpresentsrecentresultsfromtheapplicationofLEM2toarangeoffunctio noptimizationproblems andtoaproblemofdesigningoptimalheatexchangers.Toprovidethereaderwithasufficientbackground information,thenextsectionbrieflyreviewsthecurrentversionoftheLearnableEvolutionModel.

2ABriefOverviewoftheLearnableEvolutionModel

TheLearnableEvolutionModel(LEM)representsafundamentallydifferentapproachtoevolutionary processesthanDarwinian -typeevolutionaryalgorithms.InDarwinian -typeevolutionaryalgorithms,new individualsaregeneratedb yprocessesofmutationand/orrecombination.Thesearesemi -blindoperators thattakeintoconsiderationneithertheexperienceofindividualsinagivenpopulation(likeinLamarckian typeofevolution),northepasthistoryofevolution.InLEM,theevo lutionisguidedbyhypothesesderived fromthecurrentand,optionallyalsopastgenerationsofindividuals.Thesehypothesesidentifytheareasof thesearchspace(landscape)thatmostlikelycontaintheglobaloptimum(oroptima).Themachinelearning programisusedinLEMeitherasthesoleengineofevolutionarychange(theuniLEMversion),orin combinationwiththeDarwinian-typeofevolutionprocess(theduoLEMversion).

TheduoLEMversionintegratestwomodesofoperation:MachineLearningmode andDarwinianEvolution mode.TheDarwinianEvolutionmodeimplementsaconventionalevolutionaryalgorithm,whichemploys mutationand/orrecombinationoperatorstogeneratenewindividuals.TheMachineLearningmode generatesnewindividualsbyaprocessofhypothesisgenerationandinstantiation.Specifically,ateachstep ofevolution,itselectstwogroupsofindividualsfromthecurrentpopulation:High -performingindividuals (H-group),whichscorehighonthefunction,andLow -performanceindividuals(L-group),which scorelowonthefitnessfunction.Thesegroupsareselectedfromthecurrentpopulationorfromsome combinationofthecurrrentandpastpopulations.Thesetwogroupsarethensuppliedtoalearningprogram thatgenerateshypot hesesdistinguishingbetweentheH -groupandtheL -group.Newindividualsare generatedbyinstantiatingthehypothesesinvariousways.Thesenewindividualscompetewiththeexisting individualsfortheinclusioninthenewpopulation.

IntheduoLEMvers ion,LEMalternatesbetweenthetwomodesofoperation,switchingtoanothermode whenamodeterminationcondition ismet(e.g.,whenthereisaninsufficientimprovementofthefitness functionafteracertainnumberofpopulations).IntheuniLEMversio n,theevolutionprocessisguided solelybythemachinelearningprogram.Whenthemodeterminationconditionismet,a StartOver operationisperformed.Insuchanoperation,systemgeneratesanewpopulationrandomly,oraccordingto certainrules(Michalski,2000).

2 Figure1presentsaflowchartofuniLEMandduoLEMversionofLEM.Foracomprehensivedescriptionof theLEMmethodologyreferto(Michalski,1998,Cervone,1999,Michalski,2000).

uniLEMversion duoLEMversion

Startover Startover

Switchmode SelectHandL groups

SelectHandL SelectParents groups Generatenewindividuals viahypothesescreation andinstantiation Generatenewindividuals Generatenewindividuals viahypothesescreation viamutationand/or andinstantiation crossover

Evaluateindividuals

Evaluateindividuals

Generatenewpopulation Generatenewpopulation

Adjustparameters Adjustparameters

Figure1.AflowchartoftheuniLEMandduoLEMversions.

Belowisabriefdescriptionoftheindividualsteps,withanindicationofhowtheyareimplementedinthe LEM2system.

StartOver:Thisoperatorgeneratesanewpopulationrandomlyoraccordingtocetainrules.InLEM2,a newpopulationisgeneratedrandomly,withaprovisothatanumberofthebestperformingindividualsfrom thepastpopulationsareaddedtothenewlygeneratedpopulation(elitism).

SelectH -groupandL -group:ThisselectioncanbedoneinLEM2usingoneoftw omethods:Fitness - BasedSelection(FBS),orPopulation -BasedSelection(PBS).InFBS,theH -group(L-group)consistsof individualswhosefitnessisabovetheHFT%fromthetopvalue(belowtheLFT%fromthelowestvalue). InPBS,theH -group(L-group)consistsofHPT%highest -fitness(LPT%lowest -fitness)individualsinthe population.Figure2illustratesthesetwoselectionmethodsandtheparametersHFT(highfitness threshold),LFT(lowfitnessthreshold),HPT(highpopulationthreshold),LPT(lowpopulationthreshold).

3

Figure2.Anexampleofthefitnessprofilefunction,andanillustrationof parametersHFT,LFT,HPT,LPTwouldselecttheHandLgroups.

Selectparents: TheselectionoftheparentsisrelatedtotheDarwinianmode.Itsele ctsrepresentative individuals(parents)fromthecurrentpopulationthatwillbemutatedand/orrecombined.LEM2 implementstwotypesofmutation:deterministicanduniform.Inthefirsteveryindividualinthepopulation isselected,whileinthelatte r,everyindividualhasthesamechanceofbeingselected,independentlyfrom itsfitness.

Generatenewindividualsviahypothesiscreationandinstantiation :TheLEMmethodologyisnot constrainedtoanyparticularlearningalgorithm,butcanbeused,in principle,withanyconceptlearning method.LEM2employsAQ18rulelearningprogramthatishighlysuitableforLEMduetoitsvarious characteristics,suchastheabilitytolearnruleswithdifferentlevelsofgenerality,theuseofinternal disjunctionoperator,andapowerfulknowledgerepresentation.

Figures3and4showanexampleoftheinputandoutputfromAQ18,respectively(aftersmallediting).

¡ ¢ ¡ £ ¤ ¥ ¤ ¢ ¦

¢ § ¨ ¡ £ © ¥ ¢ £ £ ¤ £ ¡  ¦ ¥ ¡ ¢ ¨ ¦ ¤

 ¤ £  ¥  ¦  ¤    ¨

( ) * + , + - + ( . * / 0 / - , 1 2 + 3 4 , ( 5 ) 6 7 '

 ¡ ¢ ¡ ©  ¤ ¦

8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^

¥   ¤ ¦  ¤  ¦ ¥ ¨ ¡ £ ¤



  ¨           

_ ` a b c d e f g a g h i f j k d b b l b d m n

  

¨         



o p

b q d r q a r d n s t s b d k j g m b

  

 ¨         

  ¨           

b d f q a r d n s t s b d k j g m b

u

 ¢ §  ¤ ¤ ¨ ¥ ¦



 

      ¤ ! ¥

 Note:Thevaluesintheconditionsoftherule



" " " # 





$ $ " %

" abovearesymbolsrepresentingrangesoforiginal



$ " # "

" valuesofthesevariables,nottheoriginalvalues.

$ $ " # 

 Theserangeshasbeendeteminedin theprocess

$ " $ # 

 of adaptiveanchoringquantization (Michalski

 ¢ §  ¤ ¤ ¨ ¥ ¦

& andCervone,2000).

 

      ¤ ! ¥



 $ $   # 

 

"   #

$

v



" "   " 

Figure4.AQ18input. w Figure5.AQ18output. AQ18takesasintputtheH -groupandL -group,aspecificationofthetypesand domainsofthevariables, plus,optionally,controlparameters[see(KaufmanandMichalski,2000)foradetailedexplanation],and producesasetofattributionalruleswithannotationscharacterizingtherules.Eachlearnedruleisa

4 conjunctionofcondi tionsthatspecifyrangesofvaluesanattributemaytake(inthecasewhenAQ18runs withoutinvokingconstructiveinduction).Aruleisinstantiatedbyselectingvaluessatisfyingrule conditions.Thelearnedrulesareusedtogeneratenewindividuals byrandomizingvariableswithinthe rangesofvaluesdefinedbytheruleconditions.Ifaruledoesnotincludesomevariable,itmeansthatthis variablewasfoundirrelevantfordistinguishingbetweentheH -groupandtheL -group.Variablesthatare noti ncludedintheruleareinstantiatedbyrandomlychoosingavaluefromtheirdomain,orchoosinga valuethatispresentinarandomlyselectedindividualfromthecurrentpopulation.

Generatenewindividualsviamutationand/orcrossover :Individualsin theparentpopulationaremutated and/orrecombined.ResearchonDarwinian -typeevolutionaryalgorithmshasinvestigatedmanydifferent formsofmutationandrecombination.

Evaluateindividuals:Foreachnewindividual,itsfitnessisevaluatedaccording toagivenfitnessfunction orbysomeprocess,e.g.,simulation.Inthelattercase,thisoperationmaybecostlyandtime-consuming.

Generatenewpopulation :Thisstepinvolvescreatinganewpopulationthatcombinesindividualsfromthe previouspopulationwithnewindividualsgeneratedaccordingtotheruleslearned.Differentmethodscan beusedforthispurpose.Thesemethodscanbedividedinto intergenerational andgenerational.Inthe methodsofthefirstgroup,bothnewlygeneratedandprevio usindividualscompeteforinclusioninthenew population.Inthemethodsofthesecondgroup,onlynewlygeneratedindividualscompetefortheinclusion.

AdjustParameters: LEMkeepsstatisticsregardingthenumberofsuccessfulbirths,thechangein the highest-so-farfitnessscore,andothers.Usingthesestatistics,itcanadjustitsbehaviorintheevolutionary process.Forexample,itmayfindthatatagivenstepgeneratingmoregeneralormorespecificrulesmay bemoredesirable,thatparamete rscontrollingtheselectionofH-groupandL-groupneedtobechanged,or thatthemutationratefortheDarwinianevolutionarymodeneedtobeadjusted.

3LEMImplementations:LEM2,LEM1,andISHED1

LEM2 isthenewestgeneral -purposeimplementationofLE M,andrepresentsasignificantimprovement overLEM1,thefirst,rudimentaryimplementation(MichalskiandZhang,1999).LEM1,presentedat MSL98,employstheAQ15cmachinelearningprograminMachineLearningmodeandGA1andGA2,two simpleevolutionaryalgorithms,inDarwinianevolutionmode.GA1andGA2useadeterministicselection mechanismandareal-valuerepresentationofthevariables.Themaindifferencesbetweenthetwoarethat GA1generatesnewindividualsonlythroughauniformGaussianmu tationoperator,whileGA2usesalsoa uniformcrossoveroperator.Continuousvariablesarediscretizedintoafixednumberofvalues.LEM1was appliedtofunctionoptimization(MichalskiandZhang,2000),andaproblemindesigningnon-lineardigital filters(Colettietal.1999).

2 LEM2wasprogrammedusingEC++,agenericEvolutionaryComputationLibrary(CervoneandColetti, 2000).InMachineLearningmode,itemploystheAQ18rulelearningprogram(KaufmanandMichalski, 2000a).ThemainfeaturesorimprovementsintroducedtoLEM2inrelationtoLEM1include:

A. Anewmethodfordiscretizingcontinuousvariableshasbeendevelopedandimplemented.The method,calledAdaptiveAnchoringDiscretization,brieflyANCHOR(MichalskiandCervone,2000) graduallyandadaptivelyincreasestheresolutionofcontinuousvariablesintheprocessofevolution. ThemethodhasdrasticallyimprovedtheefficiencyofLEMinthecasewhenindividualsare describedbycontinuousvariables.

B. Newindividualsaregeneratedbyinstantiatingmultiplerulesratherthanonlythestrongestruleinthe rulesetgeneratedbythelearningprogram.Thisallowsthesystemtoexploreinparallelseveral subareasofthesearchspace,whichisimportantinthecaseofmulti-modallandscapes.

C. Thenumberofnewindividualsgeneratedfromasingleruleisnotfixed,butisproportionaltothe rulefitness,definedasthesumoffitnessesofexamplescoveredbytherule.

D. Inadditiontothepopulation -basedmethodforselectingtheH -groupandL -group,LEM2canalso usesthefitness-basedmethod.

E. Thecostofvariablesinadjusteddynamicallyintheevolutionprocess.Eachtimeavariableis includedinarulesetgeneratedbythelearningprogram,itscostisincreased.Thisway,thesystem givespreferencetovariablesthatwerenotincludedinthepreviouslylearnedruleset.Thisfeaturehas proventobeusefulinoptimizingfunctionswithverylargenumbersofvariables.

F. TheuniLEMversionhasbeenimplemented,thatis,theevolutionprocessrepet itivelyexecutesonly MachineLearningmode.ThereisnoseparateDarwinianEvolutionmode.

G. AsimpleversionoftheStartOveroperationhasbeenimplementedfortheuniLEMversion. Specifically,whenthefitnessprofilefunctionisflatforacontrolledn umberofgenerations,new individualsarecreatedrandomlyandinsertedintothecurrentpopulation.

H. ParameterscontrollingthecreationofH -groupandL-groupineachstepofevolution,thepopulation lookbackandthedescriptionlookback,havebeenimplementedinLEM2(Michalski,2000).

LEM2wasappliedtoarangeofoptimizationproblems,anditsperformancewascomparedtothatof conventionalDarwinian-typeevolutionaryalgorithms(Cervone,1999).ISHED1 isanimplementationof theLEMmethodologyt ailoredtowardaspecificapplicationdomain,namely,tothedesignofheat exchangersystems.Specifically,itconductsanevolutionaryoptimizationprocesstodeterminethebest arrangementoftheevaporatortubesintheheatexchangerofanairconditio ningsystemundergiven technicalandenvironmentalconstraints(seeSection4.2).Special structuremodifyingoperatorshavebeen implementedthatmodifystructuresaccordingtotheexpertdomainknowledge.Adetaileddescriptionof ISHED1isin(KaufmanandMichalski,2000).

3 4Experiments

ThissectionpresentsselectedresultsfromtestingandvalidatingtheLEMmethodologyusingLEM2.To maximizetheobjectivityofLEM2testing,theresultsfromconventionalevolutionaryalgorithmsonthe sameproblem,whichwerefoundintheliteratureorontheweb,werecomparedwiththecorresponding resultsfromLEM2.

Forproblemsforwhichwewereunabletofindsuchresultsintheliteratureorontheweb,weapplieda conventionalevolutionaryalgorithm,ES,thatwere-implementedinC++fromanexistingversioninC(that wasobtainedfromKenDeJong).ESusesareal-valuedrepresentationanddeterministicselection(i.e.,each parentisselectedandthenmutatedafixednumberoftimes,definedbythebroodparameter).Themutation isdoneaccordingtotheGaussiandistribution,inwhichthemeanisthevaluebeingmutatedandthe standarddeviationisacontrollableparameter.Eachvariablehas1/ Lprobabilityofbeingmutated,whereL isthetotalnumber ofvariablesdefininganindividual.Thenewindividualsandtheirparentsaresorted accordingtotheirfitness,andthe popsize highest-fitnessindividualsareincludedinthenextgeneration, wherepopsizeisafixedpopulationsize.

Forsomeoftheproblems,wefoundonthewebresultsfromtheapplicationofParallelGA(PGA).PGAis astandardgeneticalgorithm(thatusesabinary -stringrepresentation,mutationandcrossoveroperators, andfitness-proportionalselection)thatsimultaneouslymaintainsseparatesubpopulationsofindividuals(the numberofsubpopulationsandtheirsizesarespecifiedbyuser-providedparameters).

4.1ApplicationtoFunctionOptimization ThissectionpresentsaselectionofresultsfromtheapplicationofLEM2, ES,andPGA(whenatothree well-knownfunctionoptimizationproblems,namely,theRosenbrockfunction,theRastriginfunctionand theGaussianQuarticfunction. Problem1. FindtheminimumofR osenbrockfunction(Rosenbrock,1960)inwhichthenu mberof arguments,n,israisedto100,andeachargumentrangesbetween–5.12and5.12:

n−1 2 2 2

= x ⋅ + − + − Ros(x1, x2,..xn) (100 (x i 1 xi ) (x i )1 ) i=1 Thisisarathercomplexoptimizationproblembecausethefunctionhasaverynarrowandsharpridgeand runsaroundaparabola,sothevariablesareinterrelated(Figure5).

4 Figure5.Aninvertedgraphofatwo-dimensionalprojectionoftheRosenbrockfunction. Forcomparison,theESalgorithmwasalsoappliedtothesameproblem.Theresultsofthisexperimentare graphicallypresentedinFigure6.Twodifferent populationsizeswereused,100and150,forbothLEM2 andES.Eachexperimentwasrepeated10timesandtheresultsaveraged.

LEM2vs.ES-RosenbrockFunction 100Variables-Eachcurveistheaverageof10runs 1-LEM150.3.3 1.00E+07 2-LEM150.3.1 ES 3-LEM150.1.3 1.00E+06 4-LEM150.1.1 9,14 5-LEM100.3.3 6-LEM100.3.1 1.00E+05 10,15 7-LEM100.1.3 11,16 1.00E+04 8-LEM100.1.1 12,13 9-ES150.1 1.00E+03 17,18 10-ES150.3 3 LEM2 11-ES150.5 1.00E+02 12-ES150.7 13-ES150.9 14-ES100.1 1.00E+01 2 4 15-ES100.3 1 5 7 6,8 16-ES100.5 1.00E+00 17-ES100.7 0 Numberof 18-ES100.9 5000

births 10000 15000 20000 25000 30000 35000 40000 45000 50000 55000 60000 65000 70000

Figure6.ResultsfromLEM2andESfortheRosenbrockfunctionoptimization.

InFigure6,LEMa,b,cmeansthatthemethodwasLEM2,thepopulationsizewas a,andHPTandLPT parameterswere band c,respectively. ESa,bmeansthatpopulationsizewas aandmutationrateb.As showninFigure6,LEM2wassignificantlylessd ependentontheinput parametersthanES,andalso convergedtothefunctionminimummuchfaster.ItispossibletonoticethatsomeoftheLEMcurves(e.g. 5,6,7,8)showalonghorizontalline,meaningthatforseveralbirthsthealgorithmdidnotimprovethe globaloptimum,andthenasteepverticalline.Thisbehavioristheresultofthestartoveroperator,which introducednewindividualsinthepopulation,thereforeallowedLEMtodiscoverthoseareasofthespace mostfavorabletodirecttheevolution.

LEM2’sresultswerealsocomparedwiththebestavailableresultspreviouslypublishedforthisfunction (CHC).TheseresultsconcerntheRosenbrockfunctionwithamuchsmallernumberofvariables(only2and

5 4).TheyaresummarizedinTable1,whichshowsthenumber ofevaluationsneededtocomeδ-closetothe globaloptimum,andtherelativespeedups.

Thevalueof δ-close specifiesthe numberofgenerationsafterwhichtherelativedistancefromthe solutiontothetarget(globaloptimum)producedbyanalgori thmbecomessmallerthan δ.Thespeedupof algorithmAoverBforagiven δ,isdefinedastheratio,expressedinpercentage,ofthenumberofbirths requiredbyBtothenumberofbirthsrequiredbyAtoachievetheδ-closeresult.

Inthecaseoft wovariables,thebestresultwasachievedusingtheCHC+BLXalgorithm(briefly,CHC) thatrequired4893evaluations(EschelmanandShaffer,1993).Incontrast,LEM2foundtheglobal minimumusingonly101evaluations(aspeedupofnearly5000%).

Rosenbrockfunction δ=0 minimization2vars

LEM2(uniLEM) 101

CHC 4893

SpeedupLEM2/CHC 4800%

Table1.ResultsfortheRosenbrockfunctionof2variables.

Inthecaseoffourvariables,thebestpublishedresu ltwasachievedbyabreederGA,thatrequiredabout 250,000evaluations(births)toachievearesultwith δ=0.1(Schlierkamp-VoosenandMuhlenbein1994). LEM2foundtheglobaloptimum( δ=0)withonly281evaluations,thatis,thespeedupofLEM2overGA was atleast 75,000%(sincetheresultpublishedforGAreferredto δ=0.1ratherthan δ=0.1).Table2 summarizestheresults.TheseresultsindicatethatLEM2wasabletorapidlylocatetheportionofthe landscapecontainingtheglobaloptimum.

Rosenbrockfunctionminimization (4variables)

LEM2(uniLEM) δ=0:281

GA δ=0.1:77,000

SpeedupLEM2/GA ≥27,500%

Table2.ResultsfortheRosenbrockfunctionof4variables.

6 Figure7illustratessamplerulesthatAQge neratedwhenLEMwasappliedtofindtheminimumofthe Rosenbrockfunctionwithfourvariables,andalsohowtheymatchtheH -groupindividuals.Thevariables arediscretizedusingthevaluesshownintheTable3.

y y y y z { | } ~  ~ €  ‚ ~ } ƒ „ 

Value0 –2..-1.2 Value1 –1.2..-.4 Value2 –.4..-.4 Value3 .4..1.2 Value4 1.2..2 Table3.AcorrespondenceofthesymbolicvaluestorealvaluesofvariablesinFigure7. TheminimumisfoundwhenalltheXsareequalto1,andthiswillberepres entedinthediagrambyvalue 3,since3describestherangebetween -.4and1.2,whichincludes1.Theglobalsolutionisindicatedin Figure7byacircle.

Figure7.LearnedHypothesesandH-groupindividuals. Thelearnedhypotheses(attributionalrules)showninFigure7are: Rule1:[x1=1..3]&[x2=1..4]&[x3=1..4] Rule2:[x2=3..4]&[x4=1..3] Bothrulesincludetheindividualthatrepresentsthefunctionminimum: [x1=3]&[x2=3]&[x3=3]& [x4=3],inashortnotation:(3,3,3,3).

2

Problem2.FindtheminimumoftheRastriginfunction:

n 2

= + − π Ras(x1, x2 ,..xn ) n*10 (xi 10*cos(2* *xi )) i=1 inwhichthenumberofarguments,n,wassetto100,andeachxwasboundedbetween–5.12and5.12.

TheRastriginfunctionhasmanylocaloptima,anditiseasytomisstheglobalsolution(Figure8).Inthi s experiment,bothuniLEMandduoLEMversionswereemployed,andtheirresultswerecomparedwiththe bestavailableresultfromaconventionalevolutionarymethod,whichwasobtainedbyaparallelGAwith16 subpopulationsand20individualspersubpopulation(Muhlenbein,Schomisch,andBorn,1991).Thisresult isshownbythepointPGAinFigure9.TheLEM2'resultswerealsocomparedwiththeperformanceofES.

Figure8.A2DprojectionoftheRastriginfunction.

TheresultsofuniLEM,duoLEM,ESandtheParallelGAareshowninFigure9.

Rastrigin'sFunctionwith100Variables Eachcurveistheaverageof10runs 10000 1000 100 ... ES 10 ... 1 uniLEM ... 0.1 ... PGA 0.01 ... 0.001 Fitness(LogaritmicScale) duoLEM 0.0001 0

NumberofBirths 3,000 6,000 9,000 12,000 15,000 18,000 21,000 24,000 27,000 30,000 109,072

Figure9.ResultsobtainedbyES,LEM2'suniLEMandduoLEMversions,andaParallelGA fortheRastriginfunctionwith50variables.

3 Figure9illustratestheevolutionarypro cessconductedbyuniLEM,duoLEM,andES.Italsoshowsa pointindicatingthebestresultobtainedbytheparallelGA.Eachcurverepresentsanaverageof10runs. They-axisrepresentsthefitnessusingalogaritmicscale,andthex-axisrepresentsthenumberofbirths.As onecansee,bothuniLEMandduoLEMrelativelyquickly.DuoLEMreachedtheglobalminimumwith δ=0.0001inall10runsafterabout26000evaluations.UniLEMfoundtheglobalminimum7timesoutof 10(hencetheaverageofthefitnessfunctionishigherthaninthecaseofduoLEM).TheparallelGA,which achievedthebestresultfoundintheliteratu reonthisproblem,required109072evaluationstoachieve δ=0.001(itused8subpopulations,eachwith20individuals).Thus,thespeedupofduoLEMoverparallel GA(PGA)wasmorethan420%.Wealsoinvestigatedtherateofconvergencetotheoptimumob tainedof thesefouralgorithmsbyrepeatingtheexperimentfor20,50,and100variables.

HowManyBirthsareNeededastheNumberofVariables Increases

120000 ES

100000 Parallel GA 80000

60000

40000

20000 uniLEM duoLEM 0 20Vars 50Vars 100Vars

Figure10.Thenumberofbirthsneededtoreachtheoptimumasafunctionofthenumberofarguments.

Figure10showsthedependenceo fthe evolutionduration (measuredbythenumberofbirthsrequiredto reachthenear-optimalsolution)onthenumberoffunctionargumentsfordifferentmethods.Asseeninthe figure,theevolutiondurationincaseofLEM2hasonlyslightlyincreasedwit hthenumberofarguments, whileinthecaseofESandParallelGAithasincreasedmuchfaster.

Problem3.FindtheminimumoftheGaussianQuarticfunction:

n 4 Gauss(x1, x2,.., xn) = ixi + Gauss )1,0( i=1 inwhichthenumberofarguments, n,wassetto10,50and100,andeachxwasboundedbetween –5.12 and5.12.Thisisasimpleunimodalfunctionpaddedwithnoise(Figure11).TheGaussiannoiseensures thatthealgorithmnevergetsthesamevalueonthesamepoint.Algorithmsthatdonotdowellonthistest functionwilldopoorlyonnoisydata.InthisexperimentuniLEMwascomparedwithES. Figure11.A2DprojectionoftheGaussianQuarticfunction.

Table4presentstheresultofcomparingLEM2inuniLEMversionwithESusingdifferentpopulationsizes. Resultsareshownfordifferentdeltas. (# vars) 10 50 100

δ0δ0δ0 δ0.01 δ0.1 δ0δ0δ0 δ0.01 δ0.1 δ0δ0δ0 δ0.01 δ0.1 LEM100 .3.3 1000 800 700 4900 4400 3900 21400 10500 10100 ES100 .7 3600 3200 2900 40100 40100 36700 432860 391979 92455 Speedup LEM/ES 300% 400%400% 800% 900% 900% 2000% 3700% 900% Table4.ResultsandrelativespeedupsofLEM2overESfortheGaussianQuarticfunction. Thisexperimentconfirmstheresultsoftheproblem2,whereitwasshownthatthespeedupofLEMvs. DarwinianEvolutionaryAlgorithmsincreasesasthenumberofdimensionsincreases.Th isistobe attributedtothefactthatblindoperatorssuchasmutationandrecombinationtendtobelesseffectivewith largesearchspaces.ThisexperimentalsoshowstheabilityofLEMtoworkwithnoisyfunctions.

4.2DesignofHeatExchangers InordertotestLEMonapracticalproblem,weappliedittotheoptimizationofheatexchangerdesigns undervarioustechnicalandenvironmentalconstraints.Tothisend,wedevelopedaspecializedsystem, ISHED1,thatcustomizedLEMtothisproblem.Toexpla inthisapplication,letusbrieflyexplainthe problem.Inanairconditioningunit,therefrigerantflowsthroughaloop.Itissuperheatedandplacedin contactwithcooleroutsideairinthecondenserunit,whereittransfersheatoutandliquefies.Co mingback insidetotheevap orator,itcomesintocontactwiththewarmerinteriorairthatisbeingpushedthroughthe heatexchanger,asar esultcoolingtheairwhileheatingandevaporatingther efrigerant.Theheat

2 exchangerconsistsofanarrayofp aralleltubesthroughwhichtherefrigerantflowsbackandforth. Differentorderingsoftheflowoftherefrigerantthroughtheindividualtubesmayhaveaprofoundeffecton theairconditioner'scoolingability.

ISHED1appliesaversionofduoLEM tailoredtothisproblem.Individualsinapopulationrepresent designs(structures)ofheatexchangers.Eachdesignisdefinedbyavectorthatcharacterizesthe arrangementoftubesonthepathfromtheinputandtheoutput.InDarwinianEvolutionmod e,ISHED1 employseightstructure-modifyingoperators,whichmakechangesinthestructures(analogoustomutation operatorsinevolutionaryalgorithms).Forexample,oneoperatormaycreateasplitinarefrigerantpathby movingthesourceofatube's refrigerantclosertotheinlettube;secondoperatormayswapthetubesinthe structure;andanotheroperatormaygraftapathoftubesintoanotherpath,etc.(KaufmanandMichalski, 2000b).Theapplicationoftheseoperatorsisdomainknowledgedriven ,thatis,operatorsareapplied accordingtoknowntechnicalconstraints. MachinelearningmodeinISHED1isalsotailoredtotheheatexchangerdesigntask.Thehypotheses generateddescribeabstractionsoftheindividualstructures.Theyspecify onlythelocationofinlet,outlet andsplittubes.Beyondthat,theinstantiationmodulemaychooseamongthedifferentstructuresthatfitthe learnedtemplate,andgeneratethemostplausibleoneaccordingtothereal -worldbackgroundknowledge. ISHED1useshighandlowfitnessthresholdsof25%toselecttheH -groupandL -group.Oncerulesare generated,anelitiststrategyisusedtoformthenextgenerationofproposedarchitectures.Thebest architecturefoundsofar,aswellasallmembersofthe H-grouparepassedtothenextgeneration,along withvariousinstantiationsofthelearnedrules.

AnISHED1runproceedsasfollows.Giveninstru ctionscharacterizingtheenvironmentforthetarget heatexchanger,aninitialpopulationofdesigns(specif iedbytheuserorrandomlygenerated),and parametersfortheevolutionarypro cess,ISHED1evolvespopulationsofdesignsusingcombinationof Darwinianandmachinelearningo peratorsforaspecifiednumberofgenerations.ISHED1returnsareport thatincludesthebestdesignsfoundandtheirestimatedquality(capacity).Throughouttheexecution,design capacitiesaredeterminedbyaheatexchangersimulator(Domanski,1989).

Manyexperimentshavebeenconducted.Initialexperimentsconcernedapoblem withaknownexpert solution(design)regardingtheheatexchangersizeandairflowpattern.ThebestdesignfoundbyISHED1 wascomparabletotheexpertdesignswidelyusedbyindustry.Furtherexperimentsutilizeddifferent exchangersizesandairflo wpatterns.Thelatterchangeswereespeciallysignificant,becausethe commerciallybuiltairconditionerstypicallydonottakeintoaccountanunevenairflow.Whenconfronted withsuchsituations,theircoolingabilitysuffers.Inthecaseofnon -uniformairflow,theISHED1-designed heatexchangersperformedsignificantlybetterthanthecurrently-usedexpert-designedstructures(Kaufman andMichalski,2000b).

3 AnexampleoftheoutputfromanISHED1runisshowninFigure12.Thisrunwasdoneinav erbose mode,andassuch,thelogdetailseverystructuretested,everyoperatorapplied,andtheruleslearned.The figureonlyshowsaverysmallsampleofthefulloutputinordertogivethereaderaflavorofISHED1in action.Addedcommentsaregiveninitalics. ExchangerSize:16x3 PopulationSize:15Generations:40 OperatorPersistence:5 ModePersistence:GA-probe=2SL-probe=1 Initialpopulation: Structure#0.3:171234567891213291531I18332036223824402642112 74514471634351937213923412543442846304832:5.5376 Structure#0.8:171203422624826102827151632332181953874094211 4413463048343536I213723392541274329453147:Capacity=5.2099 and13others SelectedMembers:3,2,3,7,9,3,9,... Operations:NS(23,39),SWAP(8),SWAP(28),..., SWAP(29),SWAP(25),SWAP(1) BelowisoneofthestructurescreatedbytheapplicationofaSMoperatorinDarwinianmode(byswappingthetwotubesfollowing tube29inStructure#0.8) Generation1: Structure#1.13:1712034226248261028271516323321819538740942114 13453048343536I213723392541274346293147:Capacity=5.2093 and14others. SelectedMembers:6,15,11,3,13,1,... ...... TheprogramsoonshiftsintoSymbolicLearningMode: Generation5:Learningmode Learnedrule: [x1.x2.x3.x4.x5.x6.x7.x8.x9.x11.x12.x13.x14.x15.x17.x18.x19.x20.x21.x22.x23.x24.x25.x2 6.x27.x28.x29.x30.x31.x32.x33.x34.x35.x36.x37.x38.x39.x40.x41.x42.x43.x44.x45.x46.x47. x48=regular]&[x10=outlet]&[x16=inlet](t:7,u:7,q:1) Anexampleofageneratedstructure: Structure#5.1:171234567891229453031I183320362238244026421127 1315474834351937213923412543442846143216:Capacity=5.5377 ......... Belowisastructurefromthe21stgeneration: Generation21:Learningmode Structure#21.15218416357891213451531I331735362239244042251144 3046324734192037212338412643282729144816:5.5387 and14others SelectedMembers:11,4,4,13,15,10,12,13,15,15,12,2,3,5,10. ......... ISHED1continuestoevolvestructures,andfinallyachieves: Generation40: Structure#40.15:33172414569781229464547I13420362238243 424344271315321618111937213223254026283530144831: Capacity=6.3686

Figure12.AnexcerptfromthelogofanISHED1run.

5Summary

Thispaperpresentedaselectionofrecentresultsf romtheresearchonLearnableEvolutionModelthat employsmachinelearningtospeedupevolutionarycomputation.Theresultswereobtainedbysystem LEM2,whichcanoperateintwoversions,uniLEMandduoLEM.TheuniLEMversionexecutes repeatedlyMach ineLearningmode,whichusesrulelearningandinstantiationasbasicoperators(the Startoveroperatorallowsthesystemswitchthepopulation).TheduoLEMversionalternatesbetween

4 MachineLearningandDarwinianEvolutionmode.Thelattermodeapplies aconventionalevolutionary computationalgorithm.

LEM2,describedinthispaper,isanimprovementoftheearliersystem,LEM1.Itimplementsamultiple ruleinstantiation(cateringtomultipleglobaloptima),adaptiveanchoringdiscretization(anaut omatic adjustmentoftheprecisionindiscretizingcontinuousattributes),andauniLEMversion.Theresultsfrom LEM2haveconfirmedpreviousstrongresults(MichalskiandZhang,1999),anddemonstratethattheLEM methodologycanbehighlyusefulforsomeproblems.

ISHED1isaversionofLEMtailoredtoaclassofproblemsinengineeringdesign(optimizationofheat exchangers).Itappliestask-specificoperatorsinaLEM -typecontrolenvironment.Resultshaveconfirmed significantbenefitsfrom integratingDarwinianEvolutionandMachineLearningmodesofevolution.By doingso,ISHED1wasabletoachieveorexceedthecoolingcapacityofcommerciallydesignedsystems underavarietyofconditions,withonlyamoderateamountofdomainknowledge.

TheprocesssofgeneratingnewindividualsbyhypothesisgenerationandinstantiationusedinLEMis computationallymoreintensivethantheprocessofgeneratingnewindividualsbymutationand/or recombination.Thisisoffsetbythereduction,som etimesverysignificant,inthenumberofevaluations neededtoreachtheevolutiongoal.Thus,LEMappearstobeparticularlywell -suitedforevolutionary computationproblemsinwhichevaluatingfitnessofindividualsinapopulationisacostlyand/or time- consumingoperation(asis,e.g.,inthecaseofheat-exchangerdesign).

TheLEMmethodologyisatanearlystageofdevelopmentandopensmanyinterestingproblemsforfurther research.Theyincludeatheoreticalandexperimentalinvestigationof thetrade -offsinherentinLEM,an implementationofmoreadvancedversionsofLEM,anexperimentationwithdifferentcombinationsof conventionalevolutionaryalgorithmsandmachinelearningalgorithms,andtestinginvarietyofpractical domains.

Acknowledgments

TheauthorsthankJeffBassettandPaulWiegandfortheirhelpinplottingtheRastriginfunction.

ThisresearchhasbeenconductedinMachineLearningandInferenceLaboratoryatGeorgeMason University.TheLaboratory’sresearchthatenabled theworkpresentedinthepaperhasbeensupportedin partbytheNationalScienceFoundationunderGrantsNo.IIS-9904078andIRI-9510644.

References

Baeck,T.,Fogel,D.B.andMichalewicz,Z.(eds.)(1997). HandbookofEvolutionaryComputation . Oxford:OxfordUniversityPress.

5 Cervone,G.(1999).AnExperimentalApplicationoftheLearnableEvolutionModeltoSelected OptimizationProblems. Master’sThesis,Dept.ofComputerScience,GeorgeMasonUniversity,Fairfax, VA.

Cervone,G.andMichalski,R.S.( 2000).DesignandExperimentswiththeLEM2Implementationofthe LearnableEvolutionModel. ReportsoftheMachineLearningandInferenceLaboratory ,GeorgeMason University,Fairfax,VA(toappear).

Cervone,G.andColetti,M.(2000).EC++,aGenericC++ LibraryforEvolutionaryComputation. Reports oftheMachineLearningandInferenceLaboratory,GeorgeMasonUniversity,Fairfax,VA.(toappear).

Coletti,M.,Lash,T.,Mandsager,C.,Michalski,R.S.,andMoustafa,R.(1996).ComparingPerformanceof theLearnableEvolutionModelandGeneticAlgorithmsonProblemsinDigitalSignalFilterDesign. Proceedingsofthe1996GeneticandEvolutionaryComputationConference .

Domanski,P.A.(1989).EVSIM -AnEvaporatorSimulationModelAccountingforRefrigeranta ndOne DimensionalAirDistribution.NISTIR89-4133.

Eshelman,L.J.andSchaffer,J.D.(1993).Real -CodedGeneticAlgorithmsandIntervalSchemata. FoundationofGeneticAlgorithms2,SanMateo,CA.

Goldberg,D.E.(1989). GeneticAlgorithmsinSearch,Op timizationandMachineLearning .Addison - Wesley.

Holland,J.(1975). AdaptationinArtificialandNaturalSystems .AnnArbor:TheUniversityofMichigan Press.

Kaufman,K.A.andMichalski,R.S.(2000a).TheAQ18SystemforMachineLearningandDataMining: User'sGuide.ReportsoftheMachineLearningLaboratory ,MLI00-3,GeorgeMasonUniversity,Fairfax, VA.

Kaufman,K.A.andMichalski,R.S.(2000b).ApplyingLearnableEvolutionModeltoHeatExchanger Design. ProceedingsoftheTwelfthInternationalConf erenceonInnovativeApplicationsofArtificial Intelligence,Austin,TX(toappear).

Michalewicz,Z.(1996). GeneticAlgorithms+DataStructures=EvolutionaryPrograms .Springer Verlag.

Michalski,R.S.(1998).LearnableEvolution:CombiningSymbolican dEvolutionaryLearning. ProceedingsoftheFourthInternationalWorkshoponMultistrategyLearning, organizedbytheUniversity ofTorino,DesenzanodelGarda,Italy,June11-13,pp.14-20.

Michalski,R.S.(2000).LEARNABLEEVOLUTIONMODEL:Evolutionary ProcessesGuidedby MachineLearning.MachineLearning38(1-2),pp.9-40.

Michalski.R.S.andCervone,G.(2000).AdaptiveAnchoringQuantizationofContinuousVariablesfor LearnableEvolution. ReportsoftheMachineLearningandInferenceLaboratory ,Ge orgeMason University,Fairfax,VA(toappear).

Michalski.R.S.andZhang,Q.(1999).InitialExperimentswiththeLEM1LearnableEvolutionModel:An ApplicationtoFunctionOptimizationandEvolvableHardware. ReportsoftheMachineLearningand InferenceLaboratory,MLI99-4,GeorgeMasonUniversity,Fairfax,VA.

Mitchell,M.(1996).AnIntroductiontoGeneticAlgorithms.Cambridge,MA:MITPress.

6 Muhlenbein,H.,Schomisch,M.andBorn,J.(1991).TheParallelGeneticAlgorithmasFunction Optimizer,ProceedingsoftheFourthInt'lConferenceonGeneticAlgorithmsandtheirApplications. Ravise,C.andSebag,M.(1996)AnAdvancedEvolutionShouldNotRepeatItsPastErrors. Proceedings oftheThirteenthInternationalConferenceonMachineLearning . Reynolds,R.G.(1994).AnIntroductiontoCulturalAlgorithms. ProceedingsofthethirdAnnual ConferenceonEvolutionaryProgramming. Rosenbrock,H.H.(1960).Anautomaticmethodforfindingthegreatestorleastvalueofafunction, ComputerJournal3:175,1960. Schlierkamp-Voosen,D.andMuhlenbein,H.(1994).StrategyAdaptationbyCompetingSubpopulations, ParallelProblemSolvingfromNature,ProceedingsoftheThirdWorkshop,PPSNIII,Jerusalem. Sebag,M.andSchoenauer,M.(1994).ControllingCrossove rThroughInductiveLearning. Proceedingsof theThirdConferenceonParallelProblemSolvingfromNature,Springer-Verlag. Sebag,M.,SchoenauerM.,andRaviseC.(1997).InductiveLearningofMultationStep -sizein EvolutionaryParamterOptimization, ProceedingsoftheSixthAnnualConferenceonEvolutionary Programming.

7