SLAC-TN-03-028

Neural Networks

Patrick Smith

Office of Science, Student UndergraduateLaboratory Internship (SULI)

Stanford University

Stanford Linear Accelerator Center

Menlo Park, California

August 14,2003

Preparedin partial fulfillment of the requirementsof the Office of Science,Department of Energy’s ScienceUndergraduate Laboratory Internship under the direction of Tony Johnson.

Participant: Signature

ResearchAdvisor: Signature

Work supported in part by the Department of Energy contract DE-AC03-76SF00515. INTRODUCTION

Physicistsuse large detectorsto measureparticles createdin high-energy collisions at particle accelerators.These detectors typically producesignals indicating either where ionization occurs along the path of the particle, or where energyis depositedby the.particle.The data producedby thesesignals is fed into pattern recognition programsto try to identify what particles were produced,and to measurethe energy and direction of theseparticles. Ideally, there are many techniquesused in this pattern recognition . One technique,neural networks, is particularly suitable for identifying what type of particle causedby a set of energy deposits.

Neural networks can derive meaningfrom complicatedor imprecisedata, extract patterns,and detect trendsthat are too complex to be noticed by either humansor other computer related processes.

To assistin the advancementof this technology,Physicists use a tool kit to experimentwith severalneural network techniques.The goal of this researchis interface a neural network tool kit into JavaAnalysis Studio (JAS3), an applicationthat allows data to be analyzedfrom any experiment. As the final result, a physicist will have the ability to train, test, and implement a neural network with the desiredoutput while using JAS3 to analyzethe results or output.

Before an implementationof a neural network can take place, a firm understandingof what a neural network is and how it works is beneficial. A neural network is an artificial representation of the humanbrain that tries to simulatethe learning process[5]. It is also important to think of the word artificial in that definition as computerprograms that use calculationsduring the learning process. In short, a neural network learnsby representativeexamples.

Perhapsthe easiestway to describethe way neural networks learn is to explain how the humanbrain functions. The humanbrain containsbillions of neural cells that are responsiblefor processing information [2]. Each one of thesecells acts as a simple processor.When individual cells interact with one another, the complex abilities of the brain are made possible. In neural networks, the input or data are processedby a propagationfunction that adds up the values of all the incoming data. The ending value is then comparedwith a threshold or specific value. The resulting value must exceedthe activation function value in order to becomeoutput. The activation function is a mathematicalfunction that a neuron usesto produce an output referring to its input value. [8] Figure 1 depicts this process.Neural networks usually have three componentsan input, a hidden, and an output. These Iayerscreate the end result of the neural network. A real world example is a child associatingthe word dog with a picture. The child saysdog and simultaneouslylooks a picture of a dog. The input is the spoken word “dog”, the hidden is the brain processing,and the output will be the category of the word dog basedon the picture. This illustration describeshow a neural network functions.

MATERIALS AND METHODS

Before the implementation of a neural network tool kit into JAS3, JAS3 and all the componentshad to be installed. An editor that was capableof the processeshad to be obtained and downloadedsuccessfully on the workstation. The neural network that we originally used was Java Object Oriented Neural Engine (JOONB). This editor would eventually have to be replacedas you will find out why in the conclusion of this paper.

There were severalfactors that were involved in choosing the editor. It had to be user- friendly, free software, flexible and capableof handling the amount of data for our purpose. The tool-kit will be used by physicists who are analyzing their specific data, so the application must be easy to use and understand. An equally important factor is that the editor is written in Java

2 programminglanguage because the editor will be interfacedwith JAS3, a Javabased application.

Many companiessell neural network software. Thesesoftware packages are very expensive.

For the purposeof our researchthere was no needfor one of those software packages.A more generaleditor would be beneficial. An editor that was flexible was also important. An editor with functionality that includes graphsand many other easyto use featureswould improve usability. There are many editors via the World Wide Web that met our needsand were free to download along with open sourcecode. Finally, the editor had to be able to handle the specified amount of data.

Searchingfor a neural network editor involved the exploration of the intemet. Many networks were available,however; five met most of the qualifications. The Stuttgart Neural

Network simulator met the qualifications by a researcherat SLAC on a previous project.

NeuroSolutionsmet all the requirementsexcept that it was not free; in fact it was quite expensive. Cortex also met all the requirementsexcept that it was written in C++. JOONE satisfiedall requirements,so originally JOONE was chosen.

Now that the editor was chosen,understanding the componentsof it was important. JOONE a

GUI editor used a graphical user interface that allows you to create,modify, and train a neural network. JOONE for the most part is very self explanatoryand there is a tutorial that is somewhathelpful. In addition the creator of JOONE, Paolo Marrone, respondedto questions concerninghis editor that were unclear to us. Figure 2 showsthe JOONE editor.

For this study we have useda neural network to identify calorimeter clustersin a possible linear collinear collider detector. In such a detectorhundreds of particles are createdin each collision, eachin which depositsits energyinto a set of calorimeter cells. Software groups the energy depositsinto clusters,representing the energydeposited by a single particle. By measuringvarious propertiesof the cluster and feeding thesemeasurements into a neural network we can attempt to estimatewhat type of particle createdeach cluster. To train the network we use simulatedevents, in which we alreadyknow which type of particle createdeach energy deposit. The data that we usedin the neural net camefrom Gary Bower a SLAC physicist who createda neural network using a different editor. His training and validation data was theoretically valuable in a couple of instances.When training a neural network, the data should be trained on more than one net to contrastand compareresults. Bower’s data gave us that elementfor comparison. Also, we could use it to train in different modesjust to seethe varying outputs.

We constructeda neural network that included 3 layers. The input layer consistedof 15, the hidden 2.5,and the output 6. That meansthat the data set, the first 15 numbersin a row would be the input and the last 6 would be the desiredoutput accordingto the data combination of the input. The hidden layer varies. According to Kevin Swinger, the hidden layer value should be no more than twice the input. It is practical to experimentwith the hidden layer value.

When the layers where finished, anothercomponent was usedto read the data into the network referred to as the file input. The componentsimply read the first 15 columns into the network.

Another componentcalled the teacherread the last 6 columns or desiredresults into the network.

The 21 rows of data that we usedto train our network was the form of binary numbers. The first

15 inputs were named: NE [0], NE [I], NE [2], NE [O]/NE [l], NE [l]/NE [2], firs& IastL, length, firstDdiff, aveLE5, angsep*lOOOO,aveLHits2, nhits, ClusEtot, and CE[2] was the quantitiesthat describedthe clusters. The last 6 desiredoutputs were nngamma,nnchhad, nneuhad,nngammafrag, nnhadfrag, and nnnone, Thesequantities identified what type of cluster it was. Example, if the first 15 binary numbersdescribe the last 6 rows to be “1 0 0 0 0 0" it would be considereda nngammacluster. This is because“ 1 0 0 0 0 0” = nngammain binary, as

“0 0 1 0 0 0” = to a nneuhadcluster in binary. Seetable 1 for completetranslations. We set the initial learning rate q (eta) and momentumterms LZ(alpha). Eta dictatesthe proportion of the calculatederror that contributesto the weight change. Alpha relatesto the size of the previous updatefor eachweight. [8] Although we left the valuesto the editor default, the valueswere important to monitor and prevent over fitting the data. We then set the number of cycles

(epochs)for the network to train the data to 10,000. The network doesthe rest.

The purposeof training the net is to minimize the error which the network makesat each output unit over the entire data set. [S] So while training we looked for our error to move closer and closer to zero. It will never actually reachzero. If it did the network simply memorizedthe training data. So if the training error reacheda predeterminedtarget value, flattened out, or start to rise, training should be stopped. The last stepwas to read the output into a file for comparison. The output was read into AIDA. AIDA histograminterfaces allowed for a simple and easyto interpret implementationof the output. RESULTS

Table 1 representssample data that was usedto train the net. Each row representsa different data set. The first 15 was used as inputs the last 6 desiredoutputs. The input rows determined what the last 6 (desiredoutput) should be. Table 2 is the validation data or actual data that representthe values of the data we train. Note that the training data input is not identical to the validation input. This is becausewe do not want to train the net to memorize the data; instead we want to teach the net how to categorizethe data. So when training data, the error should continue to decreaseas the epochsincrease. We tried JOONE on the complex set of data with 15 inputs, but we were unable to get it reach

our desiredoutcome. We studied theory on neural networks, contactedthe author, and tried

different neural configurations.The results were never successful. We then thought that the data

we were using was complex, so we would try simpler data. That experimentdid not work as

well. Seefigure 3 for the graphical output that JOONE plotted. As you can notice in the plot, the

error doesnot continue to decreaseinstead it fluctuates.We then decidedto try anotherneural

editor namedCJNN. We testedthe samesimple data set that we used with JOONE. The output

looked much better, so we proceededto test CJNN on the complex data set. Again, the output

was much better. Seefigure 4 for the graphical representationof the data trained on CJNN.

Notice that the error decreasesas the epochsincrease. In figure 5 we used AIDA for analysisof

our results from training the net. The plot in0 representswhat desireddata was usedfor training.

Out0 displaysthe results that we obtainedafter it was trained. If the Out0 satisfactoryit would closely resemblein0. In a closer analysisthe bottom left and right plots out0 when inO==I and out0 when inO==O if the net was working properly we would get all 1 in the left plot and all 0 in the right plot. DISCUSSION AND CONCLUSION

Although the initial of creation and implementationof a neural net seemedsimple, I learned that there are many factors that one must take into accord. We faced many obstacleswhile attemptingto successfullyconstruct the network. At one point during our researchit seemedas when one barrier was crossedanother barrier would be put up. The problems rangedfrom input data organizationto the network not learning the data.

One of the first problems we encounteredwas that theJOONE editor required semi-colons betweeneach data element. So we had to addsemi-colons to 21 rowsand over a 1000columns

6 of data. This was an easily fixable problem; nevertheless,it was time consumingto run a find and replacefunction. The JOONE editor included a tutorial that explained the basicsof the functionalities of the editor. The problem with the tutorial was that it did not completely explain the application’s capability. The training of the neural network was exhaustivelyconsuming of the workstation. We would have to leavethe computeron after we left the lab in the evening to allow the networks to completetraining. At one point we analyzedtrained data and realized the data was not being trained at all. For about 3 days we kept running tests but we continuedto get flawed results.

Finally, my mentor and I concludedthat JOONE, despiteits great flexibility and functionality, was not working properly. We contactedPathak Saurav a scientist from the

University of Penn.He createda neural network call CJNN. This editor lacks the flexibility that

JOONE, but when we ran the samedata that we trained in JOONE it looked much better. We migrated away form JOONE and towards CJNN.

Before this research,I had little knowledgeon this subject. During this research,I have learneda lot about how neural networks learn and we have reachedfarther towards our goal.

With further work, Physicistswill have an option to build a network and train data for their personalresearch. Becauseof the many problemsthat we encounteredwith JOONE and the 8 week time limited that I had, we were unableto interface a Neural Network editor completely into JAS3. While JOONE looks impressiveand very flexible, we had limited successin making it completethe task we wanted. Although, CJNN is less flexible, it meetsour need at the current time. We should continue to interface CJNN into JAS3.

7 AKNOWLEDGEMENTS

I would like to take this time to thank the Departmentof Energy for the opportunity to participatein the SLAC/SULI internship. SpecialThanks go to my mentor Tony Johnsonfor an excellent project that has my interest. I thank Tony for aspirationand a new found dedication towards figuring out the unknowns. I also would like to thank Max Turri for all of his efforts of assistingme. Lastly I would like to thank my new friends for the fulfilling experienceat

SLACNJLI. REFRENCES

[l] “Artificial Neural Networks in High Energy Physics” HEP. http://neuralnets.web.cem.ch/NeuralNets/nnwInHep.html

[2] Grossberg,Steven, Neural Networks and Natural Intelligence. Cambridge: Fourth Printing

(1989).

[3] “Overview of ” JeffHeaton. www.ieffheaton.com/ai/iavaneural/chl.ht

[4] “Neural networks at Work” ieeespectrumjune1993. http://www.ece.o~i.edu/-strom/papers/spectruml.PDF

[5] “Neural Networks with Java” FachhochschuleRegensburg. http://rfhs8012fh- re~ensbur~.de/-sai39122/ifroehl/diplom/e-index.html

[6] “Java Object Oriented Neural Engine”JOONE.http://www-ra.informatik.uni- tuebingen.de/SNNS/

[7] “Stuttgart Neural Network Simulator” University of Stuttgart. http://www-ra.informatik.uni- tuebingen.de/SNNSl

[8] Swinger, Kevin, Applying Neural Networks a Practical Guide. San Francisco: Morgan

Kaufman Publishers,Inc (2001). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0.25004 0.72558 0.86318 0.34461 0.84058 4mooO 6.00000 3.00000 1.66660 0.04296 5983.20000 4.00000 12.00000 0.25777 153.910

0.03091 0.28449 0.31539 0.10864 0.90201 1.ooooo 3.00000 3.00000 0.66667 0.01058 3267.30000 1.33330 4.00000 0.06348 138.050

0.30590 6.96750 7.09420 0.04390 0.98215 o.ooooo 20.00000 21.00000 0.04762 0.20617 613.38000 1.33330 99.00000 4.56750 189.720 0.03992 0.16875 0.20822 0.23654 0.81045 22.00000 24.00000 3.00000 7.66660 0.01226 9718.50000 1.66660 5mOoO 0.07353 12.165 0.03703 0.43066 0.46769 0.08599 0.92082 3.00000 7.00000 5.00000 0.80000 0.01105 4.53240 1.33330 6.00000 0.06632 10.501 0.25247 38.60200 38.70900 0.00654 0.99724 28.00000 30.00000 3.00000 9.66660 0.03913 888.43000 3.33330 10.00000 0.23480 -0.682 0.11539 0.63563 0.64976 0.18153 0.97824 o.ooooo 6.00000 7.00000 0.14286 0.07644 1385.40000 2.00000 20.00000 0.47050 -58.409 0.06044 0.15082 0.19019 0.40073 0.79303 1 .oOOOo 2.00000 2.00000 1 .oOOoo 0.02505 12147.00000 1.66660 5.00000 0.15032 -89.196

0.41007 2.10940 2.18130 0.19440 0.96707 1.ooooo 10.00000 1O.OOOOO 0.20000 0.09010 1870.00000 3.33330 30.00000 0.68658 187.520 0.33440 0.48850 0.49750 0.68455 0.98191 7.00000 11 .OOoOO 5.00000 1.6OGilO 0.05257 13852.00000 4.00000 18.00000 0.31544 27.766 0.03400 0.12319 0.13931 0.27599 0.88434 1.ooooo 2.00000 2.OoaOo 1 .ooooo 0.01126 12805.00000 1.66660 5.00000 0.06757 -18.785 0.17292 2.28410 2.33290 0.07571 0.97907 o.ooooo 11.00000 12.00000 0.08333 0.08613 358.28000 2.00000 37.00000 0.87463 41.148 0.48116 3.76840 3.85420 0.12768 0.97775 1 .ooooo 21.00000 21.00000 0.09524 0.11356 856.09000 1 .ooooo 154.oOooo 6.28410 11.511 0.18082 0.35489 0.50208 0.50952 0.70683 22.00000 24.00000 3.00000 7.66660 0.03087 15327.00000 3.00@00 9.oOoOo 0.18519 12.360 0.52078 2.91620 3.14000 0.17858 0.92874 1 .oOOoo 15mooo 15.00000 0.13333 0.25838 1055.50000 1.33330 91.omoO 3.65650 31.846 0.60028 3.80650 3.81770 0.15770 0.99707 o.ooooo 15.00000 16.00000 0.06250 0.28897 635.15OCO 2.33330 86.00000 4.12300 188.540

0.08455 1.86600 1.87020 0.04531 0.99776 2.00000 14.00000 13.00000 0.23077 0.15618 730.42000 1.66660 68.00000 1.94920 189.100 0.40779 4.12740 4.29080 0.09880 0.96192 1.ooOOo 16.00000 16.00000 0.12500 0.25372 81.92900 2.66660 99.00000 3.99850 77.910 0.01863 0.54361 0.56224 0.03427 0.96687 11.ooooO 13.00000 3.00000 4.00000 0.01606 5549.9uooo 1.33330 4.oOOoO 0.09636 81.733

TABLE 1. Training data used in the Network 16 17 18 19 20 21 1 .ooooo o.uuooo 0.00000 O.OOOi)o 0.00000 0.00000 1.onooo 0.00000 0.00000 0.00000 0.00000 0.00000 I .oonoo 0.00000 0.0000 0.00000 0.00000 n oonoo I .ooooo u.ooooo o.onooo o.noono 0.00000 0.00000 0.00000 0.00000 o.ooooo I 00000 0.00000 0 oouoo o.nooon 0.00000 o.oonno I .ooooo 0.00000 0.00000 I .aoooo o.ooooo 0.00000 0 00000 0.013000 0.00000 I .ooooo o.ooooo 0.00000 O.C)tJ0()(~ 0.00000 0.00000 100000 0.00000 0.00000 0.00000 0.00000 0.00000 I .ooooo 0.00000 0.00000 0.00000 0.00000 0.00000

1 .oOOoo 0.00000 0.00000 0.00000 0.00000 o.ooooo

1 .moOo 0.00000 0.00000 0.00000 0.00000 0.00000 1.ooooo 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 I .uooon 0.00000 0 oonoo 1.ooaoo 0.00000 o.ooono 0.00000 o.oouoo 0.00000 1.ooooo o.oooilo o.ooono 0.00000 0 no000 0.00000

1 .ooooo o.ooooo 0.00000 0.00000 0.00000 0.00000 1’

10 ./ ,’ ‘. ,’

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0.19322 0.66292 0.81166 0.29147 0.81674 4.00000 7.00000 4.00000 1.25000 0.04119 7764.20000 1.66660 9.00000 0.24715 47.95400 0.12833 0.30596 0.34620 0.41943 0.88377 7.00000 12.00000 6.00000 1.33330 0.07902 1437.1Oao0 3.33330 16.00000 0.47411 -7.75650 0.23529 0.40886 0.61080 0.57548 0.66939 1 .oOOOo 5.00000 5.oOKlo 0.40000 0.06192 5 100.60000 1.66660 12.0OO00 0.37152 -7.45010 0.06450 0.19797 0.23980 0.32581 0.82557 o.ooooo 1.oOOOo 2.00000 0.50000 0.01531 4535.60000 1.33330 4.00000 0.09188 151.15000 0.00625 0.78755 0.79380 0.00794 0.99212 0.00000 5.00000 6.00000 0.16667 0.01962 93.63900 1.33330 7.00000 0.11772 17.20100 0.39088 1.44750 1.61900 0.27003 1.ooiNo 10.00000 10.00000 0.20000 0.10606 1254.50000 2.33330 33.00000 0.73668 0.34205 0.17717 0.43967 0.54705 0.40296 0.80371 5.00000 7.00000 3.ooOOO 2.OcOOo 0.03205 11800.00000 3.00000 9.olxOo 0.19228 -99.81400 0.46979 2.34170 2.65740 0.20061 0.88121 0.00000 10.00000 11.00000 0.09091 0.26227 1165.10000 2.00000 46.OCMIOO 2.06860 -119.51000 0.47712 2.55580 2.70380 0.18667 0.94529 o.oooao 12.00000 13.00000 0.07692 0.12257 408.97000 2.33330 64.00000 1.72500 -1.73170 0.16083 1.25520 1.35130 0.12812 0.92887 3.00000 11 .OOOoo 9.00000 0.44444 0.05357 991.11000 1 .oOOOo 17.00000 0.36486 188.54000 0.29678 1.77160 1.85700 0.16751 0.95403 o.ooooo 9.ooooo lO.OOOOO 0.10000 0.20973 1289.50000 3.66660 46.ooOOO 1.52330 103.12000 \ 0.16612 2.49220 2.59420 0.06665 0.96069 4.00000 14.00000 ll.OOOOO 0.45455 0.06228 1128.10000 1.oowo 21.00000 0.63709 -189.89000 0.65619 6.46290 6.57750 0.10153 0.98258 o.ooooo 26.00000 27.ooOOO 0.03704 0.57548 307.51000 2.33330 513.oOOOo 36.60800 0.35027 0.13887 1.00880 1.10270 0.13765 0.91486 o.ooooo 5.ooooo 6.00000 0.16667 0.03399 2197.10000 2.33330 ll.OOOOO 0.20393 - 186.64000 0.28524 3.40400 3.54150 0.08379 0.96117 1 .ooooo 18.00000 18.00000 0.11111 0.12983 320.50000 I.33330 97.00000 3.03960 -2.27480 0.44105 4.09810 4.16030 0.10762 0.98505 0.00000 18.00000 19.OOOOO 0.05263 0.20093 953.26000 2.33330 114.OmOO 5.13570 -49.07000 0.04156 0.47499 0.51530 0.08749 0.92178 15.00000 16.ooooo 2.00000 8.00ooo 0.03013 10927.00000 2.00000 6.00000 0.18078 -51.39800 0.03897 0.49133 0.53030 0.07932 0.92651 0.00000 4.00000 5.00000 0.20000 0.04180 1669.60000 1.33330 7.00000 0.25083 186.44000 0.24647 1.99750 2.10970 0.12338 0.94684 2.00000 10.00000 9aoOOo 0.33333 0.06891 1761.20000 3.33330 28.00000 0.62309 19.36400 0.31595 2.18740 2.37820 0.14444 0.91976 o.ooooo 8.00000 9.00000 0.11111 0.07196 1476.10000 1.66660 21.00000 0.52830 -162.42000 Table 2. ‘Ihe validation data (actual values). I6 17 18 19 10 21 1.oonno 0.00000 0.00000 0 nnooo 0.00000 0.00000 I 00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0 ooonn 1.00000 o.ooi1oo o.oonnn 0.00000 o.ooooo 0 ooonn 1.OOOOO 0.00000 0.00000 100000 0.00000 0 00000 0 00000 0.00000 0 00000 1.oocloo 0 ooono 0 00000 0.00000 o.oooon 0.00000 1 .oooOo 0.00000 0.00000 0.00000 0.00000 0 ooooi I moo0 0.00000 0.00000 o.noooo o.oonoo 0.00000 I 00000 0.00000 0.00000 0.00000 0.00000 0.00000 I .ooooo o.ooooo 0.00000 0.00000 0.00000 0 00000 1 .oOOOo 0.00000 0.00000 O.Ooa~O 0.00000 0.00000 1 .noooo 0.00000 0.00000 0.00000 0.00000 1).00oa~ I .ooooo 0.00000 0.00000 o.ouoon 0.00000 0 oonoo I .ooooo o.noQoo 0.00000 n.noooo o.nooon 0.00000 1.ooooo 0.00000 0.00000 0.00000 0.000110 0.00000 1 .onooo o.omo 0.00000 n.oooon 0.00000 0.00000 0.00000 0.00000 0.00000 I .ooooo 0.00000 0.00000 I .ooooo 0.00000 0.00000 0.00000 0.00000 o.oMloo 1 .omOo 0.00000 o.oomo o.oomo 0.004300 0.00000 11 Modal from JochenFrohlich

dendrites -q -- Y.A

Figure 1. Biological Network

Dendrites- acceptthe inputs.

Core- processesthe input.

Axon-turns the into outputs

12

Figure 3. The JOONE plot of an unsuccessful15-25-6 multi-payer perceptrontrain net.

14 Figure 4. The CJNN output plot.

15 FIGURE 5. plots comparing the desired output with the resulting output

in0

Entries : 17654 12,000 2,000 x Mean : 0.35805 10,000 x Rms : 0.47943 x Rmr : 0.15300 1,500 8,000

6,000 1,000

4,000 500 2,000

0 0 0.0 0.2 0.4 0.G 0.8 1.0 0.2 0.4 0.G 0.8 1 .o

out0 when inO==l out0 when inO==O

Entries : 6321 800-r - x Mean : 0.42700 1,000 700-- x Rms : 0.13539 800 GO0 --

500 -- 600 400-- 400

200

0.2 0.4 0.G 0.8 1 .o 0.2 0.4 0.G 0.8 1 .o

16