
<p>Computational Methods for Data Analysis – 2014/15</p><p>Lab 5: Neural Networks in R</p><p>In this lab we’ll see how to train a neural net for classification, using the package nnet.1</p><p>Starting</p><p>In this lab we’ll use the nnet library; you should start by loading it into your R environment using library (and installing the package using install.packages if you haven’t installed them yet). </p><p>Loading the cheese dataset</p><p>In this exercise we’ll use the cheese dataset, containing information about taste, acetic component, lactic component, etc of several types of cheese.</p><p>Exercise: Download the cheese data from the Lab5 directory on the course website.</p><p>Exercise: Load the cheese data into a variable called cheese in your workspace.</p><p>Exploring the data</p><p>Exercise: Use the commands cheese names(cheese) summary(cheese) to get a feel for the data.</p><p>Create and analyze a perceptron</p><p>Fit a simple neural net to predict with no hidden layer and linear activation function using the function nnet(). nnet() takes as arguments - a formula specifying output and input variables (just as in linear regression) - a parameter data specifying the data frame to be used 1 The lab follows closely the presentation in http://www.louisaslett.com/Courses/Data_Mining/ST4003-Lab5- Introduction_to_Neural_Networks.pdf - a parameter size specifiying the number of hidden units (0 in this case) - a parameter skip specifying whether the inputs are directly connected with the outputs or now - a parameter linout specifying whether a linear activation function is desired. (see http://cran.r-project.org/web/packages/nnet/nnet.pdf for full details of all the arguments)</p><p>For example, the following command creates a perceptron with three input variables, one output variable, and using a linear activation function instead of the logistic function.</p><p>> fitnn1 = nnet(taste ~ Acetic + H2S + Lactic, cheese, size=0, skip=TRUE, linout=TRUE) > summary(fitnn1)</p><p>In the output of summary(), i1, i2 and i3 are the input nodes (so acetic, H2S and lactic respectively); o is the output node (taste); and b is the bias. The values under the links are the weights.</p><p>Evaluating the predictions of a perceptron</p><p>Let us now use the J cost function implemented in Lab 1 to evaluate the model: </p><p>J = function (model) { return(mean(residuals(model) ^ 2) / 2) }</p><p>Exercise Compute J of fitnn1</p><p>Let us now compare this to the fit we get by doing standard least squares regression: fitlm = lm(taste ~ Acetic + H2S + Lactic, cheese) summary(fitlm)</p><p>Exercise Compute J of fitlm</p><p>Adding a Hidden Layer With a Single Node</p><p>We now add a hidden layer in between the input and output neurons, with a single node.</p><p>> fitnn3 = nnet(taste ~ Acetic + H2S + Lactic, cheese, size=1, linout=TRUE)</p><p>During backpropagation you should see values like the following: # weights: 6 initial value 25372.359476 final value 7662.886667 converged</p><p>You can visualize the structure of the network created by nnet using the function plot.nnet(), in the devtools library library(devtools)</p><p>The following encantation enables devtools: source_url('https://gist.githubusercontent.com/fawda123 /7471137/raw/466c1474d0a505ff044412703516c34f1a4684a5/n net_plot_update.r')</p><p>We can now plot the architecture of fitnn3 with the following command: plot.nnet(fitnn3)</p><p>(NB plot.nnet() doesn’t seem to work with perceptrons)</p><p>Train and Test subsets</p><p>In this exercise we are going to use part of our data for training, part for testing. </p><p>First of all, download the Titanic.csv dataset into a variable called data. This dataset provides information on the fate of passengers on the maiden voyage of the ocean liner ‘Titanic’. Explore the dataset.</p><p>Then split the dataset into a training set consisting of 2/3 the data and a test set consisting of the other 1/3 using the method seen in Lab 4. n <- length(data[,1]) indices <- sort(sample(1:n, round(2/3 * n))) train <-data[indices,] test <- data[-indices,]</p><p>The nnet package requires that the target variable of the classification (in this case, Survived) be a two-column matrix: one column for No, the other for Yes — with a 1 in the No column for the items to be classified as No, and a 1 in the Yes column for the items to be classified as 0:</p><p>No Yes Yes 0 1 Yes 0 1 No 1 0 Yes 0 1</p><p>Etc. Luckily, the class.ind utility function in nnet does this for us: train$Surv = class.ind(train$Survived) test$Surv = class.ind(test$Survived) </p><p>(Use head to see what the result is on train for instance.) We can now use train to fit a neural net. The softmax parameter specifies a logistic activation function. fitnn = nnet(Surv~Sex+Age+Class, train, size=1, softmax=TRUE) fitnn summary(fitnn) </p><p>And then test its performance on test: table(data.frame(predicted=predict(fitnn, test)[,2] > 0.5, actual=test$Surv[,2]>0.5)) </p>
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages4 Page
-
File Size-