In my last post I said I wasn’t going to write anymore about neural networks (i.e., multilayer feedforward perceptron, supervised ANN, etc.). That was a lie. I’ve received several requests to update the neural network plotting function described in the original post. As previously explained, R does not provide a lot of options for visualizing neural networks. The only option I know of is a plotting method for objects from the neuralnet package. This may be my opinion, but I think this plot leaves much to be desired (see below). Also, no plotting methods exist for neural networks created in other packages, i.e., nnet and RSNNS. These packages are the only ones listed on the CRAN task view, so I’ve updated my original plotting function to work with all three. Additionally, I’ve added a new option for plotting a raw weight vector to allow use with neural networks created elsewhere. This blog describes these changes, as well as some new arguments added to the original function.
As usual, I’ll simulate some data to use for creating the neural networks. The dataset contains eight input variables and two output variables. The final dataset is a data frame with all variables, as well as separate data frames for the input and output variables. I’ve retained separate datasets based on the syntax for each package.
library(clusterGeneration) seed.val<-2 set.seed(seed.val) num.vars<-8 num.obs<-1000 #input variables cov.mat<-genPositiveDefMat(num.vars,covMethod=c("unifcorrmat"))$Sigma rand.vars<-mvrnorm(num.obs,rep(0,num.vars),Sigma=cov.mat) #output variables parms<-runif(num.vars,-10,10) y1<-rand.vars %*% matrix(parms) + rnorm(num.obs,sd=20) parms2<-runif(num.vars,-10,10) y2<-rand.vars %*% matrix(parms2) + rnorm(num.obs,sd=20) #final datasets rand.vars<-data.frame(rand.vars) resp<-data.frame(y1,y2) names(resp)<-c('Y1','Y2') dat.in<-data.frame(resp,rand.vars)
The various neural network packages are used to create separate models for plotting.
#nnet function from nnet package library(nnet) set.seed(seed.val) mod1<-nnet(rand.vars,resp,data=dat.in,size=10,linout=T) #neuralnet function from neuralnet package, notice use of only one response library(neuralnet) form.in<-as.formula('Y1~X1+X2+X3+X4+X5+X6+X7+X8') set.seed(seed.val) mod2<-neuralnet(form.in,data=dat.in,hidden=10) #mlp function from RSNNS package library(RSNNS) set.seed(seed.val) mod3<-mlp(rand.vars, resp, size=10,linOut=T)
I’ve noticed some differences between the functions that could lead to some confusion. For simplicity, the above code represents my interpretation of the most direct way to create a neural network in each package. Be very aware that direct comparison of results is not advised given that the default arguments differ between the packages. A few key differences are as follows, although many others should be noted. First, the functions differ in the methods for passing the primary input variables. The nnet
function can take separate (or combined) x and y inputs as data frames or as a formula, the neuralnet
function can only use a formula as input, and the mlp
function can only take a data frame as combined or separate variables as input. As far as I know, the neuralnet
function is not capable of modelling multiple response variables, unless the response is a categorical variable that uses one node for each outcome. Additionally, the default output for the neuralnet
function is linear, whereas the opposite is true for the other two functions.
Specifics aside, here’s how to use the updated plot function. Note that the same syntax is used to plot each model.
#import the function from Github library(devtools) source_url('https://gist.githubusercontent.com/fawda123/7471137/raw/466c1474d0a505ff044412703516c34f1a4684a5/nnet_plot_update.r') #plot each model plot.nnet(mod1) plot.nnet(mod2) plot.nnet(mod3)
The neural networks for each model are shown above. Note that only one response variable is shown for the second plot. Also, neural networks created using mlp
do not show bias layers, causing a warning to be returned. The documentation about bias layers for this function is lacking, although I have noticed that the model object returned by mlp
does include information about ‘unitBias’ (see the output from mod3$snnsObject$getUnitDefinitions()
). I wasn’t sure what this was so I excluded it from the plot. Bias layers aren’t all that informative anyway, since they are analogous to intercept terms in a regression model. Finally, the default variable labels differ for the mlp
plot from the other two. I could not find any reference to the original variable names in the mlp
object, so generic names returned by the function are used.
I have also added five new arguments to the function. These include options to remove bias layers, remove variable labels, supply your own variable labels, and include the network architecture if using weights directly as input. The new arguments are shown in bold.
mod.in |
neural network object or numeric vector of weights, if model object must be from nnet , mlp , or neuralnet functions |
nid |
logical value indicating if neural interpretation diagram is plotted, default T |
all.out |
character string indicating names of response variables for which connections are plotted, default all |
all.in |
character string indicating names of input variables for which connections are plotted, default all |
bias |
logical value indicating if bias nodes and connections are plotted, not applicable for networks from mlp function, default T |
wts.only |
logical value indicating if connections weights are returned rather than a plot, default F |
rel.rsc |
numeric value indicating maximum width of connection lines, default 5 |
circle.cex |
numeric value indicating size of nodes, passed to cex argument, default 5 |
node.labs |
logical value indicating if labels are plotted directly on nodes, default T |
var.labs |
logical value indicating if variable names are plotted next to nodes, default T |
x.lab |
character string indicating names for input variables, default from model object |
y.lab |
character string indicating names for output variables, default from model object |
line.stag |
numeric value that specifies distance of connection weights from nodes |
struct |
numeric value of length three indicating network architecture(no. nodes for input, hidden, output), required only if mod.in is a numeric vector |
cex.val |
numeric value indicating size of text labels, default 1 |
alpha.val |
numeric value (0-1) indicating transparency of connections, default 1 |
circle.col |
character string indicating color of nodes, default ‘lightblue’, or two element list with first element indicating color of input nodes and second indicating color of remaining nodes |
pos.col |
character string indicating color of positive connection weights, default ‘black’ |
neg.col |
character string indicating color of negative connection weights, default ‘grey’ |
max.sp |
logical value indicating if space between nodes in each layer is maximized, default F |
... |
additional arguments passed to generic plot function |
The plotting function can also now be used with an arbitrary weight vector, rather than a specific model object. The struct
argument must also be included if this option is used. I thought the easiest way to use the plotting function with your own weights was to have the input weights as a numeric vector, including bias layers. I’ve shown how this can be done using the weights directly from mod1
for simplicity.
wts.in<-mod1$wts struct<-mod1$n plot.nnet(wts.in,struct=struct)
Note that wts.in
is a numeric vector with length equal to the expected given the architecture (i.e., for 8 10 2 network, 100 connection weights plus 12 bias weights). The plot should look the same as the plot for the neural network from nnet
.
The weights in the input vector need to be in a specific order for correct plotting. I realize this is not clear by looking directly at wt.in
but this was the simplest approach I could think of. The weight vector shows the weights for each hidden node in sequence, starting with the bias input for each node, then the weights for each output node in sequence, starting with the bias input for each output node. Note that the bias layer has to be included even if the network was not created with biases. If this is the case, simply input a random number where the bias values should go and use the argument bias=F
. I’ll show the correct order of the weights using an example with plot.nn
from the neuralnet package since the weights are included directly on the plot.
If we pretend that the above figure wasn’t created in R, we would input the mod.in
argument for the updated plotting function as follows. Also note that struct
must be included if using this approach.
mod.in<-c(13.12,1.49,0.16,-0.11,-0.19,-0.16,0.56,-0.52,0.81) struct<-c(2,2,1) #two inputs, two hidden, one output plot.nnet(mod.in,struct=struct)
Note the comparability with the figure created using the neuralnet package. That is, larger weights have thicker lines and color indicates sign (+ black, – grey).
One of these days I’ll actually put these functions in a package. In the mean time, please let me know if any bugs are encountered.
Cheers,
Marcus
Update:
I’ve changed the function to work with neural networks created using the train
function from the caret package. The link above is updated but you can also grab it here.
mod4<-train(Y1~.,method='nnet',data=dat.in,linout=T) plot.nnet(mod4,nid=T)
Also, factor levels are now correctly plotted if using the nnet
function.
fact<-factor(sample(c('a','b','c'),size=num.obs,replace=T)) form.in<-formula('cbind(Y2,Y1)~X1+X2+X3+fact') mod5<-nnet(form.in,data=cbind(dat.in,fact),size=10,linout=T) plot.nnet(mod5,nid=T)
Update 2:
More updates… I’ve now modified the function to plot multiple hidden layers for networks created using the mlp
function in the RSNNS package and neuralnet
in the neuralnet package. As far as I know, these are the only neural network functions in R that can create multiple hidden layers. All others use a single hidden layer. I have not tested the plotting function using manual input for the weight vectors with multiple hidden layers. My guess is it won’t work but I can’t be bothered to change the function unless it’s specifically requested. The updated function can be grabbed here (all above links to the function have also been changed).
library(RSNNS) #neural net with three hidden layers, 9, 11, and 8 nodes in each mod<-mlp(rand.vars, resp, size=c(9,11,8),linOut=T) par(mar=numeric(4),family='serif') plot.nnet(mod)
Here’s an example using the neuralnet
function with binary predictors and categorical outputs (credit to Tao Ma for the model code).
library(neuralnet) #response AND<-c(rep(0,7),1) OR<-c(0,rep(1,7)) #response with predictors binary.data<-data.frame(expand.grid(c(0,1),c(0,1),c(0,1)),AND,OR) #model net<-neuralnet(AND+OR~Var1+Var2+Var3, binary.data,hidden=c(6,12,8),rep=10,err.fct="ce",linear.output=FALSE) #plot ouput par(mar=numeric(4),family='serif') plot.nnet(net)
Update 3:
The color vector argument (circle.col
) for the nodes was changed to allow a separate color vector for the input layer. The following example shows how this can be done using relative importance of the input variables to color-code the first layer.
#example showing use of separate colors for input layer #color based on relative importance using 'gar.fun' ## #create input data seed.val<-3 set.seed(seed.val) num.vars<-8 num.obs<-1000 #input variables library(clusterGeneration) cov.mat<-genPositiveDefMat(num.vars,covMethod=c("unifcorrmat"))$Sigma rand.vars<-mvrnorm(num.obs,rep(0,num.vars),Sigma=cov.mat) #output variables parms<-runif(num.vars,-10,10) y1<-rand.vars %*% matrix(parms) + rnorm(num.obs,sd=20) #final datasets rand.vars<-data.frame(rand.vars) resp<-data.frame(y1) names(resp)<-'Y1' dat.in<-data.frame(resp,rand.vars) ## #create model library(nnet) mod1<-nnet(rand.vars,resp,data=dat.in,size=10,linout=T) ## #relative importance function library(devtools) source_url('https://gist.github.com/fawda123/6206737/raw/2e1bc9cbc48d1a56d2a79dd1d33f414213f5f1b1/gar_fun.r') #relative importance of input variables for Y1 rel.imp<-gar.fun('Y1',mod1,bar.plot=F)$rel.imp #color vector based on relative importance of input values cols<-colorRampPalette(c('green','red'))(num.vars)[rank(rel.imp)] ## #plotting function source_url('https://gist.githubusercontent.com/fawda123/7471137/raw/466c1474d0a505ff044412703516c34f1a4684a5/nnet_plot_update.r') #plot model with new color vector #separate colors for input vectors using a list for 'circle.col' plot(mod1,circle.col=list(cols,'lightblue'))
I really like this function for plotting Neural networks. Thanks Marcus.
I just wanted to ask you about two things.
It seems to run into a problem when you train an nnet with caret and use train(…, method=”nnet”), when the function tries to evaluate the resulting train object it falls over. The train object from caret has a property modFormula that your function cannot evaluate. I have updated your function in line 101 and 106 to use my formula so it works for me but I think it would be a good addition to your function to get it to work with caret out of the box.
The second thing is when I train a function with caret or just with the nnet package and use “WL~.” as my formula the latest version of plot.nnet function seem to fall over with an error message
“Error in terms.formula(forms) : ‘.’ in formula and no ‘data’ argument”.
I think it needs me to specify the formula explicitly; something like WL~HA + DATE …. It worked fine in your previous version. There might be a better way to do this?
Thanks again.
Thanks for pointing that out Ed. I wasn’t aware that you could average nnet models in caret. I wish I would have known this about a year ago…. I can update the function this weekend based on your two points. The error you refer to in your second point has to do with how the function gets default names for the variables. This part of the code is a hot mess that needs to be reworked.
one other point Marcus, I was trying to fit a neural network with nnet and I use a mixture of factor and numeric variables. In that case the nodes (when I implement your plot function) are labelled incorrectly. In line 109 of your code where it says x.names <- colnames(facts) doesn't work for me. I needed to change the code to be x.names<- mod.in$coefnames.
Alright, I think I’ve got it working correctly now. Let me know if you still have trouble.
That’s great its works for all the things that I pointed out. Very useful. Thanks Marcus.
Thanks Marcus. I’m new to R, but I know you’ve done a great work. I like your plot function. Here is my question:
I want to plot a NN architecture with multiple hidden layers (e.g., 2 hidden layers with 6 nodes in the first layer and 8 in the second), however, the function can only plot the first hidden layer with 6 nodes, doesn’t show the second layer. Would you please show me how to make it? Thanks.
For example, I tried to specify 2 hidden layers with size=c(6,8) in RSNNS mlp function as follows, I want to plot this NN architecture diagram.
R code:
> mod3 plot.nnet(mod3)
> mod3 plot.nnet(mod3)
Hey Tao, check your email. I’ll try to include this in the function.
See the last update and let me know if you have any problems.
Hi,
Could you please modify the function so that the colour of the input nodes (circles) will represent the sign of their total contribution to the output?
Hi Aminu, please see my update above… it should work now though I didn’t update the function to do this automatically, I just modified the input argument for the node colors.
Thanks Marcus, I tried the coloring, its working fine. but my expectation was that i could get two colors one for positive contribution and the other for negative contribution.
Easy, just create a color vector like this:
Visualizing neural networks in R - update | Met...
Hi,
Thanks for your useful blog, I learnt how to use NNets mostly from here.
Is there a way to use the gar.fun function simply from raw vectors of weights? I’ve been using the funciton with models created by the nnet package and everything was fine, but now I’d like to obtain the inputs relative importance of a model that i created using a Java library. I managed to plot the neural network, and was wondering if I could do the same using the gar.fun function.
Is there something I could adapt from the function? (as I am pretty new to R, I might have missed it).
Thanks a lot for your kind help.
Best,
IT
Hi there,
I’m always happy when someone finds these posts useful! As you know, the
gar.fun
function doesn’t work with raw vector inputs. I plan on extending the functionality in theplot.nnet
function to all the other neural network functions on the blog. I will work a bit on this over the weekend. Stay tuned.-Marcus
See the update to the function here: https://beckmw.wordpress.com/2013/08/12/variable-importance-in-neural-networks/
That’s exactly what I needed. Thanks a lot for helping!
Cheers,
IT
Hello, Marcus.
Congratulations for the great work.
I have to plot a neural network with 81 inputs, 1 hidden layer and 1 output.
But the 81 inputs overlaps in the graph and the image doesn’t come out right.
Can you help me to put some space between the nodes?
Thanks!
Rafael Coelho
Hi Rafael,
Overlapping will be a challenge with so many variables, but you can try modifying the circle and text size with the
circle.cex
andcex.val
arguments. I also made a quick change to the plotting function to include a new argument to maximize space between nodes,max.sp
. Maybe something like this will work for you?-Marcus
I would also like to offer my gratitude to becky
Visualizing neural networks from the nnet package – R is my friend
Representación de redes neuronales con R » Análisis y decisión
Nice tutorial. Thanks!
You’re welcome, thanks for reading!
Dear Marcus,
I frequently use the multinom() function when working with multiple categorical response variables.
Is there any way to visualize how nnet deals with such models?
Many thanks for this great site and R code!
Best regards,
Christoph
I have never used the multinom function but I think I can make sense of it after looking at the help documentation. If I understand correctly, there is no hidden layer in multinomial model fit with nnet. You can verify this by looking at the ‘n’ attribute in the resulting model, e.g., my_mod$n. This attribute shows the structure of the neural network as three values – number of input nodes, number of hidden nodes, and number of output nodes. The number of hidden nodes is zero, which cannot be changed. A neural network without a hidden layer is basically just a linear model, so it wouldn’t make sense to plot like a neural network. Hope that helps!
NeuralNetTools 1.0.0 now on CRAN – R is my friend
if I use this for project I’m working on, how do I cite you?
Hi there, you can just cite the package on CRAN (or the current version number on the development repo on GitHub):
Beck MW. 2015. NeuralNetTools: Visualization and Analysis Tools for Neural Networks. Version 1.3.7. http://cran.r-project.org/web/packages/NeuralNetTools/
Dear Marcus
I would like to implement the last plot with colors in the predictors. I have trained a nnet model for classification with a structure 4-16-4 network with 148 weights.
I just wonder if you know how can extract the right name of the output to add in gar.fun using the model fitted with the nnet function?
recmod$lev returns me:
“de” “di” “g” “mo”
when I used some of those levels names (outputs) in the gar.fun, it returns:
gar.fun(‘de’,recmod,col=cols,ylab=’Rel. importance’,ylim=c(-1,1))
Error in best.wts[[grep(paste(“out”, out.ind), names(best.wts))]] :
recursive indexing failed at level 2
The following lines summary my model:
model$terms
pattern ~ PC1 + PC2 + PC3 + PC4
attr(,”variables”)
list(pattern, PC1, PC2, PC3, PC4)
attr(,”factors”)
PC1 PC2 PC3 PC4
pattern 0 0 0 0
PC1 1 0 0 0
PC2 0 1 0 0
PC3 0 0 1 0
PC4 0 0 0 1
attr(,”term.labels”)
[1] “PC1” “PC2” “PC3” “PC4″
attr(,”order”)
[1] 1 1 1 1
attr(,”intercept”)
[1] 1
attr(,”response”)
[1] 1
attr(,”.Environment”)
attr(,”predvars”)
list(pattern, PC1, PC2, PC3, PC4)
attr(,”dataClasses”)
pattern PC1 PC2 PC3 PC4
“factor” “numeric” “numeric” “numeric” “numeric”
Thanks!!
Alejandro
Hi Alejandro,
Try installing the NeuralNetTools package and replacing your model with the code below (and change colors as you want).
You should install the development version of the package as I’ve made some changes since the version on CRAN went online.
Let me know if you still have problems.
-Marcus
Hi Marcus,
Thanks for providing that information, However, due to my ANN architecture has 4 outputs, I am having the following error using your code:
Error in garson.nnet(recmod, bar_plot = FALSE) :
Garson only applies to neural networks with one output node
What variable importance analysis do you suggest for my ANN for classification?
Best,
Alejandro
Hi Alejandro,
Yes, the Garson method was developed for only one output variable. Try using the olden method. It works for more than one output and gives relative importance as a continuous variable rather than absolute. You should be able to input the ‘importance values’ from the olden method with my code above.
-Marcus
Dear Marcus,
Thank you so much for this great resource!
I was wondering if you have any suggestion on which R neural network packages are best for continuous data, example, for rainfall prediction?
Hi Fish, thanks for reading. The packages that create neural networks in R are all pretty similar, but they of course differ in the default arguments for the model inputs. The most difficult part is finding an ‘optimal’ model that best fits your data, so be sure to use separate training and test datasets. Make sure you look at different network architecture as well, e,g, vary the number of nodes in the hidden layer, check which input variables are redundant, etc. You might start with the train function from the caret package, using method = “nnet”. It’s a nice way to test multiple models without writing a custom script. Also make sure to use linout = TRUE for continuous output data. Hope that helps…
-Marcus
Dear Marcus,
Thank you so much!!
I was wondering if it is possible to plot the final neural net found using the train command in caret. In the code below, I am trying to find the best decay and size values (number of hidden units) using CV. I am using model averaging using avNNet. Is there a way to plot it?
Thanks,
nnetGrid <- expand.grid(.decay = seq(0, 0.1, .01),
.size = c(1:10), .bag = FALSE)
ctrl <- trainControl(method = "cv", number = 10)
nnetTune <- train(trainXnnet, training[,11],
method = "avNNet", repeats = 5, tuneGrid = nnetGrid, trControl = ctrl, preProc = c("center", "scale"), linout = TRUE, trace = FALSE,
MaxNWts = 10 * (ncol(trainXnnet) + 1) + 10 + 1,
maxit = 500)
Hi fish,
This has come up before… the problem is that the ‘avNNet’ method of train creates a final model that averages multiple nnet models to create the final output. As far as I can tell, the output doesn’t include enough information to use the plotting function. My suggestion is to recreate the individual models that were used to make the averaged models, then look at each model separately. Try this code, it isolates individual models from the output, recreates them, then uses the NeuralNetTools functions.
Really, helpful!! Great, thanks!!
Great work first of all. Just wondering if there is a way to threshold some of the weights? Or if possible just to show the top 5% of the edges, ranked by absolute value of weight.
Cheers,
Sachin
Hi Sachin, no this is not possible with the current version. However, the RSNNS package can create pruned neural networks and I have included a plotting method for these models. This is kind of similar to what you are trying to do.
If you really want to plot only weights with a given value, you can request that feature here: https://github.com/fawda123/NeuralNetTools/issues
Hope that helps.
-Marcus
ujjwalkarn/R-Tutorials | GITROOM
Hello,
I just discovered your package (great work by the way) as I was looking for a way to plot different kinds of neural networks for pedagogical purposes.
SO, I have a model with 252 input variables and 1 output variable.
Hence I have 2 weight matrices:
w1: 252×10
w2: 10×10
And two bias vectors (1×10) b1 and b2.
According to your description on the requirements for a numerical input and the struct parameter, I have flatten the weights and biases as follows:
as.vector(cbind(t(b1),t(w1),t(b2),t(w2))), ending with a vector of length 2640.
However, I’m unable to describe correctly the structure. The only way to make it work is with c(252,10,10), which will display only 1 hidden layer, not 2.
I understand that with such a number of variables the plot will be cluttered, but my intention is to prune later once the hidden layers are correctly displayed.
Thanks,
Hi Alexis,
Sorry for the delay in response. It can get tricky when manually entering the weight vectors. Maybe this example will help. I tried to recreate a simpler version of a neural network two input variables, one output variable, and two hidden layers with two nodes each. Bias nodes, three total, are also connected to the hidden and output layers. With this structure (2, 2, 2, 1), the correct weight order is as follows (B = bias, H = hidden, I = input, O = ouput):
B1-H11, I1-H11, I2-H11, B1-H12, I1-H12, I2-H12, B2-H21, H11-H21, H12-H21, B2-H22, H11-H22, H12-H22, B3-O1, H21-O1, H22-O1
The hidden nodes are denoted using a layer, node construct, i.e. H21 is the first node in the second layer. The correct weight vector and structure argument for the plotnet function should look like this:
Hope that helps.
-Marcus