Tuesday, June 25, 2019

Building a Neural Network in Python

Having a clear goal is important to build a neural network, as we need to decide what we want it to learn. Our objective is to implement a three-input XOR gate (An Exclusive Or function returns a 1 only if all the inputs are either 0 or 1) represented by the truth table below:

Make sure you have NumPy installed as we will be using this Python library as it provides a great set of functions to help us organize our neural network and also simplifies the calculations. Our program for the two-layer neural network is as follows:

# 2 Layer Neural Network

import numpy as np

# X = input of our 3 input XOR gate
# set up the inputs of the neural network (right from the table)
X = np.array(([0,0,0],[0,0,1],[0,1,0], \
    [0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]), dtype=float)
# y = the output of our neural network
y = np.array(([1], [0],  [0],  [0],  [0], \
     [0],  [0],  [1]), dtype=float)

# what value we want to predict
xPredicted = np.array(([0,0,1]), dtype=float)

X = X/np.amax(X, axis=0) # maximum of X input array
# maximum of xPredicted (our input data for the prediction)
xPredicted = xPredicted/np.amax(xPredicted, axis=0)

# set up our Loss file for graphing

lossFile = open("SumSquaredLossList.csv", "w")

class Neural_Network (object):
  def __init__(self):
    #parameters
    self.inputLayerSize = 3  # X1,X2,X3
    self.outputLayerSize = 1 # Y1
    self.hiddenLayerSize = 4 # Size of the hidden layer

    # build weights of each layer
    # set to random values
    # look at the interconnection diagram to make sense of this
    # 3x4 matrix for input to hidden
    self.W1 = \
            np.random.randn(self.inputLayerSize, self.hiddenLayerSize)
    # 4x1 matrix for hidden layer to output
    self.W2 = \
            np.random.randn(self.hiddenLayerSize, self.outputLayerSize)

  def feedForward(self, X):
    # feedForward propagation through our network
    # dot product of X (input) and first set of 3x4  weights
    self.z = np.dot(X, self.W1)

    # the activationSigmoid activation function - neural magic
    self.z2 = self.activationSigmoid(self.z)

    # dot product of hidden layer (z2) and second set of 4x1 weights
    self.z3 = np.dot(self.z2, self.W2)

    # final activation function - more neural magic
    o = self.activationSigmoid(self.z3)
    return o

  def backwardPropagate(self, X, y, o):
    # backward propagate through the network
    # calculate the error in output
    self.o_error = y - o

    # apply derivative of activationSigmoid to error
    self.o_delta = self.o_error*self.activationSigmoidPrime(o)

    # z2 error: how much our hidden layer weights contributed to output error
    self.z2_error = self.o_delta.dot(self.W2.T)

    # applying derivative of activationSigmoid to z2 error
    self.z2_delta = self.z2_error*self.activationSigmoidPrime(self.z2)

    # adjusting first set (inputLayer --> hiddenLayer) weights
    self.W1 += X.T.dot(self.z2_delta)
    # adjusting second set (hiddenLayer --> outputLayer) weights
    self.W2 += self.z2.T.dot(self.o_delta)

  def trainNetwork(self, X, y):
    # feed forward the loop
    o = self.feedForward(X)
    # and then back propagate the values (feedback)
    self.backwardPropagate(X, y, o)


  def activationSigmoid(self, s):
    # activation function
    # simple activationSigmoid curve as in the book
    return 1/(1+np.exp(-s))

  def activationSigmoidPrime(self, s):
    # First derivative of activationSigmoid
    # calculus time!
    return s * (1 - s)


  def saveSumSquaredLossList(self,i,error):
    lossFile.write(str(i)+","+str(error.tolist())+'\n')
   
  def saveWeights(self):
    # save this in order to reproduce our cool network
    np.savetxt("weightsLayer1.txt", self.W1, fmt="%s")
    np.savetxt("weightsLayer2.txt", self.W2, fmt="%s")

  def predictOutput(self):
    print ("Predicted XOR output data based on trained weights: ")
    print ("Expected (X1-X3): \n" + str(xPredicted))
    print ("Output (Y1): \n" + str(self.feedForward(xPredicted)))

myNeuralNetwork = Neural_Network()
trainingEpochs = 1000
#trainingEpochs = 100000

for i in range(trainingEpochs): # train myNeuralNetwork 1,000 times
  print ("Epoch # " + str(i) + "\n")
  print ("Network Input : \n" + str(X))
  print ("Expected Output of XOR Gate Neural Network: \n" + str(y))
  print ("Actual  Output from XOR Gate Neural Network: \n" + \
          str(myNeuralNetwork.feedForward(X)))
  # mean sum squared loss
  Loss = np.mean(np.square(y - myNeuralNetwork.feedForward(X)))
  myNeuralNetwork.saveSumSquaredLossList(i,Loss)
  print ("Sum Squared Loss: \n" + str(Loss))
  print ("\n")
  myNeuralNetwork.trainNetwork(X, y)

myNeuralNetwork.saveWeights()
myNeuralNetwork.predictOutput()


Let's go through the program and understand the implementation. We started with importing the NumPy library, failing to do so will result in import error. Next, we define all eight possibilities of our X1–X3 inputs and the Y1 output from the truth table:

# X = input of our 3 input XOR gate
# set up the inputs of the neural network (right from the table)
X = np.array(([0,0,0],[0,0,1],[0,1,0], \
[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]), dtype=float)
# y = the output of our neural network
y = np.array(([1], [0], [0], [0], [0], \
[0], [0], [1]), dtype=float)


Next we pick a value to predict (we predict them all, but this is the particular answer we want at the end).

# what value we want to predict
xPredicted = np.array(([0,0,1]), dtype=float)
X = X/np.amax(X, axis=0) # maximum of X input array
# maximum of xPredicted (our input data for the prediction)
xPredicted = xPredicted/np.amax(xPredicted, axis=0)


Then we save our Sum Squared Loss results to a file for use by Excel per epoch:

# set up our Loss file for graphing
lossFile = open("SumSquaredLossList.csv", "w") 


Next we build the Neural_Network class based on the truth table , each of the layers are represented by a line in the network.

class Neural_Network (object):
def __init__(self):
#parameters
self.inputLayerSize = 3 # X1,X2,X3
self.outputLayerSize = 1 # Y1
self.hiddenLayerSize = 4 # Size of the hidden layer

# build weights of each layer
# set to random values
# look at the interconnection diagram to make sense of this
# 3x4 matrix for input to hidden
self.W1 = \
np.random.randn(self.inputLayerSize, self.hiddenLayerSize)
# 4x1 matrix for hidden layer to output
self.W2 = \
np.random.randn(self.hiddenLayerSize, self.outputLayerSize)


The feedForward function shown below implements the feed-forward path through the neural network. This multiplies the matrices containing the weights from each layer to each layer and then applies the sigmoid activation function.

def feedForward(self, X):

# feedForward propagation through our network
# dot product of X (input) and first set of 3x4 weights
self.z = np.dot(X, self.W1)
# the activationSigmoid activation function - neural magic
self.z2 = self.activationSigmoid(self.z)
# dot product of hidden layer (z2) and second set of 4x1 weights
self.z3 = np.dot(self.z2, self.W2)
# final activation function - more neural magic
o = self.activationSigmoid(self.z3)
return o


Next we add the backwardPropagate function that implements the real trial and error learning that our neural network uses.

def backwardPropagate(self, X, y, o):

# backward propagate through the network
# calculate the error in output
self.o_error = y - o
# apply derivative of activationSigmoid to error
self.o_delta = self.o_error*self.activationSigmoidPrime(o)
# z2 error: how much our hidden layer weights contributed to output
# error
self.z2_error = self.o_delta.dot(self.W2.T)

self.z2_delta = self.z2_error*self.activationSigmoidPrime(self.z2)
# adjusting first set (inputLayer --> hiddenLayer) weights
self.W1 += X.T.dot(self.z2_delta)
# adjusting second set (hiddenLayer --> outputLayer) weights
self.W2 += self.z2.T.dot(self.o_delta)


To train the network for a particular epoch, we call both the backwardPropagate and the feedForward functions each time we train the network.

def trainNetwork(self, X, y):
# feed forward the loop
o = self.feedForward(X)
# and then back propagate the values (feedback)
self.backwardPropagate(X, y, o)


The sigmoid activation function and the first derivative of the sigmoid activation function follows:

def activationSigmoid(self, s):
# activation function
# simple activationSigmoid curve as in the book
return 1/(1+np.exp(-s))
def activationSigmoidPrime(self, s):
# First derivative of activationSigmoid
# calculus time!
return s * (1 - s)


Next, save the epoch values of the loss function to a file for Excel and the neural weights.

def saveSumSquaredLossList(self,i,error):
lossFile.write(str(i)+","+str(error.tolist())+'\n')
def saveWeights(self):
# save this in order to reproduce our cool network
np.savetxt("weightsLayer1.txt", self.W1, fmt="%s")
np.savetxt("weightsLayer2.txt", self.W2, fmt="%s")


Next, we run our neural network to predict the outputs based on the current trained weights.

def predictOutput(self):
print ("Predicted XOR output data based on trained weights: ")
print ("Expected (X1-X3): \n" + str(xPredicted))
print ("Output (Y1): \n" + str(self.feedForward(xPredicted)))
myNeuralNetwork = Neural_Network()
trainingEpochs = 1000
#trainingEpochs = 100000


The following is the main training loop that goes through all the requested epochs. Change the variable trainingEpochs above to vary the number of epochs you would like to train your network.

for i in range(trainingEpochs): # train myNeuralNetwork 1,000 times
print ("Epoch # " + str(i) + "\n")
print ("Network Input : \n" + str(X))
print ("Expected Output of XOR Gate Neural Network: \n" + str(y))
print ("Actual Output from XOR Gate Neural Network: \n" + \
str(myNeuralNetwork.feedForward(X)))
# mean sum squared loss
Loss = np.mean(np.square(y - myNeuralNetwork.feedForward(X)))
myNeuralNetwork.saveSumSquaredLossList(i,Loss)
print ("Sum Squared Loss: \n" + str(Loss))
print ("\n")
myNeuralNetwork.trainNetwork(X, y)


Save the results of your training for reuse and predict the output of our requested value.

myNeuralNetwork.saveWeights()
myNeuralNetwork.predictOutput()


As we run the program. we will see the program start stepping through 1,000 epochs of training, printing the results of each epoch, and then finally showing the final input and output:

 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88681523]
 [0.02836525]
 [0.05311786]
 [0.06632363]
 [0.05310541]
 [0.06650775]
 [0.0559176 ]
 [0.9063089 ]]
Sum Squared Loss:
0.004997997009787768


Epoch # 992

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88694184]
 [0.02832068]
 [0.05305645]
 [0.06625177]
 [0.05304377]
 [0.06643567]
 [0.05586747]
 [0.90641198]]
Sum Squared Loss:
0.00498696546787876


Epoch # 993

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88706805]
 [0.02827627]
 [0.05299522]
 [0.06618013]
 [0.05298231]
 [0.06636379]
 [0.05581748]
 [0.90651475]]
Sum Squared Loss:
0.0049759796156153184


Epoch # 994

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88719387]
 [0.02823202]
 [0.05293419]
 [0.06610869]
 [0.05292105]
 [0.06629213]
 [0.05576763]
 [0.90661722]]
Sum Squared Loss:
0.004965039178392612


Epoch # 995

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.8873193 ]
 [0.02818792]
 [0.05287334]
 [0.06603746]
 [0.05285998]
 [0.06622068]
 [0.05571791]
 [0.90671937]]
Sum Squared Loss:
0.004954143883756911


Epoch # 996

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88744435]
 [0.02814398]
 [0.05281268]
 [0.06596644]
 [0.05279909]
 [0.06614944]
 [0.05566832]
 [0.90682122]]
Sum Squared Loss:
0.004943293461384907


Epoch # 997

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88756901]
 [0.02810019]
 [0.0527522 ]
 [0.06589563]
 [0.05273839]
 [0.0660784 ]
 [0.05561886]
 [0.90692276]]
Sum Squared Loss:
0.004932487643063279


Epoch # 998

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88769329]
 [0.02805655]
 [0.05269192]
 [0.06582503]
 [0.05267788]
 [0.06600757]
 [0.05556954]
 [0.907024  ]]
Sum Squared Loss:
0.004921726162668404



Epoch # 999

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[0.88781718]
 [0.02801307]
 [0.05263181]
 [0.06575463]
 [0.05261756]
 [0.06593695]
 [0.05552035]
 [0.90712494]]
Sum Squared Loss:
0.004911008756146379


Predicted XOR output data based on trained weights:
Expected (X1-X3):
[0. 0. 1.]
Output (Y1):
[0.02796974]
------------------
(program exited with code: 0)

Press any key to continue . . .





It also creates the following files of interest:

1. weightsLayer1.txt: This file contains the final trained weights for input-layer to hidden-layer connections (a 4x3 matrix).

2. weightsLayer2.txt: This file contains the final trained weights for hidden-layer to output-layer connections (a 1x4 matrix).

3. SumSquaredLossList.csv: This is a comma-delimited file containing the epoch number and each loss factor at the end of each epoch. We use this to graph the results across all epochs.


Now lets focus on our output, at the bottom, we see our expected output is 0.02796974, which is quite close to the expected value of 0. If we compare each of the expected outputs to the actual output from the network, we see they all match pretty closely. And every time we run it the results will be slightly different because we  initialize the weights with random numbers at the start of the run.

The goal of a neural-network training is not to get it exactly right — only right within a stated tolerance of the correct result. For example, if we said that any output above 0.9 is a 1 and any output below 0.1 is a 0, then our network would have given perfect results.

The Sum Squared Loss is a measure of all the errors of all the possible inputs. If we increase the number of epochs to 100,000, then the numbers are better still as shown in the output below, but our results, according to our accuracy criteria (> 0.9 = 1 and < 0.1 = 0) were good enough in the 1,000 epoch run.


Epoch # 99999

Network Input :
[[0. 0. 0.]
 [0. 0. 1.]
 [0. 1. 0.]
 [0. 1. 1.]
 [1. 0. 0.]
 [1. 0. 1.]
 [1. 1. 0.]
 [1. 1. 1.]]
Expected Output of XOR Gate Neural Network:
[[1.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [1.]]
Actual  Output from XOR Gate Neural Network:
[[9.97830436e-01]
 [2.33644763e-11]
 [2.29263769e-11]
 [1.54902142e-10]
 [2.36241540e-11]
 [1.54821551e-10]
 [1.54875769e-10]
 [1.27870648e-09]]
Sum Squared Loss:
0.1250005880560837


Predicted XOR output data based on trained weights:
Expected (X1-X3):
[0. 0. 1.]
Output (Y1):
[2.33646223e-11]
------------------
(program exited with code: 0)

Press any key to continue . . .


So this is how a simple 2 layer neural network is implemented using NumPy Python library. In the next post we'll use TensorFlow for the same neural network. Till we meet again, keep practicing and learning Python as Python is easy to learn!


Share:

0 comments:

Post a Comment