Monday, June 10, 2019

Multi Layer Perceptron (with One Hidden Layer) with TensorFlow

We know that a neural network MLP differs from a SLP neural network in that it can have one or more hidden layers. Thus we must write parameterized code that allows us to work in the most
general way possible, establishing at the time of definition the number of hidden layers present in the neural network and how many neurons they are composed of.

Now let's define two new parameters that define the number of neurons present for each hidden layer. The n_hidden_1 parameter will indicate how many neurons are present in the first hidden layer, while n_hidden_2 will indicate how many neurons are present in the second hidden layer.

First we'll start with an MLP neural network with only one hidden layer consisting of only two neurons. The n_input and n_classes parameters, they will have the same values as the previous example with the SLP neural network discussed in previous post and shown below:

n_hidden_1 = 2 # 1st layer number of neurons
n_input = 2 # size data input (# size of each element of x)
n_classes = 2 # n of classes

The definition of the placeholders is also the same as the previous example:

X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_classes])

Next we have to deal with the definition of the various W and bias b weights for the different connections. The neural network is now much more complex, having several layers to take into account. An efficient way to parameterize them is to define them as follows:

# Store layers weight & bias

weights = {

'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_hidden_1, n_classes]))
}

biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'out': tf.Variable(tf.random_normal([n_classes]))
}


To create a neural network model that takes into account all the parameters we've specified dynamically, we need to define a convenient function, which we'll call multilayer_perceptron():

# Create model

def multilayer_perceptron(x):
   
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    # Output fully connected layer with a neuron for each class
    out_layer = tf.matmul(layer_1, weights['out']) + biases['out']
    return out_layer 



Now we'll build the model by calling up the multilayer_perceptron(x) :

# Construct model

evidence = multilayer_perceptron(X)
y_ = tf.nn.softmax(evidence)


Next we define the cost function and choose an optimization method. For MLP neural networks, a good choice is tf.train.AdamOptimizer() as an optimization method.

# Define cost and optimizer

cost = tf.reduce_sum(tf.pow(Y-y_,2))/ (2 * n_samples)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

Now we have completed the definition of the model of the MLP neural network. Next we move on to creating a session to implement the learning phase.

Learning Phase

Here we'll define two lists that will contain the number of epochs and the measured cost values for each of them. We will also initialize all the variables before starting the session:

#Learning Phase

avg_set = []
epoch_set = []
init = tf.global_variables_initializer()


In order to implement the learning session, open the session with the following instructions:

with tf.Session() as sess:
       sess.run(init)


Now we'll implement the code to execute for each epoch, and inside it a scan for each batch belonging to the training set. In this case we have a training set consisting of a single batch, so we will have only one iteration in which we will directly assign inputX and inputY to batch_x and batch_y. In other cases we need to implement a function, such as next_batch(batch_size), which subdivides the entire training set (for example, inputdata) into different batches, progressively returning them as a return value.

At each batch cycle the cost function will be minimized with sess.run([optimizer,cost]), which will correspond to a partial cost. All batches will contribute to the calculation of the average cost of all the avg_cost batches. However, in this case, since we have only one batch, the avg_cost is equivalent to the cost of the entire training set.

for epoch in range(training_epochs):
        avg_cost = 0.
        # Loop over all batches
        for i in range(total_batch):
            #batch_x, batch_y = inputdata.next_batch(batch_size)TO BE IMPLEMENTED
            batch_x = inputX
            batch_y = inputY
            _, c = sess.run([optimizer, cost], feed_dict={X:optimizer, cost], feed_dict={X:batch_x, Y:        batch_y})
            # Compute average loss
            avg_cost += c / total_batch


Every certain number of epochs, we will certainly want to display the value of the current cost on the terminal and add these values to the avg_set and epoch_set lists, as in the previous case with SLP.

if epoch % display_step == 0:
        print("Epoch:", '%04d' % (epoch+1), "cost={:.9f}".format(avg_cost))
        avg_set.append(avg_cost)
        epoch_set.append(epoch + 1)
       
    print("Training phase finished")


Before running the session, it's better to add a few lines of instructions to view the results of the learning phase:

    last_result = sess.run(y_, feed_dict = {X: inputX})
    training_cost = sess.run(cost, feed_dict = {X: inputX, Y: inputY})
    print("\nTraining cost = ", training_cost)
    print("\nLast result = \n", last_result)


Now let's execute the session and see the results of the learning phase which should be as shown in the output below:

Epoch: 0001 cost=0.513027489
Epoch: 0051 cost=0.198177129
Epoch: 0101 cost=0.090004168
Epoch: 0151 cost=0.077234499
Epoch: 0201 cost=0.073419131
Epoch: 0251 cost=0.071573555
Epoch: 0301 cost=0.070778824
Epoch: 0351 cost=0.070493281
Epoch: 0401 cost=0.070406877
Epoch: 0451 cost=0.070384279
Epoch: 0501 cost=0.070379063
Epoch: 0551 cost=0.070378006
Epoch: 0601 cost=0.070377789
Epoch: 0651 cost=0.070377782
Epoch: 0701 cost=0.070377775
Epoch: 0751 cost=0.070377767
Epoch: 0801 cost=0.070377745
Epoch: 0851 cost=0.070377767
Epoch: 0901 cost=0.070377767
Epoch: 0951 cost=0.070377760
Epoch: 1001 cost=0.070377760
Epoch: 1051 cost=0.070377775
Epoch: 1101 cost=0.070377775
Epoch: 1151 cost=0.070377767
Epoch: 1201 cost=0.070377782
Epoch: 1251 cost=0.070377775
Epoch: 1301 cost=0.070377775
Epoch: 1351 cost=0.070377767
Epoch: 1401 cost=0.070377760
Epoch: 1451 cost=0.070377775
Epoch: 1501 cost=0.070377775
Epoch: 1551 cost=0.070377767
Epoch: 1601 cost=0.070377760
Epoch: 1651 cost=0.070377782
Epoch: 1701 cost=0.070377745
Epoch: 1751 cost=0.070377752
Epoch: 1801 cost=0.070377782
Epoch: 1851 cost=0.070377752
Epoch: 1901 cost=0.070377767
Epoch: 1951 cost=0.070377760

Training phase finished

Training cost =  0.07037775

Last result =
 [[0.9968413  0.00315871]
 [0.98411244 0.01588754]
 [0.96484137 0.03515864]
 [0.93628126 0.06371877]
 [0.9466923  0.05330772]
 [0.26811445 0.73188555]
 [0.40623432 0.5937656 ]
 [0.03707294 0.962927  ]
 [0.16398564 0.83601433]
 [0.00905036 0.9909497 ]
 [0.19163646 0.8083635 ]]
------------------
(program exited with code: 0)

Press any key to continue . . .


To view the data collected in the avg_set and epoch_set lists to analyze the progress of the learning phase, add the following lines to the program:

plt.plot(epoch_set,avg_set,'o',label = 'MLP Training phase')
plt.ylabel('cost')
plt.xlabel('epochs')
plt.legend()
plt.show()


Now run the program and the output should be as shown below:


From the output we can see that during the learning epochs there is a huge initial improvement as far as the cost optimization is concerned, then in the final part, the epoch improvements become smaller and then converge to zero.

From the analysis of the graph, however, it can be ascertained that the learning cycle of the neural network has been completed in the assigned epoch cycles. So we can consider the neural network as learned.

Next we  move on to the evaluation phase which includes testing and accuracy calculation.

Test Phase and Accuracy Calculation

To test this MLP neural network model, you will use the same testing set used in the SLP neural network example:

#Testing set
testX = np.array([[1.,2.25],[1.25,3.],[2,2.5],[2.25,2.75],[2.5,3.],[2.,0.9],[2.5,1.2],[3.,1.25],[3.,1.5],[3.5,2.],[3.5,2.5]])
testY = [[1.,0.]]*5 + [[0.,1.]]*6


Now we'll launch the session with the training test and evaluate the correctness of the results obtained by calculating the accuracy:

with tf.Session() as sess:
    sess.run(init)
   
    for epoch in range(training_epochs):
        for i in range(total_batch):
            batch_x = inputX
            batch_y = inputY
            _, c = sess.run([optimizer, cost], feed_dict={X: batch_x, Y: batch_y})
  
    # Test model
    pred = tf.nn.softmax(evidence) # Apply softmax to logits
    result = sess.run(pred, feed_dict = {X: testX})
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(Y, 1))
   
   
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    print("Accuracy:", accuracy.eval({X: testX, Y: testY}))

print("Result = ", result)


When we run the entire session we get the output shown below:

Accuracy: 1.0
Result =  [[0.9893686  0.01063142]
 [0.9935361  0.0064639 ]
 [0.88723236 0.11276766]
 [0.85200953 0.14799052]
 [0.8081636  0.1918364 ]
 [0.36763468 0.63236535]
 [0.18352026 0.8164797 ]
 [0.05467962 0.9453204 ]
 [0.07995363 0.9200463 ]
 [0.04446228 0.9555377 ]
 [0.09504609 0.90495396]]
------------------
(program exited with code: 0)

Press any key to continue . . .

In the case of the MLP neural network we have obtained 100% accuracy (11 points correctly classified on 11 points total). Next we show the classification obtained by drawing points on the Cartesian plane:

yc = result[:,1]
print(yc)
plt.scatter(testX[:,0],testX[:,1],c=yc, s=50, alpha=1)
plt.show()

When we run the program we get a chart as shown below:


We get chart of the points distributed on the Cartesian plane with the color going from blue to yellow, which indicates the probability of belonging to one of the two classes.

Here I am ending today's post. In the next post we shall discuss Multi Layer Perceptron (with Two Hidden Layers) with TensorFlow. So till we meet again keep learning and practicing Python as Python is easy to learn!
Share:

0 comments:

Post a Comment