Tuesday, June 11, 2019

Multi Layer Perceptron (with Two Hidden Layers) with TensorFlow

In case of Multi Layer Perceptron (with Two Hidden Layers) we will extend the previous structure by adding two neurons to the first hidden layer and adding a second hidden layer with two neurons:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

#Training set
inputX = np.array([[1.,3.],[1.,2.],[1.,1.5],[1.5,2.],[2.,3.],[2.5,1.5],[2.,1.],[3.,1.],[3.,2.],[3.5,1.],[3.5,3.]])
inputY = [[1.,0.]]*6+ [[0.,1.]]*5

learning_rate = 0.001
training_epochs = 2000
display_step = 50
n_samples = 11
batch_size = 11
total_batch = int(n_samples/batch_size)

# Network Parameters
n_hidden_1 = 4 # 1st layer number of neurons
n_hidden_2 = 2 # 2nd layer number of neurons

n_input = 2 # size data input
n_classes = 2 # classes

# tf Graph input
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_classes])

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# Create model
def multilayer_perceptron(x):
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    # Output fully connected layer with a neuron for each class
    #out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    out_layer = tf.add(tf.matmul(layer_2, weights['out']), biases['out'])
    return out_layer

# Construct model
evidence = multilayer_perceptron(X)
y_ = tf.nn.softmax(evidence)

# Define cost and optimizer
cost = tf.reduce_sum(tf.pow(Y-y_,2))/ (2 * n_samples)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

avg_set = []
epoch_set = []
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
   
    for epoch in range(training_epochs):
        avg_cost = 0.
        # Loop over all batches
        for i in range(total_batch):
            #batch_x, batch_y = inputdata.next_batch(batch_size) TO BE IMPLEMENTED
            batch_x = inputX
            batch_y = inputY
            _, c = sess.run([optimizer, cost], feed_dict={X: batch_x, Y: batch_y})
            # Compute average loss
            avg_cost += c / total_batch
        if epoch % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost={:.9f}".format(avg_cost))
            avg_set.append(avg_cost)
            epoch_set.append(epoch + 1)
    
    print("Training phase finished")
    last_result = sess.run(y_, feed_dict = {X: inputX})
    training_cost = sess.run(cost, feed_dict = {X: inputX, Y: inputY})
    print("Training cost = ", training_cost) 
    print("Last result = ", last_result)

The code marked in bold are the changes we did in the previous program. By running the session, the following results are obtained:

Epoch: 0001 cost=0.498109400
Epoch: 0051 cost=0.455157906
Epoch: 0101 cost=0.454785973
Epoch: 0151 cost=0.454688132
Epoch: 0201 cost=0.454640657
Epoch: 0251 cost=0.454613864
Epoch: 0301 cost=0.454597086
Epoch: 0351 cost=0.454585880
Epoch: 0401 cost=0.454577982
Epoch: 0451 cost=0.454572141
Epoch: 0501 cost=0.454567760
Epoch: 0551 cost=0.454564273
Epoch: 0601 cost=0.454561651
Epoch: 0651 cost=0.454559416
Epoch: 0701 cost=0.454557598
Epoch: 0751 cost=0.454556137
Epoch: 0801 cost=0.454554915
Epoch: 0851 cost=0.454553783
Epoch: 0901 cost=0.454552919
Epoch: 0951 cost=0.454552114
Epoch: 1001 cost=0.454551458
Epoch: 1051 cost=0.454550833
Epoch: 1101 cost=0.454550326
Epoch: 1151 cost=0.454549789
Epoch: 1201 cost=0.454549402
Epoch: 1251 cost=0.454549074
Epoch: 1301 cost=0.454548687
Epoch: 1351 cost=0.454548419
Epoch: 1401 cost=0.454548150
Epoch: 1451 cost=0.454547882
Epoch: 1501 cost=0.454547673
Epoch: 1551 cost=0.454547465
Epoch: 1601 cost=0.454547256
Epoch: 1651 cost=0.454547107
Epoch: 1701 cost=0.454546928
Epoch: 1751 cost=0.454546869
Epoch: 1801 cost=0.454546690
Epoch: 1851 cost=0.454546541
Epoch: 1901 cost=0.454546422
Epoch: 1951 cost=0.454546332


Training phase finished


Training cost =  0.4545462


Last result =  [[9.9297780e-01 7.0222081e-03]
 [9.9873334e-01 1.2666314e-03]
 [9.9946326e-01 5.3678628e-04]
 [9.9970561e-01 2.9431804e-04]
 [9.9961901e-01 3.8094111e-04]
 [9.9999332e-01 6.7183983e-06]
 [9.9998772e-01 1.2256095e-05]
 [9.9999928e-01 6.6046067e-07]
 [9.9999630e-01 3.6828012e-06]
 [9.9999988e-01 1.5331719e-07]
 [9.9999523e-01 4.7671210e-06]]
------------------
(program exited with code: 0)

Press any key to continue . . . 


To View the progress of the learning phase add the following code:

plt.plot(epoch_set,avg_set,'o',label = 'MLP Training phase')
plt.ylabel('cost')
plt.xlabel('epochs')
plt.legend()
plt.show()


We get the following chart using which we can analyze the learning phase of the neural network by following the trend of the cost value:



We can see from the chart that learning in this case is much faster than the previous case (at 1,000 epochs, we should be fine). The optimized cost is almost the same as in the previous neural network (0.0703778 versus 0.0705207 in the previous case).

Here we will also use the same testing set to evaluate the accuracy of the MLP neural network to classify the samples in analysis:

#Testing set
testX = np.array([[1.,2.25],[1.25,3.],[2,2.5],[2.25,2.75],[2.5,3.],[2.,0.9],[2.5,1.2],[3.,1.25],[3.,1.5],[3.5,2.],[3.5,2.5]])
testY = [[1.,0.]]*5 + [[0.,1.]]*6


Now we'll launch the session with the training test and evaluate the correctness of the results obtained by calculating the accuracy:

with tf.Session() as sess:
    sess.run(init)
   
    for epoch in range(training_epochs):
        for i in range(total_batch):
            batch_x = inputX
            batch_y = inputY
            _, c = sess.run([optimizer, cost], feed_dict={X: batch_x, Y: batch_y})

  
    # Test model
    pred = tf.nn.softmax(evidence) # Apply softmax to logits
    result = sess.run(pred, feed_dict = {X: testX})
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(Y, 1))
   
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    print("Accuracy:", accuracy.eval({X: testX, Y: testY}))

print("Result = ", result)


When we run the entire session we get the output shown below:

Accuracy: 1.0
Result =  [[0.9878533  0.01214671]
 [0.9925166  0.0074834 ]
 [0.87989074 0.12010925]
 [0.84384656 0.15615338]
 [0.79945153 0.20054851]
 [0.36648932 0.6335107 ]
 [0.1864605  0.81353945]
 [0.05755609 0.94244385]
 [0.08324576 0.9167542 ]
 [0.04708495 0.952915  ]
 [0.09848009 0.90151983]]
------------------
(program exited with code: 0)

Press any key to continue . . .

In this case also we have 100% accuracy, and with matplotlib showing the test set points on the Cartesian plane with the usual color gradient system, we will get very similar results to the previous examples.

Add the following code to plot the graph:

yc = result[:,1]
plt.scatter(testX[:,0],testX[:,1],c=yc, s=50, alpha=1)
plt.show()


The resulting plot will be as shown below:
Now let’s move on to the proper classification, passing to the neural network a very large amount of data (points on the Cartesian plane) without knowing what class they belong to. It is in fact the moment that the neural network informs you about the possible classes.

To this end, the program simulates experimental data, creating points on the Cartesian plane that are completely random. For example, generate an array containing 1,000 random points, then submit these points to the neural network to determine the class of membership:


test = 3*np.random.random((1000,2))
with tf.Session() as sess:
    sess.run(init)
   
    for epoch in range(training_epochs):
        for i in range(total_batch):
            batch_x = inputX
            batch_y = inputY
            _, c = sess.run([optimizer, cost], feed_dict={X: batch_x, Y: batch_y})

  
    # Test model
    pred = tf.nn.softmax(evidence)
    result = sess.run(pred, feed_dict = {X: test})


Now let's visualize the experimental data based on their probability of classification, evaluated by the neural network:

yc = result[:,1]
plt.scatter(test[:,0],test[:,1],c=yc, s=50, alpha=1)
plt.show()


We will get a chart, as shown below:
As we can see according to the shades, two areas of classification are delimited on the plane, with the parts around the green indicating the zones of uncertainty. The classification results can be made more comprehensible and clearer by deciding to establish based on the probability if the point belongs to one or the other class. If the probability of a point belonging to a class is greater than 0.5 then it will belong to it. To do so add the following code:

yc = np.round(result[:,1])
plt.scatter(test[:,0],test[:,1],c=yc, s=50, alpha=1)
plt.show()


Now we'll get the following chart:
In the above chart we can clearly see the two regions of the Cartesian plane that characterize the two classes of belonging.

Here I am ending today's post about Multi Layer Perceptron (with Two Hidden Layers) with TensorFlow. TensorFlow is a broad topic and further reading is suggested. So till we meet again keep learning and practicing Python as Python is easy to learn!


Share:

0 comments:

Post a Comment