We'll begin with implementation of a simple single layer Perceptron (SLP) neural network. The set of data that we will study is a set of 11 points distributed in a Cartesian axis divided into two classes of membership. The first six belong to the first class, the other five to the second. The coordinates (x, y) of the points are contained within a numpy inputX array, while the class to which they belong is indicated in inputY. This is a list of two-element arrays, with an element for each class they belong to. The value 1 in the first or second element indicates the class to which it belongs.
If the element has value [1.0], it will belong to the first class. If it has value [0,1], it belongs to the second class. The fact that they are float values is due to the optimization calculation of deep learning. The test results of the neural networks will be floating numbers, indicating the probability that an element belongs to the first or second class.
Suppose, for example, that the neural network will give us the result of an element that will have the following values: [0.910, 0.090]
This result will mean that the neural network considers that the element under analysis belongs to 91% to the first class and to 9% to the second class. So based on the values taken from the example of SVMs in the machine learning posts, we can define the following values as shown in the program below:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#Training set
inputX = np.array([[1.,3.],[1.,2.],[1.,1.5],[1.5,2.],[2.,3.],[2.5,1.5]
,[2.,1.],[3.,1.],[3.,2.],[3.5,1.],[3.5,3.]])
inputY = [[1.,0.]]*6+ [[0.,1.]]*5
print('\ninputX\n')
print(inputX)
print('\ninputY\n')
print(inputY)
yc = [0]*6 + [1]*5
print('\ninputyc\n')
print(yc)
plt.scatter(inputX[:,0],inputX[:,1],c=yc, s=50, alpha=0.9)
plt.show()
The output of the program is shown below:
inputX
[[1. 3. ]
[1. 2. ]
[1. 1.5]
[1.5 2. ]
[2. 3. ]
[2.5 1.5]
[2. 1. ]
[3. 1. ]
[3. 2. ]
[3.5 1. ]
[3.5 3. ]]
inputY
[[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1
.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]
inputyc
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
------------------
(program exited with code: 0)
Press any key to continue . . .
The graph above shows that the training set is a set of Cartesian points divided into two classes of
membership (yellow and purple)
To help in the graphic representation (as shown in Figure above) of the color assignment, the inputY array has been replaced with yc array.
As we can see, the two classes are easily identifiable in two opposite regions. The first region covers the upper-left part, the second region covers the lower-right part. All this would seem to be simply subdivided by an imaginary diagonal line, but to make the system more complex, there is an exception with the point number 6 that is internal to the other points.
The SLP Model Definition phase
To do a deep learning analysis, the first thing to do is define the neural network model we want to implement.Thus we will already have in mind the structure to be implemented, how many neurons and layers and compounds (in this case only one), the weight of the connections, and the cost function to be applied.
Following the TensorFlow practice,we start by defining a series of parameters necessary to characterize the execution of the calculations during the learning phase. The learning rate is a parameter that regulates the learning speed of each neuron. This parameter is very important and plays a very important role in regulating the efficiency of a neural network during the learning phase. Establishing the optimal a priori value of the learning rate is impossible, because it depends very much on the structure of the neural network and on the particular type of data to be analyzed. It is therefore necessary to adjust this value through different learning tests, choosing the value that guarantees the best accuracy.
We start with a generic value of 0.01, assigning this value to the learning_rate parameter. Another parameter to be defined is training_epochs. This defines how many epochs (learning cycles) will be applied to the neural network for the learning phase.
During program execution, it will be necessary in some way to monitor the progress of learning and this can be done by printing values on the terminal. We can decide how many epochs you will have to display a printout with the results, and insert them into the display_step parameter. A reasonable value is every 50 or 100 steps.
To make the implemented code reusable, it is necessary to add parameters that specify the number of elements that make up the training set, and how many batches must be divided. In this case we have a small training set of only 11 items. So we can use them all in one batch.
Finally, we'll add two more parameters that describe the size and number of classes to which the incoming data belongs.
Now that we have defined the parameters of the method, let's move on to building the neural network. First, define the inputs and outputs of the neural network through the use of placeholders. Thus we have just implicitly defined an SLP neural network with two neurons in the input layer and two neurons in the output layer (see Figure below), defining an input placeholder x with two values and a placeholder of output y with two values. Explicitly, we have instead defined two tensors, the tensor x that will contain the values of the input coordinates, and a tensor y that will contain the probabilities of belonging to the two classes of each element.
Now that we have defined the placeholders, occupied with the weights and the bias, which, as we saw, are used to define the connections of the neural network. These tensors W and b are defined as variables by the constructor Variable() and initialized to all zero values with tf.zeros().
The variables weight and bias we just defined will be used to define the evidence x * W + b, which characterizes the neural network in mathematical form. The tf.matmul() function performs a multiplication between tensors x * W, while the tf.add() function adds to the result the value of bias b.
From the value of the evidence, we can directly calculate the probabilities of the output values with the tf.nn.softmax() function. The tf.nn.softmax() function performs two steps:
• It calculates the evidence that a certain Cartesian entry point xi belongs to a particular class.
• It converts the evidence into probability of belonging to each of the two possible classes and returns it as y_.
Continuing with the construction of the model, we must think about establishing the rules for the minimization of these parameters and we do so by defining the cost (or loss). In this phase we can choose many functions; one of the most common is the mean squared error loss.
Once the cost (or loss) function has been defined, an algorithm must be established to perform the minimization at each learning cycle (optimization).
We can use the tf.train.GradientDescentOptimizer() function as an optimizer that bases its operation on the Gradient Descent algorithm.
After incorporating the fields in our program it should look like:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#Training set
inputX = np.array([[1.,3.],[1.,2.],[1.,1.5],[1.5,2.],[2.,3.],[2.5,1.5]
,[2.,1.],[3.,1.],[3.,2.],[3.5,1.],[3.5,3.]])
inputY = [[1.,0.]]*6+ [[0.,1.]]*5
print('\ninputX\n')
print(inputX)
print('\ninputY\n')
print(inputY)
yc = [0]*6 + [1]*5
print('\ninputyc\n')
print(yc)
plt.scatter(inputX[:,0],inputX[:,1],c=yc, s=50, alpha=0.9)
plt.show()
learning_rate = 0.01
training_epochs = 2000
display_step = 50
n_samples = 11
batch_size = 11
total_batch = int(n_samples/batch_size)
n_input = 2 # size data input (# size of each element of x)
n_classes = 2 # n of classes
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
# Set model weights
W = tf.Variable(tf.zeros([n_input, n_classes]))
b = tf.Variable(tf.zeros([n_classes]))
evidence = tf.add(tf.matmul(x, W), b)
y_ = tf.nn.softmax(evidence)
cost = tf.reduce_sum(tf.pow(y-y_,2))/ (2 * n_samples)
optimizer =tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
With the definition of the cost optimization method (minimization), we complete the definition of the neural network model and ready to begin to implement its learning phase.
Here I am ending today's discussion wherein we covered Single Layer Perceptron with TensorFlow. In the next post I'll focus on implementation of learning phase . So till we meet again keep learning and practicing Python as Python is easy to learn!
If the element has value [1.0], it will belong to the first class. If it has value [0,1], it belongs to the second class. The fact that they are float values is due to the optimization calculation of deep learning. The test results of the neural networks will be floating numbers, indicating the probability that an element belongs to the first or second class.
Suppose, for example, that the neural network will give us the result of an element that will have the following values: [0.910, 0.090]
This result will mean that the neural network considers that the element under analysis belongs to 91% to the first class and to 9% to the second class. So based on the values taken from the example of SVMs in the machine learning posts, we can define the following values as shown in the program below:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#Training set
inputX = np.array([[1.,3.],[1.,2.],[1.,1.5],[1.5,2.],[2.,3.],[2.5,1.5]
,[2.,1.],[3.,1.],[3.,2.],[3.5,1.],[3.5,3.]])
inputY = [[1.,0.]]*6+ [[0.,1.]]*5
print('\ninputX\n')
print(inputX)
print('\ninputY\n')
print(inputY)
yc = [0]*6 + [1]*5
print('\ninputyc\n')
print(yc)
plt.scatter(inputX[:,0],inputX[:,1],c=yc, s=50, alpha=0.9)
plt.show()
The output of the program is shown below:
inputX
[[1. 3. ]
[1. 2. ]
[1. 1.5]
[1.5 2. ]
[2. 3. ]
[2.5 1.5]
[2. 1. ]
[3. 1. ]
[3. 2. ]
[3.5 1. ]
[3.5 3. ]]
inputY
[[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1
.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]
inputyc
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
------------------
(program exited with code: 0)
Press any key to continue . . .
The graph above shows that the training set is a set of Cartesian points divided into two classes of
membership (yellow and purple)
To help in the graphic representation (as shown in Figure above) of the color assignment, the inputY array has been replaced with yc array.
As we can see, the two classes are easily identifiable in two opposite regions. The first region covers the upper-left part, the second region covers the lower-right part. All this would seem to be simply subdivided by an imaginary diagonal line, but to make the system more complex, there is an exception with the point number 6 that is internal to the other points.
The SLP Model Definition phase
To do a deep learning analysis, the first thing to do is define the neural network model we want to implement.Thus we will already have in mind the structure to be implemented, how many neurons and layers and compounds (in this case only one), the weight of the connections, and the cost function to be applied.
Following the TensorFlow practice,we start by defining a series of parameters necessary to characterize the execution of the calculations during the learning phase. The learning rate is a parameter that regulates the learning speed of each neuron. This parameter is very important and plays a very important role in regulating the efficiency of a neural network during the learning phase. Establishing the optimal a priori value of the learning rate is impossible, because it depends very much on the structure of the neural network and on the particular type of data to be analyzed. It is therefore necessary to adjust this value through different learning tests, choosing the value that guarantees the best accuracy.
We start with a generic value of 0.01, assigning this value to the learning_rate parameter. Another parameter to be defined is training_epochs. This defines how many epochs (learning cycles) will be applied to the neural network for the learning phase.
During program execution, it will be necessary in some way to monitor the progress of learning and this can be done by printing values on the terminal. We can decide how many epochs you will have to display a printout with the results, and insert them into the display_step parameter. A reasonable value is every 50 or 100 steps.
To make the implemented code reusable, it is necessary to add parameters that specify the number of elements that make up the training set, and how many batches must be divided. In this case we have a small training set of only 11 items. So we can use them all in one batch.
Finally, we'll add two more parameters that describe the size and number of classes to which the incoming data belongs.
Now that we have defined the parameters of the method, let's move on to building the neural network. First, define the inputs and outputs of the neural network through the use of placeholders. Thus we have just implicitly defined an SLP neural network with two neurons in the input layer and two neurons in the output layer (see Figure below), defining an input placeholder x with two values and a placeholder of output y with two values. Explicitly, we have instead defined two tensors, the tensor x that will contain the values of the input coordinates, and a tensor y that will contain the probabilities of belonging to the two classes of each element.
Now that we have defined the placeholders, occupied with the weights and the bias, which, as we saw, are used to define the connections of the neural network. These tensors W and b are defined as variables by the constructor Variable() and initialized to all zero values with tf.zeros().
The variables weight and bias we just defined will be used to define the evidence x * W + b, which characterizes the neural network in mathematical form. The tf.matmul() function performs a multiplication between tensors x * W, while the tf.add() function adds to the result the value of bias b.
From the value of the evidence, we can directly calculate the probabilities of the output values with the tf.nn.softmax() function. The tf.nn.softmax() function performs two steps:
• It calculates the evidence that a certain Cartesian entry point xi belongs to a particular class.
• It converts the evidence into probability of belonging to each of the two possible classes and returns it as y_.
Continuing with the construction of the model, we must think about establishing the rules for the minimization of these parameters and we do so by defining the cost (or loss). In this phase we can choose many functions; one of the most common is the mean squared error loss.
Once the cost (or loss) function has been defined, an algorithm must be established to perform the minimization at each learning cycle (optimization).
We can use the tf.train.GradientDescentOptimizer() function as an optimizer that bases its operation on the Gradient Descent algorithm.
After incorporating the fields in our program it should look like:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#Training set
inputX = np.array([[1.,3.],[1.,2.],[1.,1.5],[1.5,2.],[2.,3.],[2.5,1.5]
,[2.,1.],[3.,1.],[3.,2.],[3.5,1.],[3.5,3.]])
inputY = [[1.,0.]]*6+ [[0.,1.]]*5
print('\ninputX\n')
print(inputX)
print('\ninputY\n')
print(inputY)
yc = [0]*6 + [1]*5
print('\ninputyc\n')
print(yc)
plt.scatter(inputX[:,0],inputX[:,1],c=yc, s=50, alpha=0.9)
plt.show()
learning_rate = 0.01
training_epochs = 2000
display_step = 50
n_samples = 11
batch_size = 11
total_batch = int(n_samples/batch_size)
n_input = 2 # size data input (# size of each element of x)
n_classes = 2 # n of classes
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])
# Set model weights
W = tf.Variable(tf.zeros([n_input, n_classes]))
b = tf.Variable(tf.zeros([n_classes]))
evidence = tf.add(tf.matmul(x, W), b)
y_ = tf.nn.softmax(evidence)
cost = tf.reduce_sum(tf.pow(y-y_,2))/ (2 * n_samples)
optimizer =tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)
With the definition of the cost optimization method (minimization), we complete the definition of the neural network model and ready to begin to implement its learning phase.
Here I am ending today's discussion wherein we covered Single Layer Perceptron with TensorFlow. In the next post I'll focus on implementation of learning phase . So till we meet again keep learning and practicing Python as Python is easy to learn!
0 comments:
Post a Comment