Thursday, June 27, 2019

More on Machine Learning

We developed algorithms and programs that can learn things about data and about sensory input and apply that knowledge to new situations. However, our machines do not “understand” anything about what they have learned. They have just accumulated data about their inputs and have transformed that input to some kind of output.

Even if the machine does not “understand” what it has learned, that does not mean that we  cannot do impressive things using these machine-learning techniques that will be discussed in this post.

Ever thought what does it mean for a machine to learn something?

Well if a machine can take inputs and by some process transform those inputs to some useful outputs, then we can say the machine has learned something. This definition has a wide meaning. In writing a simple program to add two numbers you have taught that machine something. It has learned to add two numbers.

We’re going to focus in this post on machine learning in the sense of the use of algorithms and statistical models that progressively improve their performance on a specific task. Most of our goal setting (training the machine) will be done with known solutions to a problem: first training our machine and then applying the training to new, similar examples of the problem.

Although I've mentioned about the types of machine-learning algorithms in my post related to machine learning, I'd like to mention again in this post that there are three main types of machine-learning algorithms:

1. Supervised learning: This type of algorithm builds a model of data that contains both inputs and outputs. The data is known as training data. This is the kind of machine learning we show in this post.

2. Unsupervised learning: For this type of algorithm, the data contains only the inputs, and the algorithms look for the structures and patterns in the data.

3. Reinforcement learning: This area is concerned with software taking actions based on some kind of cumulative reward. These algorithms do not assume knowledge of an exact mathematical model and are used when exact models are unavailable. This is the most complex area of machine learning, and the one that may be used mostly in the future.

Creating a Machine-Learning Network for Detecting Clothes Types

It's time to build TensorFlow/Keras machine-learning application using the freely available training Fashion-MNIST (Modified National Institute of Standards and Technology) database that contains 60,000 fashion products from ten categories. It contains data in 28x28 pixel format with 6,000 items in each category. The categories are:

0 T-shirt/top
1 Trouser
2 Pullover
3 Dress
4 Coat
5 Sandal
6 Shirt
7 Sneaker
8 Bag
9 Ankle boot

Our first task is getting the data — The Fashion-MNIST dataset. It will take a while to first load it to our computer. After we run the program for the first time, it will use the Fashion-MNIST data copied to our computer.

Next step would be to train our machine-learning neural network using all 60,000 images of
clothes: 6,000 images in each of the ten categories. After training comes the testing of our network. Our trained network will be tested three different ways:

1) a set of 10,000 training photos from the Fashion_MNIST data set;
2) a selected image from the Fashion_MNIST data set; and
3) a photo of a woman’s dress.

The code for our network in shown below:

#import libraries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg 
import seaborn as sns
import tensorflow as tf
from tensorflow.python.framework import ops
from tensorflow.examples.tutorials.mnist import input_data
from PIL import Image

# Import Fashion MNIST
fashion_mnist = input_data.read_data_sets('input/data',
        one_hot=True)

fashion_mnist = tf.keras.datasets.fashion_mnist

(train_images, train_labels), (test_images, test_labels) \
        = fashion_mnist.load_data()




class_names = ['T-shirt/top', 'Trouser',
        'Pullover', 'Dress', 'Coat',
        'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']


train_images = train_images / 255.0

test_images = test_images / 255.0


# Prepare the training images
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)

# Prepare the test images
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)


model = tf.keras.Sequential()

input_shape = (28, 28, 1)
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(tf.keras.layers.BatchNormalization())

model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.25))

model.add(tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.25))

model.add(tf.keras.layers.Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.25))

model.add(tf.keras.layers.Flatten())

model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))

model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))

model.add(tf.keras.layers.Dense(10, activation='softmax'))


model.compile(optimizer=tf.train.AdamOptimizer(),
                      loss='sparse_categorical_crossentropy',
                                    metrics=['accuracy'])


model.fit(train_images, train_labels, epochs=5)

# test with 10,000 images
test_loss, test_acc = model.evaluate(test_images, test_labels)

print('10,000 image Test accuracy:', test_acc)

#run test image from Fashion_MNIST data

img = test_images[15]
img = (np.expand_dims(img,0))
singlePrediction = model.predict(img,steps=1)
print ("Prediction Output")
print(singlePrediction)
print()
NumberElement = singlePrediction.argmax()
Element = np.amax(singlePrediction)

print ("Our Network has concluded that the image number '15' is a "
        +class_names[NumberElement])
print (str(int(Element*100)) + "% Confidence Level")



As usual we start with importing all the libraries needed to run our example two-layer model. Next  we load our data as shown by the code below:

# Import Fashion MNIST
fashion_mnist = input_data.read_data_sets('input/data',
one_hot=True)
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) \
= fashion_mnist.load_data() 


We also give some descriptive names to the ten classes within the Fashion_MNIST data.

class_names = ['T-shirt/top', 'Trouser',
'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']


Then we change all the images to be scaled from 0.0–1.0 rather than 0–255.

train_images = train_images / 255.0
test_images = test_images / 255.0


The next step is to define our neural-network model and layers. It is very simple to add more neural layers, and to change their sizes and their activation functions. We are also applying a bias to our activation function (relu), in this case with softmax, for the final output layer.

model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
model.add(tf.keras.layers.Dense(128, activation='relu' ))
model.add(tf.keras.layers.Dense(10, activation='softmax' ))


Then comes compiling our model. Here we used the loss function sparse_categorical_crossentropy. It is used when we have assigned a different integer for each clothes category as we have in this example. ADAM (a method for stochastic optimization) is a good default optimizer. It provides a method well suited for problems that are large in terms of data and/or parameters. Sparse categorical crossentropy is a loss function used to measure the error between categories across the data set. Categorical refers to the fact that the data has more than two categories (binary) in the data set. Sparse refers to using a single integer to refer to classes (0–9, in our example). Entropy (a measure of disorder) refers to the mix of data between the categories.

model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_
crossentropy',
metrics=['accuracy'])


To fit and train our model, we chose the number of epochs as only 5 due to the time it takes to run the
model for our examples. Feel free to increase! Here we load the NumPy arrays
for the input to our network (the database train_images).

model.fit(train_images, train_labels, epochs=5)

For evaluation of  the model, the model.evaluate function is used, it compare the outputs of our trained network in each epoch and generates test_acc and test_loss for our information in each epoch as stored in the history variable.

# test with 10,000 images
test_loss, test_acc = model.evaluate(test_images,
test_labels)
print('10,000 image Test accuracy:', test_acc)


When we run the program we get the following output:

Epoch 1/5
60000/60000 [==============================] - 44s 726us/step - loss: 0.5009 -
acc: 0.8244
Epoch 2/5
60000/60000 [==============================] - 42s 703us/step - loss: 0.3751 -
acc: 0.8652
Epoch 3/5
60000/60000 [==============================] - 42s 703us/step - loss: 0.3359 -
acc: 0.8767
Epoch 4/5
60000/60000 [==============================] - 42s 701us/step - loss: 0.3124 -
acc: 0.8839
Epoch 5/5
60000/60000 [==============================] - 42s 703us/step - loss: 0.2960 -
acc: 0.8915
10000/10000 [==============================] - 4s 404us/step
10,000 image Test accuracy: 0.873


The test results shows that with our two-layer neural machine learning network, we are classifying 87 percent of the 10,000-image test database correctly. We upped the number of epochs to 50 and increased this to only 88.7 percent accuracy. Lots of extra computation with little increase in accuracy.

Testing a single test image

Next task is to test a single image (shown in the figure below)from the Fashion_MNIST database.


This is implemented as shown below:

#run test image from Fashion_MNIST data
img = test_images[15]
img = (np.expand_dims(img,0))
singlePrediction = model.predict(img,steps=1)
print ("Prediction Output")
print(singlePrediction)
print()
NumberElement = singlePrediction.argmax()
Element = np.amax(singlePrediction)
print ("Our Network has concluded that the image number '15' is a "
+class_names[NumberElement])
print (str(int(Element*100)) + "% Confidence Level")

Here are the results from a five-epoch run:

Prediction Output
[[1.2835168e-05 9.9964070e-01 6.2637120e-08 3.4126092e-04 4.4297972e-06
7.8450663e-10 6.2759432e-07 9.8717527e-12 1.2729484e-08 1.1002166e-09]]


Our Network has concluded that the image number '15' is a Trouser
99% Confidence Level


The result shows tt correctly identified the picture as a trouser. Remember,however, that we only had an overall accuracy level on the test data of 87 percent.

Testing on external pictures

To accomplish this test, We took a dress, hung it up on a wall (see Figure below) and took a picture of it with phone.



Next we converted it to a resolution of 28 x 28 pixels down from 3024x3024 pixels straight from the phone. (See Figure below)



The following code is for arranging the data from our JPG picture to fit the format required by TensorFlow.

# run Our test Image
# read test dress image
imageName = "Dress28x28.JPG"

testImg = Image.open(imageName)
testImg.load()
data = np.asarray( testImg, dtype="float" )
data = tf.image.rgb_to_grayscale(data)
data = data/255.0
data = tf.transpose(data, perm=[2,0,1])
singlePrediction = model.predict(data,steps=1)
print ("Prediction Output")
print(singlePrediction)
print()
NumberElement = singlePrediction.argmax()
Element = np.amax(singlePrediction)
print ("Our Network has concluded that the file '"
+imageName+"' is a "+class_names[NumberElement])
print (str(int(Element*100)) + "% Confidence Level")


Lets run the program to see the results. We put the Dress28x28.JPG file in the same directory as our program and ran a five- epoch training run. Here are the results:

Prediction Output
[[1.2717753e-06 1.3373902e-08 1.0487850e-06 3.3525557e-11 8.8031484e-09
7.1847245e-10 1.1177938e-04 8.8322977e-12 9.9988592e-01 3.2957085e-12]]


Our Network has concluded that the file 'Dress28x28.JPG' is a Bag
99% Confidence Level


The result shows that our neural network machine learning program, after classifying 60,000 pictures
and 6,000 dress pictures, concluded at a 99 percent confidence level that the dress is a bag.

Let's increase the training epochs to 50 and rerun the program. Here are the results from that run:

Prediction Output
[[3.4407502e-33 0.0000000e+00 2.5598763e-33 0.0000000e+00 0.0000000e+00

0.0000000e+00 2.9322060e-17 0.0000000e+00 1.0000000e+00 1.5202169e-39]]

Our Network has concluded that the file 'Dress28x28.JPG' is a Bag
100% Confidence Level


The dress is still a bag, but now our program is 100 percent confident that the dress is a bag. This illustrates one of the problems with machine learning. Being 100 percent certain that a picture is of a bag when it is a dress, is still 100 percent wrong. What is the real problem here?

Probably the neural-network configuration is just not good enough to distinguish the dress from a bag. We saw that additional training epochs didn’t seem to help at all, so the next thing to try is to increase the number of neurons in our hidden level. We can use CNN (convolutional neural networks), data augmentation (increasing the training samples by rotating, shifting, and zooming that pictures).

We changed the model layers in our program to use the following four-level convolutional-layer model. CNNs work by scanning images and analyzing them chunk by chunk, say at 5x5 window that moves by a stride length of two pixels each time until it spans the entire message. It’s like looking at an image using a microscope; we only see a small part of the picture at any one time, but eventually we see the whole picture.

The CNN model code has the same structure as the last program. The only significant change
is the addition of the new layers for the CNN network as shown below:

#import libraries
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import seaborn as sns
import tensorflow as tf

from tensorflow.python.framework import ops
from tensorflow.examples.tutorials.mnist import input_data
from PIL import Image
# Import Fashion MNIST
fashion_mnist = input_data.read_data_sets('input/data',
one_hot=True)
fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) \
= fashion_mnist.load_data()
class_names = ['T-shirt/top', 'Trouser',
'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
train_images = train_images / 255.0
test_images = test_images / 255.0
# Prepare the training images
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
# Prepare the test images
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)
model = tf.keras.Sequential()
input_shape = (28, 28, 1)
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=input_shape))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.25))
model.add(tf.keras.layers.Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.25))

model.add(tf.keras.layers.Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Dropout(0.25))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(512, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(128, activation='relu'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=5)
# test with 10,000 images
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('10,000 image Test accuracy:', test_acc)
#run test image from Fashion_MNIST data
img = test_images[15]
img = (np.expand_dims(img,0))
singlePrediction = model.predict(img,steps=1)
print ("Prediction Output")
print(singlePrediction)
print()
NumberElement = singlePrediction.argmax()
Element = np.amax(singlePrediction)
print ("Our Network has concluded that the image number '15' is a "
+class_names[NumberElement])
print (str(int(Element*100)) + "% Confidence Level")


When we run this program, results were as follows:

10,000 image Test accuracy: 0.8601
Prediction Output
[[5.9128129e-06 9.9997270e-01 1.5681641e-06 8.1393973e-06 1.5611777e-06
7.0504888e-07 5.5174642e-06 2.2484977e-07 3.0045830e-06 5.6888598e-07]]


Our Network has concluded that the image number '15' is a Trouser


The key number here is the 10,000-image test accuracy. At 86 percent, it was actually lower than our previous, simpler machine-learning neural network (87 percent). Why did this happen?

This is probably a case related to “overfitting” the training data. A CNN model such as this can use complex internal models to train (many millions of possibilities) and can lead to overfitting, which means the trained network recognizes the training set better but loses the ability to recognize new test data.

Choosing the machine-learning neural network to work with your data is one of the major decisions you will make in your design. However, understanding activation functions, dropout management, and loss functions will also deeply affect the performance of your machine-learning program.

Optimizing all these parameters at once is a difficult task that requires research and experience. Here I am ending this post and in the next post we'll run our base code again and do some analysis of the run using MatPlotLib. Till we meet again keep practicing and learning Python as Python is easy to learn!
Share:

0 comments:

Post a Comment