Tuesday, May 28, 2019

Pandas - 45 Supervised Learning with scikit-learn (Nonlinear SVC)

In the previous post we have seen the SVC linear algorithm defining a line of separation that was
intended to split the two classes. We have more complex SVC algorithms that can establish curves (2D) or curved surfaces (3D) based on the same principles of maximizing the distances between the points closest to the surface. Let’s see the system using a polynomial kernel. As the name implies, we can define a polynomial curve that separates the area decision in two portions. The degree of the polynomial can be defined by the degree option. Even in this case C is the coefficient of regularization.

In the following program we'll try to apply an SVC algorithm with a polynomial kernel of third degree and with a C coefficient equal to 1:

x = np.array([[1,3],[1,2],[1,1.5],[1.5,2],[2,3],[2.5,1.5],[2,1],[3,1],[3,2],[3.5,1],[3.5,3]])
y = [0]*6 + [1]*5
svc = svm.SVC(kernel='poly',C=1, degree=3).fit(x,y)
X,Y = np.mgrid[0:4:200j,0:4:200j]
Z = svc.decision_function(np.c_[X.ravel(),Y.ravel()])
Z = Z.reshape(X.shape)
plt.contourf(X,Y,Z > 0,alpha=0.4)
plt.contour(X,Y,Z,colors=['k','k','k'], linestyles=['--','-','--'],levels=[-1,0,1])

The output of the program is shown below which is the decision space using an SVC with a polynomial kernel:
There is another type of nonlinear kernel, the Radial Basis Function (RBF). In this case the
separation curves tend to define the zones radially with respect to the observation points of the training set. See the following program :

x = np.array([[1,3],[1,2],[1,1.5],[1.5,2],[2,3],[2.5,1.5],[2,1],[3,1],[3,2],[3.5,1],[3.5,3]])
y = [0]*6 + [1]*5
svc = svm.SVC(kernel='rbf', C=1, gamma=3).fit(x,y)
X,Y = np.mgrid[0:4:200j,0:4:200j]
Z = svc.decision_function(np.c_[X.ravel(),Y.ravel()])
Z = Z.reshape(X.shape)
plt.contourf(X,Y,Z > 0,alpha=0.4)
plt.contour(X,Y,Z,colors=['k','k','k'], linestyles=['--','-','--'],levels=[-1,0,1])

The output of the program is shown below in which we can see the two portions of the decision with all points of the training set correctly positioned.:
Now let us use more complex datasets for a classification problem with SVC by using the previously used dataset: the Iris Dataset.

The SVC algorithm used before learned from a training set containing only two classes but now we  will extend the case to three classifications, as the Iris Dataset is split into three classes, corresponding to the three different species of flowers.

In this case the decision boundaries intersect each other, subdividing the decision area (in the case 2D) or the decision volume (3D) in several portions.

Both linear models have linear decision boundaries (intersecting hyperplanes), while models with nonlinear kernels (polynomial or Gaussian RBF) have nonlinear decision boundaries. These boundaries are more flexible with figures that are dependent on the type of kernel and its parameters.
See the following program :

iris = datasets.load_iris()
x = iris.data[:,:2]
y = iris.target
h = .05
svc = svm.SVC(kernel='linear',C=1.0).fit(x,y)
x_min,x_max = x[:,0].min() - .5, x[:,0].max() + .5
y_min,y_max = x[:,1].min() - .5, x[:,1].max() + .5
h = .02
X, Y = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min,y_max,h))
Z = svc.predict(np.c_[X.ravel(),Y.ravel()])
Z = Z.reshape(X.shape)

The output of the program is shown below where we can see the decision space is divided into three portions separated by decisional boundaries:
Let's apply a nonlinear kernel for generating nonlinear decision boundaries, such as the polynomial kernel as shown in the following program:

iris = datasets.load_iris()
x = iris.data[:,:2]
y = iris.target
h = .05
svc = svm.SVC(kernel='poly',C=1.0,degree=3).fit(x,y)
x_min,x_max = x[:,0].min() - .5, x[:,0].max() + .5
y_min,y_max = x[:,1].min() - .5, x[:,1].max() + .5
h = .02
X, Y = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min,y_max,h))
Z = svc.predict(np.c_[X.ravel(),Y.ravel()])
Z = Z.reshape(X.shape)

The output of the program is shown below which shows how the polynomial decision boundaries split the area in a very different way compared to the linear case:
Just notice that in the polynomial case the blue portion is not directly connected with the purple portion. To see the difference in the distribution of areas we can apply the RBF kernel as shown in the following program :

iris = datasets.load_iris()
x = iris.data[:,:2]
y = iris.target
h = .05
svc = svm.SVC(kernel='rbf', gamma=3, C=1.0).fit(x,y)
x_min,x_max = x[:,0].min() - .5, x[:,0].max() + .5
y_min,y_max = x[:,1].min() - .5, x[:,1].max() + .5
h = .02
X, Y = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min,y_max,h))
Z = svc.predict(np.c_[X.ravel(),Y.ravel()])
Z = Z.reshape(X.shape)

The output of the program is shown below which shows how the RBF kernel generates radial areas:
We can use the SVC method to solve even regression problems. This method is called Support Vector Regression. The model produced by SVC actually does not depend on the complete training set, but uses only a subset of elements, i.e., those closest to the decisional boundary.

In a similar way, the model produced by SVR also depends only on a subset of the training set. Let's see how the SVR algorithm will use the diabetes dataset that we have already seen in the previous posts. By way of example, we will refer only to the third physiological data. We will perform three different regressions, a linear and two nonlinear (polynomial). The linear case will produce a straight line as the linear predictive model is very similar to the linear regression seen previously, whereas
polynomial regressions will be built of the second and third degrees.

The SVR() function is almost identical to the SVC()function seen previously. The only aspect to consider is that the test set of data must be sorted in ascending order. See the following program :

diabetes = datasets.load_diabetes()
x_train = diabetes.data[:-20]
y_train = diabetes.target[:-20]
x_test = diabetes.data[-20:]
y_test = diabetes.target[-20:]
x0_test = x_test[:,2]
x0_train = x_train[:,2]
x0_test = x0_test[:,np.newaxis]
x0_train = x0_train[:,np.newaxis]
x0_test = x0_test*100
x0_train = x0_train*100
svr = svm.SVR(kernel='linear',C=1000)
svr2 = svm.SVR(kernel='poly',C=1000,degree=2)
svr3 = svm.SVR(kernel='poly',C=1000,degree=3)
y = svr.predict(x0_test)
y2 = svr2.predict(x0_test)
y3 = svr3.predict(x0_test)

The output of the program is shown below:

As shown in the output, the three regression curves will be represented with three colors. The linear
regression will be blue; the polynomial of second degree that is, a parabola, will be red; and the polynomial of third degree will be green.

Here I am ending today’s post. In the next post we shall start with Deep Learning with
TensorFlow. Until we meet again keep practicing and learning Python, as Python is easy to learn!


Post a Comment