Friday, September 4, 2020

MNIST dataset

MNIST Reborn, Restored and Expanded: Additional 50K Training Samples | by  Synced | SyncedReview | Medium

Every machine learning student knows about the MNIST dataset. Fewer know what the acronym stands for. In fact, we had to look it up to be able to tell you that the M stands for Modified, and NIST stands for National Institute of Standards and Technology. Now you probably know something that an average machine learning expert doesn’t!

Handwritten digits are a classic case that is often used when discussing why we use machine learning, and we will make no exception.

Below you can see examples of handwritten images from the very commonly used MNIST dataset.


The correct label (what digit the writer was supposed to write) is shown above each image. Note that some of the "correct” class labels are questionable: see for example the second image from left: is that really a 7, or actually a 4?

In the most common machine learning problems, exactly one class value is correct at a time. This is also true in the MNIST case, although as we said, the correct answer may often be hard to tell. In this kind of problem, it is not possible that an instance belongs to multiple classes (or none at all) at the same time. What we would like to achieve is an AI method that can be given an image like the ones above, and automatically spits out the correct label (a number between 0 and 9).

The roots of machine learning are in statistics, which can also be thought of as the art of extracting knowledge from data. Especially methods such as linear regression and Bayesian statistics, which are both already more than two centuries old (!), are even today at the heart of machine learning.

The area of machine learning is often divided in subareas according to the kinds of problems being attacked. A rough categorization is as follows:

Supervised learning: We are given an input, for example a photograph with a traffic sign, and the task is to predict the correct output or label, for example which traffic sign is in the picture (speed limit, stop sign, etc.). In the simplest cases, the answers are in the form of yes/no (we call these binary classification problems).

Unsupervised learning: There are no labels or correct outputs. The task is to discover the structure of the data: for example, grouping similar items to form “clusters”, or reducing the data to a small number of important “dimensions”. Data visualization can also be considered unsupervised learning.

Reinforcement learning: Commonly used in situations where an AI agent like a self-driving car must operate in an environment and where feedback about good or bad choices is available with some delay. Also used in games where the outcome may be decided only at the end of the game.

The categories are somewhat overlapping and fuzzy, so a particular method can sometimes be hard to place in one category. For example, as the name suggests, so-called semisupervised learning is partly supervised and partly unsupervised.

When it comes to machine learning, we will focus primarily on supervised learning, and in particular, classification tasks. In classification, we observe in input, such as a photograph of a traffic sign, and try to infer its “class”, such as the type of sign (speed limit 80 km/h, pedestrian crossing, stop sign, etc.). Other examples of classification tasks include: identification of fake Twitter accounts (input includes the list of followers, and the rate at which they have started following the account, and the class is either fake or real account) and handwritten digit recognition (input is an image, class is 0,...,9).

We shall continue our discuss in the next post which will focus on supervised learning.

Share:

0 comments:

Post a Comment