Wilfried Haensch

30 Reputation

2 Badges

0 years, 207 days

MaplePrimes Activity


These are questions asked by

 

Recognizing Handwritten Digits with Machine Learning

Introduction

 

Using the DeepLearning  package, this application trains a neural network to recognize the numbers in images of handwritten digits. The trained neural network is then applied to a number of test images.

 

The training and testing images are a very small subset of the MNIST database of handwritten digits; these consist of 28 x 28 pixel images of a handwritten digit, ranging from 0 to 9. A sample image for the digit zero is .

 

Ultimately, this application generates an vector of weights for each digit; think of weights as a marking grid for a multiple choice exam. When reshaped into a matrix, a weight vector for the digit 0 might look like this.

When attempting to recognize the number in an image

 

• 

If a pixel with a high intensity lands in the red area, the evidence is high that the handwritten digit is zero

• 

Conversely, if a pixel with a high intensity lands in the blue area, the evidence is low that the handwritten digit is zero

 

The DeepLearning package is a partial interface to Tensorflow, an open-source machine learning framework. To learn about the machine learning techniques used in this application, please consult these references (the next section, however, features a brief overview)

 

• 

https://www.tensorflow.org/versions/r1.1/get_started/mnist/beginners

• 

https://www.oreilly.com/learning/not-another-mnist-tutorial-with-tensorflow

Notes

 

Introduction

 

We first build a computational (or dataflow) graph. Then, we create a Tensorflow session to run the graph.

 

Tensorflow computations involve tensors; think of tensors as multidimensional arrays.

Images

 

Each 28 x 28 image is flattened into a list with 784 elements.

 

Once flattened, the training images are stored in a tensor x, with shape of [none, 784]. The first index is the number of training images ("none" means that we can use an arbitrary number of training images).

Labels

 

Each training image is associated with a label.

 

• 

Labels are a 10-element list, where each element is either 0 or 1

• 

All elements apart from one are zero

• 

The location of the non-zero element is the "value" of the image

 

So for an image that displays the digit 5, the label is [ 0,0,0,0,0,1,0,0,0,0]. This is known as a one-hot encoding.

 

All the labels are stored in a tensor y_ with a shape of [none, 10].

Training

 

The neural network is trained via multinomial logistic regression (also known as softmax).

 

Step 1

Calculate the evidence that each image is in a selected class. Do this by performing a weighted sum of the pixel intensity for the flattened image.

 

e__i = sum(`W__i,j`*x__j, j = 1 .. 784)+b__i

 

where

 

• 

Wi,j and bi are the weight and the bias for digit i and pixel j. Think of W as a matrix with 784 rows (one for each pixel) and 10 columns (one for each digit), and b is a vector with 10 columns (one for each digit)

• 

xj is the intensity of pixel j

 

Step 2

Normalize the evidence into a vector of probabilities with softmax.

 

y__i = softmax*e__i and softmax*e__i = e^x__i/(sum(e^x__j, j = 1 .. 784))

 

Step 3

For each image, calculate the cross-entropy of the vector of predicted probabilities and the actual probabilities (i.e the labels)

 

H__y_(y) = -(sum(y_[i]*log(y__i), i = 1 .. 10))

where

• 

y_ is the true distribution of probabilities (i.e. the one-hot encoded label)

• 

y is the predicted distribution of probabilities

 

The smaller the cross entropy, the better the prediction.

 

Step 4

The mean cross-entropy across all training images is then minimized to find the optimum values of W and b

Testing

 

For each test image, we will generate 10 ordered probabilities that sum to 1. The location of the highest probability is the predicted value of the digit.

Miscellaneous

 

This application consists of

 

• 

this worksheet

• 

and a very small subset of images from the MNIST handwritten digit database

 

in a single zip file. The images are stored in folders; the folders should be extracted to the location as this worksheet.

Load Packages and Define Parameters

 

restart:
with(DeepLearning):
with(DocumentTools):
with(DocumentTools:-Layout):
with(ImageTools):

LEARNING_RATE := 0.01:
TRAIN_STEPS   := 40:

 

Number of training images to load for each digit (maximum of 100)

N := 22:


Number of labels (there are 10 digits, so this is always 10)

L := 10:

 

Number of test images

T := 50:

Import Training Images and Generate Labels

 

Import the training images, where images[n] is a list containing the images for digit n.

path := "C:/Users/Wilfried/Documents/Maple/Examples/ML/":
for j from 0 to L - 1 do
    images[j] := [seq(Import(cat(path, j, "/", j, " (", i, ").PNG")), i = 1 .. N)];
end do:

Generate the labels for digit j, where label[n] is the label for image[n].

for j from 0 to L - 1 do
   labels[j] := ListTools:-Rotate~([[1,0,0,0,0,0,0,0,0,0]$N],-j)[]:
end do:

 

Display training images

Embed([seq(images[i-1], i = 1 .. L)]);

Training

 

Flatten and collect images

x_train := convert~([seq(images[i - 1][], i = 1 .. L)], list):

 

Collect labels

y_train := [seq(labels[i - 1], i = 1 .. L)]:

 

Define placeholders x  and y to feed the training images and labels into

SetEagerExecution(false):
x  := Placeholder(float[4], [none, 784]):
y_ := Placeholder(float[4], [none, L]):

Define weights and bias

W := Variable(Array(1 .. 784, 1 .. L), datatype = float[4]):
b := Variable(Array(1 .. L), datatype = float[4]):

 

Define the classifier using multinomial logistic regression

y := SoftMax(x.W + b):

 

Define the cross-entropy (i.e. the cost function)

cross_entropy := ReduceMean(-ReduceSum(y_ * log(y), reduction_indicies = [1])):

 

Get a Tensorflow session

sess := GetDefaultSession():

 

Initialize the variables

init := VariablesInitializer():
sess:-Run(init):

 

Define the optimizer to minimize the cross entropy

optimizer := Optimizer(GradientDescent(LEARNING_RATE)):
training  := optimizer:-Minimize(cross_entropy):

 

Repeat the optimizer many times

for i from 1 to TRAIN_STEPS do

   sess:-Run(training, {x in x_train, y_ in y_train}):

   if i mod 200 = 0 then
      print(cat("loss = ", sess:-Run(cross_entropy, {x in x_train, y_ in y_train})));    
   end if:

end do:

Import Test Images and Predict Numbers

 

Randomize the order of the test images.

i_rand := combinat:-randperm([seq(i, i = 1 .. 100)]);

[13, 71, 67, 52, 81, 37, 46, 6, 39, 77, 36, 21, 49, 95, 62, 26, 44, 65, 90, 72, 70, 5, 4, 54, 31, 23, 63, 18, 22, 38, 27, 53, 50, 17, 47, 51, 78, 79, 92, 20, 28, 34, 60, 80, 58, 87, 86, 93, 84, 12, 59, 98, 97, 56, 75, 10, 29, 61, 7, 66, 100, 42, 91, 43, 89, 76, 11, 74, 8, 96, 64, 94, 68, 48, 33, 24, 40, 30, 57, 73, 99, 15, 19, 1, 3, 41, 85, 83, 35, 14, 45, 2, 88, 9, 16, 32, 69, 25, 55, 82]

(6.1)


Load and flatten test images.

path:= "C:/Users/Wilfried/Documents/Maple/Examples/ML/test_images":
x_test_images := [seq(Import(cat(path,"/","test (", i, ").png")), i in i_rand[1 .. T])]:
x_train:= convert~(x_test_images, list):

 

For each test image, generate 10 probabilities that the digit is a number from 1 to 10

pred := sess:-Run(y, {x in x_train})

_rtable[36893490552617628484]

(6.2)


For each test image, find the predicted digit associated with the greatest probability

predList := seq( max[index]( pred[i, ..] ) - 1, i = 1 .. T )

9, 1, 0, 5, 3, 4, 5, 2, 8, 3, 8, 2, 5, 2, 4, 6, 8, 4, 1, 1, 1, 6, 7, 5, 7, 7, 4, 7, 7, 8, 7, 5, 5, 7, 5, 5, 3, 3, 0, 6, 7, 7, 4, 3, 4, 0, 3, 2, 3, 7

(6.3)

L := []; for i to 50 do L := [op(L), max(pred[i])] end do
:
Val:=Vector(10,0):
Val_mean:=Vector(10,0):
for k from 1 to 10 do:
  L1:=[]:
 for i from 1 to 50 do:
  if predList[i]=k-1 then L1:=[op(L1),L[i]] end if:
 end do:
 Val(k):=evalf(L1,3):
 Val_mean(k):=Statistics:-Mean(Array(L1)):
end do:

Val,Val_mean

 

Vector[column](%id = 36893490552619428548), Vector[column](%id = 36893490552619428668)

(6.4)

Consider the first test image

Embed(x_test_images[1])

The ten probabilities associated with this image are

pred[1, ..]

Vector[row](10, {(1) = 0.1176837849925505e-4, (2) = .3597199022769928, (3) = 0.8788742707110941e-3, (4) = 0.14628235250711441e-1, (5) = .16885940730571747, (6) = 0.10462711565196514e-1, (7) = 0.16997022554278374e-1, (8) = 0.5874206870794296e-1, (9) = 0.20698020234704018e-2, (10) = .3676302134990692})

(6.5)

 

Confirm that the probabilities add up to 1

add(i, i in pred[1, ..])

HFloat(1.0000000058325895)

(6.6)

 

The maximum probability occurs at this index

maxProbInd := max[index](pred[1, ..])

10

(6.7)

Hence the predicted number is

maxProbInd - 1

9

(6.8)

Embed(x_test_images[1 .. 25])

Embed(x_test_images[26 .. 50])

We now display all the predictions

T1 := Table(Row(seq(predList[k],k = 1.. 25)),Row( seq(predList[k],k = 26 .. 50 ))
          ):
InsertContent(Worksheet(T1)):

Visualize Weights


I have problmes running the file with Maple 2024. It runs fine with Maple 2020.2 (execpt the very last part, which is not essential). The problem occurs at the SoftMax command, even if I use Softmax. It seems to be a Python conersion problem in Maple 2024. Please let me know what the remidy is. You need to modify the data path because it is set to my computer.

Wilfried

Download HDR.mw

When I calculate the edge values of a matrix the result is lengthy expression that could be simplified if evaluated numerically, Why is that not done? 

Page 1 of 1