TP-NN
Posted on Thu 23 January 2020 in posts
TP Neural Network¶
the first part of this session is to implement its own version of the perceptron and get more familiar with the practice of ML.
Multiclass Perceptron¶
In order to make your own perceptron, you will have to implement the following
- Define the softmax function
- PeceptronOut : return the output for each of the $K$ neurons.
- UpdateWeights : update the weights of the perceptron for a mini-batch
- OneHotVec : from a label and a number of possible class, it returns a vector with one on the correct class and zero elsewhere
- TransformX : From the train set, return a new set by adding a component with the constant 1 for all data
- ComputeScore : Return the number of correct classified samples
import numpy as np
import matplotlib.pyplot as plt
def softmax(y):
exp_y = np.exp(y)
return exp_y / np.sum(exp_y,0)
def PerceptronOut(w, X):
return softmax(np.matmul(w,X))
Update rule
$$ w_{ki}^{(t+1)} = w_{ki}^{(t)} + \frac{\eta}{{\rm BS}} \sum_{m} x_i^{(m)} \left( \tilde{y}_k^{(m)} - y(\vec{x}^{(m)})_k\right) $$# X should be D X n_mb , and y K X n_mb
def updateWeights(w,X,y_true,η=0.1):
y = PerceptronOut(w,X) # K x n_mb
w = w + η*(np.matmul(X,(y_true-y).T)/X.shape[1]).T
return w
def OneHotVec(y,Nc):
NS = y.shape[0]
y_true = np.zeros((Nc,y.shape[0])) # true labels
for i in range(NS):
y_true[y[i],i] = 1
return y_true
def normalize_X(train_s,Ns):
un = np.ones((Ns,1))
x = np.concatenate((un,train_s),axis=1)
return x.T
def ComputeScore(w,X,y):
pr = np.argmax(PerceptronOut(w,X),0)
return np.sum(pr==y)/X.shape[1]
def ComputeLike(w,X,y):
return np.mean(np.log(PerceptronOut(w,X))*y)*w.shape[0]
Now, check that your perceptron is working, you should reach ~0.98 of correct classification on the training set (and something similar on the test set).
- Compare the likelihood function in order to define at which epoch the learning should be stopped
- Check what are the worst classified images (wrongly classify image with the highest response.
- Show what is the averaged images for each class (the average image of all the correctly classifier samples for a given class)
import pickle
import gzip
import numpy as np
import matplotlib.pyplot as plt
import numpy as np
f = gzip.open('../../M1Pro-ML/mnist.pkl.gz', 'rb')
u = pickle._Unpickler(f)
u.encoding = 'latin1'
p = u.load()
train_set, valid_set, test_set = p
Ns = 5000
X = normalize_X(train_set[0][:Ns,:],Ns)
Xtest = normalize_X(test_set[0],10000)
y = OneHotVec(train_set[1][:Ns],10)
ytest = OneHotVec(test_set[1][:10000],10)
# Full update
w = np.random.random((10,X.shape[0]))
for t in range(10):
w = updateWeights(w,X,y,η=0.1)
print("t=",t," score=",ComputeScore(w,X,train_set[1][:Ns]))
print("score=",ComputeScore(w,X,train_set[1][:Ns]))
# implement batches
def getMiniBatches(X,y,m,bs):
return X[:,m*bs:(m+1)*bs],y[:,m*bs:(m+1)*bs]
# With mini-batch update
w = np.random.random((10,X.shape[0]))
bs = 20
NB = int(X.shape[1]/bs)
sc_train = np.array([])
sc_test = np.array([])
like_train = np.array([])
like_test = np.array([])
for t in range(50):
for m in range(NB):
Xb,yb = getMiniBatches(X,y,m,bs)
w = updateWeights(w,Xb,yb,η=0.1)
sc_train = np.append(sc_train,ComputeScore(w,X,train_set[1][:Ns]))
sc_test = np.append(sc_test,ComputeScore(w,Xtest,test_set[1][:]))
like_train = np.append(like_train,ComputeLike(w,X,y))
like_test = np.append(like_test,ComputeLike(w,Xtest,ytest))
print("t=",t," score=",ComputeScore(w,X,train_set[1][:Ns]))
Now, check that your perceptron is working, you should reach ~0.98 of correct classification on the training set (and something similar on the test set).
- Compare the likelihood function in order to define at which epoch the learning should be stopped
- Check what are the worst classified images (wrongly classify image with the highest response)
- Show what is the averaged images for each class (the average image of all the correctly classifier samples for a given class)
f,ax = plt.subplots(1,2,figsize=(20,5))
ax[0].plot(1-sc_train)
ax[0].plot(1-sc_test)
ax[1].plot(like_train)
ax[1].plot(like_test)
f, ax = plt.subplots(1,10,figsize=(15,5))
y_all = np.argmax(PerceptronOut(w,X),0)
for k in range(10):
id_ = np.where(train_set[1][:Ns] == k)
X_ = X[:,id_]
Pout_ = PerceptronOut(w,X)[:,id_[0]]
yout_ = y_all[id_[0]]
wrong = Pout_[:,np.where(yout_!=k)[0]]
Xwrong = X[:,id_[0]]
id_max = np.where(np.max(wrong)==wrong)
print(np.argmax(wrong[:,id_max[1][0]]))
ax[k].imshow(Xwrong[1:,id_max[1][0]].reshape(28,28))
f,ax = plt.subplots(1,10,figsize=(15,5))
for k in range(10):
id_ = np.where(train_set[1][:Ns] == k)
X_ = X[:,id_[0]]
mean_im = np.mean(X_[:,np.where(np.argmax(PerceptronOut(w,X)[:,id_[0]],0)==k)[0]],1)
ax[k].imshow(mean_im[1:].reshape(28,28))
# visualizing the weights
f, ax = plt.subplots(1,10,figsize=(20,10))
for i in range(10):
ax[i].imshow(w[i,1:].reshape(28,28))
Neural network can show strong weakness according to small but well-chosen pertubation. Let's consider the following loss function
$$ \mathcal{L} = \left\| \vec{y}_{target} - \vec{y}(\vec{x}) \right\|^2 $$where $\vec{y}_{target}$ is the targetted class (the one toward wish we want to bias the result. Now
- Compute the gradient w.r.t. the pixel images
- Implement the gradient and look at the effect of this gradient on a randomly generated image
- Look at the obtained images ... does it look like what you would expect ?
It is possible to obtain even worst effect by considering the following loss which try to force the obtain image $\vec{x}$ to be as close as some chosen image
$$ \mathcal{L} = \left\| \vec{y}_{target} - \vec{y}(\vec{x}) \right\|^2 + \lambda\left\| \vec{x}_{chosen} - \vec{x} \right\|^2 $$where $\lambda$ is a parameter balancing between the two terms. Now
- Implement the new gradient
- Test your gradient by choosing $\vec{y}_{target}$ to be a different class from $x_{chosen}$.
def UpdateClass(w,X,k,η=0.2):
y_target = np.zeros(w.shape[0])
y_target[k] = 1
y_x = PerceptronOut(w,X)
act = np.exp(np.matmul(w,X)) # C x NS
Norm = np.sum(act,0)
# grad = w/Norm - np.tensordot((act/(Norm**2)),np.sum(w*act.reshape(10,1),0),0)
grad = w*(act.reshape(10,1))/Norm - np.tensordot((act/(Norm**2)),np.sum(w*act.reshape(10,1),0),0)
X = X + η*np.matmul(grad.T,(y_target-y_x))
return X
x = np.random.random(785)/100
k=7
for t in range(50):
x = UpdateClass2(w,x,k,η=0.1)
plt.plot(PerceptronOut(w,x))
plt.imshow(x[1:].reshape(28,28))
plt.imshow(w[3,1:].reshape(28,28))
def UpdateClass_withIm(w,X,xt,k,η=0.2,λ=0.1):
y_target = np.zeros(w.shape[0])
y_target[k] = 1
y_x = PerceptronOut(w,X)
act = np.exp(np.matmul(w,X)) # C x NS
Norm = np.sum(act,0)
grad = w*(act.reshape(10,1))/Norm - np.tensordot((act/(Norm**2)),np.sum(w*act.reshape(10,1),0),0)
X = X + η*np.matmul(grad.T,(y_target-y_x)) - η*λ*(X-xt)
return X
x = np.random.random(785)/100
for t in range(20):
x = UpdateClass3(w,x,X[:,0],2,η=0.1,λ=0.12)
PerceptronOut(w,x)
plt.plot(np.arange(0,10),PerceptronOut(w,x))
plt.imshow(x[1:].reshape(28,28))
plt.colorbar()