Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Mnistauto 4
1. An Analysis of RBM.m of Hinton’s
mnistdeepauto example’s
backprop.m
by Ali Riza SARAL
arsaral((at))yahoo.com
References:
Hinton’s «Lecture 12C _ Restricted Boltzmann Machines»
Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference»
Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»
2. @copyright
• % Version 1.000%% Code provided by Ruslan Salakhutdinov
and Geoff Hinton%% Permission is granted for anyone to
copy, use, modify, or distribute this% program and
accompanying programs and documents for any purpose,
provided% this copyright notice is retained and
prominently displayed, along with% a note saying that the
original programs are available from our% web page.% The
programs and documents are distributed without any
warranty, express or% implied. As the programs were
written for research purposes only, they have% not been
tested to the degree that would be advisable in any
important% application. All use of these programs is
entirely at the user's own risk.
7. Initialization 4.
• %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS
• l1=size(w1,1)-1; % 784
• l2=size(w2,1)-1; % 1000
• l3=size(w3,1)-1; % 500
• l4=size(w4,1)-1; % 250
• l5=size(w5,1)-1; % 30
• l6=size(w6,1)-1; % 250
• l7=size(w7,1)-1; % 500
• l8=size(w8,1)-1; % 1000
• l9=l1; % 784
• test_err=[];
• train_err=[];
• The weights are bidirectional, the 4 layers become 8 and their lengths
remain the same for the reverse processing.
8. Epoch loop
• for epoch = 1:maxepoch
• %%%%%%%%%%%% COMPUTE TRAINING RECONSTRUCTION ERROR
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• %%%%%%%%%%% COMPUTE TEST RECONSTRUCTION ERROR
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• PERFORM CONJUGATE GRADIENT LOOP
• end
9. Conjugate Gradient Loop
• for batch = 1:numbatches/10
• %%%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH
• %%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES
• End
• save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8
• save mnist_error test_err train_err;
• End (of epoche)
10. Call mnistdisp.m
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• fprintf(1,'Displaying in figure 1: Top row - real data, Bottom row -- reconstructions
n');
• output=[]; %Concat the numbers in output
• for ii=1:15 %Take only the first 15 numbers (30 infact)
output = [output data(ii,1:end-1)' dataout(ii,:)']; % 784x100 ++ 784x100
• end %Take the training number first and then the corresponding reconstruction
• if epoch==1 %Manage figure positioning etc.
• close all
• figure('Position',[100,600,1000,200]);
• else
• figure(1)
• end
• mnistdisp(output); % prepare data to be displayed and display
• drawnow;
11. Mnistdisp.m 1
• function [err] = mnistdisp(digits); % 784x30
• % display a group of MNIST images
• col=28;row=28;
• [dd,N] = size(digits); % 784x30 N=30;
• imdisp=zeros(2*28,ceil(N/2)*28);
• % 56, 420 pixel picture size, 56 is two numbers
over each other = 28+28, 420 is 15 numbers
adjacent to each other 15 * 28 = 420
12. Mnistdisp.m 2
• for nn=1:N % 1:30
•
• ii=rem(nn,2);
• if(ii==0) ii=2; end % ii is line number 1rem1->1 2rem2=0->2 3->1
4->2
• jj=ceil(nn/2); % jj is digit column sequence in the picture 1/2=1 2/2=1
3/2=2 4/2=2
• img1 = reshape(digits(:,nn),row,col);
• %reshape((784x1),28,28) = 28x28 nn=1..30 reshapes digit
nn in nn loop, there are 30 number columns of 784 length
• img2(((ii-1)*row+1):(ii*row),((jj-1)*col+1):(jj*col))=img1';
• % img2(nn=1 -> 0+1: 1*row, 0+1:1*col) = img1‘
• % İmg2 (row ..., column ...,) = img1’ ... İndicates row range
and column age to be updated with the extracted column image in img1
• % img2(nn=2 -> 1*row+1: 2*row, 1*col+1:2*col) = img1‘
• End
14. Minimize.m
• function [X, fX, i] = minimize(X, f, length, varargin)
• % Minimize a differentiable multivariate function.
• %% Usage: [X, fX, i] = minimize(X, f, length, P1,
P2, P3, ... )
• %% where the starting point is given by "X" (D by
1), and the function named in% the string "f",
must return a function value and a vector of
partial% derivatives of f wrt X
15. Backprop.m calling minimize.m
• %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3
LINESEARCHES
• max_iter=3;
• VV = [w1(:)' w2(:)' w3(:)' w4(:)' w5(:)' w6(:)' w7(:)'
w8(:)']';
• Dim = [l1; l2; l3; l4; l5; l6; l7; l8; l9];
• [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data);
• VV is starting point X at minimize.m
• the function named in% the string "f " is 'CG_MNIST‘.
16. Minimize.m
• There is an older version of this program at coursera-
ml-master package’s mlclass-ex5 and others under the
name:
• Fmincg
• Copyright (C) 2001 and 2002 by Carl Edward
Rasmussen. Date 2002-02-13
• Minimize.m is the newer and better
explained/documented version
• % Copyright (C) 2001 - 2006 by Carl Edward Rasmussen
(2006-09-08).
17. Minimize.m 1
• function [X, fX, i] = minimize(X, f, length,
varargin)
• The function returns the found% solution "X",
and a vector of function values "fX" indicating
the progress made% and "i" the number of
iterations used.
23. CG_MNIST 3.2
• W1,w2,w3probs are calculated with sigma
function where as w4 is calculated as a
Gaussian function.
• A forward processing and a backward
processing is done for reconstruction.
• Processing is done for 1000 batch input items.
• (10x100).
25. CG_MNIST 4.1
• f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) +
(1-XX(:,1:end-1)).*log(1-XXout)));
• % 1000x785-1 .* 1000x784 --> 1x1
• This is the function value returned for this
calling of the ‘function’ CG_MNIST by
minimize for this databatch of 1000.
• If carefully viewed one can notice the
similarity of this algorithm to lrcostfunction.m
of coursera-ml-mastermlclass-ex3.
26. CG_MNIST 4.2
• f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) +
(1-XX(:,1:end-1)).*log(1-XXout)));
• Vs
• h_of_x = sigmoid(X * theta);
• J = 1 / m * sum( -1 * y' * log(h_of_x) - (1-y') *
log(1 - h_of_x) );
• This is practically a cost function.
27. CG_MNIST 5.1
• IO = 1/N*(XXout-XX(:,1:end-1)); % 1000x784
• Ix8=IO; % 1000x784
• dw8 = w7probs'*Ix8;
• % 1001x1000 * 1000x784 = 1001x784
• The difference of XXout and XX(data) is divided by the number
of cases and this difference is reverse multiplied with
w7probs. This gives us the dw8 difference caused by this
layer. The outcome IO of each step is similarly reverse
multiplied with the weights of each layer and the difference
caused by that layer is found.
30. CG_MNIST 5.4
• df = [dw1(:)' dw2(:)' dw3(:)' dw4(:)' dw5(:)'
dw6(:)' dw7(:)' dw8(:)' ]';
• % 2837314x1
• Finally the df return parameter of CG_MNIST
is set.
31. Minimize.m to backprop.m connection
• Minimize.m returns the minimized version of
VV in the variable X.
• [X, fX] =
minimize(VV,'CG_MNIST',max_iter,Dim,data);
32. End of backprop.m
• w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); xxx = (l1+1)*l2;
• w2 = reshape(X(xxx+1:xxx+(l2+1)*l3),l2+1,l3); xxx = xxx+(l2+1)*l3;
• w3 = reshape(X(xxx+1:xxx+(l3+1)*l4),l3+1,l4); xxx = xxx+(l3+1)*l4;
w4 = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); xxx = xxx+(l4+1)*l5;
w5 = reshape(X(xxx+1:xxx+(l5+1)*l6),l5+1,l6); xxx = xxx+(l5+1)*l6;
w6 = reshape(X(xxx+1:xxx+(l6+1)*l7),l6+1,l7); xxx = xxx+(l6+1)*l7;
w7 = reshape(X(xxx+1:xxx+(l7+1)*l8),l7+1,l8); xxx = xxx+(l7+1)*l8;
w8 = reshape(X(xxx+1:xxx+(l8+1)*l9),l8+1,l9);
• Resetting the weight values according to the X value returned by
minimize.m
• save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8
• save mnist_error test_err train_err;
• Save w values and error values for this epoche.