SlideShare uma empresa Scribd logo
1 de 32
An Analysis of RBM.m of Hinton’s
mnistdeepauto example’s
backprop.m
by Ali Riza SARAL
arsaral((at))yahoo.com
References:
Hinton’s «Lecture 12C _ Restricted Boltzmann Machines»
Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference»
Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»
@copyright
• % Version 1.000%% Code provided by Ruslan Salakhutdinov
and Geoff Hinton%% Permission is granted for anyone to
copy, use, modify, or distribute this% program and
accompanying programs and documents for any purpose,
provided% this copyright notice is retained and
prominently displayed, along with% a note saying that the
original programs are available from our% web page.% The
programs and documents are distributed without any
warranty, express or% implied. As the programs were
written for research purposes only, they have% not been
tested to the degree that would be advisable in any
important% application. All use of these programs is
entirely at the user's own risk.
Initialization 1.
• maxepoch=1; %maxepoch=200;
• fprintf(1,'nFine-tuning deep autoencoder by
minimizing cross entropy error. n');
• fprintf(1,'60 batches of 1000 cases each. n');
• load mnistvh
• load mnisthp
• load mnisthp2
• load mnistpo
mnistdeepauto
• hidrecbiases=hidbiases; % 1x1000
• save mnistvh vishid hidrecbiases visbiases; %784x1000 1x1000 1x784
• fprintf(1,'nPretraining Layer 2 with RBM: %d-%d n',numhid,numpen); % 1000, 500
• ...
• hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases;
• save mnisthp hidpen penrecbiases hidgenbiases; % 1000x500 1x500 1x1000
• fprintf(1,'nPretraining Layer 3 with RBM: %d-%d n',numpen,numpen2); % 500, 250
• ...
• hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases;
• save mnisthp2 hidpen2 penrecbiases2 hidgenbiases2; % 500x250 1x250 1x500
• fprintf(1,'nPretraining Layer 4 with RBM: %d-%d n',numpen2,numopen); %250, 30
• ...
• hidtop=vishid; toprecbiases=hidbiases; topgenbiases=visbiases;
• save mnistpo hidtop toprecbiases topgenbiases; % 250x30 1x30 1x250
Backprop Initialization 2.
• makebatches;[numcases numdims
numbatches]=size(batchdata); % 100x784x600
• N=numcases;
Initialization 3.
• %%%% PREINITIALIZE WEIGHTS OF THE AUTOENCODER
• w1=[vishid; hidrecbiases]; % 784x1000 ; 1x1000 = 785x1000
• w2=[hidpen; penrecbiases]; % 1000x500 ; 1x500 = 1001x500
• w3=[hidpen2; penrecbiases2]; %500x250 ; 1x250 = 501x250
• w4=[hidtop; toprecbiases]; % 250x30 ; 1x30 = 251x30
• w5=[hidtop'; topgenbiases]; % 30x250 ; 1x250 = 31x250
• w6=[hidpen2'; hidgenbiases2]; % 250x500 ; 1x500 = 251x500
• w7=[hidpen'; hidgenbiases]; % 500x1000 ; 1x1000 = 501x1000
• w8=[vishid'; visbiases]; % 1000x784 ; 1x784 = 1001x784
• %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS
Initialization 4.
• %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS
• l1=size(w1,1)-1; % 784
• l2=size(w2,1)-1; % 1000
• l3=size(w3,1)-1; % 500
• l4=size(w4,1)-1; % 250
• l5=size(w5,1)-1; % 30
• l6=size(w6,1)-1; % 250
• l7=size(w7,1)-1; % 500
• l8=size(w8,1)-1; % 1000
• l9=l1; % 784
• test_err=[];
• train_err=[];
• The weights are bidirectional, the 4 layers become 8 and their lengths
remain the same for the reverse processing.
Epoch loop
• for epoch = 1:maxepoch
• %%%%%%%%%%%% COMPUTE TRAINING RECONSTRUCTION ERROR
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• %%%%%%%%%%% COMPUTE TEST RECONSTRUCTION ERROR
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• PERFORM CONJUGATE GRADIENT LOOP
• end
Conjugate Gradient Loop
• for batch = 1:numbatches/10
• %%%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH
• %%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES
• End
• save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8
• save mnist_error test_err train_err;
• End (of epoche)
Call mnistdisp.m
• %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS
• fprintf(1,'Displaying in figure 1: Top row - real data, Bottom row -- reconstructions
n');
• output=[]; %Concat the numbers in output
• for ii=1:15 %Take only the first 15 numbers (30 infact)
output = [output data(ii,1:end-1)' dataout(ii,:)']; % 784x100 ++ 784x100
• end %Take the training number first and then the corresponding reconstruction
• if epoch==1 %Manage figure positioning etc.
• close all
• figure('Position',[100,600,1000,200]);
• else
• figure(1)
• end
• mnistdisp(output); % prepare data to be displayed and display
• drawnow;
Mnistdisp.m 1
• function [err] = mnistdisp(digits); % 784x30
• % display a group of MNIST images
• col=28;row=28;
• [dd,N] = size(digits); % 784x30 N=30;
• imdisp=zeros(2*28,ceil(N/2)*28);
• % 56, 420 pixel picture size, 56 is two numbers
over each other = 28+28, 420 is 15 numbers
adjacent to each other 15 * 28 = 420
Mnistdisp.m 2
• for nn=1:N % 1:30
•
• ii=rem(nn,2);
• if(ii==0) ii=2; end % ii is line number 1rem1->1 2rem2=0->2 3->1
4->2
• jj=ceil(nn/2); % jj is digit column sequence in the picture 1/2=1 2/2=1
3/2=2 4/2=2
• img1 = reshape(digits(:,nn),row,col);
• %reshape((784x1),28,28) = 28x28 nn=1..30 reshapes digit
nn in nn loop, there are 30 number columns of 784 length
• img2(((ii-1)*row+1):(ii*row),((jj-1)*col+1):(jj*col))=img1';
• % img2(nn=1 -> 0+1: 1*row, 0+1:1*col) = img1‘
• % İmg2 (row ..., column ...,) = img1’ ... İndicates row range
and column age to be updated with the extracted column image in img1
• % img2(nn=2 -> 1*row+1: 2*row, 1*col+1:2*col) = img1‘
• End
Mnistdisp.m 3
• imagesc(img2,[0 1]);
• colormap gray;
• axis equal;
• axis off;
• drawnow;
• err=0; % not used
Minimize.m
• function [X, fX, i] = minimize(X, f, length, varargin)
• % Minimize a differentiable multivariate function.
• %% Usage: [X, fX, i] = minimize(X, f, length, P1,
P2, P3, ... )
• %% where the starting point is given by "X" (D by
1), and the function named in% the string "f",
must return a function value and a vector of
partial% derivatives of f wrt X
Backprop.m calling minimize.m
• %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3
LINESEARCHES
• max_iter=3;
• VV = [w1(:)' w2(:)' w3(:)' w4(:)' w5(:)' w6(:)' w7(:)'
w8(:)']';
• Dim = [l1; l2; l3; l4; l5; l6; l7; l8; l9];
• [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data);
• VV is starting point X at minimize.m
• the function named in% the string "f " is 'CG_MNIST‘.
Minimize.m
• There is an older version of this program at coursera-
ml-master package’s mlclass-ex5 and others under the
name:
• Fmincg
• Copyright (C) 2001 and 2002 by Carl Edward
Rasmussen. Date 2002-02-13
• Minimize.m is the newer and better
explained/documented version
• % Copyright (C) 2001 - 2006 by Carl Edward Rasmussen
(2006-09-08).
Minimize.m 1
• function [X, fX, i] = minimize(X, f, length,
varargin)
• The function returns the found% solution "X",
and a vector of function values "fX" indicating
the progress made% and "i" the number of
iterations used.
Minimize.m 2
• Backprop.m
• [X, fX] =
minimize(VV,'CG_MNIST',max_iter,Dim,data);
• Minimize.m
• function [X, fX, i] = minimize(X, f, length, varargin)
• CG_MINIST.m
• function [f, df] = CG_MNIST(VV,Dim,XX);
Minimize.m 3
• Backprop.m
• [X, fX] =
minimize(VV,'CG_MNIST',max_iter,Dim,data);
• X is minimized VV.
• fx= is df in CG_MNIST.m = [dw1(:)' dw2(:)' dw3(:)'
dw4(:)' dw5(:)' dw6(:)' dw7(:)' dw8(:)' ]';
• Max_iter=3
• Dim is in backprop.m= [l1; l2; l3; l4; l5; l6; l7; l8; l9];
• Data is in backprop.m = data=[]; for kk=1:10
data=[data batchdata(:,:,(tt-1)*10+kk)]; end
• 10 minibatches combined into 1 larger batch
CG_MNIST 1
• function [f, df] = CG_MNIST(VV,Dim,XX);
• l1 = Dim(1); % 784
• l2 = Dim(2); % 1000
• l3 = Dim(3); % 500
• l4= Dim(4); % 250
• l5= Dim(5); % 30
• l6= Dim(6); % 250
• l7= Dim(7); % 500
• l8= Dim(8); % 1000
• l9= Dim(9); % 784
• N = size(XX,1); % 1000
• Set lengths of weights. XX is the batch data of 10 x 100 = 1000
items.
CG_MNIST 2
• w1 = reshape(VV(1:(l1+1)*l2),l1+1,l2); % 785x1000
• xxx = (l1+1)*l2; % 785000
• w2 = reshape(VV(xxx+1:xxx+(l2+1)*l3),l2+1,l3); % 1001x500
• xxx = xxx+(l2+1)*l3; % 1285500
• w3 = reshape(VV(xxx+1:xxx+(l3+1)*l4),l3+1,l4); % 501x250
• xxx = xxx+(l3+1)*l4;
• w4 = reshape(VV(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 251x30
• xxx = xxx+(l4+1)*l5;
• w5 = reshape(VV(xxx+1:xxx+(l5+1)*l6),l5+1,l6); % 31x250
• xxx = xxx+(l5+1)*l6;
• w6 = reshape(VV(xxx+1:xxx+(l6+1)*l7),l6+1,l7); % 251x500
• xxx = xxx+(l6+1)*l7;
• w7 = reshape(VV(xxx+1:xxx+(l7+1)*l8),l7+1,l8); % 501x1000
• xxx = xxx+(l7+1)*l8;
• w8 = reshape(VV(xxx+1:xxx+(l8+1)*l9),l8+1,l9); % 1001x784
• Extract weights between layers
CG_MNIST 3.1
• XX = [XX ones(N,1)]; % 1000x785
• w1probs = 1./(1 + exp(-XX*w1));
• w1probs = [w1probs ones(N,1)]; % 1000x785 * 785x1000 =
1000x100->1000x1001
• w2probs = 1./(1 + exp(-w1probs*w2));
• w2probs = [w2probs ones(N,1)]; % 1000x1001 * 1001x500
= 1000x500->1000x501
• w3probs = 1./(1 + exp(-w2probs*w3));
• w3probs = [w3probs ones(N,1)]; % 1000x501 * 501x250 =>
1000x251
• w4probs = w3probs*w4;
• w4probs = [w4probs ones(N,1)]; % 1000x251 * 251x30 =
1000x31
CG_MNIST 3.2
• W1,w2,w3probs are calculated with sigma
function where as w4 is calculated as a
Gaussian function.
• A forward processing and a backward
processing is done for reconstruction.
• Processing is done for 1000 batch input items.
• (10x100).
CG_MNIST 3.3
• w5probs = 1./(1 + exp(-w4probs*w5));
• w5probs = [w5probs ones(N,1)]; % 1000x31 *
31x250 => 1000x251
• w6probs = 1./(1 + exp(-w5probs*w6));
• w6probs = [w6probs ones(N,1)]; % 1000x251
* 251x500 => 1000x501
• w7probs = 1./(1 + exp(-w6probs*w7));
• w7probs = [w7probs ones(N,1)]; % 1000x501
* 501x1000 => 1000x1001
• XXout = 1./(1 + exp(-w7probs*w8)); %
1000x1001 * 1001x784 = 1000x784
CG_MNIST 4.1
• f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) +
(1-XX(:,1:end-1)).*log(1-XXout)));
• % 1000x785-1 .* 1000x784 --> 1x1
• This is the function value returned for this
calling of the ‘function’ CG_MNIST by
minimize for this databatch of 1000.
• If carefully viewed one can notice the
similarity of this algorithm to lrcostfunction.m
of coursera-ml-mastermlclass-ex3.
CG_MNIST 4.2
• f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) +
(1-XX(:,1:end-1)).*log(1-XXout)));
• Vs
• h_of_x = sigmoid(X * theta);
• J = 1 / m * sum( -1 * y' * log(h_of_x) - (1-y') *
log(1 - h_of_x) );
• This is practically a cost function.
CG_MNIST 5.1
• IO = 1/N*(XXout-XX(:,1:end-1)); % 1000x784
• Ix8=IO; % 1000x784
• dw8 = w7probs'*Ix8;
• % 1001x1000 * 1000x784 = 1001x784
• The difference of XXout and XX(data) is divided by the number
of cases and this difference is reverse multiplied with
w7probs. This gives us the dw8 difference caused by this
layer. The outcome IO of each step is similarly reverse
multiplied with the weights of each layer and the difference
caused by that layer is found.
CG_MNIST 5.2
• Ix7 = (Ix8*w8').*w7probs.*(1-w7probs);
• % 1000x784 * 784x1001 .* 1000x1001 .* 1-1000x1001 = 1000x1001
• Ix7 = Ix7(:,1:end-1); % 1000x1000
• dw7 = w6probs'*Ix7; % 501x1000 * 1000x1000 = 501x1000
• Ix6 = (Ix7*w7').*w6probs.*(1-w6probs); % 1000x1000 * 1000x501 =
1000x501I
• x6 = Ix6(:,1:end-1); % 1000x500
• dw6 = w5probs'*Ix6; % 251x500
• Ix5 = (Ix6*w6').*w5probs.*(1-w5probs);
• % 1000x500 * 500x251 =1000x251
• Ix5 = Ix5(:,1:end-1); % 1000x250
• dw5 = w4probs'*Ix5; % 31x250
• Ix4 = (Ix5*w5'); % 1000x250 * 250x31 = 1000x31
• Ix4 = Ix4(:,1:end-1); % 1000x30
• dw4 = w3probs'*Ix4; % 251x100 * 1000x30 = 251x30
CG_MNIST 5.3
• Ix3 = (Ix4*w4').*w3probs.*(1-w3probs);
• % 1000x30 * 30x251 .* 1000x251 = 1000x251
• Ix3 = Ix3(:,1:end-1); % 1000x250
• dw3 = w2probs'*Ix3; % 501x250
• Ix2 = (Ix3*w3').*w2probs.*(1-w2probs);
• % 1000x250 * 250x501 .*1000x501 = 1000x501
• Ix2 = Ix2(:,1:end-1); % 1000x500
• dw2 = w1probs'*Ix2; % 1001x500
• Ix1 = (Ix2*w2').*w1probs.*(1-w1probs);
• %1000x500 * 500x1001 = 1000x1001
• Ix1 = Ix1(:,1:end-1); % 1000x1000
• dw1 = XX'*Ix1; % 785x1000
CG_MNIST 5.4
• df = [dw1(:)' dw2(:)' dw3(:)' dw4(:)' dw5(:)'
dw6(:)' dw7(:)' dw8(:)' ]';
• % 2837314x1
• Finally the df return parameter of CG_MNIST
is set.
Minimize.m to backprop.m connection
• Minimize.m returns the minimized version of
VV in the variable X.
• [X, fX] =
minimize(VV,'CG_MNIST',max_iter,Dim,data);
End of backprop.m
• w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); xxx = (l1+1)*l2;
• w2 = reshape(X(xxx+1:xxx+(l2+1)*l3),l2+1,l3); xxx = xxx+(l2+1)*l3;
• w3 = reshape(X(xxx+1:xxx+(l3+1)*l4),l3+1,l4); xxx = xxx+(l3+1)*l4;
w4 = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); xxx = xxx+(l4+1)*l5;
w5 = reshape(X(xxx+1:xxx+(l5+1)*l6),l5+1,l6); xxx = xxx+(l5+1)*l6;
w6 = reshape(X(xxx+1:xxx+(l6+1)*l7),l6+1,l7); xxx = xxx+(l6+1)*l7;
w7 = reshape(X(xxx+1:xxx+(l7+1)*l8),l7+1,l8); xxx = xxx+(l7+1)*l8;
w8 = reshape(X(xxx+1:xxx+(l8+1)*l9),l8+1,l9);
• Resetting the weight values according to the X value returned by
minimize.m
• save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8
• save mnist_error test_err train_err;
• Save w values and error values for this epoche.

Mais conteúdo relacionado

Semelhante a Mnistauto 4

Regression Analysis , A statistical approch to analysis data.pptx
Regression Analysis , A statistical  approch to analysis data.pptxRegression Analysis , A statistical  approch to analysis data.pptx
Regression Analysis , A statistical approch to analysis data.pptxTifahInternational
 
curve fitting or regression analysis-1.pptx
curve fitting or regression analysis-1.pptxcurve fitting or regression analysis-1.pptx
curve fitting or regression analysis-1.pptxabelmeketa
 
Introduction to Functional Programming with Haskell and JavaScript
Introduction to Functional Programming with Haskell and JavaScriptIntroduction to Functional Programming with Haskell and JavaScript
Introduction to Functional Programming with Haskell and JavaScriptWill Kurt
 
FAIQ MANUAL.pdf
FAIQ MANUAL.pdfFAIQ MANUAL.pdf
FAIQ MANUAL.pdfFaiqAli57
 
Different Types of Machine Learning Algorithms
Different Types of Machine Learning AlgorithmsDifferent Types of Machine Learning Algorithms
Different Types of Machine Learning Algorithmsrahmedraj93
 
令和から本気出す
令和から本気出す令和から本気出す
令和から本気出すTakashi Kitano
 
Growing a Compiler: Getting to machine learning from a general purpose compiler
Growing a Compiler: Getting to machine learning from a general purpose compilerGrowing a Compiler: Getting to machine learning from a general purpose compiler
Growing a Compiler: Getting to machine learning from a general purpose compilerJulia Computing Inc.
 
Linear models
Linear modelsLinear models
Linear modelsFAO
 
Csci101 lect08b matlab_programs
Csci101 lect08b matlab_programsCsci101 lect08b matlab_programs
Csci101 lect08b matlab_programsElsayed Hemayed
 
stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?NETWAYS
 
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfIncorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfaartechindia
 
Programming withmatlab
Programming withmatlabProgramming withmatlab
Programming withmatlabnehanairm
 

Semelhante a Mnistauto 4 (20)

Regression Analysis , A statistical approch to analysis data.pptx
Regression Analysis , A statistical  approch to analysis data.pptxRegression Analysis , A statistical  approch to analysis data.pptx
Regression Analysis , A statistical approch to analysis data.pptx
 
curve fitting or regression analysis-1.pptx
curve fitting or regression analysis-1.pptxcurve fitting or regression analysis-1.pptx
curve fitting or regression analysis-1.pptx
 
Introduction to Functional Programming with Haskell and JavaScript
Introduction to Functional Programming with Haskell and JavaScriptIntroduction to Functional Programming with Haskell and JavaScript
Introduction to Functional Programming with Haskell and JavaScript
 
matlab.docx
matlab.docxmatlab.docx
matlab.docx
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
matlab_tutorial.ppt
matlab_tutorial.pptmatlab_tutorial.ppt
matlab_tutorial.ppt
 
FAIQ MANUAL.pdf
FAIQ MANUAL.pdfFAIQ MANUAL.pdf
FAIQ MANUAL.pdf
 
Different Types of Machine Learning Algorithms
Different Types of Machine Learning AlgorithmsDifferent Types of Machine Learning Algorithms
Different Types of Machine Learning Algorithms
 
令和から本気出す
令和から本気出す令和から本気出す
令和から本気出す
 
Growing a Compiler: Getting to machine learning from a general purpose compiler
Growing a Compiler: Getting to machine learning from a general purpose compilerGrowing a Compiler: Getting to machine learning from a general purpose compiler
Growing a Compiler: Getting to machine learning from a general purpose compiler
 
Las tripas de un sistema solr
Las tripas de un sistema solrLas tripas de un sistema solr
Las tripas de un sistema solr
 
Mit6 094 iap10_lec03
Mit6 094 iap10_lec03Mit6 094 iap10_lec03
Mit6 094 iap10_lec03
 
Linear models
Linear modelsLinear models
Linear models
 
10 chap
10 chap10 chap
10 chap
 
Csci101 lect08b matlab_programs
Csci101 lect08b matlab_programsCsci101 lect08b matlab_programs
Csci101 lect08b matlab_programs
 
stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?stackconf 2022: Are all programming languages in english?
stackconf 2022: Are all programming languages in english?
 
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdfIncorporate the SOR method in the multigridTest-m and apply the multig.pdf
Incorporate the SOR method in the multigridTest-m and apply the multig.pdf
 
Programming withmatlab
Programming withmatlabProgramming withmatlab
Programming withmatlab
 
Querying solr
Querying solrQuerying solr
Querying solr
 

Mais de Ali Rıza SARAL

On the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxOn the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxAli Rıza SARAL
 
Human assisted computer creativity
Human assisted computer creativityHuman assisted computer creativity
Human assisted computer creativityAli Rıza SARAL
 
20160308 ars writing large music works
20160308 ars writing large music works20160308 ars writing large music works
20160308 ars writing large music worksAli Rıza SARAL
 
AR+S The Role Of Abstraction In Human Computer Interaction
AR+S   The Role Of Abstraction In Human Computer InteractionAR+S   The Role Of Abstraction In Human Computer Interaction
AR+S The Role Of Abstraction In Human Computer InteractionAli Rıza SARAL
 

Mais de Ali Rıza SARAL (8)

On the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptxOn the Role of Design in Creativity.pptx
On the Role of Design in Creativity.pptx
 
Mnistauto 3
Mnistauto 3Mnistauto 3
Mnistauto 3
 
Mnistauto 2
Mnistauto 2Mnistauto 2
Mnistauto 2
 
Mnistauto 1
Mnistauto 1Mnistauto 1
Mnistauto 1
 
Human assisted computer creativity
Human assisted computer creativityHuman assisted computer creativity
Human assisted computer creativity
 
20160308 ars writing large music works
20160308 ars writing large music works20160308 ars writing large music works
20160308 ars writing large music works
 
Komut satırı JAVA
Komut satırı JAVAKomut satırı JAVA
Komut satırı JAVA
 
AR+S The Role Of Abstraction In Human Computer Interaction
AR+S   The Role Of Abstraction In Human Computer InteractionAR+S   The Role Of Abstraction In Human Computer Interaction
AR+S The Role Of Abstraction In Human Computer Interaction
 

Último

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwaitjaanualu31
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadhamedmustafa094
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxnuruddin69
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stageAbc194748
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksMagic Marks
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfsmsksolar
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdfKamal Acharya
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 

Último (20)

Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills KuwaitKuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
Kuwait City MTP kit ((+919101817206)) Buy Abortion Pills Kuwait
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
Learn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic MarksLearn the concepts of Thermodynamics on Magic Marks
Learn the concepts of Thermodynamics on Magic Marks
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Rums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdfRums floating Omkareshwar FSPV IM_16112021.pdf
Rums floating Omkareshwar FSPV IM_16112021.pdf
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 

Mnistauto 4

  • 1. An Analysis of RBM.m of Hinton’s mnistdeepauto example’s backprop.m by Ali Riza SARAL arsaral((at))yahoo.com References: Hinton’s «Lecture 12C _ Restricted Boltzmann Machines» Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference» Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»
  • 2. @copyright • % Version 1.000%% Code provided by Ruslan Salakhutdinov and Geoff Hinton%% Permission is granted for anyone to copy, use, modify, or distribute this% program and accompanying programs and documents for any purpose, provided% this copyright notice is retained and prominently displayed, along with% a note saying that the original programs are available from our% web page.% The programs and documents are distributed without any warranty, express or% implied. As the programs were written for research purposes only, they have% not been tested to the degree that would be advisable in any important% application. All use of these programs is entirely at the user's own risk.
  • 3. Initialization 1. • maxepoch=1; %maxepoch=200; • fprintf(1,'nFine-tuning deep autoencoder by minimizing cross entropy error. n'); • fprintf(1,'60 batches of 1000 cases each. n'); • load mnistvh • load mnisthp • load mnisthp2 • load mnistpo
  • 4. mnistdeepauto • hidrecbiases=hidbiases; % 1x1000 • save mnistvh vishid hidrecbiases visbiases; %784x1000 1x1000 1x784 • fprintf(1,'nPretraining Layer 2 with RBM: %d-%d n',numhid,numpen); % 1000, 500 • ... • hidpen=vishid; penrecbiases=hidbiases; hidgenbiases=visbiases; • save mnisthp hidpen penrecbiases hidgenbiases; % 1000x500 1x500 1x1000 • fprintf(1,'nPretraining Layer 3 with RBM: %d-%d n',numpen,numpen2); % 500, 250 • ... • hidpen2=vishid; penrecbiases2=hidbiases; hidgenbiases2=visbiases; • save mnisthp2 hidpen2 penrecbiases2 hidgenbiases2; % 500x250 1x250 1x500 • fprintf(1,'nPretraining Layer 4 with RBM: %d-%d n',numpen2,numopen); %250, 30 • ... • hidtop=vishid; toprecbiases=hidbiases; topgenbiases=visbiases; • save mnistpo hidtop toprecbiases topgenbiases; % 250x30 1x30 1x250
  • 5. Backprop Initialization 2. • makebatches;[numcases numdims numbatches]=size(batchdata); % 100x784x600 • N=numcases;
  • 6. Initialization 3. • %%%% PREINITIALIZE WEIGHTS OF THE AUTOENCODER • w1=[vishid; hidrecbiases]; % 784x1000 ; 1x1000 = 785x1000 • w2=[hidpen; penrecbiases]; % 1000x500 ; 1x500 = 1001x500 • w3=[hidpen2; penrecbiases2]; %500x250 ; 1x250 = 501x250 • w4=[hidtop; toprecbiases]; % 250x30 ; 1x30 = 251x30 • w5=[hidtop'; topgenbiases]; % 30x250 ; 1x250 = 31x250 • w6=[hidpen2'; hidgenbiases2]; % 250x500 ; 1x500 = 251x500 • w7=[hidpen'; hidgenbiases]; % 500x1000 ; 1x1000 = 501x1000 • w8=[vishid'; visbiases]; % 1000x784 ; 1x784 = 1001x784 • %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS
  • 7. Initialization 4. • %%%%%%%%%% END OF PREINITIALIZATION OF WEIGHTS • l1=size(w1,1)-1; % 784 • l2=size(w2,1)-1; % 1000 • l3=size(w3,1)-1; % 500 • l4=size(w4,1)-1; % 250 • l5=size(w5,1)-1; % 30 • l6=size(w6,1)-1; % 250 • l7=size(w7,1)-1; % 500 • l8=size(w8,1)-1; % 1000 • l9=l1; % 784 • test_err=[]; • train_err=[]; • The weights are bidirectional, the 4 layers become 8 and their lengths remain the same for the reverse processing.
  • 8. Epoch loop • for epoch = 1:maxepoch • %%%%%%%%%%%% COMPUTE TRAINING RECONSTRUCTION ERROR • %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS • %%%%%%%%%%% COMPUTE TEST RECONSTRUCTION ERROR • %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS • PERFORM CONJUGATE GRADIENT LOOP • end
  • 9. Conjugate Gradient Loop • for batch = 1:numbatches/10 • %%%%%%%%%%% COMBINE 10 MINIBATCHES INTO 1 LARGER MINIBATCH • %%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • End • save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8 • save mnist_error test_err train_err; • End (of epoche)
  • 10. Call mnistdisp.m • %%%% DISPLAY FIGURE TOP ROW REAL DATA BOTTOM ROW RECONSTRUCTIONS • fprintf(1,'Displaying in figure 1: Top row - real data, Bottom row -- reconstructions n'); • output=[]; %Concat the numbers in output • for ii=1:15 %Take only the first 15 numbers (30 infact) output = [output data(ii,1:end-1)' dataout(ii,:)']; % 784x100 ++ 784x100 • end %Take the training number first and then the corresponding reconstruction • if epoch==1 %Manage figure positioning etc. • close all • figure('Position',[100,600,1000,200]); • else • figure(1) • end • mnistdisp(output); % prepare data to be displayed and display • drawnow;
  • 11. Mnistdisp.m 1 • function [err] = mnistdisp(digits); % 784x30 • % display a group of MNIST images • col=28;row=28; • [dd,N] = size(digits); % 784x30 N=30; • imdisp=zeros(2*28,ceil(N/2)*28); • % 56, 420 pixel picture size, 56 is two numbers over each other = 28+28, 420 is 15 numbers adjacent to each other 15 * 28 = 420
  • 12. Mnistdisp.m 2 • for nn=1:N % 1:30 • • ii=rem(nn,2); • if(ii==0) ii=2; end % ii is line number 1rem1->1 2rem2=0->2 3->1 4->2 • jj=ceil(nn/2); % jj is digit column sequence in the picture 1/2=1 2/2=1 3/2=2 4/2=2 • img1 = reshape(digits(:,nn),row,col); • %reshape((784x1),28,28) = 28x28 nn=1..30 reshapes digit nn in nn loop, there are 30 number columns of 784 length • img2(((ii-1)*row+1):(ii*row),((jj-1)*col+1):(jj*col))=img1'; • % img2(nn=1 -> 0+1: 1*row, 0+1:1*col) = img1‘ • % İmg2 (row ..., column ...,) = img1’ ... İndicates row range and column age to be updated with the extracted column image in img1 • % img2(nn=2 -> 1*row+1: 2*row, 1*col+1:2*col) = img1‘ • End
  • 13. Mnistdisp.m 3 • imagesc(img2,[0 1]); • colormap gray; • axis equal; • axis off; • drawnow; • err=0; % not used
  • 14. Minimize.m • function [X, fX, i] = minimize(X, f, length, varargin) • % Minimize a differentiable multivariate function. • %% Usage: [X, fX, i] = minimize(X, f, length, P1, P2, P3, ... ) • %% where the starting point is given by "X" (D by 1), and the function named in% the string "f", must return a function value and a vector of partial% derivatives of f wrt X
  • 15. Backprop.m calling minimize.m • %%%%%%%%%%%%%%% PERFORM CONJUGATE GRADIENT WITH 3 LINESEARCHES • max_iter=3; • VV = [w1(:)' w2(:)' w3(:)' w4(:)' w5(:)' w6(:)' w7(:)' w8(:)']'; • Dim = [l1; l2; l3; l4; l5; l6; l7; l8; l9]; • [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data); • VV is starting point X at minimize.m • the function named in% the string "f " is 'CG_MNIST‘.
  • 16. Minimize.m • There is an older version of this program at coursera- ml-master package’s mlclass-ex5 and others under the name: • Fmincg • Copyright (C) 2001 and 2002 by Carl Edward Rasmussen. Date 2002-02-13 • Minimize.m is the newer and better explained/documented version • % Copyright (C) 2001 - 2006 by Carl Edward Rasmussen (2006-09-08).
  • 17. Minimize.m 1 • function [X, fX, i] = minimize(X, f, length, varargin) • The function returns the found% solution "X", and a vector of function values "fX" indicating the progress made% and "i" the number of iterations used.
  • 18. Minimize.m 2 • Backprop.m • [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data); • Minimize.m • function [X, fX, i] = minimize(X, f, length, varargin) • CG_MINIST.m • function [f, df] = CG_MNIST(VV,Dim,XX);
  • 19. Minimize.m 3 • Backprop.m • [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data); • X is minimized VV. • fx= is df in CG_MNIST.m = [dw1(:)' dw2(:)' dw3(:)' dw4(:)' dw5(:)' dw6(:)' dw7(:)' dw8(:)' ]'; • Max_iter=3 • Dim is in backprop.m= [l1; l2; l3; l4; l5; l6; l7; l8; l9]; • Data is in backprop.m = data=[]; for kk=1:10 data=[data batchdata(:,:,(tt-1)*10+kk)]; end • 10 minibatches combined into 1 larger batch
  • 20. CG_MNIST 1 • function [f, df] = CG_MNIST(VV,Dim,XX); • l1 = Dim(1); % 784 • l2 = Dim(2); % 1000 • l3 = Dim(3); % 500 • l4= Dim(4); % 250 • l5= Dim(5); % 30 • l6= Dim(6); % 250 • l7= Dim(7); % 500 • l8= Dim(8); % 1000 • l9= Dim(9); % 784 • N = size(XX,1); % 1000 • Set lengths of weights. XX is the batch data of 10 x 100 = 1000 items.
  • 21. CG_MNIST 2 • w1 = reshape(VV(1:(l1+1)*l2),l1+1,l2); % 785x1000 • xxx = (l1+1)*l2; % 785000 • w2 = reshape(VV(xxx+1:xxx+(l2+1)*l3),l2+1,l3); % 1001x500 • xxx = xxx+(l2+1)*l3; % 1285500 • w3 = reshape(VV(xxx+1:xxx+(l3+1)*l4),l3+1,l4); % 501x250 • xxx = xxx+(l3+1)*l4; • w4 = reshape(VV(xxx+1:xxx+(l4+1)*l5),l4+1,l5); % 251x30 • xxx = xxx+(l4+1)*l5; • w5 = reshape(VV(xxx+1:xxx+(l5+1)*l6),l5+1,l6); % 31x250 • xxx = xxx+(l5+1)*l6; • w6 = reshape(VV(xxx+1:xxx+(l6+1)*l7),l6+1,l7); % 251x500 • xxx = xxx+(l6+1)*l7; • w7 = reshape(VV(xxx+1:xxx+(l7+1)*l8),l7+1,l8); % 501x1000 • xxx = xxx+(l7+1)*l8; • w8 = reshape(VV(xxx+1:xxx+(l8+1)*l9),l8+1,l9); % 1001x784 • Extract weights between layers
  • 22. CG_MNIST 3.1 • XX = [XX ones(N,1)]; % 1000x785 • w1probs = 1./(1 + exp(-XX*w1)); • w1probs = [w1probs ones(N,1)]; % 1000x785 * 785x1000 = 1000x100->1000x1001 • w2probs = 1./(1 + exp(-w1probs*w2)); • w2probs = [w2probs ones(N,1)]; % 1000x1001 * 1001x500 = 1000x500->1000x501 • w3probs = 1./(1 + exp(-w2probs*w3)); • w3probs = [w3probs ones(N,1)]; % 1000x501 * 501x250 => 1000x251 • w4probs = w3probs*w4; • w4probs = [w4probs ones(N,1)]; % 1000x251 * 251x30 = 1000x31
  • 23. CG_MNIST 3.2 • W1,w2,w3probs are calculated with sigma function where as w4 is calculated as a Gaussian function. • A forward processing and a backward processing is done for reconstruction. • Processing is done for 1000 batch input items. • (10x100).
  • 24. CG_MNIST 3.3 • w5probs = 1./(1 + exp(-w4probs*w5)); • w5probs = [w5probs ones(N,1)]; % 1000x31 * 31x250 => 1000x251 • w6probs = 1./(1 + exp(-w5probs*w6)); • w6probs = [w6probs ones(N,1)]; % 1000x251 * 251x500 => 1000x501 • w7probs = 1./(1 + exp(-w6probs*w7)); • w7probs = [w7probs ones(N,1)]; % 1000x501 * 501x1000 => 1000x1001 • XXout = 1./(1 + exp(-w7probs*w8)); % 1000x1001 * 1001x784 = 1000x784
  • 25. CG_MNIST 4.1 • f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) + (1-XX(:,1:end-1)).*log(1-XXout))); • % 1000x785-1 .* 1000x784 --> 1x1 • This is the function value returned for this calling of the ‘function’ CG_MNIST by minimize for this databatch of 1000. • If carefully viewed one can notice the similarity of this algorithm to lrcostfunction.m of coursera-ml-mastermlclass-ex3.
  • 26. CG_MNIST 4.2 • f = -1/N*sum(sum( XX(:,1:end-1).*log(XXout) + (1-XX(:,1:end-1)).*log(1-XXout))); • Vs • h_of_x = sigmoid(X * theta); • J = 1 / m * sum( -1 * y' * log(h_of_x) - (1-y') * log(1 - h_of_x) ); • This is practically a cost function.
  • 27. CG_MNIST 5.1 • IO = 1/N*(XXout-XX(:,1:end-1)); % 1000x784 • Ix8=IO; % 1000x784 • dw8 = w7probs'*Ix8; • % 1001x1000 * 1000x784 = 1001x784 • The difference of XXout and XX(data) is divided by the number of cases and this difference is reverse multiplied with w7probs. This gives us the dw8 difference caused by this layer. The outcome IO of each step is similarly reverse multiplied with the weights of each layer and the difference caused by that layer is found.
  • 28. CG_MNIST 5.2 • Ix7 = (Ix8*w8').*w7probs.*(1-w7probs); • % 1000x784 * 784x1001 .* 1000x1001 .* 1-1000x1001 = 1000x1001 • Ix7 = Ix7(:,1:end-1); % 1000x1000 • dw7 = w6probs'*Ix7; % 501x1000 * 1000x1000 = 501x1000 • Ix6 = (Ix7*w7').*w6probs.*(1-w6probs); % 1000x1000 * 1000x501 = 1000x501I • x6 = Ix6(:,1:end-1); % 1000x500 • dw6 = w5probs'*Ix6; % 251x500 • Ix5 = (Ix6*w6').*w5probs.*(1-w5probs); • % 1000x500 * 500x251 =1000x251 • Ix5 = Ix5(:,1:end-1); % 1000x250 • dw5 = w4probs'*Ix5; % 31x250 • Ix4 = (Ix5*w5'); % 1000x250 * 250x31 = 1000x31 • Ix4 = Ix4(:,1:end-1); % 1000x30 • dw4 = w3probs'*Ix4; % 251x100 * 1000x30 = 251x30
  • 29. CG_MNIST 5.3 • Ix3 = (Ix4*w4').*w3probs.*(1-w3probs); • % 1000x30 * 30x251 .* 1000x251 = 1000x251 • Ix3 = Ix3(:,1:end-1); % 1000x250 • dw3 = w2probs'*Ix3; % 501x250 • Ix2 = (Ix3*w3').*w2probs.*(1-w2probs); • % 1000x250 * 250x501 .*1000x501 = 1000x501 • Ix2 = Ix2(:,1:end-1); % 1000x500 • dw2 = w1probs'*Ix2; % 1001x500 • Ix1 = (Ix2*w2').*w1probs.*(1-w1probs); • %1000x500 * 500x1001 = 1000x1001 • Ix1 = Ix1(:,1:end-1); % 1000x1000 • dw1 = XX'*Ix1; % 785x1000
  • 30. CG_MNIST 5.4 • df = [dw1(:)' dw2(:)' dw3(:)' dw4(:)' dw5(:)' dw6(:)' dw7(:)' dw8(:)' ]'; • % 2837314x1 • Finally the df return parameter of CG_MNIST is set.
  • 31. Minimize.m to backprop.m connection • Minimize.m returns the minimized version of VV in the variable X. • [X, fX] = minimize(VV,'CG_MNIST',max_iter,Dim,data);
  • 32. End of backprop.m • w1 = reshape(X(1:(l1+1)*l2),l1+1,l2); xxx = (l1+1)*l2; • w2 = reshape(X(xxx+1:xxx+(l2+1)*l3),l2+1,l3); xxx = xxx+(l2+1)*l3; • w3 = reshape(X(xxx+1:xxx+(l3+1)*l4),l3+1,l4); xxx = xxx+(l3+1)*l4; w4 = reshape(X(xxx+1:xxx+(l4+1)*l5),l4+1,l5); xxx = xxx+(l4+1)*l5; w5 = reshape(X(xxx+1:xxx+(l5+1)*l6),l5+1,l6); xxx = xxx+(l5+1)*l6; w6 = reshape(X(xxx+1:xxx+(l6+1)*l7),l6+1,l7); xxx = xxx+(l6+1)*l7; w7 = reshape(X(xxx+1:xxx+(l7+1)*l8),l7+1,l8); xxx = xxx+(l7+1)*l8; w8 = reshape(X(xxx+1:xxx+(l8+1)*l9),l8+1,l9); • Resetting the weight values according to the X value returned by minimize.m • save mnist_weights w1 w2 w3 w4 w5 w6 w7 w8 • save mnist_error test_err train_err; • Save w values and error values for this epoche.