Mnistauto 2

An Analysis of RBM.m of Hinton’s
mnistdeepauto example
by Ali Riza SARAL
References:
Hinton’s «Lecture 12C _ Restricted Boltzmann Machines»
Hugo Larochelle’s «Neural networks [5.2] _ Restricted Boltzmann machine – inference»
Hugo Larochelle’s «Neural networks [5.4] _ Restricted Boltzmann machine - contrastive divergence»

Mnistdeepauto mechanism for calling
RBM
• RBM is called 4 times from mnistdeepauto.m
• Mnistdeepauto.m passes batchdata and numhid
to rbm.
• It controls rbm to run for 1000, 500, 250, 30
numhid 4 layers.
• Each of these layers return vishid hidrecbiases
visbiases (in their own names)
• Mnistdeepauto saves these for backprop to use
• save mnistvh vishid hidrecbiases visbiases;

How RBM works
• RBM.m is a single epoch loop. It repeats the
same processing with its updated data to
approach a better solution.
• for epoch = epoch:maxepoch, % 1 : 10
• ...
• fprintf(1, 'epoch %4i error %6.1f n', epoch,
errsum);
• end;

Data update of RBM epoch loops
• %%%%%%%%% UPDATE WEIGHTS AND BIASES
• vishidinc = momentum*vishidinc + ... % 784x1000
• epsilonw*( (posprods-negprods)/numcases - weightcost*vishid);
% 784x 1000
•
• visbiasinc = momentum*visbiasinc + (epsilonvb/numcases)*(posvisact-
negvisact); % 1x784 + 1x784 - 1x784
• hidbiasinc = momentum*hidbiasinc + (epsilonhb/numcases)*(poshidact-
neghidact); % 1x1000 + 1x1000 - 1x1000
• vishid = vishid + vishidinc; % 784x1000 + 784x1000
• visbiases = visbiases + visbiasinc; % 1x784 + x784
• hidbiases = hidbiases + hidbiasinc; % 1x1000 + 1x1000
• %%%%%%%%%%%%%%%% END OF UPDATES

Data update at the bottom of epoche
loop
• Vishid, visbiases, hidbiases are calculated at
the bottom of the RBM epoch loop and they
are going to be used at the next loop.
• The new values of Vishid, visbiases, hidbiases
are calculated by adding vishidinc, visbiasinc,
hidbiasinc.
• RBM epoch loop basically calculates vishidinc,
visbiasinc, hidbiasinc.

RBM constants initialization
• vishidinc, visbiasinc, hidbiasinc use some
constants.
• epsilonw = 0.1; % Learning rate for weights
• epsilonvb = 0.1; % Learning rate for biases of
visible units
• epsilonhb = 0.1; % Learning rate for biases of
hidden units
• weightcost = 0.0002;
• initialmomentum = 0.5;
• finalmomentum = 0.9;

RBM variables’ definition and init
mechanism
• İnit mechanism depends on the restart variable.
• Mnistdeepauto has:
• restart=1;
• rbm;
• ...
• Rbm.m has :
• if restart ==1,
• restart=0;
• epoch=1;% Initializing symmetric weights and biases.
• ...
• End
• This mechanism disables the initialization at the further epoches
than 0

RBM variables definition and init
• % Initializing symmetric weights and biases.
• vishid = 0.1*randn(numdims, numhid); %784x1000
• hidbiases = zeros(1,numhid); % 1x1000
• visbiases = zeros(1,numdims); % 1x784
•
• poshidprobs = zeros(numcases,numhid); %100x1000
• neghidprobs = zeros(numcases,numhid); %100x1000
• posprods = zeros(numdims,numhid); %784x1000
• negprods = zeros(numdims,numhid); %784x1000
• vishidinc = zeros(numdims,numhid); %784x1000
• hidbiasinc = zeros(1,numhid); %1x1000
• visbiasinc = zeros(1,numdims); %1x784
• batchposhidprobs=zeros(numcases,numhid,numbatches); %100x1000x600
• The values of these variables change according to the current epoch.

Batches loop
• Each repetition of epoches loop is executed for all
the batches (600 of them)
• fprintf(1,'epoch %dn',epoch);
• errsum=0;
• for batch = 1:numbatches,
• fprintf(1,'epoch %d batch %dn',epoch,batch);

Accumulating the errsum per epoch
• errsum=0;
• %Errsum is initiated before batch processing begins in each epoch
• ...
• %After the END OF NEGATIVE PHASE
• err= sum(sum( (data-negdata).^2 )); %1x1
• errsum = err + errsum;
•
• %Error is calculated based on the difference of data and negdata
and accumulated from all the batches using the errsum variable
based on the current epoch.
• The data/negdata concept comes from contrastive divergence.

Varying momentum
• Before passing to the next batch it resets the
momentum variable that effects the rate of changes
• if epoch>5,
• momentum=finalmomentum;
• else momentum=initialmomentum;
• end;
• A varying momentum maybe necessary to catch a
• converging point quickly and then approach it slowly or
otherwise.

The pith of Rbm.m
• The pith of rbm.m is composed of two sections (and the middle of
them):
• %%%%%%%%% START POSITIVE PHASE
• data = batchdata(:,:,batch); % 100x784
• ...
• %%%%%%%%% END OF POSITIVE PHASE
• poshidstates = poshidprobs > rand(numcases,numhid); %100x1000 >
100x1000 = 100 x 1000
• %%%%%%%%% START NEGATIVE PHASE
• negdata = ...
• %%%%%%%%% END OF NEGATIVE PHASE

Positive phase depends on the input x(t) while negative
phase depends on the system - Hugo Larochelle

Gibbs sampling and calculation of x(t) x(1) ...
X(k) corresponds to the epoches loop

Epoches and Gibbs repetition
• errsum=0;
• for batch = 1:numbatches,
• ...
• poshidstates = poshidprobs > rand(numcases,numhid); %100x1000 > 100x1000 = 100 x 1000
• ...
• err= sum(sum( (data-negdata).^2 )); %1x1
• errsum = err + errsum;
• if epoch>5,
• momentum=finalmomentum;
• else
• momentum=initialmomentum;
• end;
• %%%%%%%%% UPDATE WEIGHTS AND BIASES
• ...
• %%%%%%%%%%%%%%%% END OF UPDATES
• end %of batch
• fprintf(1, 'epoch %4i error %6.1f n', epoch, errsum);
• end; %of epoch

Positive Phase
•
• poshidprobs = 1./(1 + exp(-data*vishid -
repmat(hidbiases,numcases,1)));
• %100x784 * 784x1000 - 1x1000repmat->100x1000 = 100x1000
• batchposhidprobs(:,:,batch)=poshidprobs; %100x1000<-batch(600)
• posprods = data' * poshidprobs; %784x100 * 100x1000 = 784x1000
• poshidact = sum(poshidprobs); % 1x1000
• posvisact = sum(data); % 1x784

Calculation of positive hidden
probabilities
• % This section uses a different batch from batcchdata for each loop value of the variable batch.
• %Batchdata is %100x784x600 and numbatches = 600.
• %for batch = 1:numbatches, loop executes this section 600 times for each epoche.
• %100x784 * 784x1000 - 1x1000repmat->100x1000 = 100x1000
• Poshidprobs calculates the possibility of hidden values given the data and vistohid
• Weight vectors. It also adds the hidden biases with repmat.
• After calculating the poshidprobs for the current batch it saves these values to the
batchposhidprobs with the batch loops counter named batch.
• ...

Example
• octave:6> h=1:10
• h =
• 1 2 3 4 5 6 7 8 9 10
• octave:7> n=2
• n = 2
• octave:8> repmat(h,n,1)
• ans =
• 1 2 3 4 5 6 7 8 9 10
• 1 2 3 4 5 6 7 8 9 10

The possibility of h given x is p(h|x) and the
possibility of x given h is p(x|h)

Positive phase continued...
• %100x784 * 784x1000 - 1x1000repmat->100x1000 =
100x1000
• posprods = data' * poshidprobs; %784x100 * 100x1000 = 784x1000

Positive Phase Product calculation
• ...
• posprods = data' * poshidprobs; %784x100 *
100x1000 = 784x1000

Where does product term comes
from?

The pith of Rbm.m
• The pith of rbm.m is composed of two sections (and the middle of them):
• ...
• poshidstates = poshidprobs > rand(numcases,numhid);
• %100x1000 > 100x1000 = 100 x 1000
• negdata = ...

• debug> size(batchposhidprobs)
• ans =
• 100 1000 600
• debug> size(poshidprobs)
• ans = 100 1000
• debug> poshidprobs(1:10)
• ans = 0.64384 0.40320 0.43777 0.50833 0.53826 0.67048 0.20901
0.53981 0.41825 0.48169
• debug> poshidstates(1:10)
• ans =
• 1 0 0 1 1 1 0 1 0 0

• octave:15> clear all
• octave:16> a=rand(2,3)
• a =
• 0.77448 0.25583 0.51168
• 0.82385 0.54641 0.86008
• octave:17> r=rand(2,3)
• r =
• 0.55236 0.47071 0.49259
• 0.64676 0.21294 0.63988
• octave:18> a>r
• ans =
• 1 0 1
• 1 1 1

Poshidprobs to poshidstates
conversion
• poshidstates = poshidprobs > rand(numcases,numhid);
• Positive phase hidden probabilities are taken
as input and used to calculate the input of x(1)
• visual values. The possibility that the hidden
values are right is taken into account by
comparing them with a random generated
number group.

Negative Phase
• negdata = 1./(1 + exp(-poshidstates*vishid' -
repmat(visbiases,numcases,1)));
• % 100x1000 * 1000x784 - 1x784repmat->100x784 = 100 x 784
• neghidprobs = 1./(1 + exp(-negdata*vishid -
• % 100x784 * 784x1000 - 1x1000repmat->100x1000 = 100x1000
• negprods = negdata'*neghidprobs;
• % 784x100 * 100x1000
• neghidact = sum(neghidprobs); % 1x1000
• negvisact = sum(negdata); % 1x784

Negative Phase 2
• negdata = 1./(1 + exp(-poshidstates*vishid' -
repmat(visbiases,numcases,1)));
• % 100x1000 * 1000x784 - 1x784repmat->100x784 = 100 x 784
• neghidprobs = 1./(1 + exp(-negdata*vishid -
• % 100x784 * 784x1000 - 1x1000repmat->100x1000 = 100x1000
• ...

The Product Term
• ...
• negprods = negdata'*neghidprobs;
• % 784x100 * 100x1000
• neghidact = sum(neghidprobs); % 1x1000
• negvisact = sum(negdata); % 1x784

HINTON Lecture 12C _ Restricted
Boltzmann Machines

Mnistauto 2

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Mnistauto 2

Semelhante a Mnistauto 2 (20)

Mais de Ali Rıza SARAL

Mais de Ali Rıza SARAL (7)

Último

Último (20)

Mnistauto 2