Abstract: This PDSG workshop introduces basic concepts of recurrent neural networks. Concepts covered are feed forward vs. recurrent, time progression, memory cells, short term memory predictions and long term memory predictions.
Level: Fundamental
Requirements: No prior programming or statistics knowledge required.
2. Recurrent vs Feed Forward
• Feed Forward Network
• Inputs enter the network and progress forward through layers.
• There is no retention (memory) of past inputs or states, or time.
• Recurrent Neural Network
• Maybe only one layer.
• Outputs are cycled back in as inputs (fixed number of cycles).
• The outputs from previous state are added to inputs of next
state; maintaining memory of a fixed number of past states.
Inputs Outputs
Inputs are Feed
Forward thru
the layers.
3. Recurrent Neural Network
Outputs from each step are added to the inputs of the next step.
Inputs
for S0
Outputs Inputs
for Sx+1
Outputs for Sx+1
Outputs
for Sx
Retained
Single Layer
Inputs for
Initial state.
Retain outputs of
current state.
Retained
Retain outputs of
current state.
Add outputs of
previous state to
inputs of current state.
This is what is meant by RNN retain memory.
Useful for NLP for remembering context, and other
problems which are time series dependent to forecast.
4. Time Series
Time Progression in Recurrent Neural Network
Inputs
for T0
Outputs T0
Inputs for
Time T0
Outputs for time T0
Inputs
for T1
T0
T1
Outputs T1
Inputs
for T2
T2
Outputs T2
Inputs
for T3
T3
Time Progression
View of Neural Network Unrolled, i.e., as each cycle was a separate layer.
5. Prediction – Short Term Memory
Making Predictions in Sequenced Data (i.e., NLP)
Prediction at St
Squashing function for
prediction (tanh)
Past Prediction at St-1
Inputs at St
Example: Predict the next word in a sentence (or search query) based on the last word seen.
Short Term Memory because we only remember the last prediction.
6. Long Short - Term Memory (LSTM)
• Long Short – Term Memory is a type of RNN.
• Adds a layer typically between the input layer and the first
hidden layer.
• Retains some memory of past outputs (long) and a means to
forget (short).
Inputs Outputs
Hidden Layer(s)
The LSTM layer
7. LSTM Layer
LTSM Details (i.e., memory cell)
Xt
ht
Outputs from
Input layer
Outputs (hidden state)
to next layer
Memory
ht
Calculated Output at Time Tt
ht-1
Previous Output from LSTM layer
Inputs
for Tt
Outputs
for Tt
Ct
Constant value from one of
the inputs at Time Tt.
Ct-1
Previous Constant from LSTM layer
8. LTSM Constant Values
• Examples of constant values in LTSM memory cell:
• Time – How much time has passed.
• Location – What is my direction.
• Speed – What is my acceleration
9. Prediction – Long Term Memory
Split Neural Network – Inputs are passed through two duplicated NNs.
(Short Term)
Prediction at St
Memory Cell
Past Prediction at St-1
Inputs at St
(Long Term)
Prediction at St
Inputs are split across two neural networks in parallel. In one, the prediction is done without memory (short term). In the second,
the prediction is done with memory (long term). A final prediction is made from the short term and long term prediction.