SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Learning to learn by
gradient descent by
gradient descent
citation: 9 -> 38 Katy, 2016/11/25@DataLab
NIPS 2016
Background
• learn:
1. a task
2. training experience
3. a performance measure
• a computer program is said to learn if its
performance at the task improves with experience.
Mitchell [Mitchell, 1993]
Background
• learning to learn:
1. a family of tasks
2. training experience for each of these tasks
3. a family of performance measures
• an algorithm is said to learn to learn if its
performance at each task improves with
experience and with the number of tasks.
Thrun, Sebastian, and Lorien Pratt, eds. Learning to learn. Springer Science & Business Media, 2012.
Background
• Frequently, tasks in machine learning can be
expressed as the problem of optimizing an
objective function defined over some domain
• The goal is to find the minimizer
• the standard approach for differentiable functions
is some form of gradient descent, resulting in a
sequence of updates
Motivation
• Most of the modern work is based around
designing update rules for specific classes of
problems, it might perform poorly on other class of
problems
Motivation
• In this work we take a different tack and instead
propose to replace hand-designed update rules
with a learned update rule
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Related Work
• C. Daniel, J. Taylor, and S. Nowozin. Learning step
size controllers for robust neural network training. In
Association for the Advancement of Artificial
Intelligence, 2016.
Outline
• Related work or naïve methods
• Main idea
• Evaluation
• Conclusion
Learning to learn with RNN
• In this work, they proposed to replace hand-
designed update rules with a learned update rule,
which we called the optimizer(a LSTM) m, with its
own parameter
• This results in updates to the optimizee f of the form
φ
gt is the output of LSTM
How to train the optimizer
• For training the optimizer, we have an objective that
depends on the trajectory for a time horizon T
• θ the optimizee parameters
• ϕ: the optimizer parameters
• f: the function in question
m is the LSTM
Intuition of Trajectory
old trajectory
trajectory with new φ
Challenge
• too many parameters in LSTM
• solution?
Coordinatewise LSTM
Optimizer
gt
Information Sharing
Between Coordinates
• global average cells(GAC) designate a subset of
the cells in each LSTM layer for communication.
their outgoing activations are averages at each
step across all coordinates.
• allowing different LSTMs to communicate with each
other
Outline
• Related work
• Main idea
• Evaluation
• Conclusion
Experiment 1
Experiment 2: change
structure
Experiment 3: systematically
changing NN architecture
LSTM train on one-hiddent-layer 20-units NN
Experiment 4 on covnet on
CIFAR-10
LSTM-sub: train on only hold out dataset
Experiment 5 on Neural Style, optimizer
train on only one style and 1800 content
image from imageNet
Outline
• Related work or naïve methods
• Main idea
• Evaluation
• Conclusion
Conclusion
• So far the learning process is handcraft, but this
work shows how to train a NN by a NN
• generalize well on different architecture but not on
different activation function
• execution time?
• sometimes, when you are confused for long, try to
email the author(all of them). A typo can kill you.

Mais conteúdo relacionado

Mais procurados

Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksSangwoo Mo
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningSangwoo Mo
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural networkSopheaktra YONG
 
Understanding cnn
Understanding cnnUnderstanding cnn
Understanding cnnRucha Gole
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based ClusteringSSA KPI
 
Interpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex modelsInterpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex modelsManojit Nandi
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Video Transformers.pptx
Video Transformers.pptxVideo Transformers.pptx
Video Transformers.pptxSangmin Woo
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networksguestfee8698
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsMathias Niepert
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.RHARISH R
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers leopauly
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 

Mais procurados (20)

Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural NetworksDeep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
 
Bayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-LearningBayesian Model-Agnostic Meta-Learning
Bayesian Model-Agnostic Meta-Learning
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Feedforward neural network
Feedforward neural networkFeedforward neural network
Feedforward neural network
 
Understanding cnn
Understanding cnnUnderstanding cnn
Understanding cnn
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Density Based Clustering
Density Based ClusteringDensity Based Clustering
Density Based Clustering
 
Interpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex modelsInterpretable machine learning : Methods for understanding complex models
Interpretable machine learning : Methods for understanding complex models
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Video Transformers.pptx
Video Transformers.pptxVideo Transformers.pptx
Video Transformers.pptx
 
Inference in Bayesian Networks
Inference in Bayesian NetworksInference in Bayesian Networks
Inference in Bayesian Networks
 
Learning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for GraphsLearning Convolutional Neural Networks for Graphs
Learning Convolutional Neural Networks for Graphs
 
Autoencoders
AutoencodersAutoencoders
Autoencoders
 
Denoising autoencoder by Harish.R
Denoising autoencoder by Harish.RDenoising autoencoder by Harish.R
Denoising autoencoder by Harish.R
 
Introduction to Visual transformers
Introduction to Visual transformers Introduction to Visual transformers
Introduction to Visual transformers
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 

Semelhante a Learning to Learn by Gradient Descent by Gradient Descent

Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learningKaty Lee
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxMohammadAsim91
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...dyyjkd
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Jeromy Anglim
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Wuhyun Rico Shin
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learningDongHyun Kwak
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyNUPUR YADAV
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsViral Gupta
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningDongHyun Kwak
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Introduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsIntroduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsSayak Paul
 
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...ssuser4b1f48
 
ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016InVID Project
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNINGMLReview
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.pptJayaprasanna4
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.pptJayaprasanna4
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsJoonyoung Yi
 

Semelhante a Learning to Learn by Gradient Descent by Gradient Descent (20)

Optimization as a model for few shot learning
Optimization as a model for few shot learningOptimization as a model for few shot learning
Optimization as a model for few shot learning
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
 
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
Presentation File of paper "Leveraging Normalization Layer in Adapters With P...
 
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
Presentation based on "Hierarchical Bayesian Models of Subtask Learning. Angl...
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation Systems
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
Introduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural netsIntroduction to cyclical learning rates for training neural nets
Introduction to cyclical learning rates for training neural nets
 
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
NS-CUK Seminar: J.H.Lee, Review on "Task Relation-aware Continual User Repres...
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016ELLA LC algorithm presentation in ICIP 2016
ELLA LC algorithm presentation in ICIP 2016
 
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
OPTIMIZATION AS A MODEL FOR FEW-SHOT LEARNING
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppt
 
human computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppthuman computer Interaction cognitive models.ppt
human computer Interaction cognitive models.ppt
 
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with DiscussionsIntroduction to MAML (Model Agnostic Meta Learning) with Discussions
Introduction to MAML (Model Agnostic Meta Learning) with Discussions
 

Mais de Katy Lee

ICML 2017 Meta network
ICML 2017 Meta networkICML 2017 Meta network
ICML 2017 Meta networkKaty Lee
 
Technical interview experience sharing
Technical interview experience sharingTechnical interview experience sharing
Technical interview experience sharingKaty Lee
 
Overcoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkOvercoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkKaty Lee
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural networkKaty Lee
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionKaty Lee
 
FinalReport
FinalReportFinalReport
FinalReportKaty Lee
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_InterpreterKaty Lee
 

Mais de Katy Lee (7)

ICML 2017 Meta network
ICML 2017 Meta networkICML 2017 Meta network
ICML 2017 Meta network
 
Technical interview experience sharing
Technical interview experience sharingTechnical interview experience sharing
Technical interview experience sharing
 
Overcoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural networkOvercoming catastrophic forgetting in neural network
Overcoming catastrophic forgetting in neural network
 
Meta learning with memory augmented neural network
Meta learning with memory augmented neural networkMeta learning with memory augmented neural network
Meta learning with memory augmented neural network
 
Making neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursionMaking neural programming architectures generalize via recursion
Making neural programming architectures generalize via recursion
 
FinalReport
FinalReportFinalReport
FinalReport
 
Neural_Programmer_Interpreter
Neural_Programmer_InterpreterNeural_Programmer_Interpreter
Neural_Programmer_Interpreter
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Learning to Learn by Gradient Descent by Gradient Descent

  • 1. Learning to learn by gradient descent by gradient descent citation: 9 -> 38 Katy, 2016/11/25@DataLab NIPS 2016
  • 2. Background • learn: 1. a task 2. training experience 3. a performance measure • a computer program is said to learn if its performance at the task improves with experience. Mitchell [Mitchell, 1993]
  • 3. Background • learning to learn: 1. a family of tasks 2. training experience for each of these tasks 3. a family of performance measures • an algorithm is said to learn to learn if its performance at each task improves with experience and with the number of tasks. Thrun, Sebastian, and Lorien Pratt, eds. Learning to learn. Springer Science & Business Media, 2012.
  • 4. Background • Frequently, tasks in machine learning can be expressed as the problem of optimizing an objective function defined over some domain • The goal is to find the minimizer • the standard approach for differentiable functions is some form of gradient descent, resulting in a sequence of updates
  • 5. Motivation • Most of the modern work is based around designing update rules for specific classes of problems, it might perform poorly on other class of problems
  • 6. Motivation • In this work we take a different tack and instead propose to replace hand-designed update rules with a learned update rule
  • 7. Outline • Related work • Main idea • Evaluation • Conclusion
  • 8. Outline • Related work • Main idea • Evaluation • Conclusion
  • 9. Related Work • C. Daniel, J. Taylor, and S. Nowozin. Learning step size controllers for robust neural network training. In Association for the Advancement of Artificial Intelligence, 2016.
  • 10. Outline • Related work or naïve methods • Main idea • Evaluation • Conclusion
  • 11. Learning to learn with RNN
  • 12. • In this work, they proposed to replace hand- designed update rules with a learned update rule, which we called the optimizer(a LSTM) m, with its own parameter • This results in updates to the optimizee f of the form φ gt is the output of LSTM
  • 13. How to train the optimizer • For training the optimizer, we have an objective that depends on the trajectory for a time horizon T • θ the optimizee parameters • ϕ: the optimizer parameters • f: the function in question m is the LSTM
  • 14. Intuition of Trajectory old trajectory trajectory with new φ
  • 15.
  • 16. Challenge • too many parameters in LSTM • solution?
  • 18. Information Sharing Between Coordinates • global average cells(GAC) designate a subset of the cells in each LSTM layer for communication. their outgoing activations are averages at each step across all coordinates. • allowing different LSTMs to communicate with each other
  • 19. Outline • Related work • Main idea • Evaluation • Conclusion
  • 22. Experiment 3: systematically changing NN architecture LSTM train on one-hiddent-layer 20-units NN
  • 23. Experiment 4 on covnet on CIFAR-10 LSTM-sub: train on only hold out dataset
  • 24. Experiment 5 on Neural Style, optimizer train on only one style and 1800 content image from imageNet
  • 25. Outline • Related work or naïve methods • Main idea • Evaluation • Conclusion
  • 26. Conclusion • So far the learning process is handcraft, but this work shows how to train a NN by a NN • generalize well on different architecture but not on different activation function • execution time? • sometimes, when you are confused for long, try to email the author(all of them). A typo can kill you.