SlideShare uma empresa Scribd logo
1 de 80
Baixar para ler offline
Moving Towards Deep Learning
Algorithms on HPCC Systems
Maryam M. Najafabadi
Overview
• L-BFGS
• HPCC Systems
• Implementation of L-BFGS on HPCC Systems
• SoftMax
• Sparse Autoencoder
• Toward Deep Learning
2
Mathematical optimization
• Minimizing/Maximizing a function
Minimum
3
Optimization Algorithms in Machine Learning
• Linear Regression
Minimize Errors
4
Optimization Algorithms in Machine Learning
• SVM
Maximize Margin
5
Optimization Algorithms in Machine Learning
• Collaborative filtering
• K-means
• Maximum likelihood estimation
• Graphical models
• Neural Networks
• Deep Learning
6
Formulate Training as an Optimization Problem
• Training model: finding parameters that minimize some objective
function
Define Parameters
Define an Objective
Function
Find values for the parameters that
minimize the objective function
Cost term Regularization term
Optimization
Algorithm
7
How they work
Search Direction
Step Length
8
Gradient Descent
• Step length
• Constant value
• Search direction
• Negative gradient
9
Gradient Descent
• Step length
• Constant value
• Search direction
• Negative gradient
Small Step Length
10
Gradient Descent
• Step length
• Constant value
• Search direction
• Negative gradient
Large Step Length
11
Newton Methods
• Step length
• Use a line search
• Search direction
• Use Curative Information (Inverse of Hessian Matrix)
12
Quasi Newton Methods
• Problem with large n in Newton methods
• Calculation of inverse of Hessian matrix too expensive
• Continuously updating an approximation of the inverse of the Hessian
matrix in each iteration
13
BFGS
• Broyden, Fletcher, Goldfarb, and Shanno
• Most popular Quasi Newton Method
• Uses Wolfe line search to find step length
• Needs to keep n×n matrix in memory
14
L-BFGS
• Limited-memory: only a few vectors of length n (m×n instead of n×n)
• m << n
• Useful for solving large problems (large n)
• More stable learning
• Uses curvature information to take a more direct route
• faster convergence
15
How to use
• Define a function that calculates Objective value and Gradient
ObjectiveFunc (x, ObjectiveFunc_params, TrainData , TrainLabel)
16
Why L-BFGS?
• Toward Deep Learning
• Optimization is heart of DL and many other ML algorithms
• Popular
• Advantages over SGD
17
HPCC Systems
• Open source, massive parallel-processing computing platform for big
data processing and analytics
• LexisNexis Risk Solutions
• Uses commodity clusters of hardware running on top of the Linux
operating system
• Based on DataFlow programming model
• THOR-ROXIE
• ECL
18
THOR
19
RECORD base
DataFlow Analysis
• Main focus is how the data is being changed
• A Graph represents a transformation on the data
• Each node is an operation
• Edges show the data flow
20
A DataFlow example
21
Id value
1 2
1 3
2 5
1 10
3 4
2 9
Id value
1 2
1 3
1 10
2 5
2 9
3 4
Id value
1 10
2 9
3 4
MAX
ECL
• Enterprise Control Language
• Compiled into optimized C++ code
• Declarative Language provides parallel and distributed DataFlow
oriented processing
22
ECL
• Enterprise Control Language
• Compiled into optimized C++ code
• Declarative Language provides parallel and distributed DataFlow
oriented processing
23
Declarative
• What to accomplish, rather than How to accomplish
• You’re describing what you’re trying to achieve, without instructing
how to do it
24
ECL
• Enterprise Control Language
• Compiled into optimized C++ code
• Declarative Language provides parallel and distributed DataFlow
oriented processing
25
ECL
• Enterprise Control Language
• Compiled into optimized C++ code
• Declarative Language provides parallel and distributed DataFlow
oriented processing
26
27
Id value
1 2
1 3
2 5
1 10
3 4
2 9
28
Id value
2 5
2 9
Node 1
Node 2
READ
Id value
1 2
1 3
3 4
1 10
29
Id value
2 5
2 9
Node 1
Node 2
LOCAL SORT
Id value
1 2
1 3
1 10
3 4
Id value
2 5
2 9
Node 1
Node 2
READ
Id value
1 2
1 3
3 4
1 10
30
Id value
2 5
2 9
Node 1
Node 2
LOCAL SORT
Id value
1 2
1 3
1 10
3 4
Id value
2 5
2 9
Node 1
Node 2
READ
Id value
1 2
1 3
3 4
1 10
Id value
1 2
1 3
1 10
3 4
Id value
2 5
2 9
Node 1
Node 2
LOCAL GROUP
31
Id value
2 5
2 9
Node 1
Node 2
LOCAL SORT
Id value
1 2
1 3
1 10
3 4
Id value
2 5
2 9
Node 1
Node 2
READ
Id value
1 2
1 3
3 4
1 10
Id value
1 2
1 3
1 10
3 4
Id value
2 5
2 9
Node 1
Node 2
LOCAL GROUP
Id value
1 10
3 4
Id value
2 9
Node 1
Node 2
LOCAL AGG/MAX
Back to L-BFGS
• Minimize f(x)
• Start with an initialized x : x0
• Repeatedly update: xk+1 = xk + αkpk
32
Wolfe line search L-BFGS
• If x too large it does not fit in memory of one machine
• Needs m × n memory
• Distribute x on different machines
• Try to do computations locally
• Do global computations as necessary
33
• If x too large it does not fit in memory of one machine
• Needs m × n memory
• Distribute x on different machines
• Try to do computations locally
• Do global computations as necessary
34
• If x too large it does not fit in memory of one machine
• Needs m × n memory
• Distribute x on different machines
• Try to do computations locally
• Do global computations as necessary
35
. . .
• If x too large it does not fit in memory of one machine
• Needs m × n memory
• Distribute x on different machines
• Try to do computations locally
• Do global computations as necessary
36
. . .
. . .
37
• Dot Product
38
1, 3, 6, 8, 10, 9, 1, 2, 3, 9, 8
2, 3, 3, 8, 3, 11, 1, 2, 5, 9, 5
• Dot Product
39
1, 3, 6, 8
3, 11, 1, 2
10, 9, 1, 2 3, 9, 8
Node 1 Node 2 Node 3
5, 9, 52, 3, 3, 8
• Dot Product
40
1, 3, 6, 8
3, 11, 1, 2
10, 9, 1, 2 3, 9, 8
Node 1 Node 2 Node 3
5, 9, 52, 3, 3, 8
LCOAL Dot Product 120 134 136
• Dot Product
41
1, 3, 6, 8
3, 11, 1, 2
10, 9, 1, 2 3, 9, 8
Node 1 Node 2 Node 3
5, 9, 52, 3, 3, 8
LCOAL Dot Product
Global Summation
120 134 136
390
Using ECL for implementing L-BFGS
42
0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
Using ECL for implementing L-BFGS
43
0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
Using ECL for implementing L-BFGS
44
0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
Node1 Node2 Node3 Node4
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Using ECL for implementing L-BFGS
45
0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
Node1 Node2 Node3 Node4
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Node 1
Node 4
Node 2
Node 3
Using ECL for implementing L-BFGS
46
0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Node 1
Node 4
Node 2
Node 3
Example of LOCAL operations
• Scale
47
Example of LOCAL operations
• Scale
48
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Node 1
Node 4
Node 2
Node 3
x
Example of LOCAL operations
• Scale
49
Node_id partition_values
1 1, 3, 68
2 2, 7, 5, 5
3 5, 3, 4, 6
4 7, 7
Node 1
Node 4
Node 2
Node 3
x_10
Example of Global operation
• Dot Product
50
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Node 1
Node 4
Node 2
Node 3
Node_id partition_values
1 1, 3, 68
2 2, 7, 5, 5
3 5, 3, 4, 6
4 7, 7
Node 1
Node 4
Node 2
Node 3
x x_10
Example of Global operation
• Dot Product
51
Node_id partition_values
1 0.1, 0.3, 0.6, 0.8
2 0.2, 0.7, 0.5, 0.5
3 0.5, 0.3, 0.4, 0.6
4 0.7, 0.7
Node 1
Node 4
Node 2
Node 3
Node_id partition_values
1 1, 3, 6, 8
2 2, 7, 5, 5
3 5, 3, 4, 6
4 7, 7
Node 1
Node 4
Node 2
Node 3
x x_10
Example of Global operation
• Dot Product
52
Node_id dot_value
1 2.27
2 1.39
3 1.67
4 0.98
dot_local
Example of Global operation
• Dot Product
53
Node_id dot_value
1 2.27
2 1.39
3 1.67
4 0.98
dot_local
L-BFGS based Implementations
• Softmax
• Sparse Autoencoder
54
SoftMax Regression
• Generalizes logistic regression
• More than two classes
• MNIST -> 10 different classes
55
Formulate to an optimization problem
• Parameters
• K × f variables
• Objective function
• Generalize logistic regression objective function
• Define a function to calculate objective value and Gradient at a give
point
56
SoftMax Results
57
SoftMax Results
• Lshtc-large
• 410 GB
• 61 itr, 81 fun
• 1 hour
• Wikipedia-medium
• 1,048 GB
• 12 itr, 21 fun
• Half an hour
58
400 Nodes
More Examples
• Parameter matrix in SoftMax: K × f
• Data Matrix: f × m
• Multiply these two matrix
• Result is K × m
59
60
K
f
f
m
If parameter matrix is small
61
K
f
f
m
62
Node1 Node2 Node3
63
Node1 Node2 Node3
64
Node1 Node2 Node3
K
f
f
m1 m2 m3
65
Node1 Node2 Node3
K
f
f
m1 m2 m3
LOCAL JOIN
K×m1 K×m2 K×m3
66
Node1 Node2 Node3
K
f
f
m1 m2 m3
LOCAL JOIN
K×m1 K×m2 K×m3
K×m
If both matrices big
67
K
f
f
m
If both matrices big
68
f1
f
m2
K
m1
m3
f2 f3
If both matrices big
69
f1
f
m
K
f2 f3
f1
f2
f3
K×m
If both matrices big
70
f1
f
m
K
f2 f3
f1
f2
f3
K×m K×m
If both matrices big
71
f1
f
m
K
f2 f3
f1
f2
f3
K×m K×m K×m
If both matrices big
72
f1
f
m
K
f2 f3
f1
f2
f3
K×m K×m K×m ROLLUP
Sparse Autoencoder
• Autoencoder
• Output is the same as the input
• Sparsity
• constraint the hidden neurons to be inactive most of the time
• Stacking them up makes a Deep Network
73
Formulate to an optimization problem
• Parameters
• Weight and bias values
• Objective function
• Difference between output and expected output
• Penalty term to impose sparsity
• Define a function to calculate objective value and Gradient at a give
point
74
Sparse Autoencoder results
• 10,000 samples of randomly 8×8 selected patches
75
Sparse Autoencoder results
• MNIST dataset
76
Toward Deep Learning
• Provide learned features from one layer to another sparse
autoencoder
• …. Stack up to build a deep network
• Fine tuning
• Using forward propagation to calculate cost value and back propagation to
calculate gradients
• Use L-BFGS to fine tune
77
SUMMARY
• HPCC Systems allows implementation of Large-scale ML algorithms
• Optimization Algorithms an important aspect for advanced machine
learning problems
• L-BFGS implemented on HPCC Systems
• SoftMax
• Sparse Autoencoder
• Implement other algorithms by calculating objective value and gradient
• Toward deep learning
78
• HPCC Systems
• https://hpccsystems.com/
• ECL-ML Library
• https://github.com/hpcc-systems/ecl-ml
• My GitHub
• https://github.com/maryamregister
• My Email
• mmousaarabna2013@fau.edu
79
Questions?
Thank You
80

Mais conteúdo relacionado

Mais procurados

Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
Turi, Inc.
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshop
odsc
 

Mais procurados (20)

Language translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlowLanguage translation with Deep Learning (RNN) with TensorFlow
Language translation with Deep Learning (RNN) with TensorFlow
 
Urs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksUrs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural Networks
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
ODSC West
ODSC WestODSC West
ODSC West
 
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflowNVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
NVIDIA 深度學習教育機構 (DLI): Image segmentation with tensorflow
 
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural NetsPython for Image Understanding: Deep Learning with Convolutional Neural Nets
Python for Image Understanding: Deep Learning with Convolutional Neural Nets
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 
Using neon for pattern recognition in audio data
Using neon for pattern recognition in audio dataUsing neon for pattern recognition in audio data
Using neon for pattern recognition in audio data
 
Deep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep FeaturesDeep Learning Made Easy with Deep Features
Deep Learning Made Easy with Deep Features
 
Introduction to Neural Networks in Tensorflow
Introduction to Neural Networks in TensorflowIntroduction to Neural Networks in Tensorflow
Introduction to Neural Networks in Tensorflow
 
[246]reasoning, attention and memory toward differentiable reasoning machines
[246]reasoning, attention and memory   toward differentiable reasoning machines[246]reasoning, attention and memory   toward differentiable reasoning machines
[246]reasoning, attention and memory toward differentiable reasoning machines
 
Introduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will ConstableIntroduction to Deep Learning with Will Constable
Introduction to Deep Learning with Will Constable
 
Intro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNetIntro to Scalable Deep Learning on AWS with Apache MXNet
Intro to Scalable Deep Learning on AWS with Apache MXNet
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
DIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe WorkshopDIY Deep Learning with Caffe Workshop
DIY Deep Learning with Caffe Workshop
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr..."Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
"Trade-offs in Implementing Deep Neural Networks on FPGAs," a Presentation fr...
 

Semelhante a Moving Toward Deep Learning Algorithms on HPCC Systems

Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
HPCC Systems
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
Edge AI and Vision Alliance
 

Semelhante a Moving Toward Deep Learning Algorithms on HPCC Systems (20)

04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
[db analytics showcase Sapporo 2018] B33 H2O4GPU and GoAI: harnessing the pow...
 
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim HunterWeb-Scale Graph Analytics with Apache Spark with Tim Hunter
Web-Scale Graph Analytics with Apache Spark with Tim Hunter
 
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix SchemesOptimizing Set-Similarity Join and Search with Different Prefix Schemes
Optimizing Set-Similarity Join and Search with Different Prefix Schemes
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
 
Naist2015 dec ver1
Naist2015 dec ver1Naist2015 dec ver1
Naist2015 dec ver1
 
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache ApexApache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
Apache Big Data 2016: Next Gen Big Data Analytics with Apache Apex
 
POLARDB for MySQL - Parallel Query
POLARDB for MySQL - Parallel QueryPOLARDB for MySQL - Parallel Query
POLARDB for MySQL - Parallel Query
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performance
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
SQL Server Deep Drive
SQL Server Deep Drive SQL Server Deep Drive
SQL Server Deep Drive
 
BigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache ApexBigDataSpain 2016: Introduction to Apache Apex
BigDataSpain 2016: Introduction to Apache Apex
 
Dream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input FilesDream3D and its Extension to Abaqus Input Files
Dream3D and its Extension to Abaqus Input Files
 
Libra : A Compatible Method for Defending Against Arbitrary Memory Overwrite
Libra : A Compatible Method for Defending Against Arbitrary Memory OverwriteLibra : A Compatible Method for Defending Against Arbitrary Memory Overwrite
Libra : A Compatible Method for Defending Against Arbitrary Memory Overwrite
 
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui MengChallenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
Challenging Web-Scale Graph Analytics with Apache Spark with Xiangrui Meng
 
Challenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache SparkChallenging Web-Scale Graph Analytics with Apache Spark
Challenging Web-Scale Graph Analytics with Apache Spark
 
13 risc
13 risc13 risc
13 risc
 

Mais de HPCC Systems

Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
HPCC Systems
 

Mais de HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Expanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network CapabilitiesExpanding HPCC Systems Deep Neural Network Capabilities
Expanding HPCC Systems Deep Neural Network Capabilities
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 

Último

Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
SayantanBiswas37
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
nirzagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
ahmedjiabur940
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 

Último (20)

Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Statistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbersStatistics notes ,it includes mean to index numbers
Statistics notes ,it includes mean to index numbers
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Kings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about themKings of Saudi Arabia, information about them
Kings of Saudi Arabia, information about them
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Moving Toward Deep Learning Algorithms on HPCC Systems

  • 1. Moving Towards Deep Learning Algorithms on HPCC Systems Maryam M. Najafabadi
  • 2. Overview • L-BFGS • HPCC Systems • Implementation of L-BFGS on HPCC Systems • SoftMax • Sparse Autoencoder • Toward Deep Learning 2
  • 4. Optimization Algorithms in Machine Learning • Linear Regression Minimize Errors 4
  • 5. Optimization Algorithms in Machine Learning • SVM Maximize Margin 5
  • 6. Optimization Algorithms in Machine Learning • Collaborative filtering • K-means • Maximum likelihood estimation • Graphical models • Neural Networks • Deep Learning 6
  • 7. Formulate Training as an Optimization Problem • Training model: finding parameters that minimize some objective function Define Parameters Define an Objective Function Find values for the parameters that minimize the objective function Cost term Regularization term Optimization Algorithm 7
  • 8. How they work Search Direction Step Length 8
  • 9. Gradient Descent • Step length • Constant value • Search direction • Negative gradient 9
  • 10. Gradient Descent • Step length • Constant value • Search direction • Negative gradient Small Step Length 10
  • 11. Gradient Descent • Step length • Constant value • Search direction • Negative gradient Large Step Length 11
  • 12. Newton Methods • Step length • Use a line search • Search direction • Use Curative Information (Inverse of Hessian Matrix) 12
  • 13. Quasi Newton Methods • Problem with large n in Newton methods • Calculation of inverse of Hessian matrix too expensive • Continuously updating an approximation of the inverse of the Hessian matrix in each iteration 13
  • 14. BFGS • Broyden, Fletcher, Goldfarb, and Shanno • Most popular Quasi Newton Method • Uses Wolfe line search to find step length • Needs to keep n×n matrix in memory 14
  • 15. L-BFGS • Limited-memory: only a few vectors of length n (m×n instead of n×n) • m << n • Useful for solving large problems (large n) • More stable learning • Uses curvature information to take a more direct route • faster convergence 15
  • 16. How to use • Define a function that calculates Objective value and Gradient ObjectiveFunc (x, ObjectiveFunc_params, TrainData , TrainLabel) 16
  • 17. Why L-BFGS? • Toward Deep Learning • Optimization is heart of DL and many other ML algorithms • Popular • Advantages over SGD 17
  • 18. HPCC Systems • Open source, massive parallel-processing computing platform for big data processing and analytics • LexisNexis Risk Solutions • Uses commodity clusters of hardware running on top of the Linux operating system • Based on DataFlow programming model • THOR-ROXIE • ECL 18
  • 20. DataFlow Analysis • Main focus is how the data is being changed • A Graph represents a transformation on the data • Each node is an operation • Edges show the data flow 20
  • 21. A DataFlow example 21 Id value 1 2 1 3 2 5 1 10 3 4 2 9 Id value 1 2 1 3 1 10 2 5 2 9 3 4 Id value 1 10 2 9 3 4 MAX
  • 22. ECL • Enterprise Control Language • Compiled into optimized C++ code • Declarative Language provides parallel and distributed DataFlow oriented processing 22
  • 23. ECL • Enterprise Control Language • Compiled into optimized C++ code • Declarative Language provides parallel and distributed DataFlow oriented processing 23
  • 24. Declarative • What to accomplish, rather than How to accomplish • You’re describing what you’re trying to achieve, without instructing how to do it 24
  • 25. ECL • Enterprise Control Language • Compiled into optimized C++ code • Declarative Language provides parallel and distributed DataFlow oriented processing 25
  • 26. ECL • Enterprise Control Language • Compiled into optimized C++ code • Declarative Language provides parallel and distributed DataFlow oriented processing 26
  • 27. 27 Id value 1 2 1 3 2 5 1 10 3 4 2 9
  • 28. 28 Id value 2 5 2 9 Node 1 Node 2 READ Id value 1 2 1 3 3 4 1 10
  • 29. 29 Id value 2 5 2 9 Node 1 Node 2 LOCAL SORT Id value 1 2 1 3 1 10 3 4 Id value 2 5 2 9 Node 1 Node 2 READ Id value 1 2 1 3 3 4 1 10
  • 30. 30 Id value 2 5 2 9 Node 1 Node 2 LOCAL SORT Id value 1 2 1 3 1 10 3 4 Id value 2 5 2 9 Node 1 Node 2 READ Id value 1 2 1 3 3 4 1 10 Id value 1 2 1 3 1 10 3 4 Id value 2 5 2 9 Node 1 Node 2 LOCAL GROUP
  • 31. 31 Id value 2 5 2 9 Node 1 Node 2 LOCAL SORT Id value 1 2 1 3 1 10 3 4 Id value 2 5 2 9 Node 1 Node 2 READ Id value 1 2 1 3 3 4 1 10 Id value 1 2 1 3 1 10 3 4 Id value 2 5 2 9 Node 1 Node 2 LOCAL GROUP Id value 1 10 3 4 Id value 2 9 Node 1 Node 2 LOCAL AGG/MAX
  • 32. Back to L-BFGS • Minimize f(x) • Start with an initialized x : x0 • Repeatedly update: xk+1 = xk + αkpk 32 Wolfe line search L-BFGS
  • 33. • If x too large it does not fit in memory of one machine • Needs m × n memory • Distribute x on different machines • Try to do computations locally • Do global computations as necessary 33
  • 34. • If x too large it does not fit in memory of one machine • Needs m × n memory • Distribute x on different machines • Try to do computations locally • Do global computations as necessary 34
  • 35. • If x too large it does not fit in memory of one machine • Needs m × n memory • Distribute x on different machines • Try to do computations locally • Do global computations as necessary 35 . . .
  • 36. • If x too large it does not fit in memory of one machine • Needs m × n memory • Distribute x on different machines • Try to do computations locally • Do global computations as necessary 36 . . . . . .
  • 37. 37
  • 38. • Dot Product 38 1, 3, 6, 8, 10, 9, 1, 2, 3, 9, 8 2, 3, 3, 8, 3, 11, 1, 2, 5, 9, 5
  • 39. • Dot Product 39 1, 3, 6, 8 3, 11, 1, 2 10, 9, 1, 2 3, 9, 8 Node 1 Node 2 Node 3 5, 9, 52, 3, 3, 8
  • 40. • Dot Product 40 1, 3, 6, 8 3, 11, 1, 2 10, 9, 1, 2 3, 9, 8 Node 1 Node 2 Node 3 5, 9, 52, 3, 3, 8 LCOAL Dot Product 120 134 136
  • 41. • Dot Product 41 1, 3, 6, 8 3, 11, 1, 2 10, 9, 1, 2 3, 9, 8 Node 1 Node 2 Node 3 5, 9, 52, 3, 3, 8 LCOAL Dot Product Global Summation 120 134 136 390
  • 42. Using ECL for implementing L-BFGS 42 0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
  • 43. Using ECL for implementing L-BFGS 43 0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x
  • 44. Using ECL for implementing L-BFGS 44 0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x Node1 Node2 Node3 Node4 Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7
  • 45. Using ECL for implementing L-BFGS 45 0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x Node1 Node2 Node3 Node4 Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7 Node 1 Node 4 Node 2 Node 3
  • 46. Using ECL for implementing L-BFGS 46 0.1, 0.3, 0.6, 0.8, 0.2, 0.7, 0.5, 0.5, 0.5, 0.3, 0.4, 0.6, 0.7, 0.7x Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7 Node 1 Node 4 Node 2 Node 3
  • 47. Example of LOCAL operations • Scale 47
  • 48. Example of LOCAL operations • Scale 48 Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7 Node 1 Node 4 Node 2 Node 3 x
  • 49. Example of LOCAL operations • Scale 49 Node_id partition_values 1 1, 3, 68 2 2, 7, 5, 5 3 5, 3, 4, 6 4 7, 7 Node 1 Node 4 Node 2 Node 3 x_10
  • 50. Example of Global operation • Dot Product 50 Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7 Node 1 Node 4 Node 2 Node 3 Node_id partition_values 1 1, 3, 68 2 2, 7, 5, 5 3 5, 3, 4, 6 4 7, 7 Node 1 Node 4 Node 2 Node 3 x x_10
  • 51. Example of Global operation • Dot Product 51 Node_id partition_values 1 0.1, 0.3, 0.6, 0.8 2 0.2, 0.7, 0.5, 0.5 3 0.5, 0.3, 0.4, 0.6 4 0.7, 0.7 Node 1 Node 4 Node 2 Node 3 Node_id partition_values 1 1, 3, 6, 8 2 2, 7, 5, 5 3 5, 3, 4, 6 4 7, 7 Node 1 Node 4 Node 2 Node 3 x x_10
  • 52. Example of Global operation • Dot Product 52 Node_id dot_value 1 2.27 2 1.39 3 1.67 4 0.98 dot_local
  • 53. Example of Global operation • Dot Product 53 Node_id dot_value 1 2.27 2 1.39 3 1.67 4 0.98 dot_local
  • 54. L-BFGS based Implementations • Softmax • Sparse Autoencoder 54
  • 55. SoftMax Regression • Generalizes logistic regression • More than two classes • MNIST -> 10 different classes 55
  • 56. Formulate to an optimization problem • Parameters • K × f variables • Objective function • Generalize logistic regression objective function • Define a function to calculate objective value and Gradient at a give point 56
  • 58. SoftMax Results • Lshtc-large • 410 GB • 61 itr, 81 fun • 1 hour • Wikipedia-medium • 1,048 GB • 12 itr, 21 fun • Half an hour 58 400 Nodes
  • 59. More Examples • Parameter matrix in SoftMax: K × f • Data Matrix: f × m • Multiply these two matrix • Result is K × m 59
  • 61. If parameter matrix is small 61 K f f m
  • 65. 65 Node1 Node2 Node3 K f f m1 m2 m3 LOCAL JOIN K×m1 K×m2 K×m3
  • 66. 66 Node1 Node2 Node3 K f f m1 m2 m3 LOCAL JOIN K×m1 K×m2 K×m3 K×m
  • 67. If both matrices big 67 K f f m
  • 68. If both matrices big 68 f1 f m2 K m1 m3 f2 f3
  • 69. If both matrices big 69 f1 f m K f2 f3 f1 f2 f3 K×m
  • 70. If both matrices big 70 f1 f m K f2 f3 f1 f2 f3 K×m K×m
  • 71. If both matrices big 71 f1 f m K f2 f3 f1 f2 f3 K×m K×m K×m
  • 72. If both matrices big 72 f1 f m K f2 f3 f1 f2 f3 K×m K×m K×m ROLLUP
  • 73. Sparse Autoencoder • Autoencoder • Output is the same as the input • Sparsity • constraint the hidden neurons to be inactive most of the time • Stacking them up makes a Deep Network 73
  • 74. Formulate to an optimization problem • Parameters • Weight and bias values • Objective function • Difference between output and expected output • Penalty term to impose sparsity • Define a function to calculate objective value and Gradient at a give point 74
  • 75. Sparse Autoencoder results • 10,000 samples of randomly 8×8 selected patches 75
  • 76. Sparse Autoencoder results • MNIST dataset 76
  • 77. Toward Deep Learning • Provide learned features from one layer to another sparse autoencoder • …. Stack up to build a deep network • Fine tuning • Using forward propagation to calculate cost value and back propagation to calculate gradients • Use L-BFGS to fine tune 77
  • 78. SUMMARY • HPCC Systems allows implementation of Large-scale ML algorithms • Optimization Algorithms an important aspect for advanced machine learning problems • L-BFGS implemented on HPCC Systems • SoftMax • Sparse Autoencoder • Implement other algorithms by calculating objective value and gradient • Toward deep learning 78
  • 79. • HPCC Systems • https://hpccsystems.com/ • ECL-ML Library • https://github.com/hpcc-systems/ecl-ml • My GitHub • https://github.com/maryamregister • My Email • mmousaarabna2013@fau.edu 79