This talk provides an overview of an important emerging artificial intelligence technology, deep learning neural networks. Deep learning is a branch of computer science focused on machine learning algorithms that model and make predictions about data. A key distinction is that deep learning is not merely a software program, but a new class of information technology that is changing the concept of the modern technology project by replacing hard-coded software with a capacity to learn and execute tasks. In the future, deep learning smart networks might comprise a global computational infrastructure tackling real-time data science problems such as global health monitoring, energy storage and transmission, and financial risk assessment.
2. 6 May 2019
Deep Learning 1
Melanie Swan, Technology Theorist
Philosophy Department, Purdue University,
Indiana, USA
Founder, Institute for Blockchain Studies
Singularity University Instructor; Institute for Ethics and
Emerging Technology Affiliate Scholar; EDGE
Essayist; FQXi Advisor
Traditional Markets Background
Economics and Financial
Theory Leadership
New Economies research group
Source: http://www.melanieswan.com, http://blockchainstudies.org/NSNE.pdf, http://blockchainstudies.org/Metaphilosophy_CFP.pdf
https://www.facebook.com/groups/NewEconomies
3. 6 May 2019
Deep Learning
Deep Learning Smart Network Thesis
2
(1) Deep learning (machine learning) is one of the
latest and most important Artificial Intelligence
technologies.
This is in the bigger context that
(2) Humanity is embarked on a Digital
Transformation Journey, evolving into a
Computation-harnessing Society with Smart
Network Technologies
(Smart networks: autonomous computing networks such as
deep learning nets, blockchains, and UAV fleets)
Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning.
https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning
4. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
3
Image Source: http://www.opennn.net
5. 6 May 2019
Deep Learning
Digital Transformation Journey
Digital transformation: digitizing information and processes
$3.8 trillion global IT spend 2019 (Gartner)
$3.9 trillion global business value derived from AI in 2022
$1.3 trillion Digital Transformation Technologies (IDC)
$77.6 billion spend on AI systems in 2022
4
Source: https://www.gartner.com/en/newsroom/press-releases/2019-01-28-gartner-says-global-it-spending-to-reach--3-8-trillio,
https://www.idc.com/getdoc.jsp?containerId=prUS43381817
Digital transformation
Technology used to
make existing work more
efficient, now technology
is transforming the work
itself
Blockchain, IoT, AI,
Cloud technologies
6. 6 May 2019
Deep Learning
Philosophy of Economic Theory
Future of the Digital Economy
5
Digital InfrastructurePhysical Infrastructure
Digital
Networks
• Natural Resources
• Electricity
• Data
• Communications
Intelligent
Networks
Transportation
Networks
• Blockchain
• Deep Learning
Smart Infrastructure
Traditional
Economy
Digital Economy
1700-1970 1970-2015 2015-2050
Phase 1 Phase 2
Now
IntelligenceDigitization
7. 6 May 2019
Deep Learning
Philosophy of Economic Theory
Longer-term Economic Futures
6
Traditional
Economy
Digital
Economy
CRISPR
Bioprinting
Cellular Therapies
Natural resources
Electricity
Manufacturing
Atoms Bits Cells Energy
Social Networks
Apps
Payments
Now
Biological
Economy
Space
Economy
Phase 1 Phase 2
IntelligenceDigitization
1700-1970 1970-2015 2015-2050 2020-2080 2025-2100
Value
Mining
Settlement
Exploration
Blockchain
Deep Learning
8. 6 May 2019
Deep Learning
Exascale supercomputing 2021e
Exabyte global data volume 2020e: 40 EB
Scientific, governmental, corporate, and personal
Big Data ≠ Smart Data
Sources: http://www.oyster-ims.com/media/resources/dealing-information-growth-dark-data-six-practical-steps/,
https://www.theverge.com/2019/3/18/18271328/supercomputer-build-date-exascale-intel-argonne-national-laboratory-energy
7
Only 6% data protected, only
42% companies say they know
how to extract meaningful
insights from the data available
to them (Oxford Economics
Workforce 2020)
9. 6 May 2019
Deep Learning
Why do we need Learning Technologies?
8
Big data is not smart data (i.e. usable)
New data science methods needed for data growth,
older learning algorithms under-performing
Source: http://blog.algorithmia.com/introduction-to-deep-learning-2016
10. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
9
Image Source: http://www.opennn.net
11. 6 May 2019
Deep Learning
Artificial Intelligence (AI) Argument
Artificial intelligence is using
computers to do cognitive work
(physical or mental) that usually
requires a human
Deep Learning/Machine Learning
is the biggest area in AI
10
Source: Swan, M. Philosophy of Deep Learning Networks: Reality Automation Modules.
Ke Jie vs. AlphaGo AI Go player, Future of
Go Summit, Wuzhen China, May 2017
12. 6 May 2019
Deep Learning
Progression in AI Learning Machines
11
Single-purpose AI:
Hard-coded rules
Multi-purpose AI:
Algorithm detects rules,
reusable template
Question-answering AI:
Natural-language processing
Deep Learning prototypeHard-coded AI machine Deep Learning machine
Deep Blue, 1997 Watson, 2011 AlphaGo, 2016
13. 6 May 2019
Deep Learning 12
Conceptual Definition:
Deep learning is a computer program that can
identify what something is
Technical Definition:
Deep learning is a class of machine learning
algorithms in the form of a neural network that
uses a cascade of layers of processing units to
extract features from data sets in order to make
predictive guesses about new data
Source: Extending Jann LeCun, http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun-
on-deep-learning
What is Deep Learning?
14. 6 May 2019
Deep Learning
How are AI and Deep Learning related?
13
Source: Machine Learning Guide, 9. Deep Learning
Artificial intelligence:
Using computers to do cognitive work
that usually requires a human
Machine learning:
Computers with the capability to learn
using patterns and inference as
opposed to explicit instructions
Neural network:
A computer system modeled on the
human brain and nervous system
Deep learning:
Program that can recognize objects
Deep
Learning
Neural Nets
Machine Learning
Artificial Intelligence
Computer Science
Within the Computer Science
discipline, in the field of Artificial
Intelligence, Deep Learning is a
class of Machine Learning
algorithms, that are in the form
of a Neural Network
15. 6 May 2019
Deep Learning
What is a Neural Net?
14
Intuition: create an Artificial Neural Network to solve
problems in the same way as the human brain
16. 6 May 2019
Deep Learning
Technophysics and Statistical Mechanics
Deep Learning is inspired by Physics
15
Sigmoid function suggested as a model for neurons,
per statistical mechanical behavior (Cowan, 1972)
Stationary solutions for dynamic models (asymmetric
weights create an oscillator to model neuron signaling)
Hopfield Neural Network: content-addressable
memory system with binary threshold nodes,
converges to a local minimum (Hopfield, 1982)
Can use statistical mechanics (Ising model of
ferromagnetism) for neurons
Restricted Boltzmann Machine (Hinton, 1983)
Statistical mechanics and condensed matter: Boltzmann
distribution, free energy, Gibbs sampling, renormalization;
stochastic processing units with binary output
Source: https://www.quora.com/Is-deep-learning-related-to-statistical-physics-particularly-network-science
17. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
16
Image Source: http://www.opennn.net
18. 6 May 2019
Deep Learning
Why is it called “Deep” Learning?
Hidden layers of processing (2-20 intermediary layers)
“Deep” networks (3+ layers) versus “shallow” (1-2 layers)
Basic deep learning network: 5 layers; GoogleNet: 22 layers
17
Sandwich Architecture:
visible Input and Output layers
with hidden processing layers
GoogleNet:
22 layers
19. 6 May 2019
Deep Learning
Why Deep “Learning”?
System is “dumb” (i.e. mechanistic)
“Learns” by having big data (lots of input examples), and making
trial-and-error guesses to adjust weights to find key features
Creates a predictive system to identity new examples
Usual AI argument: big enough data is what makes a
difference (“simple” algorithms run over large data sets)
18
Input: Big Data (e.g.;
many examples)
Method: Trial-and-error
guesses to adjust node weights
Output: system identifies
new examples
20. 6 May 2019
Deep Learning
Sample task: is that a Car?
Create an image recognition system that determines
which features are relevant (at increasingly higher levels
of abstraction) and correctly identifies new examples
19
Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
21. 6 May 2019
Deep Learning
Two classes of Learning Systems
Supervised and Unsupervised Learning
Supervised
Classify labeled data
Unsupervised
Find patterns in
unlabeled data
20
Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning
22. 6 May 2019
Deep Learning
Early success in Supervised Learning (2011)
YouTube: user-classified data
perfect for Supervised Learning
21
Source: Google Brain: Le, QV, Dean, Jeff, Ng, Andrew, et al. 2012. Building high-level features using large scale unsupervised
learning. https://arxiv.org/abs/1112.6209
23. 6 May 2019
Deep Learning
2 main kinds of Deep Learning neural nets
22
Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ
Convolutional Neural Nets
Image recognition
Convolve: roll up to higher
levels of abstraction to identify
feature sets
Recurrent Neural Nets
Speech, text, audio recognition
Recur: iterate over sequential
inputs with a memory function
LSTM (Long Short-Term
Memory) remembers
sequences and avoids
gradient vanishing
24. 6 May 2019
Deep Learning
Image Recognition and Computer Vision
23
Source: Quoc Le, https://arxiv.org/abs/1112.6209; Yann LeCun, NIPS 2016,
https://drive.google.com/file/d/0BxKBnD5y2M8NREZod0tVdW5FLTQ/view
Marv Minsky, 1966
“summer project”
Jeff Hawkins, 2004, Hierarchical
Temporal Memory (HTM)
Quoc Le, 2011, Google
Brain cat recognition
Convolutional net for autonomous driving, http://cs231n.github.io/convolutional-networks
History
Current state of
the art - 2019
25. 6 May 2019
Deep Learning
Image Classification
24
Source: https://cs.stanford.edu/people/karpathy/deepimagesent/?hn
Human-level image recognition and captioning
26. 6 May 2019
Deep Learning
Image Understanding
25
Source: https://cs.stanford.edu/people/karpathy/deepimagesent/?hn
“Understanding” is the system’s three-step process
Image -> internal representation -> text
Labels “tennis racket” = concepts
Machine learning: Kantian-level object recognition, not Hegelian
27. 6 May 2019
Deep Learning
Famous Image Nets
Image recognition (<10% error rate)
AlexNet (2012) - 5 layers
Error rate 15.3% versus 26.2%
VGGNet (2018) - 19 CNN layers
GoogleNet (2019) - 22 CNN layers
BatchNorm (between Conv and Pooling)
Microsoft ResNet (2015) - diverse layers
26
Sources: https://towardsdatascience.com/an-overview-of-resnet-and-its-variants-5281e2f56035,
https://medium.com/coinmonks/paper-review-of-vggnet-1st-runner-up-of-ilsvlc-2014-image-classification-d02355543a11
28. 6 May 2019
Deep Learning
Speed and size of Deep Learning nets?
Google Deep Brain cat recognition, 2011
1 bn connections, 10 mn images (200x200 pixel),
1,000 machines (16,000 cores), 3 days
State of the art, 2016-2019
NVIDIA facial recognition, 100 million images, 10
layers, 1 bn parameters, 30 exaflops, 30 GPU days
Google Net, 11.2 bn parameter system
Lawrence Livermore Lab, 15 bn parameter system
Digital Reasoning, “cognitive computing” (Nashville
TN), 160 bn parameters, trains on three multi-core
computers overnight
27
Parameters: variables that determine the network structure
Sources:,https://futurism.com/biggest-neural-network-ever-pushes-ai-deep-learning, Digital Reasoning paper:
https://arxiv.org/pdf/1506.02338v3.pdf
29. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
28
Image Source: http://www.opennn.net
30. 6 May 2019
Deep Learning
Problem: correctly recognize “apple”
29
Source: Michael A. Nielsen, Neural Networks and Deep Learning
31. 6 May 2019
Deep Learning
Modular Processing Units
30
Source: http://deeplearning.stanford.edu/tutorial
1. Input 2. Hidden layers 3. Output
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Unit: processing unit, logit (logistic
regression unit), perceptron, artificial neuron
32. 6 May 2019
Deep Learning
Image Recognition
Digitize Input Data into Vectors
31
Source: Quoc V. Le, A Tutorial on Deep Learning, Part 1: Nonlinear Classifiers and The Backpropagation Algorithm, 2015, Google
Brain, https://cs.stanford.edu/~quocle/tutorial1.pdf
33. 6 May 2019
Deep Learning
Image Recognition
Log features and trial-and-error test
32
1. Input 2. Hidden layers 3. Output
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist
Mathematical methods used to update the weights
Linear algebra: matrix multiplications of input vectors
Statistics: logistic regression units (Y/N (0,1)), probability weighting
and updating, inference for outcome prediction
Calculus: optimization (minimization), gradient descent in back-
propagation to avoid local minima with saddle points
Feed-forward pass (0,1)
1.5
Backward pass to update probabilities per correct guess
.5.5
.5.5.5
1
10
.75
.25
Inference
Guess
Actual
Feature 1
Feature 2
Feature 3
34. 6 May 2019
Deep Learning
Image Recognition
Levels of Abstraction Object Recognition
33
Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
Layer 1: Log all features (line, edge, unit of sound)
Layer 2: Identify more complicated features (jaw line,
corner, combination of speech sounds)
Layer 3+: Push features to higher levels of abstraction
until full objects can be recognized
35. 6 May 2019
Deep Learning
Image Recognition
Higher Abstractions of Feature Recognition
34
Source: https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
36. 6 May 2019
Deep Learning
Example: NVIDIA Facial Recognition
35
Source: NVIDIA
First hidden layer extracts all possible low-level features
from data (lines, edges, contours); next layers abstract
into more complex features of possible relevance
37. 6 May 2019
Deep Learning
Deep Learning
36
Source: Quoc V. Le et al, Building high-level features using large scale unsupervised learning, 2011, https://arxiv.org/abs/1112.6209
38. 6 May 2019
Deep Learning
Speech, Text, Audio Recognition
Sequence-to-sequence Recognition + LSTM
37
Source: Andrew Ng
LSTM: Long Short Term Memory
Technophysics technique: each subsequent layer remembers
data for twice as long (fractal-type model)
The “grocery store” not the “grocery church”
39. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
38
Image Source: http://www.opennn.net
40. 6 May 2019
Deep Learning
Logistic regression, Lego-like structure of layers of
processing units, and finding the minimum of the curve
3 Key Technical Aspects of Deep Learning
39
Reduce combinatoric
dimensionality
Core processing unit
(input-processing-output)
Levers: weights and bias
Squash values into
Sigmoidal S-curve
-Binary values (Y/N, 0/1)
-Probability values (0 to 1)
-Tanh values 9(-1) to 1)
Loss FunctionPerceptron StructureSigmoid Function
“Dumb” system learns by
adjusting parameters and
checking against outcome
Loss function
optimizes efficiency
of solution
Non-linear curve
(logistic regression)
means manipulability
What
Why
41. 6 May 2019
Deep Learning
1. Regression
Linear Regression
40
House price vs. Size (square feet)
y=mx+b
House price
Size (square feet)
Source: https://www.statcrunch.com/5.0/viewreport.php?reportid=5647
Regression: how does one variable relate to another
42. 6 May 2019
Deep Learning
Logistic Regression
41
Source: http://www.simafore.com/blog/bid/99443/Understand-3-critical-steps-in-developing-logistic-regression-models
43. 6 May 2019
Deep Learning
Logistic Regression
42
Higher-order mathematical
formulation
Sigmoid function
S-shaped and bounded
Maps the whole real axis into a finite
interval (0-1)
Non-linear
Can fit probability
Can apply optimization techniques
Deep Learning classification
predictions are in the form of a
probability value
Source: https://www.quora.com/Logistic-Regression-Why-sigmoid-function
Sigmoid Function
Unit Step Function
44. 6 May 2019
Deep Learning
Sigmoid function: Taleb
43
Source: Swan, M. (2019). Blockchain Theory of Programmable Risk: Black Swan Smart Contracts. In Blockchain Economics: Implications
of Distributed Ledgers - Markets, communications networks, and algorithmic reality. London: World Scientific.
Thesis: mapping a phenomenon to an
s-curve curve (“convexify” it), means
its risk may be controlled
Antifragility = convexity = risk-manageable
Fragility = concavity
Non-linear dose response in medicine
suggests treatment optimality
U-shaped, j-shaped curves in hormesis
(biphasic response); Bell’s theorem
45. 6 May 2019
Deep Learning
Regression (summary)
Logistic regression
Predict binary outcomes:
Perceptron (0 or 1)
Predict probabilities:
Sigmoid Neuron (values 0-1)
Tanh Hyperbolic Tangent
Neuron (values (-1)-1)
44
Logistic Regression (Sigmoid function)
(0-1) or Tanh ((-1)-1)
Linear Regression
Linear regression
Predict continuous set
of values (house prices)
46. 6 May 2019
Deep Learning
2. Lego-like layers of processing units
Deep Learning Architecture
45
Source: Michael A. Nielsen, Neural Networks and Deep Learning
Modular Processing Units
47. 6 May 2019
Deep Learning
More complicated in actual use
Convolutional neural net scale-up for
number recognition
Example data: MNIST dataset
http://yann.lecun.com/exdb/mnist
46
Source: http://www.kdnuggets.com/2016/04/deep-learning-vs-svm-random-forest.html
48. 6 May 2019
Deep Learning
Node Structure: Computation Graph
47
Edge
(input value)
Architecture
Node
(operation)
Edge
(input value)
Edge
(output value)
Example 1
3
4
Add
??
Example 2
3
4
Multiply
??
49. 6 May 2019
Deep Learning
Basic node with Weights and Bias
48
Edge
Input value = 4
Edge
Input value = 16
Edge
Output value = 20
Node
Operation =
Add
Input Values have
Weights w
Nodes have a
Bias bw1* x1
w2*x2
N+b
.25*4=1
.75*16=12
13+2 15
Input Processing Output Variable Weights and
Biases
Basic node structure is fixed: input-processing-output
Weight and bias are variable parameters that are
adjusted as the system iterates and “learns”
Source: http://neuralnetworksanddeeplearning.com/chap1.html
Mimics NAND gate
Basic Node Structure (fixed) Basic Node with Weights and Bias (variable)
50. 6 May 2019
Deep Learning
Image Recognition
Log features and trial-and-error test
49
1. Input 2. Hidden layers 3. Output
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist
Mathematical methods used to update the weights
Linear algebra: matrix multiplications of input vectors
Statistics: logistic regression units (Y/N (0,1)), probability weighting
and updating, inference for outcome prediction
Calculus: optimization (minimization), gradient descent in back-
propagation to avoid local minima with saddle points
Feed-forward pass (0,1)
1.5
Backward pass to update probabilities per correct guess
.5.5
.5.5.5
1
10
.75
.25
Inference
Guess
Actual
Feature 1
Feature 2
Feature 3
51. 6 May 2019
Deep Learning
Actual: same structure, more complicated
50
52. 6 May 2019
Deep Learning 51
Source: https://medium.com/@karpathy/software-2-0-a64152b37c35
Same structure, more complicated values
53. 6 May 2019
Deep Learning
Neural net: massive scale-up of nodes
52
Source: http://neuralnetworksanddeeplearning.com/chap1.html
55. 6 May 2019
Deep Learning
How does the neural net actually “learn”?
Vary the weights
and biases to see if
a better outcome is
obtained
Repeat until the net
correctly classifies
the data
54
Source: http://neuralnetworksanddeeplearning.com/chap2.html
Structural system based on cascading layers of
neurons with variable parameters: weight and bias
56. 6 May 2019
Deep Learning
3. Loss function optimization
Backpropagation
Problem: Combinatorial complexity
Inefficient to test all possible parameter variations
Solution: Backpropagation (1986 Nature paper)
Optimization method used to calculate the error
contribution of each neuron after a batch of data is
processed
55
Source: http://neuralnetworksanddeeplearning.com/chap2.html
57. 6 May 2019
Deep Learning
Backpropagation of errors
1. Calculate the total error
2. Calculate the contribution to the error at each step
going backwards
Variety of Error Calculation methods: Mean Square Error
(MSE), sum of squared errors of prediction (SSE), Cross-
Entropy (Softmax), Softplus
Goal: identify which feature solutions have a higher
power of potential accuracy
56
58. 6 May 2019
Deep Learning
Backpropagation
Heart of Deep Learning
Backpropagation: algorithm dynamically calculates
the gradient (derivative) of the loss function with
respect to the weights in a network to find the
minimum and optimize the function from there
Algorithms optimize the performance of the network by
adjusting the weights, e.g.; in the gradient descent algorithm
Error and gradient are computed for each node
Intermediate errors transmitted backwards through the
network (backpropagation)
Objective: optimize the weights so the network can
learn how to correctly map arbitrary inputs to outputs
57
Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4,
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
59. 6 May 2019
Deep Learning
Gradient Descent
Gradient: derivative to find the minimum of a function
Gradient descent: optimization algorithm to find the
biggest errors (minima) most quickly
Error = MSE, log loss, cross-entropy; e.g.; least correct
predictions to correctly identify data
Technophysics methods: spin glass, simulated
annealing
58
Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4
60. 6 May 2019
Deep Learning
Optimization Technique
Mathematical tool used in statistics, finance, decision
theory, biological modeling, computational neuroscience
State as non-linear equation to optimize
Minimize loss or cost
Maximize reward, utility, profit, or fitness
Loss function links instance of an event to its cost
Accident (event) means $1,000 damage on average (cost)
5 cm height (event) confers 5% fitness advantage (reward)
Deep learning: system feedback loop
Apply cost penalty for incorrect classifications in training
Methods: CNN (classification): cross-entropy; RNN
(regression): MSE
Loss Function
59
Laplace
61. 6 May 2019
Deep Learning
Known problems: Overfitting
Regularization
Introduce additional information
such as a lambda parameter in the
cost function (to update the theta
parameters in the gradient descent
algorithm)
Dropout: prevent complex
adaptations on training data by
dropping out units (both hidden and
visible)
Test new datasets
60
62. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
61
Image Source: http://www.opennn.net
63. 6 May 2019
Deep Learning
Applications: Cats to Cancer to Cognition
62
Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ
Computational imaging: Machine learning for 3D microscopy
https://www.nature.com/nature/journal/v523/n7561/full/523416a.html
64. 6 May 2019
Deep Learning
Radiology: Tumor Image Recognition
63
Source: https://www.nature.com/articles/srep24454
Computer-Aided
Diagnosis with
Deep Learning
Breast tissue
lesions in images
Pulmonary nodules
in CT Scans
65. 6 May 2019
Deep Learning
Melanoma Image Recognition
64
Source: Nature volume542, pages115–118 (02 February 2017
http://www.nature.com/nature/journal/v542/n7639/full/nature21056.html
2017
66. 6 May 2019
Deep Learning
Melanoma Classification
65
Source: https://www.techemergence.com/machine-learning-medical-diagnostics-4-current-applications/
Diagnose skin cancer using deep learning CNNs
Algorithm trained to detect skin cancer (melanoma)
using 130,000 images of skin lesions representing over
2,000 different diseases
67. 6 May 2019
Deep Learning
DIY Image Recognition: use Contrast
66
Source: https://developer.clarifai.com/modelshttps://developer.clarifai.com/models
How many orange pixels?
Apple or Orange? Melanoma risk or healthy skin?
Degree of contrast in photo colors?
68. 6 May 2019
Deep Learning
Deep Learning and Genomics: RNNs
Large classes of hypothesized but unknown correlations
Genotype-phenotype disease linkage unknown
Computer-identifiable patterns in genomic data
RNN: textual analysis; CNN: genome symmetry
67
Source: http://ieeexplore.ieee.org/document/7347331
69. 6 May 2019
Deep Learning
AI Medical Diagnosis
Earlier stage diagnosis, personalized, world health clinic
Smartphone-based diagnostic tools with AI for optical
detection and EVA (enhanced visual assessment)
68
Source: https://spectrum.ieee.org/biomedical/devices/ai-medicine-comes-to-africas-rural-clinics
70. 6 May 2019
Deep Learning
Deep Learning World Clinic
WHO estimates 400 million people without
access to essential health services
6% in extreme poverty due to healthcare costs
Next leapfrog technology: Deep Learning
Last-mile build out of brick-and-mortar clinics
does not make sense in era of digital medicine
Medical diagnosis via image recognition, natural
language processing symptoms description
Convergence Solution: Digital Health Wallet
Deep Learning medical diagnosis + Blockchain-
based EMRs (electronic medical records)
Empowerment Effect: Deep learning = “tool I
use,” not hierarchically “doctor-administered”
69
Source: http://www.who.int/mediacentre/news/releases/2015/uhc-report/en/
Digital Health Wallet:
Deep Learning diagnosis
Blockchain-based EMRs
72. 6 May 2019
Deep Learning
Deep learning neural networks are inspired by the
structure of the cerebral cortex
The processing unit, perceptron, artificial neuron is the
mathematical representation of a biological neuron
In the cerebral cortex, there can be several layers of
interconnected perceptrons
71
Deep Qualia machine? General purpose AI
Mutual inspiration of neurological and computing research
73. 6 May 2019
Deep Learning
Brain is hierarchically organized
Visual cortex is hierarchical with intermediate layers
The ventral (recognition) pathway in the visual cortex has multiple
stages: Retina - LGN - V1 - V2 - V4 - PIT – AIT
Human brain simulation projects
Swiss Blue Brain project, European Human Brain Project
72
Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
74. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
73
Image Source: http://www.opennn.net
75. 6 May 2019
Deep Learning 74
the farther future: better horse is a car.
new technology.
better horse “horseless carriage” => car
76. 6 May 2019
Deep Learning
Autonomous Driving
Deep Learning
Identify what things are
CNNs: core element of machine
vision systems
Scenario-based decision-making
75
77. 6 May 2019
Deep Learning
The Very Small
Deep Learning in Cells
On-board pacemaker data security,
software updates, patient monitoring
Medical nanorobotics for cell repair
Deep Learning: identify what things are
(diagnosis)
Blockchain: secure automation technology
Bio-cryptoeconomics: secure automation
of medical nanorobotics for cell repair
Medical nanorobotics as coming-onboard
repair platform for the human body
High number of agents and “transactions”
Identification and automation is obvious
76
Sources: Swan, M. Blockchain Thinking: The Brain as a DAC (Decentralized Autonomous Corporation)., IEEE 2015; 34(4): 41-52 , Swan,
M. Forthcoming. Technophysics, Smart Health Networks, and the Bio-cryptoeconomy: Quantized Fungible Global Health Care Equivalency
Units for Health and Well-being. In Boehm, F. Ed., Nanotechnology, Nanomedicine, and AI. Boca Raton FL: CRC Press
78. 6 May 2019
Deep Learning
The Very Small
Human Brain/Cloud Interface
77
Sources: Martins, Swan, Freitas Jr., et. al. 2019. Human Brain/Cloud Interface. Front. Neurosci.
79. 6 May 2019
Deep Learning
The Very Large
Deep Learning in Space
Satellite networks
Automated space
construction bots/agents
Deep Learning: identify
what things are
(classification)
Blockchain: secure
automation technology
Applications: asteroid
mining, terraforming,
radiation-monitoring,
space-based solar power,
debris tracking net
78
80. 6 May 2019
Deep Learning
Quantum Machine Learning
79
Quantum Computing: assign an amplitude (not a
probability) for possible states of the world
Amplitudes can interfere destructively and cancel out,
be complex numbers, not sum to 1
Feynman: “QM boils down to the minus signs”
QC: a device that maintains a state that is a
superposition for every configuration of bits
Turn amplitude into probabilities (event probability is
the squared absolute value of its amplitude)
Challenge: obtain speed advantage by exploiting
amplitudes, need to choreograph a pattern of
interference (not measure random configurations)
Sources: Scott Aaronson; and Biamonte, Lloyd, et al. (2017). Quantum machine learning. Nature. 549:195–202.
81. 6 May 2019
Deep Learning
Agenda
Digital Transformation Journey
Artificial Intelligence
Deep Learning
Definition
How does it work?
Technical details
Applications
Near-term
Future
Conclusion
Research and Risks
80
Image Source: http://www.opennn.net
82. 6 May 2019
Deep Learning
Research Topics
Layer depth vs. height: (1x9, 3x3, etc.); L1/2 slow-downs
Dark knowledge: data compression, compress dark
(unseen) knowledge into a single summary model
Adversarial networks: two networks, adversary network
generates false data and discriminator network identifies
Reinforcement networks: goal-oriented algorithm for
system to attain a complex objective over many steps
81
Source: http://cs231n.github.io/convolutional-networks, https://arxiv.org/abs/1605.09304,
https://www.iro.umontreal.ca/~bengioy/talks/LondonParisMeetup_15April2015.pdf
83. 6 May 2019
Deep Learning
Research Topics
82
Sources: Devlin et al. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,
http://prog3.com/sbdm/blog/zouxy09/article/details/8781396
Language representation models
BERT (Bidirectional Encoder
Representations from Transformers)
Deep Belief Network
Connections between layers not units
Find initial weighting guesses for units
as system pre-processing step
Deep Boltzmann Machine
Stochastic recurrent neural network
Internal representations of learning
Represent and solve combinatoric
problems
Deep
Boltzmann
Machine
Deep
Belief
Network
84. 6 May 2019
Deep Learning
Google Deep Dream net
Deep dream generated images
Not random pasting of dog snouts
System synthesizes every pixel in
context, and determines good places
for dog snouts
83
Source: Georges Seurat, Un dimanche après-midi à l'Île de la Grande Jatte, 1884-1886;
http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722; Google DeepDream uses algorithmic pareidolia (seeing an image
when none is present) to create a dream-like hallucinogenic appearance
85. 6 May 2019
Deep Learning
Hardware and Software Innovation
84
86. 6 May 2019
Deep Learning
Hardware advance
TPU and GPU clusters
Chip design and cloud data center
architecture
GPU chips (graphics processing unit): 3D
graphics cards for fast matrix multiplication
Google TPU chip (tensor processing unit):
flow through matrix multiplications without
storing interim values in memory (AlphaGo)
Chip design advances
Google Cloud TPUs: ML accelerators for
TensorFlow; TPU 3.0 pod (8x more
powerful, up to 100 petaflops (2018))
NVIDIA DGX-1 integrated deep learning
system (Eight Tesla P100 GPU
accelerators)
85
Google TPU
Cloud and
Chip
Source: http://www.techradar.com/news/computing-components/processors/google-s-tensor-processing-unit-explained-this-is-what-
the-future-of-computing-looks-like-1326915
NVIDIA DGX-1
87. 6 May 2019
Deep Learning
Software advance
What is TensorFlow?
86
Source: https://www.youtube.com/watch?v=uHaKOFPpphU
Python code invoking TensorFlowTensorBoard (TensorFlow) visualization
Computation graph Design in TensorFlow
“Tensor” = multidimensional arrays used in NN operations
“Flow” directly through tensor operations (matrix multiplications)
without needing to store intermediate values in memory
Google’s open-source
machine learning library
88. 6 May 2019
Deep Learning
Network advance
Edge Device-based Machine Learning
Surveillance camera, USB and
Browser-based Machine Learning
Intel: Movidius Visual Processing
Unit (VPU): USB ML for IOT
Security cameras, industrial
equipment, robots, drones
Apple: ML acquisition Turi (Dato)
Browser-based Deep Learning
ConvNetJS; TensorFire
Javascript library to run Deep
Learning nets in a browser
Smart Network in a browser
JavaScript Deep Learning
Blockchain EtherWallets
87
Source: http://cs.stanford.edu/people/karpathy/convnetjs/, http://www.infoworld.com/article/3212884/machine-learning/machine-learning-
comes-to-your-browser-via-javascript.html
89. 6 May 2019
Deep Learning
Risks and Limitations of Deep Learning
88
Complicated conceptually and technically
Skilled workforce
Limited solution
So far, restricted to a specific range of applications (supervised
learning for image and text recognition)
Plateau: cheap hardware and already-labeled data sets; need
to model complex network science relationships between data
Non-generalizable intelligence
AlphaGo learns each arcade game from scratch
How does the “black box” system work?
Claim: no “learning,” just a clever mapping of the input data
vector space to output solution vector space
Source: Battaglia et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261.
2018
90. 6 May 2019
Deep Learning
Conclusion
• Deep learning is not merely an
AI technique or a software
program, but a new class of
smart network information
technology that is changing the
concept of the modern
technology project by offering
real-time engagement with
reality
• Deep learning is a data
automation method that
replaces hard-coded software
with a capacity, in the form of a
learning network that is trained
to perform a task
89
Conclusion
Deep learning is an AI
software technology for
identifying objects
Applications: healthcare,
autonomous driving, robotics
Deep learning is a new class
of smart network information
technology that is replacing
hard-coded software with a
capacity, in the form of a
learning network that is
trained to perform a task
91. 6 May 2019
Deep Learning
Deep Learning Smart Network Thesis
90
(1) Deep learning (machine learning) is one of the
latest and most important Artificial Intelligence
technologies.
This is in the bigger context that
(2) Humanity is embarked on a Digital
Transformation Journey, evolving into a
Computation-harnessing Society with Smart
Network Technologies
(Smart networks: autonomous computing networks such as
deep learning nets, blockchains, and UAV fleets)
Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning.
https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning
92. 6 May 2019
Deep Learning
Possibility space of Intelligence
91
Sources: http://hplusmagazine.com/2015/09/02/the-space-of-mind-designs-and-the-human-mental-model/,
https://www.nature.com/articles/s41586-019-1138-y
Machine intelligence as its own species
93. 6 May 2019
Deep Learning
Smart networks
The network is the computer
92
Source: https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0
Computing networks
2015+
Computer networking
1970-1980
Computer networks
1990-2010
94. 6 May 2019
Deep Learning
Neural Networks and Deep Learning, Michael Nielsen,
http://neuralnetworksanddeeplearning.com/
Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron
Courville, http://www.deeplearningbook.org/Machine learning and deep neural nets
Machine Learning Guide podcast, Tyler Renelle,
http://ocdevel.com/podcasts/machine-learning
notMNIST dataset http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html
Metacademy; Fast.ai; Keras.io
Resources
93
Distill (visual ML journal)
http://distill.pubSource: http://cs231n.stanford.edu
https://www.deeplearning.ai/
95. 6 May 2019
Deep Learning
Deep Learning frameworks and libraries
94
Source: http://www.infoworld.com/article/3163525/analytics/review-the-best-frameworks-for-machine-learning-and-deep-
learning.html#tk.ifw-ifwsb
97. Melanie Swan
Purdue University
melanie@BlockchainStudies.org
Deep Learning Explained
The future of Artificial Intelligence and Smart Networks
Scientech
Indianapolis IN, May 6, 2019
Slides: http://slideshare.net/LaBlogga
Image credit: NVIDIA
Thank You! Questions?
98. 6 May 2019
Deep Learning
Technophysics Research Program:
Application of physics principles to technology
97
Econophysics
Biophysics • Disease causality: role of cellular dysfunction and environmental degradation
• Concentration limits in short and long range inter-cellular signaling
• Boltzmann distribution and diffusion limits in RNAi and SiRNA delivery
• Path integrals extend point calculations in dynamical systems
• General (not only specialized) Schrödinger for Black Scholes option pricing
• Quantum game theory (greater than fixed sum options), Quantum finance
Smart Networks
(intelligent self-operating networks)
Technologies Tools
• Smart network
field theory
• Optimal control
theory
• Blockchain
• Deep Learning
• UAV, HFT, RTB, IoT
• Satellite, nanorobot
Steam
Light and
ElectromagneticsMechanics Information
21c20c18-19c16-17c
Scientific Paradigms Computational Complexity, Black
Holes, and Quantum Gravity
(Aaronson, Susskind, Zenil)
General Topics
Quantum Computation
• Apply renormalization group to system
criticality and phase transition detection
(Aygun, Goldenfeld) and extend tensor
network renormalization (Evenbly, Vidal)
• Unifying principles: same probability
functions used for spin glasses (statistical
physics), error-correcting (LDPC) codes
(information theory), and randomized
algorithms (computer science) (Mézard)
• Define relationships between statistical
physics and information theory: generalized
temperature and Fisher information, partition
functions and free energy, and Gibbs’
inequality and entropy (Merhav)
• Apply complexity theory to blockchain and deep
learning (dos Santos)
• Apply spin glass models to blockchain and deep
learning (LeCun, Auffinger, Stein)
• Apply deep learning to particle physics (Radovic)
Research Topics
Data Science Method: Science Modules
Technophysics The application of physics principles to the study of technology
(particularly statistical physics and information theory for the control of complex networks)
99. 6 May 2019
Deep Learning
Deep Learning Timeline
98
Source: F. Vazquez, https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0
100. 6 May 2019
Deep Learning
What is a Neural Net?
99
Structure: input-processing-output
Mimic neuronal signal firing structure of brain with
computational processing units
Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning,
http://cs231n.github.io/convolutional-networks/
101. 6 May 2019
Deep Learning
Deep Learning vocabulary
What do these terms mean?
Deep Learning, Machine Learning, Artificial Intelligence
Perceptron, Artificial Neuron, Logit
Deep Belief Net, Artificial Neural Net, Boltzmann Machine
Google DeepDream, Google Brain, Google DeepMind
Supervised and Unsupervised Learning
Convolutional Neural Nets
Recurrent NN & LSTM (Long Short Term Memory)
Activation Function ReLU (Rectified Linear Unit)
Deep Learning libraries and frameworks
TensorFlow, Caffe, Theano, Torch, DL4J
Backpropagation, gradient descent, loss function
100