3. 3
Approaches to Intelligence
Neuroscience
Brain Science
Psychology
Economics
Mechanical
Engineering
Computer Science
(a.k.a. AI)
Observe
brain activities
Observe
human behavior
Mathematical model
- Game theory
- Optimization
Differential equation
Control theory
Mimic human
intelligence
4. 4
Technology Focus of AI Research has Changed Over Time
Hiroshi Maruyama
1st Wave of A. I. (1956-1974)
• Symbol Processing (LISP)
• Means-End Analysis
• Language Parsing
2nd Wave of A. I. (1980-1987)
• Knowledge Representation
• Expert System
• Ontology
3rd Wave of A. I. (2008- )
• Statistical Machine
Learning
• Deep Learning
• Blackbox optimization
- Garbage Collection
- Search Algorithms
- Formal Language Theory
- :
- Object-Oriented Language
- Modeling
- Semantic Web
- :
Inductive Programming
Blackbox computing
5. “AI” is the name of a research field, but …
Research Field Derived
Technologies
Applications
Physics
AI
Internal Combustion
Engine
Semiconductor
Automobile
Computer
Search algorithm
Speech recognition
Image recognition
Car navigation
AI speaker
Autonomous
driving
We do not call
them “Physics”
Some call
them “AI”
6. “Artificial Intelligence” is an Overloaded Term
1. For researchers, AI is a research activity (or field)
to study intelligence by simulating it by machine
— Search, Inference, Optimization, Recognition, NLP, …
2. For AI vendors, AI is ANY information system that
utilizes ANY of above research results
3. For general public, AI is a human-like machine
intelligence
6
8. 8
What is Deep Learning? – A (Stateless) Function
Y = f(X)X Y
Very high-
dimensional, any
combination of
continuous and
categorial variables
Low-dimensional for
classification, very
high-dimensional
for generation
9. 9
Example: Converting Celsius to Fahrenheit
Hiroshi Maruyama
double c2f(double c) {
return 1.8*c + 32.0;
}
Input: C
Output: F
Where F is Fahrenheit
equivalent of C in Celsius
Requirements
Algorithm
F = 1.8 * C + 32Model
A Priori
Knowledge
Model must be know in advance, and
Algorithm must be constructible
10. Training Data Set
Observation
Training(search for parameter θ)
No knowledge on model or algorithm is required!
Alternative Approach – Data-Driven, Inductive Programming
(aka Statistical Modeling)
11. 11
Deep Neural Net as a Universal Computing Mechanism
⚫ Very large number of parameters
⚫ Can approximate ANY high-
dimensional function*
➔ Pseudo Turing Complete!
Output
Input
* G. Cybenko. Approximations by superpositions
of sigmoidal functions. Mathematics of Control,
Signals, and Systems, 2(4):303–314, 1989.
14. Fundamental Limitation of ML (1)
Training data
set
Model
Statistical Machine Learning works only if the
future is similar to the past
Timeline
Data is sampled
at some point in
the past
Training
Inference (i.e., prediction)
based on the trained
model
15. Fundamental Limitation of ML (2)
⚫ Powerless on data in unseen regions
Training Data Set
Interpolation
Extrapolation
??
Statistical Machine Learning does not improvise
16. 16
Fundamental Limitation of ML (3)
⚫ Always works statistically
Original Distribution
i. i. d.
Training Data Set
Trained Model
Random
Sampling !!
No guarantee of “100% correctness”
17. What is Deep Learning – Recap
⚫ A new way of programming (inductive programming)
— No prior knowledge on model or algorithm
⚫ Preparing training dataset is the key
— Creative “teacher signal” allows innovative applications
⚫ It’s statistical modeling
— Assume i. i. d. (independent and identically-distributed)
— Approximation only (no exact answers)
17
20. 20
X: Sensor Input
Y: Actuator Output
Y = f(X, θ)
u(S, Y): Reward function
S: Current State
21. 21
Blackbox optimizers
Optuna: “define-by-run” Bayesian optimizer
https://optuna.org/
Whitebox Optimization
- Simplex algorithm
- Internal point method
The utility function is
known in advance
Blackbox Optimization
- Reinforcement learning
- Bayesian optimization
- Utility function is not known in advance
- Use an external oracle for individual
utility values
x
u(x)
出典:Wikipedia
22. 22
“Programming by Optimization” – How to optimize your program
for particular subset of input
Parametric
Source code
Weaving
Blackbox
Optimizer
Hoos, Holger H. "Programming by optimization." Communications of the ACM 55.2 (2012): 70-80.
Optimized
Code
23. cf. Evolution of Science
Law of Gravitation
1/15, 201623
Hiroshi Maruyama
Model with the smaller number of parameters is the correct one
24. 24
High dimensional science:
Cancer diagnosis based on ExRNA expressions
Cancer diagnosis
Scientists tend to look for a
small set of dominant
parameters (simpler models)
Deep neural network (w/ a
large # of parameters) gives
much higher accuracy
https://www.preferred-networks.jp/en/news
26. Evolution of Computing
Whitebox Computing Blackbox Computing
Theoretical
foundation
Discrete mathematics, esp.
Boolean logic
Probability Theory
Computational
mechanism
Turing Machine Deep Learning, Bayesian Optimization,
…
Problems to solve Well-defined, low-dimensional Ill-defined, very high-dimensional
Programming Hand-crafted (constrained by
human cognitive capacity)
Inductive and/or search-based
Accuracy No error Approximation only
Design principles Modularization, separation of
concerns
Integration
26
28. Maruyama’s Conjecture:
In 2020, more than half of newly developed software have
inductively-trained / blackbox-optmized components
This is the largest paradigm shift since the inventin of digital computer!
30. 30
Myth 1: Deep Learning is unsafe
Wall Street Journal, 7/7, 2016
http://jp.wsj.com/articles/SB11860788629023424577004582173882125060236
Tesla accident, 2016
However, …
Can you guarantee 100% safety if you do
conventional V-shaped development?
出典:Wikipedia
31. 31
Typical bug density (per 1,000 loc in equiv. assembly code)
http://www.softrel.com/Current%20defect%20density%20statistics.pdf
Do not pretend that there are “100% safe” programs!
32. Myth 2:BBC is unexplainable, uncontrollable
⚫ Is Deep Learning unexplainable?
— DL today runs on a digital computer
◆ The same input / training data set / hyper parameters / random number
seeds yields exactly the same output
— You can trace the computation bit-by-bit
— However, it is completely another story if mere human can understand the
trace
⚫ What is “explainability”?
32
33. Could we explain how Fukushima disaster had occurred?
33
東京電力福島原子力発電所における事故調査・検証委員会 最終報告書「概要」27ページ
http://www.cas.go.jp/jp/seisaku/icanps/SaishyuGaiyou.pdf
⚫ The Independent Investigation Commission spent 14 months to
produce total 1,700 pages of the report
“Many points are still unclear”
34. Can you control a complex system?
⚫ Flipping “Kill switch” does not mean “control”
— You cannot shut down the system of a flying airplane, a surgical robot while operation, …
⚫ W. Ashby’s Law of Requisite Variety (1958)
— “If a system is to be stable, the number of states of its control mechanism must be greater
than or equal to the number of states in the system being controlled”
34
It’s the problem’s complexity that makes system
unexplainable / uncontrollable
It’s not because of Deep Learning or Blackbox
Optimization!
35. Can you reduce the complexity of your system?
35
C.S. Holling, Resilience Cycle
Holling, C.S. and Lance H. Gunderson. 2002
Reduction of complexity comes with collapse
→ We may need keep complexity
→ Anticipate big disturbance in your design
J. Casti, X-Events: The Collapse of Everything
ISBN-13: 978-4023311558
https://www.researchgate.net/publication/261338523_ICHIGA
N_Security_-
_A_Security_Architecture_That_Enables_Situation-
Based_Policy_Switching
36. Myth 3:Optimization gives what you want
What happens if we increase the collision penalty to the infinity?
36
Cars that do not move!
You have to be explicit in stating the balance between the utility and the safety
37. 37
A case of Smart Robot
You: “Get me coffee”
The smart robot goes to Starbucks downstairs, sees many people in
the line, kills everybody, and gets coffee to you
Precisely specifying the objective function is very hard
This is “Frame Problem,” still an open problem in AI research
IJCAI 2017 Keynote by Stuart Russell, “Provably Beneficial AI”
38. BBC makes us think
3 Myths
1. BBC is not safe
2. BBC is unexplainable, uncontrollable
3. BBC gives what you want
We have to be explicit about
1. No such thing as “100% safe”
2. Complexity is the enemy, not BBC
3. You have to be careful when you say you want something
Think what we really want!
39. The role of engineering
Theories(e.g.,
structure)
* Safety Factor
New technology is accepted by the society only after it becomes engineering descipline
Civil Engineering Handbook, p999
Why do we trust bridges? Because of the accumulated knowledge
called Civil Engineering
40. 40
We started a SIG in JSSST(MLSE)
https://sites.google.com/view/sig-mlse
Kick-off Symposium (5/17, ~500 participants) MLSE workshop (7/1-2)
JSAI MLSE Session (6/8)
JSSST Annual Convension (8/29-31)
42. The risk that our society relies too much on information systems
42
⚫ Enlightenment (啓蒙思想): Our society’s fundamental assumption
— Every person can reason and choose with his / her free will
— Basis for democracy, capitalism, science, …
⚫ Because of study on AI and cognitive science, the very existence of free
will is in question
“Just as scientific study of the Bible
inadvertently undermined faith in the
Christian God, scientific study of the
mind is inadvertently undermining faith
in the liberal humanist God: the freely-
choosing individual. “
http://quillette.com/2018/03/18/wizard-prophet-steven-pinker-yuval-noah-harari/
ISBN-13: 978-1784703936
43. Two Sides of “Digital Sovereignty” (デジタル主権)
⚫ Originally, Internet is borderless
— Open, bottom-up(IETF, W3C, …)
⚫ Controlling Internet means controlling people
— Have people to buy: A/B Test, Recommendation, …
— Have people to vote: Fake News, Echo-Chamber Effect
⚫ Threat of GAFA
— Fear for giants controlling everything
— GDPR: EU’s “Digital Sovereignty”
⚫ China and Russia to follow suit
— Internet to control citizen
— As a viable alternative to democracy!
43
This is one example of the implications of IT – Please think
44. 44
As IBM Technical Leaders, You should …
Be true to the technologies
Don’t oversell or undersell
Think, and discuss their implications