Motivated Machine Learning for Water Resource Management
1. Motivated Machine Learning for Water Resource Management Janusz Starzyk School of Electrical Engineering and Computer Science, Ohio University, USA www.ent.ohiou.edu/~starzyk UNESCO Workshop on Integrated Modeling Approaches to Support Water Resource Decision Making: Crossing the Chasm
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15. INPUT OUTPUT Simulation or Real-World System Task Environment Agent Architecture Long-term Memory Short-term Memory Reason Act Perceive RETRIEVAL LEARNING EI Interaction with Environment From Randolph M. Jones, P : www.soartech.com
16. How to Motivate a Machine ? The fundamental question is how to motivate a machine to do anything, in particular to increase its “brain” complexity? How to motivate it to explore the environment and learn how to effectively work in this environment? Can a machine that only implements externally given goals be intelligent? If not how these goals can be created ?
17.
18.
19. Primitive Goal Creation - + Pain Dry soil Primitive level open tank sit on garbage refill faucet w. can water Dual pain
20.
21.
22. GCS vs. Reinforcement Learning RL Actor-critic design Goal creation system Case study: “How can Wall-E water his plants if the water resources are limited and hard to find?” Sensory pathway Motor pathway GCS Environment Pain States Gate control Desired action &state Action decision Action
23. Goal Creation Experiment Sensory-motor pairs and their effect on the environment - lake water fall rain 29 lake water reservoir water open pipe 22 reservoir water water in tank refill tank 15 water in tank water in can open faucet 8 water in can moisture water the plant water can 1 DECREASES INCREASES MOTOR SENSORY PAIR #
24. Results from GCS scheme 0 100 200 300 400 500 600 0 2 4 pain Dry soil 0 100 200 300 400 500 600 0 1 2 pain No water in can 0 100 200 300 400 500 600 0 1 2 pain No water in tank 0 100 200 300 400 500 600 0 0.5 1 pain No water in reservoir 0 100 200 300 400 500 600 0 2 4 pain No water in lake
25. GCS vs. Reinforcement Learning Averaged performance over 10 trials: GCS: RL: Machine using GCS learns to control all abstract pains and maintains the primitive pain signal on a low level in demanding environment conditions. 0 100 200 300 400 500 600 0 10 20 30
27. Goal Creation Experiment The average pain signals in 100 CGS simulations 0 100 200 300 400 500 600 0 0.5 Primitive pain – dry soil Pain 0 100 200 300 400 500 600 0 0.1 0.2 Lack of water in can Pain 0 100 200 300 400 500 600 0 0.1 0.2 Lack of water in tank Pain 0 100 200 300 400 500 600 0 0.1 0.2 Lack of water in reservoir Pain 0 100 200 300 400 500 600 0 0.05 0.1 Lack of water in lake Pain Discrete time
28. Compare RL (TDF) and GCS Mean primitive pain Pp value as a function of the number of iterations. Dashed lines indicate moment when Pp is getting stable - green line for TDF - blue line for GCS.
29.
30.
31. 2002 2010 2020 2030 Biomimetics and Bio-inspired Systems Impact on Space Transportation, Space Science and Earth Science Embryonics Extremophiles DNA Computing Brain-like computing Self Assembled Array Artificial nanopore high resolution Mars in situ life detector Sensor Web Skin and Bone Self healing structure and thermal protection systems Biologically inspired aero-space systems Space Transportation Memristors Biological nanopore low resolution Mission Complexity Biological Mimicking
At first, the only pain that machine receives is the primitive pain. Once machine learns that eating food reduces the primitive pain, the lack of food becomes an abstract pain. As there is less and less food in the environment, the primitive pain increases again (since the machine cannot get the food) and the machine must learn how to get the food (buy the grocery). Once it learns this, a new pain source is created and so on. Notice that the primitive pain is maintained under control eventually in spite of changing environment conditions. In this presented trial, the machine can learn to create, develop and solve all abstract pains in this experiment within 300 iterations. In this experiment, school opportunity is designed as always available. Therefore, it is noted in Figure 6.18 that the abstract pain for “lack of school opportunity”, although was created when solving lower level pains, were never activated and stayed zero.