6. !4
All models are wrong, but some are useful
(George Box)
David Hand
Theory-driven models can be wrong
But data-driven models cannot be wrong
7. !4
All models are wrong, but some are useful
(George Box)
David Hand
Theory-driven models can be wrong
But data-driven models cannot be wrong
or right
8. !4
All models are wrong, but some are useful
(George Box)
David Hand
Theory-driven models can be wrong
But data-driven models cannot be wrong
or right
Data-driven are not trying to describe an underlying reality.
so they could be poor or useless, but not wrong
But are merely intended to be useful
35. !31
• the number of immediate neighbors who are
“heavy” (non-hydrogen) atoms
• the valence minus the number of hydrogens
• the atomic number
• the atomic mass
• the atomic charge
• the number of attached hydrogens
• whether the atom is contained in at least one ring
• hydrogen-bond acceptor or not?
• hydrogen-bond donor or not?
• negatively ionizable or not?
• positively ionizable or not?
• aromatic or not?
• halogen or not?
Rogers+, Extended-Connectivity Fingerprints. J. Chem. Inf. Model., 2010, 50 (5), pp 742–754
Faber+, Prediction Errors of Molecular Machine Learning Models Lower than Hybrid DFT Error. J. Chem. Theory Comput., 2017, 13 (11), pp 5255–5264
38. !34
v
xv<latexit sha1_base64="xMPgqSyD/mawPfrhGPiiW2hkCAw=">AAACi3ichVHNSgJRFD5Of2aZVpugjSRGKzlTQSEtpAha+pM/oCIz09UG54+ZUTLxBVq2aWGbghbRA/QAbXqBFj5CtDRo06Iz40CUZGe4c7/73fOd+917REORLRux7+MmJqemZ/yzgbn54EIovLiUt/SmKbGcpCu6WRQFiymyxnK2bCusaJhMUEWFFcTGgbNfaDHTknXt2G4brKIKdU2uyZJgE1Usi2rnrFttVcNRjKMbkVHAeyAKXqT08COU4QR0kKAJKjDQwCasgAAWfSXgAcEgrgId4kxCsrvPoAsB0jYpi1GGQGyD/nValTxWo7VT03LVEp2i0DBJGYEYvuA9DvAZH/AVP/+s1XFrOF7aNItDLTOqoYuV7Me/KpVmG06/VWM921CDXderTN4Nl3FuIQ31rfOrQTaRiXXW8RbfyP8N9vGJbqC13qW7NMv0xvgRyQu9GDWI/92OUZDfjPMY59Pb0eS+1yo/rMIabFA/diAJR5CCnNuHS+jBNRfktrgEtzdM5XyeZhl+BHf4BTUVky0=</latexit><latexit sha1_base64="xMPgqSyD/mawPfrhGPiiW2hkCAw=">AAACi3ichVHNSgJRFD5Of2aZVpugjSRGKzlTQSEtpAha+pM/oCIz09UG54+ZUTLxBVq2aWGbghbRA/QAbXqBFj5CtDRo06Iz40CUZGe4c7/73fOd+917REORLRux7+MmJqemZ/yzgbn54EIovLiUt/SmKbGcpCu6WRQFiymyxnK2bCusaJhMUEWFFcTGgbNfaDHTknXt2G4brKIKdU2uyZJgE1Usi2rnrFttVcNRjKMbkVHAeyAKXqT08COU4QR0kKAJKjDQwCasgAAWfSXgAcEgrgId4kxCsrvPoAsB0jYpi1GGQGyD/nValTxWo7VT03LVEp2i0DBJGYEYvuA9DvAZH/AVP/+s1XFrOF7aNItDLTOqoYuV7Me/KpVmG06/VWM921CDXderTN4Nl3FuIQ31rfOrQTaRiXXW8RbfyP8N9vGJbqC13qW7NMv0xvgRyQu9GDWI/92OUZDfjPMY59Pb0eS+1yo/rMIabFA/diAJR5CCnNuHS+jBNRfktrgEtzdM5XyeZhl+BHf4BTUVky0=</latexit><latexit sha1_base64="xMPgqSyD/mawPfrhGPiiW2hkCAw=">AAACi3ichVHNSgJRFD5Of2aZVpugjSRGKzlTQSEtpAha+pM/oCIz09UG54+ZUTLxBVq2aWGbghbRA/QAbXqBFj5CtDRo06Iz40CUZGe4c7/73fOd+917REORLRux7+MmJqemZ/yzgbn54EIovLiUt/SmKbGcpCu6WRQFiymyxnK2bCusaJhMUEWFFcTGgbNfaDHTknXt2G4brKIKdU2uyZJgE1Usi2rnrFttVcNRjKMbkVHAeyAKXqT08COU4QR0kKAJKjDQwCasgAAWfSXgAcEgrgId4kxCsrvPoAsB0jYpi1GGQGyD/nValTxWo7VT03LVEp2i0DBJGYEYvuA9DvAZH/AVP/+s1XFrOF7aNItDLTOqoYuV7Me/KpVmG06/VWM921CDXderTN4Nl3FuIQ31rfOrQTaRiXXW8RbfyP8N9vGJbqC13qW7NMv0xvgRyQu9GDWI/92OUZDfjPMY59Pb0eS+1yo/rMIabFA/diAJR5CCnNuHS+jBNRfktrgEtzdM5XyeZhl+BHf4BTUVky0=</latexit><latexit sha1_base64="xMPgqSyD/mawPfrhGPiiW2hkCAw=">AAACi3ichVHNSgJRFD5Of2aZVpugjSRGKzlTQSEtpAha+pM/oCIz09UG54+ZUTLxBVq2aWGbghbRA/QAbXqBFj5CtDRo06Iz40CUZGe4c7/73fOd+917REORLRux7+MmJqemZ/yzgbn54EIovLiUt/SmKbGcpCu6WRQFiymyxnK2bCusaJhMUEWFFcTGgbNfaDHTknXt2G4brKIKdU2uyZJgE1Usi2rnrFttVcNRjKMbkVHAeyAKXqT08COU4QR0kKAJKjDQwCasgAAWfSXgAcEgrgId4kxCsrvPoAsB0jYpi1GGQGyD/nValTxWo7VT03LVEp2i0DBJGYEYvuA9DvAZH/AVP/+s1XFrOF7aNItDLTOqoYuV7Me/KpVmG06/VWM921CDXderTN4Nl3FuIQ31rfOrQTaRiXXW8RbfyP8N9vGJbqC13qW7NMv0xvgRyQu9GDWI/92OUZDfjPMY59Pb0eS+1yo/rMIabFA/diAJR5CCnNuHS+jBNRfktrgEtzdM5XyeZhl+BHf4BTUVky0=</latexit>
hG<latexit sha1_base64="peoO2C4CZtQgbVCeJaYQ32lHrFU=">AAACl3ichVHLLgRBFD3aa7wHG2IjJsRqclskxMaEBEuDQWJk0t3KTEe/0l0zQWd+wA9YSCQkgvgAH2DjByx8gliS2Fi409OJILid6jp16p5bp+rqnmUGkuixQWlsam5pTbS1d3R2dfcke/vWA7fsGyJnuJbrb+paICzTETlpSktser7QbN0SG/refG1/oyL8wHSdNXngiW1bKzrmrmlokqlCsi+v22GpWsjbmiwZmhUuVgvJFKUpiuGfQI1BCnEsu8lb5LEDFwbKsCHgQDK2oCHgbwsqCB5z2wiZ8xmZ0b5AFe2sLXOW4AyN2T3+F3m1FbMOr2s1g0ht8CkWD5+VwxilB7qmF7qnG3qi919rhVGNmpcDnvW6VniFnqOB1bd/VTbPEqVP1Z+eJXYxHXk12bsXMbVbGHV95fD4ZXVmZTQco3N6Zv9n9Eh3fAOn8mpcZMXKyR9+dPbCL8YNUr+34ydYn0irlFazk6nMXNyqBIYwgnHuxxQyWMIyclx/H6e4xJUyqMwqC8pSPVVpiDX9+BJK9gPHYJex</latexit><latexit sha1_base64="peoO2C4CZtQgbVCeJaYQ32lHrFU=">AAACl3ichVHLLgRBFD3aa7wHG2IjJsRqclskxMaEBEuDQWJk0t3KTEe/0l0zQWd+wA9YSCQkgvgAH2DjByx8gliS2Fi409OJILid6jp16p5bp+rqnmUGkuixQWlsam5pTbS1d3R2dfcke/vWA7fsGyJnuJbrb+paICzTETlpSktser7QbN0SG/refG1/oyL8wHSdNXngiW1bKzrmrmlokqlCsi+v22GpWsjbmiwZmhUuVgvJFKUpiuGfQI1BCnEsu8lb5LEDFwbKsCHgQDK2oCHgbwsqCB5z2wiZ8xmZ0b5AFe2sLXOW4AyN2T3+F3m1FbMOr2s1g0ht8CkWD5+VwxilB7qmF7qnG3qi919rhVGNmpcDnvW6VniFnqOB1bd/VTbPEqVP1Z+eJXYxHXk12bsXMbVbGHV95fD4ZXVmZTQco3N6Zv9n9Eh3fAOn8mpcZMXKyR9+dPbCL8YNUr+34ydYn0irlFazk6nMXNyqBIYwgnHuxxQyWMIyclx/H6e4xJUyqMwqC8pSPVVpiDX9+BJK9gPHYJex</latexit><latexit sha1_base64="peoO2C4CZtQgbVCeJaYQ32lHrFU=">AAACl3ichVHLLgRBFD3aa7wHG2IjJsRqclskxMaEBEuDQWJk0t3KTEe/0l0zQWd+wA9YSCQkgvgAH2DjByx8gliS2Fi409OJILid6jp16p5bp+rqnmUGkuixQWlsam5pTbS1d3R2dfcke/vWA7fsGyJnuJbrb+paICzTETlpSktser7QbN0SG/refG1/oyL8wHSdNXngiW1bKzrmrmlokqlCsi+v22GpWsjbmiwZmhUuVgvJFKUpiuGfQI1BCnEsu8lb5LEDFwbKsCHgQDK2oCHgbwsqCB5z2wiZ8xmZ0b5AFe2sLXOW4AyN2T3+F3m1FbMOr2s1g0ht8CkWD5+VwxilB7qmF7qnG3qi919rhVGNmpcDnvW6VniFnqOB1bd/VTbPEqVP1Z+eJXYxHXk12bsXMbVbGHV95fD4ZXVmZTQco3N6Zv9n9Eh3fAOn8mpcZMXKyR9+dPbCL8YNUr+34ydYn0irlFazk6nMXNyqBIYwgnHuxxQyWMIyclx/H6e4xJUyqMwqC8pSPVVpiDX9+BJK9gPHYJex</latexit><latexit sha1_base64="peoO2C4CZtQgbVCeJaYQ32lHrFU=">AAACl3ichVHLLgRBFD3aa7wHG2IjJsRqclskxMaEBEuDQWJk0t3KTEe/0l0zQWd+wA9YSCQkgvgAH2DjByx8gliS2Fi409OJILid6jp16p5bp+rqnmUGkuixQWlsam5pTbS1d3R2dfcke/vWA7fsGyJnuJbrb+paICzTETlpSktser7QbN0SG/refG1/oyL8wHSdNXngiW1bKzrmrmlokqlCsi+v22GpWsjbmiwZmhUuVgvJFKUpiuGfQI1BCnEsu8lb5LEDFwbKsCHgQDK2oCHgbwsqCB5z2wiZ8xmZ0b5AFe2sLXOW4AyN2T3+F3m1FbMOr2s1g0ht8CkWD5+VwxilB7qmF7qnG3qi919rhVGNmpcDnvW6VniFnqOB1bd/VTbPEqVP1Z+eJXYxHXk12bsXMbVbGHV95fD4ZXVmZTQco3N6Zv9n9Eh3fAOn8mpcZMXKyR9+dPbCL8YNUr+34ydYn0irlFazk6nMXNyqBIYwgnHuxxQyWMIyclx/H6e4xJUyqMwqC8pSPVVpiDX9+BJK9gPHYJex</latexit>
v
h(1)
v =
xv
0<latexit sha1_base64="t0YIAtusNim9537WktegaHytwZo=">AAACxXichVFNS9xQFD1GW+3Y1lE3QjdDB4vdDDelYCkIgy506UdHBWOHvPgcH+aL5E1Qw+DeP+DClQWF0h/QH+DGhVsL/oTiUsFNF73JhJZWam9I3nnn3nNz3rsidFWsia56jN6+R4/7B56UBp8+ez5UHh5ZjoN25MiGE7hBtCrsWLrKlw2ttCtXw0jannDlitieyfIriYxiFfgf9G4o1z275atN5diaqWa5bgkv3eo0k4/phPm6U5mqWEK2lJ8Kz9aR2umUsoIdLrCsHBIz0t/4lW+Wq1SjPCr3gVmAKoqYD8pfYWEDARy04UHCh2bswkbMzxpMEELm1pEyFzFSeV6igxJr21wlucJmdpu/Ld6tFazP+6xnnKsd/ovLb8TKCsbpkj7TDZ3TF/pOP/7ZK817ZF52eRVdrQybQwdjS3f/VXm8amz9Vj3oWWMT73Kvir2HOZOdwunqk73Dm6X3i+PpK/pE1+z/mK7ojE/gJ7fOyYJcPHrAj2AvfGM8IPPvcdwHy29qJtXMhbfV+nQxqgG8wEtM8DwmUccc5tHg/qe4wCW+GbOGZ2gj6ZYaPYVmFH+Esf8T3nyqPg==</latexit><latexit sha1_base64="t0YIAtusNim9537WktegaHytwZo=">AAACxXichVFNS9xQFD1GW+3Y1lE3QjdDB4vdDDelYCkIgy506UdHBWOHvPgcH+aL5E1Qw+DeP+DClQWF0h/QH+DGhVsL/oTiUsFNF73JhJZWam9I3nnn3nNz3rsidFWsia56jN6+R4/7B56UBp8+ez5UHh5ZjoN25MiGE7hBtCrsWLrKlw2ttCtXw0jannDlitieyfIriYxiFfgf9G4o1z275atN5diaqWa5bgkv3eo0k4/phPm6U5mqWEK2lJ8Kz9aR2umUsoIdLrCsHBIz0t/4lW+Wq1SjPCr3gVmAKoqYD8pfYWEDARy04UHCh2bswkbMzxpMEELm1pEyFzFSeV6igxJr21wlucJmdpu/Ld6tFazP+6xnnKsd/ovLb8TKCsbpkj7TDZ3TF/pOP/7ZK817ZF52eRVdrQybQwdjS3f/VXm8amz9Vj3oWWMT73Kvir2HOZOdwunqk73Dm6X3i+PpK/pE1+z/mK7ojE/gJ7fOyYJcPHrAj2AvfGM8IPPvcdwHy29qJtXMhbfV+nQxqgG8wEtM8DwmUccc5tHg/qe4wCW+GbOGZ2gj6ZYaPYVmFH+Esf8T3nyqPg==</latexit><latexit sha1_base64="t0YIAtusNim9537WktegaHytwZo=">AAACxXichVFNS9xQFD1GW+3Y1lE3QjdDB4vdDDelYCkIgy506UdHBWOHvPgcH+aL5E1Qw+DeP+DClQWF0h/QH+DGhVsL/oTiUsFNF73JhJZWam9I3nnn3nNz3rsidFWsia56jN6+R4/7B56UBp8+ez5UHh5ZjoN25MiGE7hBtCrsWLrKlw2ttCtXw0jannDlitieyfIriYxiFfgf9G4o1z275atN5diaqWa5bgkv3eo0k4/phPm6U5mqWEK2lJ8Kz9aR2umUsoIdLrCsHBIz0t/4lW+Wq1SjPCr3gVmAKoqYD8pfYWEDARy04UHCh2bswkbMzxpMEELm1pEyFzFSeV6igxJr21wlucJmdpu/Ld6tFazP+6xnnKsd/ovLb8TKCsbpkj7TDZ3TF/pOP/7ZK817ZF52eRVdrQybQwdjS3f/VXm8amz9Vj3oWWMT73Kvir2HOZOdwunqk73Dm6X3i+PpK/pE1+z/mK7ojE/gJ7fOyYJcPHrAj2AvfGM8IPPvcdwHy29qJtXMhbfV+nQxqgG8wEtM8DwmUccc5tHg/qe4wCW+GbOGZ2gj6ZYaPYVmFH+Esf8T3nyqPg==</latexit><latexit sha1_base64="t0YIAtusNim9537WktegaHytwZo=">AAACxXichVFNS9xQFD1GW+3Y1lE3QjdDB4vdDDelYCkIgy506UdHBWOHvPgcH+aL5E1Qw+DeP+DClQWF0h/QH+DGhVsL/oTiUsFNF73JhJZWam9I3nnn3nNz3rsidFWsia56jN6+R4/7B56UBp8+ez5UHh5ZjoN25MiGE7hBtCrsWLrKlw2ttCtXw0jannDlitieyfIriYxiFfgf9G4o1z275atN5diaqWa5bgkv3eo0k4/phPm6U5mqWEK2lJ8Kz9aR2umUsoIdLrCsHBIz0t/4lW+Wq1SjPCr3gVmAKoqYD8pfYWEDARy04UHCh2bswkbMzxpMEELm1pEyFzFSeV6igxJr21wlucJmdpu/Ld6tFazP+6xnnKsd/ovLb8TKCsbpkj7TDZ3TF/pOP/7ZK817ZF52eRVdrQybQwdjS3f/VXm8amz9Vj3oWWMT73Kvir2HOZOdwunqk73Dm6X3i+PpK/pE1+z/mK7ojE/gJ7fOyYJcPHrAj2AvfGM8IPPvcdwHy29qJtXMhbfV+nQxqgG8wEtM8DwmUccc5tHg/qe4wCW+GbOGZ2gj6ZYaPYVmFH+Esf8T3nyqPg==</latexit>
h(t 1)
v
a(t)
v
h(t)
v
yv
zv
h
(T )
v
xv
tanh
X
v2V
(yv) tanh(zv)
!hG =<latexit sha1_base64="ARMGXVwsPnaLefmPlcb2NUBMhVE=">AAACmXichVHLSsNAFD3GV33Xx0JwUyyKq3IjgiIIPhaKK7VWBSsliVMbzItkWqihP+APuHCjggv1A/wAN/6ACz9BXCq4ceFtGhAV9YbJnDlzz50zc3XPMgNJ9NikNLe0trUnOjq7unt6+5L9A1uBW/YNkTNcy/V3dC0QlumInDSlJXY8X2i2bolt/XCpvr9dEX5gus6mrHpiz9YOHLNoGppkqpAcyut2WKoV8rYmS4Zmhcu11FwhmaYMRZH6CdQYpBHHmpu8RR77cGGgDBsCDiRjCxoC/nahguAxt4eQOZ+RGe0L1NDJ2jJnCc7QmD3k/wGvdmPW4XW9ZhCpDT7F4uGzMoUxeqAreqF7uqEnev+1VhjVqHup8qw3tMIr9B0PZ9/+Vdk8S5Q+VX96lihiJvJqsncvYuq3MBr6ytHJS3Z2Yywcpwt6Zv/n9Eh3fAOn8mpcrouN0z/86OyFX4wbpH5vx0+wNZlRKaOuT6XnF+NWJTCCUUxwP6YxjxWsIcf1j3CGK1wrI8qCsqKsNlKVplgziC+hZD8A53GYIg==</latexit><latexit sha1_base64="ARMGXVwsPnaLefmPlcb2NUBMhVE=">AAACmXichVHLSsNAFD3GV33Xx0JwUyyKq3IjgiIIPhaKK7VWBSsliVMbzItkWqihP+APuHCjggv1A/wAN/6ACz9BXCq4ceFtGhAV9YbJnDlzz50zc3XPMgNJ9NikNLe0trUnOjq7unt6+5L9A1uBW/YNkTNcy/V3dC0QlumInDSlJXY8X2i2bolt/XCpvr9dEX5gus6mrHpiz9YOHLNoGppkqpAcyut2WKoV8rYmS4Zmhcu11FwhmaYMRZH6CdQYpBHHmpu8RR77cGGgDBsCDiRjCxoC/nahguAxt4eQOZ+RGe0L1NDJ2jJnCc7QmD3k/wGvdmPW4XW9ZhCpDT7F4uGzMoUxeqAreqF7uqEnev+1VhjVqHup8qw3tMIr9B0PZ9/+Vdk8S5Q+VX96lihiJvJqsncvYuq3MBr6ytHJS3Z2Yywcpwt6Zv/n9Eh3fAOn8mpcrouN0z/86OyFX4wbpH5vx0+wNZlRKaOuT6XnF+NWJTCCUUxwP6YxjxWsIcf1j3CGK1wrI8qCsqKsNlKVplgziC+hZD8A53GYIg==</latexit><latexit sha1_base64="ARMGXVwsPnaLefmPlcb2NUBMhVE=">AAACmXichVHLSsNAFD3GV33Xx0JwUyyKq3IjgiIIPhaKK7VWBSsliVMbzItkWqihP+APuHCjggv1A/wAN/6ACz9BXCq4ceFtGhAV9YbJnDlzz50zc3XPMgNJ9NikNLe0trUnOjq7unt6+5L9A1uBW/YNkTNcy/V3dC0QlumInDSlJXY8X2i2bolt/XCpvr9dEX5gus6mrHpiz9YOHLNoGppkqpAcyut2WKoV8rYmS4Zmhcu11FwhmaYMRZH6CdQYpBHHmpu8RR77cGGgDBsCDiRjCxoC/nahguAxt4eQOZ+RGe0L1NDJ2jJnCc7QmD3k/wGvdmPW4XW9ZhCpDT7F4uGzMoUxeqAreqF7uqEnev+1VhjVqHup8qw3tMIr9B0PZ9/+Vdk8S5Q+VX96lihiJvJqsncvYuq3MBr6ytHJS3Z2Yywcpwt6Zv/n9Eh3fAOn8mpcrouN0z/86OyFX4wbpH5vx0+wNZlRKaOuT6XnF+NWJTCCUUxwP6YxjxWsIcf1j3CGK1wrI8qCsqKsNlKVplgziC+hZD8A53GYIg==</latexit><latexit sha1_base64="ARMGXVwsPnaLefmPlcb2NUBMhVE=">AAACmXichVHLSsNAFD3GV33Xx0JwUyyKq3IjgiIIPhaKK7VWBSsliVMbzItkWqihP+APuHCjggv1A/wAN/6ACz9BXCq4ceFtGhAV9YbJnDlzz50zc3XPMgNJ9NikNLe0trUnOjq7unt6+5L9A1uBW/YNkTNcy/V3dC0QlumInDSlJXY8X2i2bolt/XCpvr9dEX5gus6mrHpiz9YOHLNoGppkqpAcyut2WKoV8rYmS4Zmhcu11FwhmaYMRZH6CdQYpBHHmpu8RR77cGGgDBsCDiRjCxoC/nahguAxt4eQOZ+RGe0L1NDJ2jJnCc7QmD3k/wGvdmPW4XW9ZhCpDT7F4uGzMoUxeqAreqF7uqEnev+1VhjVqHup8qw3tMIr9B0PZ9/+Vdk8S5Q+VX96lihiJvJqsncvYuq3MBr6ytHJS3Z2Yywcpwt6Zv/n9Eh3fAOn8mpcrouN0z/86OyFX4wbpH5vx0+wNZlRKaOuT6XnF+NWJTCCUUxwP6YxjxWsIcf1j3CGK1wrI8qCsqKsNlKVplgziC+hZD8A53GYIg==</latexit>
40. !36
↑ “part position paper, part review, and part unification” (DeepMind)
11 Jun 2018 arXiv:1806.01261 [cs.LG]
41. !37
We argue that combinatorial generalization must be a top priority for AI to achieve human-like
abilities, and that structured representations and computations are key to realizing this objective.
•Combinatorial optimization
(Bello+ 2016; Nowak+ 2017; Dai+ 2017)
•Boolean satisfiability
(Selsam+ 2018)
•Program representation and verification
(Allamanis+ 2018; Li+ 2016)
•Modeling cellular automata and Turing
machines (Johnson, 2017)
•Performing inference in graphical
models (Yoon+ 2018)