Hattrick-Simpers MRS Webinar on AI in Materials

Where Exactly Does One
Actually Use AI in Materials
Science?
Brian DeCost, Zachary Trautt, Martin Green, Gilad Kusne,
Jason Hattrick-Simpers
NIST Gaithersburg
Jason.Hattrick-Simpers@nist.gov
@jae3goals
Any mention of commercial products within this talk is for information only; it does not imply recommendation or
endorsement by NIST.

Caveats
• These slides were presented as a MRS webinar on Machine Learning, AI,
and Data-Driven Materials Development and Design
• They are intended as a quick-hit overview of the field of where people
apply AI in materials science
• The works presented are a mixture of my own work and the work of
others.
• Do not infer from the inclusion of a citation with et. al. that I participated in the
work!!
• Follow the citation and appropriately attribute the work
• This is not an exhaustive review of the field and many great groups were
not included, not as a slight on the magnitude of their contributions, but
merely for the sake of brevity.
• Use with caution

Context for This Part of the Webinar
OEM
• High level (non-application)
breakthroughs
• Revolutionizing society with
broadly applicable tool sets
Distributor
• Adapting OEM breakthroughs
to domain specific applications
• Revolutionizing communities
with disruptive new tools
• Educate the larger materials
community
Consumer
• Test driving new tools and use
them to discover new
materials/science
• Make “minor” tool adaptations

Beware of the AI/ML Hyperbole!
• Did Google guess my last name
with just my picture????
• Clearly No!!
• ML models are often interpolative
and correlative but our
INTERPRETATIONS can build in
causation that doesn’t exist.

Who Is Your AI More Like?
Judea Pearl arXiv:1801.04016v1
Modern AIs do not read, do not understand. They only disguise as if they do.
-Noriko Arai “Can a robot pass a university entrance exam?”
http://www.notablebiographies.com/Ni-Pe/Pauling-Linus.html https://www.nist.gov/content/nist-and-nobel/nobel-moment-dan-shechtman

DeCost, et. al., to be submitted

The Scientific Method Today
Archival Data
Knowledge
Predictive
Model
Recommend
Materials
Synthesize
Characterize
Clean
Process
Collect
Feed
Extract
Knowledge
Collect
Feed

Let’s Talk About Where We Use A.I.
Hypothesis
Generation
Getting and
Cleaning the
Archival Data
Predicting
Material of
Interest
New Data
Acquisition
Synthesis
Optimization
Measurement
Optimization
Data
Processing
Pre-
processing
Feature
Identification
Knowledge
Extraction
Data
Minimization
Actionable
Information
Closing the Loop?

Hypothesis Generation
• Most techniques require data to seed the search
• Where is this data coming from
• Curated archives (Landolt-Bornstein)
• Online databases
• Technical literature
• Lab Notebooks
• LIMS systems
• All models need effective descriptors
• Chemical
• Structural
• Do the ML models explicitly contain physics?
• Extrapolative versus interpolative

Chemical Descriptors in a (Random) Forest
Ward et al. npj Comp. Mater. (2016), 28.
Experimental
Data
Machine Learning
Algorithm
Composition-based
Representation
𝜎𝑟 < 1.1 Å
MG Not MG
𝜇 𝑍 ΔΧ
𝜎 𝑇 𝑚 max 𝑟𝑐𝑜𝑣
𝑥 𝐻, 𝑥 𝐻𝑒, … 2
𝑮𝑭𝑨 = 𝒇(𝒙 𝑯, 𝒙 𝑯𝒆, … )
24 Million Ternary Alloys
74520 potential MGs
5739 measurements
145 Attributes
Random Forest

JARVIS-ML
Websites:
https://jarvis.nist.gov
https://www.ctcms.nist.gov/jarvisml
• In addition to chemical , structural descriptors are also very
important
• 1557 such chemo-structural descriptors (CFID)
• 1557 descriptors for more than 25000 materials is a huge multi-
dimensional space
• Visualization with manifold-learning techniques, such as t-SNE
• Data-science based conventional accuracy metrics (MAE,
RMSE etc. ) are not enough from materials-science
perspective
• Integrating genetic algorithm with formation energy
regression model to map phase space and reproduce
reality with ML
• We trained 12 regression and 2 classification models
with gradient boosting decision tree
• To enable others use our trained ML models, we made
Flask-python app for ML predictions on-fly for any
arbitrary material
Publications:
 Phys. Rev. Materials 2, 08380 (2018)
 Nature:Scientific Data 4, 160125 (2017)
 Nature:Scientific Reports 7, 5179 (2017)
 Nature:Scientific Data 5, 180082 (2018)
 Phys. Rev. B 98, 014107 (2018)
 arXiv:1804.01024 (2018)

Bayesian Inference for Materials Design
Gubernatis and Lookman, PRM (2018)

Machine Learning Isn’t Always Enough
c/o Citrine Informatics
Layer Physical Models with Machine Learning
Models
Use Engineering Constraints to Limit
Search Space
Hattrick-Simpers et. al. MSED (2018)

NREL Data Eco-System for Data Analytics
Zakutayev et al. Scientific Data, 2018

Data Acquisition – Synthesis Optimization
• A newly predicted material is not
useful if I can’t make it
• If I can make it and no one else
can then my work isn’t impactful
• Options are:
• Learn from the literature what has
worked before
• Ensure that a framework exists for
transferal of materials

Strangest Things – Dark Reactions
Friedler, Schrier, Norquist et al. Nature, 2016, 533, 73.
Dark Reactions refers to the many ‘failed’
reactions we attempt every day in the lab.
Build an interpretable model Predict new materials!!

Natural Language Processing for Synthesis

Optimizing Deposition Profiles (II)
J. K. Bunn, et. al., I&EC, 55 (2016)

Playing FAIR with Data
To be Findable:
• (meta)data are assigned a globally unique and
persistent identifier
• data are described with rich metadata
• metadata clearly and explicitly include the identifier
of the data it describes
• (meta)data are registered or indexed in a searchable
resource
To be Accessible:
• (meta)data are retrievable by their identifier using a
standardized communications protocol
– the protocol is open, free, and universally
implementable
– the protocol allows for an authentication and
authorization procedure, where necessary
• metadata are accessible, even when the data are no
longer available
To be Interoperable:
• (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
• (meta)data use vocabularies that follow FAIR
principles
• (meta)data include qualified references to other
(meta)data
To be Reusable:
• meta(data) are richly described with a plurality of
accurate and relevant attributes
– (meta)data are released with a clear and accessible data
usage license
– (meta)data are associated with detailed provenance
– (meta)data meet domain-relevant community standards
Wilkinson, Mark D., et al. "The FAIR Guiding Principles for scientific data
management and stewardship." Scientific data 3 (2016). DOI:
10.1038/sdata.2016.18

Data Processing – Data Cleaning/Feature
Identification
• Raw data are noisy, filled with
artifacts, and often contain little
information per pixel
• AI methods can help
• Denoise data
• Remove spurious artifacts
• Perform (preliminary) analysis
• Identify Features

Unsupervised De-Noising of Atomic Images
Vasudevan, R.K Applied Physics Letters, 106(9), p.091601, 2015, Belianinov, A., Advanced Structural and Chemical Imaging, 1(1), p.6., 2015

Data Processing
Q (1/A)
Signal to Noise in a Pattern
High Quality Data
Noisier Data
Crystallinity
Experimental Mishap
Beam
blocked
by sample
holder
Sparse  Optimal
Coverage Density
Fang Ren, ACS Comb. Sci. - DOI: 10.1021/acscombsci.7b00015

Feature Identification with MOnsters

Knowledge Extraction
• For HTE groups this has been the single most important step for the
past 15 years
• 20,000 diffraction spectra across 14 ternaries and various annealing profiles
• How does one simplify multi-dimensional data (either via visualization
or outright information extraction)?
• How can AI be used to optimize experimental measurement time?
• How can AI make redundant tasks easier so humans can focus on
high-level tasks?
• Pretty big rift between supervised and unsupervised data analysis
communities

Rapid IV Acquisition by Bayesian Inference

DeCost et al accepted for publication in Microscopy & Microanalysis
(arXiv preprint arXiv:1805.08693 (2018))
Representation learning Classifier
Deep learning: jointly learn features and predictive model
DeCost Acta Mater. 2017 DOI: 10.1016/j.actamat.2017.05.014
microstructure --> processing
segmentation and quantification
Deep Learning for Microstructure Data

Data Processing – Knowledge Extraction
Pipeline
Data Acquisition
Data Processing and Conditioning
Phase Attribution
Convert
to -
and 
integrate Calculate
Similarity
Measure
Hattrick-Simpers, et. al. APL Materials 4 (2016)

Supervised Knowledge Extraction
J. K. Bunn, et. al., JMR, 30 (2015)
Human
Analysis
AutoPhase
Analysis
Expert Analysis
One-by-one
Attribution
AutoPhase
Attribution
Sample Data Selection
Training Data Selection General Feature Analysis

Unsupervised Knowledge Extraction - NMF
2 Angle (Deg.)37 48
Intensity(Arb.Unit.)
De-convolved XRD Spectra
Phase
Diagram
Fe
Fe40Pd60
Fe40Ga60
C. J. Long, et al. 2009. Rev. Sci. Instrum. 80, 103902
Fe46Pd26Ga28
FCC Fe
22%
BCC Fe
41%
FCC FePd
31%
Phase mixture for each spot as a pie chart!
Stanev, et al. 2018. npj Computational Materials

Why not perform the analysis during the
experiment?
Kusne, et al. Scientific Reports 4, 6367 (2014)

Iterative ML – High-throughput Experimental
Studies
Discover New Metallic Glasses 100x Faster
Are “deep” eutectics necessary for
growing thin film Metallic Glasses?
Ren, et al. Science Advances Vol 4 No. 4 (2018)

“In the next 5 years, AI-driven, autonomous
materials research is going to fundamentally
change how we do materials science.”
-Jim Warren, Technical Program Director for
Materials Genomics, NIST

Autonomous Research is Already Here
AFRL - ARES Hein - UBC NIST - UMD

Autonomous REsearch Systems (ARES)

000
Jason Hein
Autonomous robots
for experimental
chemistry
Alán Aspuru-Guzik
Artificial intelligence to
predict materials
performance
Ada: An Autonomous Discovery Accelerator
Automated
Synthesis
Automated
Characterization
Screen
computationally
(AI + simulation)
36
Curtis Berlinguette
Clean energy
materials & device
fabrication

Gilad’s Automated Experimentation Platform
Fe
Fe0.4Pd0.6
Fe0.4Ga0.6
Kusne, et. al., to be submitted

Active clustering for autonomous XRD phase
mapping
Think carefully about modeling to remove researcher degrees of freedom
DeCost, et. al., to be submitted

Conclusions
• AI & ML are already prevalent in the design of new materials, materials
synthesis, data capture/cleaning and knowledge extraction
• Neither AI nor ML are a panacea that will replace human intuition and
creativity, they are enablers
• In some cases an order of magnitude increase in materials
exploration/discovery is possible
• Maybe a fairer metric of AI’s influence will be on the rate of hypothesis
generation and (in)validation
• AI needs FAIR data including negative results to be effective
• Not part of the solution = consigned to obscurity
• Full materials research autonomy (for specific problems) has already been
demonstrated

Demonstrations and Talks by (confirmed speakers):
• Theory
• Computational Approaches
• Experimental Approaches
Andrew Millis (Columbia)
Antoine Georges (CCQ)
Karin Rabe (Rutgers)
Bootcamp: Machine Learning for Materials Research &
Workshop: Machine Learning Quantum Materials
• Dates: July 30 – Aug 3, 2018
• Location: IBBR (Gaithersburg, Maryland)
MLMR Introduces researchers from industry, national labs, and academia to machine learning theory and tools for rapid data analysis.
https://nanocenter.umd.edu/events/mlmr/
Bootcamp
Three days of lectures and hands-on exercises covering a range of
data analysis topics from data pre-processing through advanced
machine learning analysis techniques. Example topics include:
• Identifying important features in complex/high dimensional
data
• Visualizing high dimensional data to facilitate user analysis.
• Identifying the fabrication ‘descriptors’ that best predict
variance in functional properties.
• Quantifying similarities between materials using complex/high
dimensional data
The hands-on exercises will demonstrate practical use of machine
learning tools on real materials data (scalar values, spectra,
micrographs, etc.
Sasha Balatsky (LANL)
Roger Melko (Waterloo)
Shoucheng Zhang (Stanford)
Stefano Curtarolo (Duke)
Gus Hart (BYU)
Ichiro Takeuchi (UMD)
Sergei Kalinin (ORNL)
Benji Maruyama (AFRL)
Jiun-Haw Chu (Univ. Washington)
Giuseppe Carleo (Flatiron)
Miles Soudenmire (Flatiron)

Acknowledgements
USC
Travis Williams
SLAC
Dr. Apurva Mehta
Dr. Fang Ren
Dr. Suchismita
Northwestern
Prof. Wolverton
Dr. Logan Ward
UNSW
Prof. Kevin Laws
NIST
Dr. Martin Green
Dr. Zachary Trautt
Dr. Gilad Kusne
Dr. Brian DeCost
Mr. Ryan Smith
(REU)
NREL
Dr. Andriy
Zakutayev
CSM
Prof. Packard
Dr. Schoeppner

Acknowledgements for Providing Slides of
Their Work
• Kamal Choudhary and Francesca Tavazza (JARVIS/NIST)
• Alex Belianinov (PCA – STEM or STM/ORNL)
• Turab Lookman (Piezoelectric Bayesian slide/LANL)
• Benji Murayama (AFRL – ARES)
• Jason Hein (UBC – ADA)
• Rama Vasaduvan (Bayesian Local Imaging/ORNL)
• John Perkins (NREL – LIMS/NREL)
• Joshua Schrier (Interpretable models /Haverford College)
• Brian DeCost (GP phase diagram control and CNN microstructure / NIST with Liz Holm of
CMU)
• Gilad Kusne (Automated phase mapping & NMF / NIST with Ichiro Takeuchi of UMD)
• Apurva Mehta (Automated Synchrotron Data minimization/SLAC – SSRL)
• Chris Wolverton and Logan Ware (Magpie predicting metallic glasses/Northwestern)

Unsupervised Knowledge Extraction
Hattrick-Simpers APL Materials 4 (2016)

Optimizing Deposition Profiles
J. K. Bunn, et. al., I&EC, 55 (2016)

Hattrick-Simpers MRS Webinar on AI in Materials

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Hattrick-Simpers MRS Webinar on AI in Materials

Semelhante a Hattrick-Simpers MRS Webinar on AI in Materials (20)

Último

Último (20)

Hattrick-Simpers MRS Webinar on AI in Materials

Notas do Editor