SlideShare uma empresa Scribd logo
1 de 44
Where Exactly Does One
Actually Use AI in Materials
Science?
Brian DeCost, Zachary Trautt, Martin Green, Gilad Kusne,
Jason Hattrick-Simpers
NIST Gaithersburg
Jason.Hattrick-Simpers@nist.gov
@jae3goals
Any mention of commercial products within this talk is for information only; it does not imply recommendation or
endorsement by NIST.
Caveats
• These slides were presented as a MRS webinar on Machine Learning, AI,
and Data-Driven Materials Development and Design
• They are intended as a quick-hit overview of the field of where people
apply AI in materials science
• The works presented are a mixture of my own work and the work of
others.
• Do not infer from the inclusion of a citation with et. al. that I participated in the
work!!
• Follow the citation and appropriately attribute the work
• This is not an exhaustive review of the field and many great groups were
not included, not as a slight on the magnitude of their contributions, but
merely for the sake of brevity.
• Use with caution
Context for This Part of the Webinar
OEM
• High level (non-application)
breakthroughs
• Revolutionizing society with
broadly applicable tool sets
Distributor
• Adapting OEM breakthroughs
to domain specific applications
• Revolutionizing communities
with disruptive new tools
• Educate the larger materials
community
Consumer
• Test driving new tools and use
them to discover new
materials/science
• Make “minor” tool adaptations
Beware of the AI/ML Hyperbole!
• Did Google guess my last name
with just my picture????
• Clearly No!!
• ML models are often interpolative
and correlative but our
INTERPRETATIONS can build in
causation that doesn’t exist.
Who Is Your AI More Like?
Judea Pearl arXiv:1801.04016v1
Modern AIs do not read, do not understand. They only disguise as if they do.
-Noriko Arai “Can a robot pass a university entrance exam?”
http://www.notablebiographies.com/Ni-Pe/Pauling-Linus.html https://www.nist.gov/content/nist-and-nobel/nobel-moment-dan-shechtman
DeCost, et. al., to be submitted
The Scientific Method Today
Archival Data
Knowledge
Predictive
Model
Recommend
Materials
Synthesize
Characterize
Clean
Process
Collect
Feed
Extract
Knowledge
Collect
Feed
Let’s Talk About Where We Use A.I.
Hypothesis
Generation
Getting and
Cleaning the
Archival Data
Predicting
Material of
Interest
New Data
Acquisition
Synthesis
Optimization
Measurement
Optimization
Data
Processing
Pre-
processing
Feature
Identification
Knowledge
Extraction
Data
Minimization
Actionable
Information
Closing the Loop?
Hypothesis Generation
• Most techniques require data to seed the search
• Where is this data coming from
• Curated archives (Landolt-Bornstein)
• Online databases
• Technical literature
• Lab Notebooks
• LIMS systems
• All models need effective descriptors
• Chemical
• Structural
• Do the ML models explicitly contain physics?
• Extrapolative versus interpolative
Chemical Descriptors in a (Random) Forest
Ward et al. npj Comp. Mater. (2016), 28.
Experimental
Data
Machine Learning
Algorithm
Composition-based
Representation
𝜎𝑟 < 1.1 Å
MG Not MG
𝜇 𝑍 ΔΧ
𝜎 𝑇 𝑚 max 𝑟𝑐𝑜𝑣
𝑥 𝐻, 𝑥 𝐻𝑒, … 2
𝑮𝑭𝑨 = 𝒇(𝒙 𝑯, 𝒙 𝑯𝒆, … )
24 Million Ternary Alloys
74520 potential MGs
5739 measurements
145 Attributes
Random Forest
JARVIS-ML
Websites:
https://jarvis.nist.gov
https://www.ctcms.nist.gov/jarvisml
• In addition to chemical , structural descriptors are also very
important
• 1557 such chemo-structural descriptors (CFID)
• 1557 descriptors for more than 25000 materials is a huge multi-
dimensional space
• Visualization with manifold-learning techniques, such as t-SNE
• Data-science based conventional accuracy metrics (MAE,
RMSE etc. ) are not enough from materials-science
perspective
• Integrating genetic algorithm with formation energy
regression model to map phase space and reproduce
reality with ML
• We trained 12 regression and 2 classification models
with gradient boosting decision tree
• To enable others use our trained ML models, we made
Flask-python app for ML predictions on-fly for any
arbitrary material
Publications:
 Phys. Rev. Materials 2, 08380 (2018)
 Nature:Scientific Data 4, 160125 (2017)
 Nature:Scientific Reports 7, 5179 (2017)
 Nature:Scientific Data 5, 180082 (2018)
 Phys. Rev. B 98, 014107 (2018)
 arXiv:1804.01024 (2018)
Bayesian Inference for Materials Design
Gubernatis and Lookman, PRM (2018)
Machine Learning Isn’t Always Enough
c/o Citrine Informatics
Layer Physical Models with Machine Learning
Models
Use Engineering Constraints to Limit
Search Space
Hattrick-Simpers et. al. MSED (2018)
NREL Data Eco-System for Data Analytics
Zakutayev et al. Scientific Data, 2018
Data Acquisition – Synthesis Optimization
• A newly predicted material is not
useful if I can’t make it
• If I can make it and no one else
can then my work isn’t impactful
• Options are:
• Learn from the literature what has
worked before
• Ensure that a framework exists for
transferal of materials
Strangest Things – Dark Reactions
Friedler, Schrier, Norquist et al. Nature, 2016, 533, 73.
Dark Reactions refers to the many ‘failed’
reactions we attempt every day in the lab.
Build an interpretable model Predict new materials!!
Natural Language Processing for Synthesis
Optimizing Deposition Profiles (II)
J. K. Bunn, et. al., I&EC, 55 (2016)
Playing FAIR with Data
To be Findable:
• (meta)data are assigned a globally unique and
persistent identifier
• data are described with rich metadata
• metadata clearly and explicitly include the identifier
of the data it describes
• (meta)data are registered or indexed in a searchable
resource
To be Accessible:
• (meta)data are retrievable by their identifier using a
standardized communications protocol
– the protocol is open, free, and universally
implementable
– the protocol allows for an authentication and
authorization procedure, where necessary
• metadata are accessible, even when the data are no
longer available
To be Interoperable:
• (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
• (meta)data use vocabularies that follow FAIR
principles
• (meta)data include qualified references to other
(meta)data
To be Reusable:
• meta(data) are richly described with a plurality of
accurate and relevant attributes
– (meta)data are released with a clear and accessible data
usage license
– (meta)data are associated with detailed provenance
– (meta)data meet domain-relevant community standards
Wilkinson, Mark D., et al. "The FAIR Guiding Principles for scientific data
management and stewardship." Scientific data 3 (2016). DOI:
10.1038/sdata.2016.18
Data Processing – Data Cleaning/Feature
Identification
• Raw data are noisy, filled with
artifacts, and often contain little
information per pixel
• AI methods can help
• Denoise data
• Remove spurious artifacts
• Perform (preliminary) analysis
• Identify Features
Unsupervised De-Noising of Atomic Images
Vasudevan, R.K Applied Physics Letters, 106(9), p.091601, 2015, Belianinov, A., Advanced Structural and Chemical Imaging, 1(1), p.6., 2015
Data Processing
Q (1/A)
Signal to Noise in a Pattern
High Quality Data
Noisier Data
Crystallinity
Experimental Mishap
Beam
blocked
by sample
holder
Sparse  Optimal
Coverage Density
Fang Ren, ACS Comb. Sci. - DOI: 10.1021/acscombsci.7b00015
Feature Identification with MOnsters
Knowledge Extraction
• For HTE groups this has been the single most important step for the
past 15 years
• 20,000 diffraction spectra across 14 ternaries and various annealing profiles
• How does one simplify multi-dimensional data (either via visualization
or outright information extraction)?
• How can AI be used to optimize experimental measurement time?
• How can AI make redundant tasks easier so humans can focus on
high-level tasks?
• Pretty big rift between supervised and unsupervised data analysis
communities
Rapid IV Acquisition by Bayesian Inference
DeCost et al accepted for publication in Microscopy & Microanalysis
(arXiv preprint arXiv:1805.08693 (2018))
Representation learning Classifier
Deep learning: jointly learn features and predictive model
DeCost Acta Mater. 2017 DOI: 10.1016/j.actamat.2017.05.014
microstructure --> processing
segmentation and quantification
Deep Learning for Microstructure Data
Data Processing – Knowledge Extraction
Pipeline
Data Acquisition
Data Processing and Conditioning
Phase Attribution
Convert
to -
and 
integrate Calculate
Similarity
Measure
Hattrick-Simpers, et. al. APL Materials 4 (2016)
Supervised Knowledge Extraction
J. K. Bunn, et. al., JMR, 30 (2015)
Human
Analysis
AutoPhase
Analysis
Expert Analysis
One-by-one
Attribution
AutoPhase
Attribution
Sample Data Selection
Training Data Selection General Feature Analysis
Unsupervised Knowledge Extraction - NMF
2 Angle (Deg.)37 48
Intensity(Arb.Unit.)
De-convolved XRD Spectra
Phase
Diagram
Fe
Fe40Pd60
Fe40Ga60
C. J. Long, et al. 2009. Rev. Sci. Instrum. 80, 103902
Fe46Pd26Ga28
FCC Fe
22%
BCC Fe
41%
FCC FePd
31%
Phase mixture for each spot as a pie chart!
Stanev, et al. 2018. npj Computational Materials
Why not perform the analysis during the
experiment?
Kusne, et al. Scientific Reports 4, 6367 (2014)
Iterative ML – High-throughput Experimental
Studies
Discover New Metallic Glasses 100x Faster
Are “deep” eutectics necessary for
growing thin film Metallic Glasses?
Ren, et al. Science Advances Vol 4 No. 4 (2018)
Closing the Loop
“In the next 5 years, AI-driven, autonomous
materials research is going to fundamentally
change how we do materials science.”
-Jim Warren, Technical Program Director for
Materials Genomics, NIST
Autonomous Research is Already Here
AFRL - ARES Hein - UBC NIST - UMD
Autonomous REsearch Systems (ARES)
000
Jason Hein
Autonomous robots
for experimental
chemistry
Alán Aspuru-Guzik
Artificial intelligence to
predict materials
performance
Ada: An Autonomous Discovery Accelerator
Automated
Synthesis
Automated
Characterization
Screen
computationally
(AI + simulation)
36
Curtis Berlinguette
Clean energy
materials & device
fabrication
Gilad’s Automated Experimentation Platform
Fe
Fe0.4Pd0.6
Fe0.4Ga0.6
Kusne, et. al., to be submitted
Active clustering for autonomous XRD phase
mapping
Think carefully about modeling to remove researcher degrees of freedom
DeCost, et. al., to be submitted
Conclusions
• AI & ML are already prevalent in the design of new materials, materials
synthesis, data capture/cleaning and knowledge extraction
• Neither AI nor ML are a panacea that will replace human intuition and
creativity, they are enablers
• In some cases an order of magnitude increase in materials
exploration/discovery is possible
• Maybe a fairer metric of AI’s influence will be on the rate of hypothesis
generation and (in)validation
• AI needs FAIR data including negative results to be effective
• Not part of the solution = consigned to obscurity
• Full materials research autonomy (for specific problems) has already been
demonstrated
Demonstrations and Talks by (confirmed speakers):
• Theory
• Computational Approaches
• Experimental Approaches
Andrew Millis (Columbia)
Antoine Georges (CCQ)
Karin Rabe (Rutgers)
Bootcamp: Machine Learning for Materials Research &
Workshop: Machine Learning Quantum Materials
• Dates: July 30 – Aug 3, 2018
• Location: IBBR (Gaithersburg, Maryland)
MLMR Introduces researchers from industry, national labs, and academia to machine learning theory and tools for rapid data analysis.
https://nanocenter.umd.edu/events/mlmr/
Bootcamp
Three days of lectures and hands-on exercises covering a range of
data analysis topics from data pre-processing through advanced
machine learning analysis techniques. Example topics include:
• Identifying important features in complex/high dimensional
data
• Visualizing high dimensional data to facilitate user analysis.
• Identifying the fabrication ‘descriptors’ that best predict
variance in functional properties.
• Quantifying similarities between materials using complex/high
dimensional data
The hands-on exercises will demonstrate practical use of machine
learning tools on real materials data (scalar values, spectra,
micrographs, etc.
Sasha Balatsky (LANL)
Roger Melko (Waterloo)
Shoucheng Zhang (Stanford)
Stefano Curtarolo (Duke)
Gus Hart (BYU)
Ichiro Takeuchi (UMD)
Sergei Kalinin (ORNL)
Benji Maruyama (AFRL)
Jiun-Haw Chu (Univ. Washington)
Giuseppe Carleo (Flatiron)
Miles Soudenmire (Flatiron)
Acknowledgements
USC
Travis Williams
SLAC
Dr. Apurva Mehta
Dr. Fang Ren
Dr. Suchismita
Northwestern
Prof. Wolverton
Dr. Logan Ward
UNSW
Prof. Kevin Laws
NIST
Dr. Martin Green
Dr. Zachary Trautt
Dr. Gilad Kusne
Dr. Brian DeCost
Mr. Ryan Smith
(REU)
NREL
Dr. Andriy
Zakutayev
CSM
Prof. Packard
Dr. Schoeppner
Acknowledgements for Providing Slides of
Their Work
• Kamal Choudhary and Francesca Tavazza (JARVIS/NIST)
• Alex Belianinov (PCA – STEM or STM/ORNL)
• Turab Lookman (Piezoelectric Bayesian slide/LANL)
• Benji Murayama (AFRL – ARES)
• Jason Hein (UBC – ADA)
• Rama Vasaduvan (Bayesian Local Imaging/ORNL)
• John Perkins (NREL – LIMS/NREL)
• Joshua Schrier (Interpretable models /Haverford College)
• Brian DeCost (GP phase diagram control and CNN microstructure / NIST with Liz Holm of
CMU)
• Gilad Kusne (Automated phase mapping & NMF / NIST with Ichiro Takeuchi of UMD)
• Apurva Mehta (Automated Synchrotron Data minimization/SLAC – SSRL)
• Chris Wolverton and Logan Ware (Magpie predicting metallic glasses/Northwestern)
Unsupervised Knowledge Extraction
Hattrick-Simpers APL Materials 4 (2016)
Optimizing Deposition Profiles
J. K. Bunn, et. al., I&EC, 55 (2016)

Mais conteúdo relacionado

Mais procurados

End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataPaul Groth
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data ShowcasingPaul Groth
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodDuncan Hull
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsPaul Groth
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data ScienceOlga Lavrentieva
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
 
An Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodAn Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodIJAEMSJORNAL
 
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Indraneel Dabhade
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Enayat Rajabi
 

Mais procurados (20)

End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 
NLP & ML Webinar
NLP & ML WebinarNLP & ML Webinar
NLP & ML Webinar
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
eScience: A Transformed Scientific Method
eScience: A Transformed Scientific MethodeScience: A Transformed Scientific Method
eScience: A Transformed Scientific Method
 
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
2019 Triangle Machine Learning Day - Biomedical Image Understanding and EHRs ...
 
Fair by design
Fair by designFair by design
Fair by design
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
Hands-on Introduction to Machine Learning
Hands-on Introduction to Machine LearningHands-on Introduction to Machine Learning
Hands-on Introduction to Machine Learning
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
Что такое Data Science
Что такое Data ScienceЧто такое Data Science
Что такое Data Science
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
An Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering methodAn Analysis of Outlier Detection through clustering method
An Analysis of Outlier Detection through clustering method
 
Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)Android Malware 2020 (CCCS-CIC-AndMal-2020)
Android Malware 2020 (CCCS-CIC-AndMal-2020)
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 

Semelhante a Hattrick-Simpers MRS Webinar on AI in Materials

Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Emerging Data Citation Infrastructure
Emerging Data Citation InfrastructureEmerging Data Citation Infrastructure
Emerging Data Citation InfrastructureMicah Altman
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper ProvenancePaul Groth
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective ApproachIRJET Journal
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciencesChris Dwan
 
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...Ben Blaiszik
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxARDC
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...jybufgofasfbkpoovh
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data ManagementC. Tobin Magle
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Will Data Science Approaches Impact Our Science?
Will Data Science Approaches Impact Our Science?Will Data Science Approaches Impact Our Science?
Will Data Science Approaches Impact Our Science?Philip Bourne
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6ARDC
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data CommonsSimon Twigger
 

Semelhante a Hattrick-Simpers MRS Webinar on AI in Materials (20)

Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Emerging Data Citation Infrastructure
Emerging Data Citation InfrastructureEmerging Data Citation Infrastructure
Emerging Data Citation Infrastructure
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Thoughts on Knowledge Graphs & Deeper Provenance
Thoughts on Knowledge Graphs  & Deeper ProvenanceThoughts on Knowledge Graphs  & Deeper Provenance
Thoughts on Knowledge Graphs & Deeper Provenance
 
Data Mining – A Perspective Approach
Data Mining – A Perspective ApproachData Mining – A Perspective Approach
Data Mining – A Perspective Approach
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...The Materials Data Facility: A Distributed Model for the Materials Data Commu...
The Materials Data Facility: A Distributed Model for the Materials Data Commu...
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptx
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...
 
Responsible conduct of research: Data Management
Responsible conduct of research: Data ManagementResponsible conduct of research: Data Management
Responsible conduct of research: Data Management
 
G045033841
G045033841G045033841
G045033841
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Will Data Science Approaches Impact Our Science?
Will Data Science Approaches Impact Our Science?Will Data Science Approaches Impact Our Science?
Will Data Science Approaches Impact Our Science?
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6Fsci 2018 wednesday1_august_am6
Fsci 2018 wednesday1_august_am6
 
Data management plans
Data management plansData management plans
Data management plans
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 

Último

CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptRakeshMohan42
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxrohankumarsinghrore1
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 

Último (20)

CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

Hattrick-Simpers MRS Webinar on AI in Materials

  • 1. Where Exactly Does One Actually Use AI in Materials Science? Brian DeCost, Zachary Trautt, Martin Green, Gilad Kusne, Jason Hattrick-Simpers NIST Gaithersburg Jason.Hattrick-Simpers@nist.gov @jae3goals Any mention of commercial products within this talk is for information only; it does not imply recommendation or endorsement by NIST.
  • 2. Caveats • These slides were presented as a MRS webinar on Machine Learning, AI, and Data-Driven Materials Development and Design • They are intended as a quick-hit overview of the field of where people apply AI in materials science • The works presented are a mixture of my own work and the work of others. • Do not infer from the inclusion of a citation with et. al. that I participated in the work!! • Follow the citation and appropriately attribute the work • This is not an exhaustive review of the field and many great groups were not included, not as a slight on the magnitude of their contributions, but merely for the sake of brevity. • Use with caution
  • 3. Context for This Part of the Webinar OEM • High level (non-application) breakthroughs • Revolutionizing society with broadly applicable tool sets Distributor • Adapting OEM breakthroughs to domain specific applications • Revolutionizing communities with disruptive new tools • Educate the larger materials community Consumer • Test driving new tools and use them to discover new materials/science • Make “minor” tool adaptations
  • 4. Beware of the AI/ML Hyperbole! • Did Google guess my last name with just my picture???? • Clearly No!! • ML models are often interpolative and correlative but our INTERPRETATIONS can build in causation that doesn’t exist.
  • 5. Who Is Your AI More Like? Judea Pearl arXiv:1801.04016v1 Modern AIs do not read, do not understand. They only disguise as if they do. -Noriko Arai “Can a robot pass a university entrance exam?” http://www.notablebiographies.com/Ni-Pe/Pauling-Linus.html https://www.nist.gov/content/nist-and-nobel/nobel-moment-dan-shechtman
  • 6. DeCost, et. al., to be submitted
  • 7. The Scientific Method Today Archival Data Knowledge Predictive Model Recommend Materials Synthesize Characterize Clean Process Collect Feed Extract Knowledge Collect Feed
  • 8. Let’s Talk About Where We Use A.I. Hypothesis Generation Getting and Cleaning the Archival Data Predicting Material of Interest New Data Acquisition Synthesis Optimization Measurement Optimization Data Processing Pre- processing Feature Identification Knowledge Extraction Data Minimization Actionable Information Closing the Loop?
  • 9. Hypothesis Generation • Most techniques require data to seed the search • Where is this data coming from • Curated archives (Landolt-Bornstein) • Online databases • Technical literature • Lab Notebooks • LIMS systems • All models need effective descriptors • Chemical • Structural • Do the ML models explicitly contain physics? • Extrapolative versus interpolative
  • 10. Chemical Descriptors in a (Random) Forest Ward et al. npj Comp. Mater. (2016), 28. Experimental Data Machine Learning Algorithm Composition-based Representation 𝜎𝑟 < 1.1 Å MG Not MG 𝜇 𝑍 ΔΧ 𝜎 𝑇 𝑚 max 𝑟𝑐𝑜𝑣 𝑥 𝐻, 𝑥 𝐻𝑒, … 2 𝑮𝑭𝑨 = 𝒇(𝒙 𝑯, 𝒙 𝑯𝒆, … ) 24 Million Ternary Alloys 74520 potential MGs 5739 measurements 145 Attributes Random Forest
  • 11. JARVIS-ML Websites: https://jarvis.nist.gov https://www.ctcms.nist.gov/jarvisml • In addition to chemical , structural descriptors are also very important • 1557 such chemo-structural descriptors (CFID) • 1557 descriptors for more than 25000 materials is a huge multi- dimensional space • Visualization with manifold-learning techniques, such as t-SNE • Data-science based conventional accuracy metrics (MAE, RMSE etc. ) are not enough from materials-science perspective • Integrating genetic algorithm with formation energy regression model to map phase space and reproduce reality with ML • We trained 12 regression and 2 classification models with gradient boosting decision tree • To enable others use our trained ML models, we made Flask-python app for ML predictions on-fly for any arbitrary material Publications:  Phys. Rev. Materials 2, 08380 (2018)  Nature:Scientific Data 4, 160125 (2017)  Nature:Scientific Reports 7, 5179 (2017)  Nature:Scientific Data 5, 180082 (2018)  Phys. Rev. B 98, 014107 (2018)  arXiv:1804.01024 (2018)
  • 12. Bayesian Inference for Materials Design Gubernatis and Lookman, PRM (2018)
  • 13. Machine Learning Isn’t Always Enough c/o Citrine Informatics Layer Physical Models with Machine Learning Models Use Engineering Constraints to Limit Search Space Hattrick-Simpers et. al. MSED (2018)
  • 14. NREL Data Eco-System for Data Analytics Zakutayev et al. Scientific Data, 2018
  • 15. Data Acquisition – Synthesis Optimization • A newly predicted material is not useful if I can’t make it • If I can make it and no one else can then my work isn’t impactful • Options are: • Learn from the literature what has worked before • Ensure that a framework exists for transferal of materials
  • 16. Strangest Things – Dark Reactions Friedler, Schrier, Norquist et al. Nature, 2016, 533, 73. Dark Reactions refers to the many ‘failed’ reactions we attempt every day in the lab. Build an interpretable model Predict new materials!!
  • 18. Optimizing Deposition Profiles (II) J. K. Bunn, et. al., I&EC, 55 (2016)
  • 19. Playing FAIR with Data To be Findable: • (meta)data are assigned a globally unique and persistent identifier • data are described with rich metadata • metadata clearly and explicitly include the identifier of the data it describes • (meta)data are registered or indexed in a searchable resource To be Accessible: • (meta)data are retrievable by their identifier using a standardized communications protocol – the protocol is open, free, and universally implementable – the protocol allows for an authentication and authorization procedure, where necessary • metadata are accessible, even when the data are no longer available To be Interoperable: • (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. • (meta)data use vocabularies that follow FAIR principles • (meta)data include qualified references to other (meta)data To be Reusable: • meta(data) are richly described with a plurality of accurate and relevant attributes – (meta)data are released with a clear and accessible data usage license – (meta)data are associated with detailed provenance – (meta)data meet domain-relevant community standards Wilkinson, Mark D., et al. "The FAIR Guiding Principles for scientific data management and stewardship." Scientific data 3 (2016). DOI: 10.1038/sdata.2016.18
  • 20. Data Processing – Data Cleaning/Feature Identification • Raw data are noisy, filled with artifacts, and often contain little information per pixel • AI methods can help • Denoise data • Remove spurious artifacts • Perform (preliminary) analysis • Identify Features
  • 21. Unsupervised De-Noising of Atomic Images Vasudevan, R.K Applied Physics Letters, 106(9), p.091601, 2015, Belianinov, A., Advanced Structural and Chemical Imaging, 1(1), p.6., 2015
  • 22. Data Processing Q (1/A) Signal to Noise in a Pattern High Quality Data Noisier Data Crystallinity Experimental Mishap Beam blocked by sample holder Sparse  Optimal Coverage Density Fang Ren, ACS Comb. Sci. - DOI: 10.1021/acscombsci.7b00015
  • 24. Knowledge Extraction • For HTE groups this has been the single most important step for the past 15 years • 20,000 diffraction spectra across 14 ternaries and various annealing profiles • How does one simplify multi-dimensional data (either via visualization or outright information extraction)? • How can AI be used to optimize experimental measurement time? • How can AI make redundant tasks easier so humans can focus on high-level tasks? • Pretty big rift between supervised and unsupervised data analysis communities
  • 25. Rapid IV Acquisition by Bayesian Inference
  • 26. DeCost et al accepted for publication in Microscopy & Microanalysis (arXiv preprint arXiv:1805.08693 (2018)) Representation learning Classifier Deep learning: jointly learn features and predictive model DeCost Acta Mater. 2017 DOI: 10.1016/j.actamat.2017.05.014 microstructure --> processing segmentation and quantification Deep Learning for Microstructure Data
  • 27. Data Processing – Knowledge Extraction Pipeline Data Acquisition Data Processing and Conditioning Phase Attribution Convert to - and  integrate Calculate Similarity Measure Hattrick-Simpers, et. al. APL Materials 4 (2016)
  • 28. Supervised Knowledge Extraction J. K. Bunn, et. al., JMR, 30 (2015) Human Analysis AutoPhase Analysis Expert Analysis One-by-one Attribution AutoPhase Attribution Sample Data Selection Training Data Selection General Feature Analysis
  • 29. Unsupervised Knowledge Extraction - NMF 2 Angle (Deg.)37 48 Intensity(Arb.Unit.) De-convolved XRD Spectra Phase Diagram Fe Fe40Pd60 Fe40Ga60 C. J. Long, et al. 2009. Rev. Sci. Instrum. 80, 103902 Fe46Pd26Ga28 FCC Fe 22% BCC Fe 41% FCC FePd 31% Phase mixture for each spot as a pie chart! Stanev, et al. 2018. npj Computational Materials
  • 30. Why not perform the analysis during the experiment? Kusne, et al. Scientific Reports 4, 6367 (2014)
  • 31. Iterative ML – High-throughput Experimental Studies Discover New Metallic Glasses 100x Faster Are “deep” eutectics necessary for growing thin film Metallic Glasses? Ren, et al. Science Advances Vol 4 No. 4 (2018)
  • 33. “In the next 5 years, AI-driven, autonomous materials research is going to fundamentally change how we do materials science.” -Jim Warren, Technical Program Director for Materials Genomics, NIST
  • 34. Autonomous Research is Already Here AFRL - ARES Hein - UBC NIST - UMD
  • 36. 000 Jason Hein Autonomous robots for experimental chemistry Alán Aspuru-Guzik Artificial intelligence to predict materials performance Ada: An Autonomous Discovery Accelerator Automated Synthesis Automated Characterization Screen computationally (AI + simulation) 36 Curtis Berlinguette Clean energy materials & device fabrication
  • 37. Gilad’s Automated Experimentation Platform Fe Fe0.4Pd0.6 Fe0.4Ga0.6 Kusne, et. al., to be submitted
  • 38. Active clustering for autonomous XRD phase mapping Think carefully about modeling to remove researcher degrees of freedom DeCost, et. al., to be submitted
  • 39. Conclusions • AI & ML are already prevalent in the design of new materials, materials synthesis, data capture/cleaning and knowledge extraction • Neither AI nor ML are a panacea that will replace human intuition and creativity, they are enablers • In some cases an order of magnitude increase in materials exploration/discovery is possible • Maybe a fairer metric of AI’s influence will be on the rate of hypothesis generation and (in)validation • AI needs FAIR data including negative results to be effective • Not part of the solution = consigned to obscurity • Full materials research autonomy (for specific problems) has already been demonstrated
  • 40. Demonstrations and Talks by (confirmed speakers): • Theory • Computational Approaches • Experimental Approaches Andrew Millis (Columbia) Antoine Georges (CCQ) Karin Rabe (Rutgers) Bootcamp: Machine Learning for Materials Research & Workshop: Machine Learning Quantum Materials • Dates: July 30 – Aug 3, 2018 • Location: IBBR (Gaithersburg, Maryland) MLMR Introduces researchers from industry, national labs, and academia to machine learning theory and tools for rapid data analysis. https://nanocenter.umd.edu/events/mlmr/ Bootcamp Three days of lectures and hands-on exercises covering a range of data analysis topics from data pre-processing through advanced machine learning analysis techniques. Example topics include: • Identifying important features in complex/high dimensional data • Visualizing high dimensional data to facilitate user analysis. • Identifying the fabrication ‘descriptors’ that best predict variance in functional properties. • Quantifying similarities between materials using complex/high dimensional data The hands-on exercises will demonstrate practical use of machine learning tools on real materials data (scalar values, spectra, micrographs, etc. Sasha Balatsky (LANL) Roger Melko (Waterloo) Shoucheng Zhang (Stanford) Stefano Curtarolo (Duke) Gus Hart (BYU) Ichiro Takeuchi (UMD) Sergei Kalinin (ORNL) Benji Maruyama (AFRL) Jiun-Haw Chu (Univ. Washington) Giuseppe Carleo (Flatiron) Miles Soudenmire (Flatiron)
  • 41. Acknowledgements USC Travis Williams SLAC Dr. Apurva Mehta Dr. Fang Ren Dr. Suchismita Northwestern Prof. Wolverton Dr. Logan Ward UNSW Prof. Kevin Laws NIST Dr. Martin Green Dr. Zachary Trautt Dr. Gilad Kusne Dr. Brian DeCost Mr. Ryan Smith (REU) NREL Dr. Andriy Zakutayev CSM Prof. Packard Dr. Schoeppner
  • 42. Acknowledgements for Providing Slides of Their Work • Kamal Choudhary and Francesca Tavazza (JARVIS/NIST) • Alex Belianinov (PCA – STEM or STM/ORNL) • Turab Lookman (Piezoelectric Bayesian slide/LANL) • Benji Murayama (AFRL – ARES) • Jason Hein (UBC – ADA) • Rama Vasaduvan (Bayesian Local Imaging/ORNL) • John Perkins (NREL – LIMS/NREL) • Joshua Schrier (Interpretable models /Haverford College) • Brian DeCost (GP phase diagram control and CNN microstructure / NIST with Liz Holm of CMU) • Gilad Kusne (Automated phase mapping & NMF / NIST with Ichiro Takeuchi of UMD) • Apurva Mehta (Automated Synchrotron Data minimization/SLAC – SSRL) • Chris Wolverton and Logan Ware (Magpie predicting metallic glasses/Northwestern)
  • 44. Optimizing Deposition Profiles J. K. Bunn, et. al., I&EC, 55 (2016)

Notas do Editor

  1. Emphasize use of ML in my work but rampant skepticism.
  2. Maybe back off and say that these are just examples of where we use AI and that it is not exhaustive and that there are few studies that ONLY do 1. often things are linked in a workflow.
  3. Stocihiometic attributes capture the fraction but not type of elements present. Elemental property attributes of atomic row, mendeev number, atomic weight, total # of unfilled states, etc. with both weighted averages, max, min, range average deviation and mode Calence orbital occupation attributes Ionic compound attributes… ALFOWLIB and MaterialsProject
  4. "dark reactions" are failed...*recorded* (in lab notebooks), but *not published* (in open literature)   The intepretable model gives us *hypotheses* about the underlying chemistry of crystal formation, in terms of physical attributes and reactions conditions.  We've actually gone into the laboratory to test them, and 2.5 of the three hypotheses are correct...   Not just predict new materials, but predict the recipes to synthesize the new materials.  Better than "human experts", even on a problem where the humans think they have a good strategy.
  5. The idea start with an arbitrary sized image, cut it into a bunch of arbitrary squares, chump them out as linear vectors. A lot of repeatable units in an image, you want to chunk out how the similar images are as a function of position. Window sizing can be automated by using FFT to get a square the size of the primary peak. ABSOLUTELY requires periodicity.
  6. Here they are actually potting the calculated polarization for each of the spots (often the property of interest). Disorder analysis from switching traces. To investigate the variability of the switching current profiles, we performed K-means analysis with 12 clusters, restricting the data to the switching current traces in the positive voltage window. The result of the K-means analysis is shown in the cluster label map in a, and the mean response of each cluster in b. Note that clusters with <20 members are not plotted on this graph. To obtain a more quantitative estimate on the disorder, a Gaussian fitting procedure was applied to each switching current profile. 
  7. Top: Lots of recent (since 2012) breakthroughs in computer vision credited to deep learning -- interesting applications in analyzing microscopy data as well.   Key point: It's not 'automate the technicians and engineers jobs out of existence', it's 'automate the boring stuff so the technicians and engineers can be more efficient, work with better information, and ask more interesting/impactful questions of the microstructure.'   Deep learning: build high-level, complex features out of lower-level features. For image data, this is accomplished by stacking many layers of learnable convolution filters that do feature extraction. 2. Deep learning in 6 words: jointly learn features and predictive model   CNN cartoon (top): visualization of the activations in a convolutional neural network (CNN) looking at an ultrahigh carbon steel micrograph. The layers of the network track with different types of features Levels 1 & 2 respond to edges, Level 3 to carbide particles and the network, and so on. This particular CNN is designed for image segmentation -- it uses features from all five levels to classify individual pixels according proeutectoid carbide (cyan), ferrite (denuded zone, blue), ferrite with particles (yellow), and widmanstatten carbide (green).   Bottom left: Deep learning can be used to relate image data to processing or properties metadata.   We should prepare interesting samples, automate the collection of many images, and have ML models flag interesting features for us to review!   figure: t-SNE visualization of CNN features for ~1000 ultrahigh carbon steel micrographs. Colormap indicates annealing temperature and relative marker size indicates annealing time for this dataset. (inset: colors show primary microconstituents. purple: network carbide, blue: ferrite w/ particles, yellow: pearlite). Similar annealing condition -> similar microstructure :: similar microstructure -> similar CNN features     Bottom right: Deep learning can also be used to automate/augment/scale up the kind of microstructure analyses we can do.   e.g. more powerful segmentation models --> broader scope of fast quantification of interesting microstructure features. Automate the image acquisition and you can get good statistics for things that are currently just subjective. For instance, imagine having a CNN automatically produce the denuded zone width distribution. This would be very challenging   ORNL group doing atomistic defect analysis and tracking in the TEM Dane Morgan's group identifying and segmenting individual dislocation loops in TEM images
  8. Mean shift theory…
  9. Love this slide, but perhaps you can merge it with the next one to show the LOOP?
  10. If you’re interested in machine learning, we have an annual bootcamp at University of Maryland that teaches a wide variety of these techniques. You’ll learn things like how to identify important features in your data, how to visualize complex or high dimensional data, and how to identify descriptors. Each morning there are lectures and the afternoons are hands on activities applying machine learning to real materials data. And Ichiro Takeuchi, me and some collaborators have also organized an annual bootcamp. At this bootcamp we teach an introduction to machine learning, the most common techniques. Half of each day is also hands on training where you learn how to write code to analyze real data. Some examples of stuff we teach. How to …