SlideShare uma empresa Scribd logo
1 de 53
The MGI & Data-driven High-
Throughput Synthesis and
Characterization
Brian DeCost, Zachary Trautt, Martin Green, Gilad Kusne,
Jason Hattrick-Simpers
NIST Gaithersburg
Jason.Hattrick-Simpers@nist.gov
@jae3goals
Any mention of commercial products within this talk is for information only; it does not imply recommendation or
endorsement by NIST.
Outline
• The Materials Genome Initiative (MGI) and NIST’s Role
• The High-Throughput Experimental Materials Collaboratory (HTE-MC)
• Accelerated Discovery of (High – Hardness & Corrosion Resistant)
Metallic Glasses
• Iterative HTE and AI
• Vision for the Future
• Look Ma No Hands (Experimentation)!!
• Conclusions
Decrease time-to-market by 50% while <<$$
• Develop a Materials Innovation
Infrastructure
• Achieve National goals in energy,
security, and human welfare with
advanced materials
• Equip the next generation of
materials workforce
Materials Genome Initiative for
Global Competitiveness
Span the Continuum
Historical Perspective
The Materials Genome Initiative
Apple Watch – Announced September
2014
Examples of Cultural Implementation and
Successes of the MGI
• Argonne Collaboration – phase identification at aluminum interfaces
• Lund Boats – MGI on the plant floor
• Casting Simulation (MAGMA) – MGI in R&D, tool shop, & plant floor
• Timken Steel – Premium Air Melt Practice, putting premium quality,
cost conscious steel into the hands of our customers
• BASF – Foaming simulations based on first principles
• ERCo – Laser Induced Breakdown Spectroscopy for real-time melt
composition (ARPA-E)
Standards Are Important
• The NIST MGI Program is taking a very careful approach to consensus
standards for data representation
• There is a long track record of failure for most of the space
• Exception for high structured data (e.g. ICSD)
• This should be done top-down not bottoms-up
MGI Directions to Date
Materials by Design
projects:
DOE EFRCs, EMNs
NSF DMREFs
HT computational
databases:
Need: High-throughput
experimental data
Workshop: “Fulfilling the Promise of the Materials
Genome Initiative via High-Throughput
Experimentation” – 2014
Workshop Conclusions
A large portion of the MGI program thus far has been devoted to modeling
and simulation. Prodigious amounts of experimental data will be required to
inform and validate modeling and simulation, to “power the MGI
computational engine.”
 HTE can rapidly establish relationships between composition, structure,
and properties for a wide variety of materials classes, and therefore is:
a) uniquely suited to rapidly generate high quality, consistent data
sets
b) the key enabling counterpart to modeling and simulation for
bringing the MGI to fruition
 “Enable broad access to HTE methodologies and data”
High Throughput Experimental Materials
Collaboratory (HTE-MC)
• Necessary because even on “brick and mortar” HTE facility would be
very costly, and multiple facilities dedicated to different materials
classes (e.g. catatlysts, photovoltaics, lightweight structural materials,
etc.) are needed
• Enable researchers at national laboratories, universities, and industry
to have access to HTE facilities
• The HTE-MC would facilitate MGI-driven research while leveraging
investment
• Complement new science investments (EMN’s, NNMI, MURI, etc)
How?
• Collaboratory: a 1989 neologism (William A. Wulf, Computer Scientist
at University of Virginia):
“defined by… a center without walls, ‘in which the nation’s
researchers can perform their research without regard to physical
locations, interacting with colleaues, accessing instrumentation,
sharing data and computational resources, … accessing information in
digital librarires
• A HTE-MC would consist of:
• An integrated, delocalized network of high-throughput synthesis and
characterization tools
• A best-in-class materials data management platform, consisting of NIST (and
other) software
HTE-MC 1st Steps: NIST – NREL Round Robin
Sample synthesis and measurements:
• Synthesize: Zn-Sn-Ti-O composition spread
sample libraries using combinatorial PLD
(@NIST) or sputtering (@NREL)
• Measure: Chemical composition, Crystal
structure, Electrical conductivity, Optical
transmittance, Band gap
• Exchange: Sample libraries and associated
data, repeat measurements
Zn-Sn-Ti-O:
• Chemical composition
• Crystal structure
• Electrical conductivity
• Optical transmittance
• Work function
Goal: test and improve the standards for exchange of data and sample among participant labs
NREL Samples NIST Sample
Addressing FAIR Principles
To be Findable:
• (meta)data are assigned a globally unique and
persistent identifier
• data are described with rich metadata
• metadata clearly and explicitly include the identifier
of the data it describes
• (meta)data are registered or indexed in a searchable
resource
To be Accessible:
• (meta)data are retrievable by their identifier using a
standardized communications protocol
– the protocol is open, free, and universally
implementable
– the protocol allows for an authentication and
authorization procedure, where necessary
• metadata are accessible, even when the data are no
longer available
To be Interoperable:
• (meta)data use a formal, accessible, shared, and
broadly applicable language for knowledge
representation.
• (meta)data use vocabularies that follow FAIR
principles
• (meta)data include qualified references to other
(meta)data
To be Reusable:
• meta(data) are richly described with a plurality of
accurate and relevant attributes
– (meta)data are released with a clear and accessible data
usage license
– (meta)data are associated with detailed provenance
– (meta)data meet domain-relevant community standards
Wilkinson, Mark D., et al. "The FAIR Guiding Principles for scientific data
management and stewardship." Scientific data 3 (2016). DOI:
10.1038/sdata.2016.18
HTE-MCGOVERNMENT
AGENCIES
MEMBERS
• Academia
• National Labs
• Industry
• Small Business
Provide
Students/Staff
Receive
Funding $Provide Structural
Funding
Provide Science
Infrastructure
USERS
• Industry
• Small Business
• Academia
• National Labs
• Manufacturing
USA Institutes
• Energy Materials
Networks
Pay Tiered
Access Fees
$
$
Generate
New Data
CONTRIBUTORS
• Academia
• National Labs
• HTE-MC Users
(after embargo period)
Receive
Benefits
Publish Open-
Access Data
VISITORS / PUBLIC
• Industry
• Small Business
• Academia
• Educators
• National Labs
• Manufacturing
USA Institutes
• Energy Materials
Networks
Access AI-ready
Public Data
Next Generation
Workforce
New
Knowledge
Materials
Solutions
+1
Provide Data
Infrastructure
HTE Materials Collaboratory
Problems
• Experimental databases
are not keeping pace with
computational databases
• HTE is out of reach to most
due to high startup and
operating costs
• Materials are diverse; no
single institution can have
all the necessary
equipment
Solution
• Integrate HTE laboratories
with materials
cyberinfrastructure
• HTE as a shared resource;
operate on demand by
access fees and core
funding
• HTE as a federated
resource; enable
connectivity via
cyberinfrastructure
• Member
• Provides infrastructure
• User
• Utilizes infrastructure
• Creates new data
• May choose to
publish data
• Contributor
• Publishes data
• Visitor
• Consumes public data
Technical Stakeholder Types and Population
Visitors
Contributors
Users
Members
(defines action, not access)
The Collaborative Economy
HTE-MC
HTE-MC
Member Institute
Laboratory Information
Management System
Data Transfer Grid
Instruments/Computing
Database / Structured
Data / Metadata
File/Collection Repository
Member Institute
Laboratory Information
Management System
Instruments/Computing
File/Collection Repository
Data Dissemination
Data Transfer Grid
Database / Structured
Data / Metadata
File/Collection Repository
Registries
Materials
Resource Registry
High-Throughput
Experiment
Resource Registry
Member Institute
User Institute
Data Transfer Grid
Laboratory Information
Management System
Data Transfer Grid
Instruments/Computing
Database / Structured
Data / Metadata
File/Collection Repository
Data Transfer Grid
Database / Structured
Data / Metadata
High-Throughput Experimental Materials
Collaboratory (HTE-MC) Workshop
• Held: February 2018
• Workshop Goals:
• Socialize the HTE-MC concept among government, academic and industry stakeholders
• Expand HTE-MC membership
• Define technical, operational and business models for the HTE-MC
• Facilitated Breakout Sessions:
• Define the Vision of HTE-MC
• Define the value proposition for participation
• Identify major barriers to successful participation
• Identify and prioritize pilot use cases
• Identify and describe modes of interaction of users
• Define governance and business models for HTE-MC
• Workshop Report: In preparation
A Multi-Agency, Multi-Year Program Plan in
Advanced Energy Materials Discovery,
Development, and Process Design
• Held July 2018
• Workshop Goals
• Determine how best to coordinate next steps within the Federal Government
• Efficiently leverage the ongoing research in advanced materials conducted in
academia, industry, and government research laboratories
• Facilitated Breakout Sessions:
• Priorities in Energy Materials R&D: Barriers, Timeline, and Metrics
• Database infrastructure needs in AI and Energy Materials R&D: Moving Materials
Discovery through Materials Processes
• Expansion of the Collaboratory Network for Energy Materials Discovery and Process
Design
• Integration of AI, ML, and Experimentation for Energy Materials Design and
Processing
• Workshop Report: In preparation
Iterative Machine Learning – High
Throughput Experimental Approach to
Discovering Novel Amorphous Alloys
Fang Ren1, Logan Ward2, Travis Williams3, Kevin J. Laws4,
Christopher M. Wolverton2, Jason Hattrick-Simpers5, Apurva Mehta1
1SLAC National Accelerator Laboratory, 2Northwestern University, 3University of South Carolina,
4UNSW Australia, 5National Institute of Standards and Technology, 6 University of Chicago
Science Advances, Vol 4 No. 4 (2018)
Lightweight Structural Materials
http://corporate.exxonmobil.com/en/energy/research-and-development/innovating-
energy-solutions/research-and-development-highlights
Wall Street Journal via Google Images
Metallic Glasses Are Interesting
http://vitreloy.caltech.edu/development.htm
West US 7998286 B2
E Ma. Nature Materials. 14, 2015.
Metallic glass (MG) is a solid
metallic material, usually an
alloy, with a disordered atomic-
scale structure (amorphous).
The Palette of Potential Metallic Glasses
Usually Contain 3 or more elements
30 non-toxic, earth friendly elements  > 4000 ternaries, > 4 Million compositions
Building the Machine Learning Model
Ref: Ward et al. npj Comp. Mater. (2016), 28.
Experimental
Data
Machine Learning
Algorithm
Composition-based
Representation
𝜎𝑟 < 1.1 Å
MG Not MG
𝜇 𝑍 ΔΧ
𝜎 𝑇 𝑚 max 𝑟𝑐𝑜𝑣
𝑥 𝐻, 𝑥 𝐻𝑒, … 2
𝑮𝑭𝑨 = 𝒇(𝒙 𝑯, 𝒙 𝑯𝒆, … )
24 Million Ternary Alloys
74520 potential MGs
5739 measurements
145 Attributes
Random Forest
Select Experiments that Involve Contradiction
Selection Criteria
1.) None of the models 100% disagree
2.) Some experimental data existed
3.) Inexpensive, low vapor pressure materials
Yang Model
Efficient
Packing Model ML Predictions
(Split) Model Predictions
Melt Spun Predictions Sputtered Predictions
“Fail Fast” via HTE
Sample Position
Deposition
Ratio
Deposited Sample
Gun 1
Binary Deposition
Gun 2
2D XRD
Detector
Fluorescence
Detector
Temperature > 1200K,
~ 5000 patterns/day
Negative Results >> Positive Results
We Can Rebuild It, We Have the Technology
Can the Model be Generalized?
Are There Any Interesting Generalizations?
Case Example X-Y-Al: Breaking from
Convention AND Property Prediction
No “deep” eutectics necessary!
Massalski “Binary Alloy Phase Diagrams” (1990)
But How to Create Property Models?
• There is no L-B-type data set for
properties of MG
• NLP/data extraction from
figures is in its infancy
• Manually scrape the literature
• 2000+ entries
• Errant measurements
• Many different groups
• Inconsistent definition of
“amorphous”
Feature Importance
Average Ground State
Volume
0.37
Minimum Ground State
Volume
0.24
Minimum Covalent Radius
0.12
Mean Melting
Temperature
0.036
Highest Melting
Temperature
0.017
Ternary Modulus Predictions
0
50
100
150
200
250
300
0 50 100 150 200 250 300
PredictedModulus(GPa)
Measured Elastic Modulus (GPa)
Experimental Validation of Prediction
Er
Can A.I./M.L. Lead to Autonomous Materials
Discovery?
“In the next 5 years, AI-driven, autonomous
materials research is going to fundamentally
change how we do materials science.”
-Jim Warren, Technical Program Director for
Materials Genomics, NIST
Autonomous Research is Already Here
AFRL - ARES Hein - UBC NIST - UMD
Autonomous REsearch Systems (ARES)
Gilad’s Automated Experimentation Platform
Fe
Fe0.4Pd0.6
Fe0.4Ga0.6
Kusne, et. al., to be submitted
Active clustering for autonomous XRD phase
mapping
Think carefully about modeling to remove researcher degrees of freedom
DeCost, et. al., to be submitted
Conclusions
• AI & ML are already prevalent in the design of new materials, materials
synthesis, data capture/cleaning and knowledge extraction
• Neither AI nor ML are a panacea that will replace human intuition and
creativity, they are enablers
• In some cases an order of magnitude increase in materials
exploration/discovery is possible
• Maybe a fairer metric of AI’s influence will be on the rate of hypothesis
generation and (in)validation
• AI needs FAIR data including negative results to be effective
• Not part of the solution = consigned to obscurity
• Full materials research autonomy (for specific problems) has already been
demonstrated
Acknowledgements
USC
Travis Williams
SLAC
Dr. Apurva Mehta
Dr. Fang Ren
Dr. Suchismita
Northwestern
Prof. Wolverton
Dr. Logan Ward
UNSW
Prof. Kevin Laws
NIST
Dr. James Warren
Dr. Martin Green
Dr. Zachary Trautt
Dr. Gilad Kusne
Dr. Brian DeCost
Mr. Ryan Smith
NREL
Dr. Andriy
Zakutayev
CSM
Prof. Packard
Dr. Schoeppner
Demonstrations and Talks by (confirmed speakers):
• Theory
• Computational Approaches
• Experimental Approaches
Andrew Millis (Columbia)
Antoine Georges (CCQ)
Karin Rabe (Rutgers)
Bootcamp: Machine Learning for Materials Research &
Workshop: Machine Learning Quantum Materials
• Dates: July 30 – Aug 3, 2018
• Location: IBBR (Gaithersburg, Maryland)
MLMR Introduces researchers from industry, national labs, and academia to machine learning theory and tools for rapid data analysis.
https://nanocenter.umd.edu/events/mlmr/
Bootcamp
Three days of lectures and hands-on exercises covering a range of
data analysis topics from data pre-processing through advanced
machine learning analysis techniques. Example topics include:
• Identifying important features in complex/high dimensional
data
• Visualizing high dimensional data to facilitate user analysis.
• Identifying the fabrication ‘descriptors’ that best predict
variance in functional properties.
• Quantifying similarities between materials using complex/high
dimensional data
The hands-on exercises will demonstrate practical use of machine
learning tools on real materials data (scalar values, spectra,
micrographs, etc.
Sasha Balatsky (LANL)
Roger Melko (Waterloo)
Shoucheng Zhang (Stanford)
Stefano Curtarolo (Duke)
Gus Hart (BYU)
Ichiro Takeuchi (UMD)
Sergei Kalinin (ORNL)
Benji Maruyama (AFRL)
Jiun-Haw Chu (Univ. Washington)
Giuseppe Carleo (Flatiron)
Miles Soudenmire (Flatiron)

Mais conteúdo relacionado

Mais procurados

Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NREL
aimsnist
 

Mais procurados (20)

Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
 
Machine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methodsMachine learning for materials design: opportunities, challenges, and methods
Machine learning for materials design: opportunities, challenges, and methods
 
Open Source Tools for Materials Informatics
Open Source Tools for Materials InformaticsOpen Source Tools for Materials Informatics
Open Source Tools for Materials Informatics
 
Applications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials DesignApplications of Natural Language Processing to Materials Design
Applications of Natural Language Processing to Materials Design
 
Smart Metrics for High Performance Material Design
Smart Metrics for High Performance Material DesignSmart Metrics for High Performance Material Design
Smart Metrics for High Performance Material Design
 
Data dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNLData dissemination and materials informatics at LBNL
Data dissemination and materials informatics at LBNL
 
Going Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCFGoing Smart and Deep on Materials at ALCF
Going Smart and Deep on Materials at ALCF
 
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
“Materials Informatics and Big Data: Realization of 4th Paradigm of Science i...
 
DuraMat Data Management and Analytics
DuraMat Data Management and AnalyticsDuraMat Data Management and Analytics
DuraMat Data Management and Analytics
 
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...Progress Towards Leveraging Natural Language Processing for Collecting Experi...
Progress Towards Leveraging Natural Language Processing for Collecting Experi...
 
Materials design using knowledge from millions of journal articles via natura...
Materials design using knowledge from millions of journal articles via natura...Materials design using knowledge from millions of journal articles via natura...
Materials design using knowledge from millions of journal articles via natura...
 
Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...Software tools for data-driven research and their application to thermoelectr...
Software tools for data-driven research and their application to thermoelectr...
 
DuraMat Data Analytics
DuraMat Data AnalyticsDuraMat Data Analytics
DuraMat Data Analytics
 
2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model2D/3D Materials screening and genetic algorithm with ML model
2D/3D Materials screening and genetic algorithm with ML model
 
Discovering advanced materials for energy applications by mining the scientif...
Discovering advanced materials for energy applications by mining the scientif...Discovering advanced materials for energy applications by mining the scientif...
Discovering advanced materials for energy applications by mining the scientif...
 
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
Graph Centric Analysis of Road Network Patterns for CBD’s of Metropolitan Cit...
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 
Applications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NRELApplications of Machine Learning for Materials Discovery at NREL
Applications of Machine Learning for Materials Discovery at NREL
 
Open-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data setsOpen-source tools for generating and analyzing large materials data sets
Open-source tools for generating and analyzing large materials data sets
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
 

Semelhante a Hattrick Simpers TMS Machine Learning Workshop Slides

Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Geoffrey Fox
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
Vaticle
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
Dan Taylor
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508
okeee
 
Презентация проекта ООО "Лаборатория Кинтех"
Презентация проекта ООО "Лаборатория Кинтех"Презентация проекта ООО "Лаборатория Кинтех"
Презентация проекта ООО "Лаборатория Кинтех"
Ivan Zaev
 

Semelhante a Hattrick Simpers TMS Machine Learning Workshop Slides (20)

Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
Multi-faceted Classification of Big Data Use Cases and Proposed Architecture ...
 
High Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run TimeHigh Performance Data Analytics and a Java Grande Run Time
High Performance Data Analytics and a Java Grande Run Time
 
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
 
Paving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflowsPaving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflows
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Australia's Environmental Predictive Capability
Australia's Environmental Predictive CapabilityAustralia's Environmental Predictive Capability
Australia's Environmental Predictive Capability
 
Summary of June 2014 Workshop Report: Building a Materials Accelerator Network
Summary of June 2014 Workshop Report: Building a Materials Accelerator NetworkSummary of June 2014 Workshop Report: Building a Materials Accelerator Network
Summary of June 2014 Workshop Report: Building a Materials Accelerator Network
 
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
AHM 2014: Enterprise Architecture for Transformative Research and Collaborati...
 
Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...
 
Why manage research data?
Why manage research data?Why manage research data?
Why manage research data?
 
100503 bioinfo instsymp
100503 bioinfo instsymp100503 bioinfo instsymp
100503 bioinfo instsymp
 
100503 bioinfo instsymp
100503 bioinfo instsymp100503 bioinfo instsymp
100503 bioinfo instsymp
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
 
10[1].1.1.115.9508
10[1].1.1.115.950810[1].1.1.115.9508
10[1].1.1.115.9508
 
Agile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric ApproachAgile Big Data Analytics Development: An Architecture-Centric Approach
Agile Big Data Analytics Development: An Architecture-Centric Approach
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
Презентация проекта ООО "Лаборатория Кинтех"
Презентация проекта ООО "Лаборатория Кинтех"Презентация проекта ООО "Лаборатория Кинтех"
Презентация проекта ООО "Лаборатория Кинтех"
 
Green Shoots: Research Data Management Pilot at Imperial College London
Green Shoots:Research Data Management Pilot at Imperial College LondonGreen Shoots:Research Data Management Pilot at Imperial College London
Green Shoots: Research Data Management Pilot at Imperial College London
 

Último

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Último (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
chemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdfchemical bonding Essentials of Physical Chemistry2.pdf
chemical bonding Essentials of Physical Chemistry2.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 

Hattrick Simpers TMS Machine Learning Workshop Slides

  • 1. The MGI & Data-driven High- Throughput Synthesis and Characterization Brian DeCost, Zachary Trautt, Martin Green, Gilad Kusne, Jason Hattrick-Simpers NIST Gaithersburg Jason.Hattrick-Simpers@nist.gov @jae3goals Any mention of commercial products within this talk is for information only; it does not imply recommendation or endorsement by NIST.
  • 2. Outline • The Materials Genome Initiative (MGI) and NIST’s Role • The High-Throughput Experimental Materials Collaboratory (HTE-MC) • Accelerated Discovery of (High – Hardness & Corrosion Resistant) Metallic Glasses • Iterative HTE and AI • Vision for the Future • Look Ma No Hands (Experimentation)!! • Conclusions
  • 3. Decrease time-to-market by 50% while <<$$ • Develop a Materials Innovation Infrastructure • Achieve National goals in energy, security, and human welfare with advanced materials • Equip the next generation of materials workforce Materials Genome Initiative for Global Competitiveness
  • 6. The Materials Genome Initiative
  • 7.
  • 8. Apple Watch – Announced September 2014
  • 9. Examples of Cultural Implementation and Successes of the MGI • Argonne Collaboration – phase identification at aluminum interfaces • Lund Boats – MGI on the plant floor • Casting Simulation (MAGMA) – MGI in R&D, tool shop, & plant floor • Timken Steel – Premium Air Melt Practice, putting premium quality, cost conscious steel into the hands of our customers • BASF – Foaming simulations based on first principles • ERCo – Laser Induced Breakdown Spectroscopy for real-time melt composition (ARPA-E)
  • 10.
  • 11. Standards Are Important • The NIST MGI Program is taking a very careful approach to consensus standards for data representation • There is a long track record of failure for most of the space • Exception for high structured data (e.g. ICSD) • This should be done top-down not bottoms-up
  • 12. MGI Directions to Date Materials by Design projects: DOE EFRCs, EMNs NSF DMREFs HT computational databases: Need: High-throughput experimental data
  • 13.
  • 14. Workshop: “Fulfilling the Promise of the Materials Genome Initiative via High-Throughput Experimentation” – 2014
  • 15. Workshop Conclusions A large portion of the MGI program thus far has been devoted to modeling and simulation. Prodigious amounts of experimental data will be required to inform and validate modeling and simulation, to “power the MGI computational engine.”  HTE can rapidly establish relationships between composition, structure, and properties for a wide variety of materials classes, and therefore is: a) uniquely suited to rapidly generate high quality, consistent data sets b) the key enabling counterpart to modeling and simulation for bringing the MGI to fruition  “Enable broad access to HTE methodologies and data”
  • 16.
  • 17. High Throughput Experimental Materials Collaboratory (HTE-MC) • Necessary because even on “brick and mortar” HTE facility would be very costly, and multiple facilities dedicated to different materials classes (e.g. catatlysts, photovoltaics, lightweight structural materials, etc.) are needed • Enable researchers at national laboratories, universities, and industry to have access to HTE facilities • The HTE-MC would facilitate MGI-driven research while leveraging investment • Complement new science investments (EMN’s, NNMI, MURI, etc)
  • 18. How? • Collaboratory: a 1989 neologism (William A. Wulf, Computer Scientist at University of Virginia): “defined by… a center without walls, ‘in which the nation’s researchers can perform their research without regard to physical locations, interacting with colleaues, accessing instrumentation, sharing data and computational resources, … accessing information in digital librarires • A HTE-MC would consist of: • An integrated, delocalized network of high-throughput synthesis and characterization tools • A best-in-class materials data management platform, consisting of NIST (and other) software
  • 19. HTE-MC 1st Steps: NIST – NREL Round Robin Sample synthesis and measurements: • Synthesize: Zn-Sn-Ti-O composition spread sample libraries using combinatorial PLD (@NIST) or sputtering (@NREL) • Measure: Chemical composition, Crystal structure, Electrical conductivity, Optical transmittance, Band gap • Exchange: Sample libraries and associated data, repeat measurements Zn-Sn-Ti-O: • Chemical composition • Crystal structure • Electrical conductivity • Optical transmittance • Work function Goal: test and improve the standards for exchange of data and sample among participant labs NREL Samples NIST Sample
  • 20. Addressing FAIR Principles To be Findable: • (meta)data are assigned a globally unique and persistent identifier • data are described with rich metadata • metadata clearly and explicitly include the identifier of the data it describes • (meta)data are registered or indexed in a searchable resource To be Accessible: • (meta)data are retrievable by their identifier using a standardized communications protocol – the protocol is open, free, and universally implementable – the protocol allows for an authentication and authorization procedure, where necessary • metadata are accessible, even when the data are no longer available To be Interoperable: • (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. • (meta)data use vocabularies that follow FAIR principles • (meta)data include qualified references to other (meta)data To be Reusable: • meta(data) are richly described with a plurality of accurate and relevant attributes – (meta)data are released with a clear and accessible data usage license – (meta)data are associated with detailed provenance – (meta)data meet domain-relevant community standards Wilkinson, Mark D., et al. "The FAIR Guiding Principles for scientific data management and stewardship." Scientific data 3 (2016). DOI: 10.1038/sdata.2016.18
  • 21. HTE-MCGOVERNMENT AGENCIES MEMBERS • Academia • National Labs • Industry • Small Business Provide Students/Staff Receive Funding $Provide Structural Funding Provide Science Infrastructure USERS • Industry • Small Business • Academia • National Labs • Manufacturing USA Institutes • Energy Materials Networks Pay Tiered Access Fees $ $ Generate New Data CONTRIBUTORS • Academia • National Labs • HTE-MC Users (after embargo period) Receive Benefits Publish Open- Access Data VISITORS / PUBLIC • Industry • Small Business • Academia • Educators • National Labs • Manufacturing USA Institutes • Energy Materials Networks Access AI-ready Public Data Next Generation Workforce New Knowledge Materials Solutions +1 Provide Data Infrastructure
  • 22. HTE Materials Collaboratory Problems • Experimental databases are not keeping pace with computational databases • HTE is out of reach to most due to high startup and operating costs • Materials are diverse; no single institution can have all the necessary equipment Solution • Integrate HTE laboratories with materials cyberinfrastructure • HTE as a shared resource; operate on demand by access fees and core funding • HTE as a federated resource; enable connectivity via cyberinfrastructure
  • 23. • Member • Provides infrastructure • User • Utilizes infrastructure • Creates new data • May choose to publish data • Contributor • Publishes data • Visitor • Consumes public data Technical Stakeholder Types and Population Visitors Contributors Users Members (defines action, not access)
  • 25. HTE-MC Member Institute Laboratory Information Management System Data Transfer Grid Instruments/Computing Database / Structured Data / Metadata File/Collection Repository Member Institute Laboratory Information Management System Instruments/Computing File/Collection Repository Data Dissemination Data Transfer Grid Database / Structured Data / Metadata File/Collection Repository Registries Materials Resource Registry High-Throughput Experiment Resource Registry Member Institute User Institute Data Transfer Grid Laboratory Information Management System Data Transfer Grid Instruments/Computing Database / Structured Data / Metadata File/Collection Repository Data Transfer Grid Database / Structured Data / Metadata
  • 26. High-Throughput Experimental Materials Collaboratory (HTE-MC) Workshop • Held: February 2018 • Workshop Goals: • Socialize the HTE-MC concept among government, academic and industry stakeholders • Expand HTE-MC membership • Define technical, operational and business models for the HTE-MC • Facilitated Breakout Sessions: • Define the Vision of HTE-MC • Define the value proposition for participation • Identify major barriers to successful participation • Identify and prioritize pilot use cases • Identify and describe modes of interaction of users • Define governance and business models for HTE-MC • Workshop Report: In preparation
  • 27. A Multi-Agency, Multi-Year Program Plan in Advanced Energy Materials Discovery, Development, and Process Design • Held July 2018 • Workshop Goals • Determine how best to coordinate next steps within the Federal Government • Efficiently leverage the ongoing research in advanced materials conducted in academia, industry, and government research laboratories • Facilitated Breakout Sessions: • Priorities in Energy Materials R&D: Barriers, Timeline, and Metrics • Database infrastructure needs in AI and Energy Materials R&D: Moving Materials Discovery through Materials Processes • Expansion of the Collaboratory Network for Energy Materials Discovery and Process Design • Integration of AI, ML, and Experimentation for Energy Materials Design and Processing • Workshop Report: In preparation
  • 28. Iterative Machine Learning – High Throughput Experimental Approach to Discovering Novel Amorphous Alloys Fang Ren1, Logan Ward2, Travis Williams3, Kevin J. Laws4, Christopher M. Wolverton2, Jason Hattrick-Simpers5, Apurva Mehta1 1SLAC National Accelerator Laboratory, 2Northwestern University, 3University of South Carolina, 4UNSW Australia, 5National Institute of Standards and Technology, 6 University of Chicago Science Advances, Vol 4 No. 4 (2018)
  • 30. Metallic Glasses Are Interesting http://vitreloy.caltech.edu/development.htm West US 7998286 B2 E Ma. Nature Materials. 14, 2015. Metallic glass (MG) is a solid metallic material, usually an alloy, with a disordered atomic- scale structure (amorphous).
  • 31. The Palette of Potential Metallic Glasses Usually Contain 3 or more elements 30 non-toxic, earth friendly elements  > 4000 ternaries, > 4 Million compositions
  • 32.
  • 33. Building the Machine Learning Model Ref: Ward et al. npj Comp. Mater. (2016), 28. Experimental Data Machine Learning Algorithm Composition-based Representation 𝜎𝑟 < 1.1 Å MG Not MG 𝜇 𝑍 ΔΧ 𝜎 𝑇 𝑚 max 𝑟𝑐𝑜𝑣 𝑥 𝐻, 𝑥 𝐻𝑒, … 2 𝑮𝑭𝑨 = 𝒇(𝒙 𝑯, 𝒙 𝑯𝒆, … ) 24 Million Ternary Alloys 74520 potential MGs 5739 measurements 145 Attributes Random Forest
  • 34. Select Experiments that Involve Contradiction Selection Criteria 1.) None of the models 100% disagree 2.) Some experimental data existed 3.) Inexpensive, low vapor pressure materials Yang Model Efficient Packing Model ML Predictions
  • 35. (Split) Model Predictions Melt Spun Predictions Sputtered Predictions
  • 36. “Fail Fast” via HTE Sample Position Deposition Ratio Deposited Sample Gun 1 Binary Deposition Gun 2 2D XRD Detector Fluorescence Detector Temperature > 1200K, ~ 5000 patterns/day
  • 37. Negative Results >> Positive Results
  • 38. We Can Rebuild It, We Have the Technology
  • 39. Can the Model be Generalized?
  • 40. Are There Any Interesting Generalizations?
  • 41. Case Example X-Y-Al: Breaking from Convention AND Property Prediction No “deep” eutectics necessary! Massalski “Binary Alloy Phase Diagrams” (1990)
  • 42. But How to Create Property Models? • There is no L-B-type data set for properties of MG • NLP/data extraction from figures is in its infancy • Manually scrape the literature • 2000+ entries • Errant measurements • Many different groups • Inconsistent definition of “amorphous” Feature Importance Average Ground State Volume 0.37 Minimum Ground State Volume 0.24 Minimum Covalent Radius 0.12 Mean Melting Temperature 0.036 Highest Melting Temperature 0.017
  • 43. Ternary Modulus Predictions 0 50 100 150 200 250 300 0 50 100 150 200 250 300 PredictedModulus(GPa) Measured Elastic Modulus (GPa)
  • 44. Experimental Validation of Prediction Er
  • 45. Can A.I./M.L. Lead to Autonomous Materials Discovery?
  • 46. “In the next 5 years, AI-driven, autonomous materials research is going to fundamentally change how we do materials science.” -Jim Warren, Technical Program Director for Materials Genomics, NIST
  • 47. Autonomous Research is Already Here AFRL - ARES Hein - UBC NIST - UMD
  • 49. Gilad’s Automated Experimentation Platform Fe Fe0.4Pd0.6 Fe0.4Ga0.6 Kusne, et. al., to be submitted
  • 50. Active clustering for autonomous XRD phase mapping Think carefully about modeling to remove researcher degrees of freedom DeCost, et. al., to be submitted
  • 51. Conclusions • AI & ML are already prevalent in the design of new materials, materials synthesis, data capture/cleaning and knowledge extraction • Neither AI nor ML are a panacea that will replace human intuition and creativity, they are enablers • In some cases an order of magnitude increase in materials exploration/discovery is possible • Maybe a fairer metric of AI’s influence will be on the rate of hypothesis generation and (in)validation • AI needs FAIR data including negative results to be effective • Not part of the solution = consigned to obscurity • Full materials research autonomy (for specific problems) has already been demonstrated
  • 52. Acknowledgements USC Travis Williams SLAC Dr. Apurva Mehta Dr. Fang Ren Dr. Suchismita Northwestern Prof. Wolverton Dr. Logan Ward UNSW Prof. Kevin Laws NIST Dr. James Warren Dr. Martin Green Dr. Zachary Trautt Dr. Gilad Kusne Dr. Brian DeCost Mr. Ryan Smith NREL Dr. Andriy Zakutayev CSM Prof. Packard Dr. Schoeppner
  • 53. Demonstrations and Talks by (confirmed speakers): • Theory • Computational Approaches • Experimental Approaches Andrew Millis (Columbia) Antoine Georges (CCQ) Karin Rabe (Rutgers) Bootcamp: Machine Learning for Materials Research & Workshop: Machine Learning Quantum Materials • Dates: July 30 – Aug 3, 2018 • Location: IBBR (Gaithersburg, Maryland) MLMR Introduces researchers from industry, national labs, and academia to machine learning theory and tools for rapid data analysis. https://nanocenter.umd.edu/events/mlmr/ Bootcamp Three days of lectures and hands-on exercises covering a range of data analysis topics from data pre-processing through advanced machine learning analysis techniques. Example topics include: • Identifying important features in complex/high dimensional data • Visualizing high dimensional data to facilitate user analysis. • Identifying the fabrication ‘descriptors’ that best predict variance in functional properties. • Quantifying similarities between materials using complex/high dimensional data The hands-on exercises will demonstrate practical use of machine learning tools on real materials data (scalar values, spectra, micrographs, etc. Sasha Balatsky (LANL) Roger Melko (Waterloo) Shoucheng Zhang (Stanford) Stefano Curtarolo (Duke) Gus Hart (BYU) Ichiro Takeuchi (UMD) Sergei Kalinin (ORNL) Benji Maruyama (AFRL) Jiun-Haw Chu (Univ. Washington) Giuseppe Carleo (Flatiron) Miles Soudenmire (Flatiron)

Notas do Editor

  1. I think we have a great opportunity for you to give attendees an overview of your work in data-driven HT synthesis and characterization. You should also feel free to provide forward-looking vision, e.g., if you'd like to highlight the emerging HT collaboratory concept led by NIST. Finally, the audience may also find it interesting to hear a bit of introductory content about NIST's role in MGI and materials data broadly.
  2. Old story – how do we combine experiment, computation, and digital data to develop the materials that fit critical needs but do so cheaper and faster than ever before?
  3. Gist: We can’t forget that this is about the full discovery to deployment cycle, it doesn’t serve our purposes to compartmentalize and only focus on independent material discovery but to consider how it will eventually move into application.
  4. MGI ideas aren’t necessarily new, but are following a natural progression that began in 1988 with COTA. The idea is that through computation-guided experimentation we can achieve our goals more quickly than through only experimentation.
  5. The emphasis is that this was started as a multi-agency initiative with coordination through the agencies but with each agency taking their own approach to implementation.
  6. Materials are complex, multiple length scales are important. We use simulations to look up length scales and experiments to look down. These both generate and consume data that is used to inform models which generate data and inform the exp-sim loop. Outside of this loop we would like to arrive at new and outstanding materials. The data and the models can live anywhere, in an ideal somewhere FAIR, but in reality is scattered between notebooks and hard drives and someone’s memory.
  7. When this method of producing materials works, it can be powerful. Alloys designed by Apple using Questech IP – centered around ICME/MGI technique for materials design and deployment.
  8. So let’s take the idea from the previous slide, abstract it a bit, and ask where does NIST fit into this MGI equation? DOC’s smiling face towards industry. If we think that there are hundreds (thousands) of such MGI loops in the country all producing data and models, then NIST’s fit is clear. First of all, we have to help industry, academia, and government labs exchange data. We can help set up repositories but first we have to ask what are meaningful (standard) ways of interchanging materials data from disparate sources? Secondly, NIST is measurement technology driven and UNCERTAINTY and QUALITY assessment and improvement are key directives in this space.
  9. But we are talking about materials data and models in repositories and a key question remains, “where are the curated, homogeneous and high-quality materials data for model development and validation coming from?”
  10. This is a big problem within the MGI, because a great deal of effort has gone to the top half of the Venn diagram. Our (and a number of other’s) contention is that HT experimentation is the potential driving force for the MGI engine.
  11. This started with a review paper by Marty, Ichiro and myself talking about how HTE has really revolutionized the way people search for and optimize new materials. This caught the attention of OSTP White House and we were asked to organize a workshop bringing together some of the best in the field.
  12. This slide is just about one of the outcomes from the workshop (held in San Fran in May 2014)
  13. Can we turn me, the high-throughput experimentalist, into the rate limiting step in an intelligent search for new amorphous alloys?
  14. Emphasize this is a moonshot, but that my off ramp is in the field of coatings.
  15. Meshing is important but a reasonably dense sampling would take ~1000 years bulk alloy (5/day) ~10 years via HTE alone ~2 years
  16. Start at the bottom of this image and work my way clockwise.
  17. Stocihiometic attributes capture the fraction but not type of elements present. Elemental property attributes of atomic row, mendeev number, atomic weight, total # of unfilled states, etc. with both weighted averages, max, min, range average deviation and mode Calence orbital occupation attributes Ionic compound attributes…
  18. Ask before getting into this, if anyone isn’t familiar with roughly how Random Forest works.
  19. Left melt-spun model, right stacked model
  20. Main data set is unbalanced
  21. The relationship between the liquidus, FWHM, and GFR, shown in fig. S7, suggests a strong correlation between glass formation and the C15 (MgCu2 prototype) Lave and B2 liquidus phase fields common to all four systems. These results indicate that these particular ordered phases are difficult to crystallize quickly, resulting in glass formation. For instance, for the Co-Zr–containing ternaries, despite the ZrCo2 C15 Lave phase having a high melting point relative to surrounding phases, the exceptional correlation between the GFR and the ZrCo2 liquidus phase field as it extends into the ternary composition space suggests a high kinetic barrier to crystallization. These correlations further suggest that large mismatch in ionic sizes and the presence of larger atoms in these structures, such as Zr, hinders crystallization more so. 
  22. What do the circle and/or the box mean?
  23. MAE 9.2 Mpa MRE 10% Here are some thoughts that I have: 1.) how much of the scatter is due to repeats for a given entry? 2.)
  24. If you’re interested in machine learning, we have an annual bootcamp at University of Maryland that teaches a wide variety of these techniques. You’ll learn things like how to identify important features in your data, how to visualize complex or high dimensional data, and how to identify descriptors. Each morning there are lectures and the afternoons are hands on activities applying machine learning to real materials data. And Ichiro Takeuchi, me and some collaborators have also organized an annual bootcamp. At this bootcamp we teach an introduction to machine learning, the most common techniques. Half of each day is also hands on training where you learn how to write code to analyze real data. Some examples of stuff we teach. How to …