Discovering new functional materials for clean energy and beyond using high-throughput computing and machine learning
1. Discovering new functional materials for clean
energy and beyond using high-throughput
computing and machine learning
Anubhav Jain
Lawrence Berkeley National Laboratory
Presentation given at Intel, Oct 2022
Slides (will be) posted to hackingmaterials.lbl.gov
2. Outline
• Introduction to group and overview of our projects
• The Materials Project and virtual materials design
• The Matbench protocol: benchmarking ML algorithms
• Natural language processing applied to materials design
• Automating materials synthesis and characterization
2
3. Overview of our research group
• Located at Lawrence Berkeley National Laboratory (Berkeley, CA)
• Group composition
• Usually 10 people in size (e.g., 5 postdocs, 5 graduate students)
• Major funding from U.S. Dept. of Energy, some funding from industry (Toyota Research
Institutes)
• Areas of emphasis
• Computational design of new functional materials
• Typically semiconductors, ceramics, or alloys
• e.g., past work in Li-ion and multivalent batteries, thermoelectric materials, carbon capture
materials, catalysts for water purification, etc.
• Not really polymers, molecular systems, or organic systems – although some past work here, too
• Machine learning applied to materials science
• Automated laboratories (recent)
3
4. We develop software frameworks for performing materials simulations,
including automation at supercomputing centers
Summary
• We develop and maintain
several software packages for
computational design of
materials
• These include “FireWorks”
for automating calculations at
supercomputing centers,
“atomate” for defining
materials science workflows,
and “matminer” for
generating descriptors for
crystal structures
4
5. We develop methods to calculate materials properties based on density
functional theory, often adapting methods for high-throughput applications
Summary
• Many materials properties are
either difficult to calculate or
require impractical amounts
of computer time
• We develop methods to
calculate materials properties
both accurately and
efficiently
• Examples include “AMSET”
(electron transport) and
ongoing work on thermal
properties of materials
5
Old method (BoltzTraP – screening is qualitative w/pitfalls)
New method (AMSET – screening is more quantitative)
Ganose, A. M.; Park, J.; Faghaninia, A.; Woods-Robinson, R.; Persson, K. A.; Jain, A. Efficient Calculation of Carrier Scattering Rates from First
Principles. Nat Commun 2021, 12 (1), 2222.
acoustic deformation potential (ad)
deformation potential, elastic tensor
ionized impurity (ii)
dielectric tensor
piezoelectric (pi)
dielectric tensor, piezoelectric tensor
polar optical phonon (po)
dielectric tensor, polar phonon frequency
a
Phonon
renormalization
at T > 0 K
Force constant
fitting
b
T= 0 K
T=100 K
T=200 K
Cubic SrTiO3 (Tc=105 K)
6. We use a combination of density functional theory calculations and machine
learning to design materials for various functional applications
Summary
• We trained machine learning
models (on open benchmark
data sets) to determine
catalytic performance of
materials in removing nitrate
from drinking water
• The models were used to pre-
screen ~60,000 materials to
only 23 materials that were
subjected to expensive
physics calculations for
verification
“Funnel” diagram illustrating how an initial list of
~60,000 compounds was passed through a
workflow to identify 23 interesting compounds.
ML was used in the workflow to pre-screen on
high activity and selectivity of N2/NH3.
The ML models show good correspondence with
significantly more expensive physical simulations
(“DFT”), demonstrating that they can be swapped
into the screening workflow reliably while extending
the search to ~500 times more compounds than
would be possible without ML augmentation.
6
“Screening of bimetallic electrocatalysts for water purification with machine learning”
Tran et al., J. Chem Phys 2022
7. We help develop and maintain a comprehensive database of materials
properties, with a user community of >250,000 registered users
Summary
• In general, only a small
fraction of materials have
available experimental
property measurements
• The Materials Project uses
massive supercomputing
resources to calculate the
properties of materials using
first principles calculations
• The data is disseminated to
large user community
7
Past year: average of
≈200 new regs/day
8. We develop and maintain “matbench”, a machine learning benchmark for
materials science, uncovering what works and what’s needed
Summary
• We created a comprehensive
set of benchmark tests for ML
algorithms that aim to predict
materials properties
• The benchmarks clearly
reveal what community
algorithms work
• They also helped show the
field that more research was
needed into “small data set”
algorithms, motivating
external works
The Matbench benchmark contains 13 data sets
that vary in size and application. Community
algorithms compete for best performance on
each data set.
The full ”leaderboard” of all algorithms to date
tested against all 13 data sets, organized by data
set size. Deep learning approaches typically excel
at large data problems but typically struggle with
small data; some hybrid approaches were
subsequently developed to address this.
https://doi.org/10.1038/s41524-020-00406-3 8
Bigger datasets
Better
relative
performance
9. We use natural language processing to parse scientific abstracts and articles
and generate data sets and hypotheses
Summary
• We used natural language
processing (NLP) to analyze
the text of several million
article abstracts
• With no domain-specific
training, the ML system
internalized a representation
of the periodic table
• More impressively, it could
predict what materials
researchers would study for
“thermoelectrics” in the future A representation of the periodic table generated
automatically by analyzing >3 million abstracts
Materials compositions for thermoelectrics
applications as predicted by NLP ~3 years ago.
Since then, approximately 1/3 of the predictions
had been reported by researchers.
https://doi.org/10.1038/s41586-019-1335-8
Sponsor: SPP, Toyota Research Institute 9
10. Summary
• We are collaborating with
other groups at LBNL (G.
Ceder, H. Kim) to develop an
automated laboratory for
automated inorganic
materials synthesis
• A contrast to other similar
efforts is working primarily
with powder based synthesis
procedures
• Several aspects already
completed, but still a work in
progress
10
July 2022
- Tube furnaces and
SEM ready
Hardware
development
Platform
Integration
Automated
Synthesis
AI-guided
Synthesis
April 2022
Box furnace, XRD,
& robots ready
November 2022
- Powder dosing system
- First automated syntheses
Summer 2023
AI-guided synthesis
Closed-
Loop
Materials
Discovery
Summer 2024
Closed-loop
materials discovery
Moving from the virtual world to the physical world:
A-lab for automated synthesis of inorganic materials
11. Miscellaneous projects – analysis of large solar PV data sets,
data extraction from figures
Summary
• We also have various other
miscellaneous projects at any
given time
• For example, we recently
trained an ML algorithm to
classify electroluminescence
images from solar power
plants and use this to assess
fire damage
• We also developed software
to help parse data from
figures
Pipeline developed to process raw EL images
(bottom-left), extract modules, segment
individual cells, and classify cells into various
defect categories using deep learning models.
This open-source pipeline can replace tedious
human annotation of module EL images at a
large scale.
11
Examples of using machine learning to identify
portions of chart images and extracting data
curves based on color
12. Outline
• Introduction to group and overview of our projects
• The Materials Project and virtual materials design
• The Matbench protocol: benchmarking ML algorithms
• Natural language processing applied to materials design
• Automating materials synthesis and characterization
12
13. The core of Materials Project is a free database of
calculated materials properties and crystal structures
Free, public resource
• www.materialsproject.org
Data on ~150,000 materials,
including information on:
• electronic structure
• phonon and thermal
properties
• elastic / mechanical properties
• magnetic properties
• ferroelectric properties
• piezoelectric properties
• dielectric properties
Powered by hundreds of millions
of CPU-hours invested into high-
quality calculations
15. Apps give insight into data
Materials Explorer
Phase Stability Diagrams
Pourbaix Diagrams
(Aqueous Stability)
Battery Explorer
15
16. The code powering the Materials Project is
available open source (BSD/MIT licenses)
just-in-time error correction, fixing your
calculations so you don’t have to
‘recipes' for common materials
science simulation tasks
making materials science web apps easy
workflow management software for
high-throughput computing
materials science analysis code:
make, transform and analyze crystals,
phase diagrams and more
& more … MP team members also contribue to
several other non-MP codes, e.g. matminer for
machine learning featurization
16
17. The Materials Project is used heavily by the research
community
> 180,000 registered
users
> 40,000 new users last year
~100 new registrations/day
~10,000 users log on every day
> 2M+ records downloaded through API each day; 1.8 TB of data served per
month
17
18. Today, the Materials Project has led to
many examples of “computer to lab”
success stories
MP for p-type transparent conductors
References
✦ Hautier, G., Miglio,A., Ceder, G., Rignanese, G.-M. & Gonze, X. Identification and
design principles of low hole effective mass p-type transparent conducting oxides.
Nature Communications 4, (2013)
✦ Bhatia,A. et al. High-Mobility Bismuth-based Transparent p-Type Oxide from High-
Throughput Material Screening. Chemistry of Materials 28, 30–34 (2015)
✦ Ricci, F. et al.An ab initio electronic transport database for inorganic materials.
Scientific Data 4, (2017)
Prediction
Screening based on band
gap, transport properties
and band alignments.
Experiment
Predictions revealed
material with s–p
hybridized valence band
(thought to correlate
well with dopability).
When synthesized,
material has excellent
transparency and readily
dopable with K.
Ba2BiTaO6
MP for thermoelectrics
References
✦ Aydemir, U. et al.YCuTe2: a member of a new class of thermoelectric materials with
CuTe4-based layered structure. Journal of Materials Chemistry A 4, 2461–2472 (2016)
✦ Zhu, H. et al. Computational and experimental investigation of TmAgTe2and
XYZ2compounds, a new group of thermoelectric materials identified by first-principles
high-throughput screening. Journal of Materials Chemistry C 3, 10554–10565 (2015).
✦ Pöhls, J.-H. et al. Metal phosphides as potential thermoelectric materials. Journal of
Materials Chemistry C 5, 12441–12456 (2017).
Prediction
Screening of tens of
thousands of materials
with predicted electron
transport properties
revealed a family of
promising XYZ2
candidates
Experiment
Several materials made:
YCuTe2 (zT = 0.75),
TmAgTe2 (zT = 0.47, 1.8
theoretical), novel NiP2
phosphide
TmAgTe2
MP for phosphors
References
✦ Wang, Z. et al. Mining Unexplored Chemistries for Phosphors for High-Color-
Quality White-Light-Emitting Diodes. Joule 2, 914–926 (2018)
✦ Li, S. et al. Data-Driven Discovery of Full-Visible-Spectrum Phosphor. Chemistry of
Materials 31, 6286–6294 (2019)
✦ Ha, J. et al. Color tunable single-phase Eu2+ and Ce3+ co-activated Sr2LiAlO4
phosphors. Journal of Materials Chemistry C 7, 7734–7744 (2019)
Prediction
Statistical analysis of existing
materials that co-occur with
word ‘phosphor’ followed
by structure prediction for
new materials
Experiment
Predicted first known Sr-Li-
Al-N quaternary, showed
green-yellow/blue emission
with quantum efficiency of
25% (Eu), 40% (Ce), 55%
(co-activated Eu, Ce)
Sr2LiAlN4
≈ç ≈
18
19. One of the applications we looked into was
thermoelectric materials
19
• A thermoelectric material
generates a voltage based on
thermal gradient
• Applications
• Heat to electricity
• Refrigeration
• Advantages include:
• Reliability
• Easy to scale to different sizes
(including compact)
www.alphabetenergy.com
20. It is difficult to balance trade-offs in
thermoelectrics properties, so use screening
20
ZT = α2σT/κ
power factor
>2 mW/mK2
(PbTe=10 mW/mK2)
Seebeck coefficient
> 100 V/K
Band structure + Boltztrap
electrical conductivity
> 103 /(ohm-cm)
Band structure + Boltztrap
thermal conductivity
< 1 W/(m*K)
• e from Boltztrap
• l difficult (phonon-phonon scattering)
Heavy band:
ü Large DOS
(higher Seebeck and more carriers)
✗ Large effective mass
(poor mobility)
Light band:
ü Small effective mass
(improved mobility)
✗ Small DOS
(lower Seebeck, fewer carriers)
Multiple bands, off symmetry:
ü Large DOS with small effective
mass
✗ Difficult to design!
E
k
~50,000 crystal
structures and
band structures
from Materials
Project are used
as a source F. Ricci, et al., An ab initio electronic transport
database for inorganic materials, Sci. Data. 4
(2017) 170085.
We compute electronic
transport properties
with BoltzTraP and
minimum thermal
conductivity (Cahill-
Pohl) for some
compounds
About 300GB of
electronic transport
data is generated. All
data is available free
for download.
21. We found several compounds with promising
figure-of-merit, but no breakthroughs
21
• Calculations:
trigonal p-
TmAgTe2 could
have power
factor up to 8
mW/mK2
• requires 1020/cm3
carriers
experiment
computation
• Calculations: p-YCuTe2 could
only reach PF of 0.4
mW/mK2
• SOC inhibits PF
• if thermal conductivity is low
(e.g., 0.4, we get zT ~1)
• Expt: zT ~0.75 – not too far
from calculation limit
• carrier concentration of 1019
• Decent performance, but
unlikely to be improved with
further optimization
• Expt: p-zT only 0.35 despite
very low thermal
conductivity (~0.25 W/mK)
• Limitation: carrier
concentration (~1017/cm3)
• likely limited by TmAg
defects, as determined by
followup calculations
• Later, we achieved zT ~ 0.47
using Zn-doping
TmAgTe2
YCuTe2
22. Outline
• Introduction to group and overview of our projects
• The Materials Project and virtual materials design
• The Matbench protocol: benchmarking ML algorithms
• Natural language processing applied to materials design
• Automating materials synthesis and characterization
22
23. There are many new algorithms being published
for ML in materials –
New ones constantly reported!
23
24. But it is very difficult to compare
algorithms
24
Data set used
in study A
Data set used
in study B
Data set used
in study C
• Different data sets
• Source (e.g., OQMD vs MP vs JARVIS)
• Quantity (e.g., MP 2019 vs MP 2022)
• Subset / data filtering (e.g., ehull<X)
• Different evaluation metrics
• Test set vs. cross validation?
• Different test set fraction?
• Can be difficult to install and retrain
many of these algorithms
MAE 5-Fold CV = 0.102 eV
RMSE Test set = 0.098 eV
vs.
? ?
25. Can we design a standard test set for ML
algorithms for materials science?
25
• There is no single type of problem that materials scientists are trying
to solve
• For now, focus on materials property prediction (from structure or
composition)
• We want a test set that contains a diverse array of problems
• Smaller data versus larger data
• Different applications (electronic, mechanical, etc.)
• Composition-only or structure information available
• Experimental vs. Ab-initio
• Classification or regression
26. Matbench includes 13 different ML tasks
26
Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference
Algorithm. npj Comput Mater 2020, 6 (1), 138. https://doi.org/10.1038/s41524-020-00406-3.
27. Models tested by Matbench to date
Model Representation type Representation summary
Magpie + Sine Coulomb
Matrix + Random Forest
Composition
or Structure
Hand-created chemical features coupled with random
forest ML algorithm
Automatminer Composition
or Structure
Hand-created chemical features with genetic algorithm
based ML algorithm and hyperparameter selection
MODNET Composition
or Structure
Hand-created chemical features with various neural
network layers
CGCNN Structure only Graph convolution based neural networks with basic
initial atom/bond features
ALIGNN Structure only Graph based convolutional networks based on
bonds/angles in addition to atoms/bonds
CRABNet Composition only Transformer-based self-attention for composition;
initialized using NLP-based embeddings
27
28. How to read the Matbench leaderboard
28
Bigger datasets
Better
relative
performance
• A scaled error of 0.0 means all
predictions are correct
• A scaled error of 1.0 is equal
to always predicting the
average value
29. Magpie + SCF Model
• Composition features using
chemical descriptors such as
averages/stdevs of elemental
properties such as melting
point, electronegativity
• Structure features using sine
Coulomb matrix
29
Ward, L., Agrawal, A., Choudhary, A. et al. A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput Mater 2, 16028 (2016).
Faber, Felix, et al. "Crystal structure representations for machine learning models of formation energies." International Journal of Quantum Chemistry 115.16 (2015): 1094-1101.
https://matbench.materialsproject.org
30. Automatminer Model
30
Dunn, A.; Wang, Q.; Ganose, A.; Dopp, D.; Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm. npj Comput
Mater 2020, 6 (1), 138.
https://matbench.materialsproject.org
31. MODNet Model
31
De Breuck, P.-P.; Evans, M. L.; Rignanese, G.-M. Robust Model Benchmarking and Bias-Imbalance in Data-Driven Materials Science: A Case Study on MODNet. Journal of Physics:
Condensed Matter, Volume 33, Number 40, 2021
https://matbench.materialsproject.org
32. CGCNN Model
32
Xie, T.; Grossman, J. C. Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties. Phys. Rev. Lett. 2018, 120 (14), 145301.
https://matbench.materialsproject.org
33. ALIGNN Model
33
Choudhary, Kamal, and Brian DeCost. "Atomistic Line Graph Neural Network for improved materials property predictions." npj Computational Materials 7.1 (2021): 1-8.
https://matbench.materialsproject.org
34. How much have we
improved overall?
34
• In some cases (e.g., Ef DFT) we
have made a lot of
improvement
• In contrast, for others (e.g., σy
steel alloys) we have barely
improved
• Possible reasons
• Amount of attention paid to
certain problems
• Small vs large data emphasis –
there is a lot more room for
improvement for small data
35. How could we improve Matbench?
• Additional tasks – but how to keep it manageable?
• Adding external conditions (temperature, reducing gas presence,
microstructural characterizations)
• Other materials classes (polymers, metal alloys, multi-material composites)
• Other types of properties (e.g., predicting spectra)
• More dynamic tests, e.g. update the test periodically and re-evaluate
• Other scoring metrics
• e.g., active learning searches
• cross-validation by leaving out chemical systems rather than random splits
35
36. Outline
• Introduction to group and overview of our projects
• The Materials Project and virtual materials design
• The Matbench protocol: benchmarking ML algorithms
• Natural language processing applied to materials design
• Automating materials synthesis and characterization
36
37. Literature data can be a key source of materials learning
37
Plan
Synthesize
Characterize
Analyze
local db +
ML
Automated Lab A
Plan
Synthesize
Characterize
Analyze
Conventional Lab B
Plan
Synthesize
Characterize
Analyze
local db +
ML
Automated Lab C
Literature data
+ broad coverage
– difficult to parse
– lack negative examples
Other A-lab data
+ structured data formats
+ negative examples
– not much out there …
Theory data
+ readily available
– difficult to establish
relevance to synthesis
38. The NLP Solution to Literature Data
• A lot of prior experimental data already exists in the literature that would take
untold costs and labor to replicate again
• Advantages to this data set are broad coverage of materials and techniques
• Disadvantages include:
• Getting access to the data
• lack of negative examples in the data
• missing / unreliable information
• difficulty to obtain structured data from unstructured text
• Natural language processing can help with the last part, although considerable
difficulties are still involved
• Named entity recognition
• Identify precursors, amounts, characteristics, etc.
• Relationship modeling
• Relate the extracted entities to one another
39. Previous approach for extracting data from
text
39
Weston, L. et al Named Entity Recognition
and Normalization Applied to Large-Scale
Information Extraction from the Materials
Science Literature. J. Chem. Inf. Model.
(2019)
Recently, we also tried BERT variants
Trewartha, A.; Walker, N.; Huo, H.; Lee, S.;
Cruse, K.; Dagdelen, J.; Dunn, A.; Persson,
K. A.; Ceder, G.; Jain, A. Quantifying the
Advantage of Domain-Specific Pre-Training
on Named Entity Recognition Tasks in
Materials Science. Patterns 2022, 3 (4),
100488.
40. Models were good for labeling entities, but
didn’t understand relationships
40
Named Entity Recognition
• Custom machine learning models to
extract the most valuable materials-related
information.
• Utilizes a long short-term memory (LSTM)
network trained on ~1000 hand-annotated
abstracts.
Trewartha, A.; Walker, N.; Huo, H.; Lee, S.;
Cruse, K.; Dagdelen, J.; Dunn, A.; Persson,
K. A.; Ceder, G.; Jain, A. Quantifying the
Advantage of Domain-Specific Pre-Training
on Named Entity Recognition Tasks in
Materials Science. Patterns 2022, 3 (4),
100488.
41. A Sequence-to-Sequence Approach
• Language model takes a sequence of tokens as input and
outputs a sequence of tokens
• Maximizes the likelihood of the output conditioned on the input
• Additionally includes task conditioning
• Capacity for “understanding” language as well as “world
knowledge”
• Task conditioning with arbitrary Seq2Seq provides extremely
flexible framework
• Large seq2seq2 models can generate text that naturally
completes a paragraph
42. How a sequence-to-sequence approach works
42
Seq2Seq model
(GPT3)
Text in (“prompt”) Text out (“completion”)
45. But it’s not perfect for technical data
45
Seq2Seq model
(GPT3)
Text in (“prompt”) Text out (“completion”)
46. A workflow for fine-tuning GPT-3
1. Initial training set of templates
filled mostly manually, as zero-
shot GPT is often poor for
technical tasks
2. Fine-tune model to fill
templates, use the model to
assist in annotation
3. Repeat as necessary until
desired inference accuracy is
achieved
47. Templated extraction of synthesis recipes
• Annotate paragraphs to output
structured recipe templates
• JSON-format
• Designed using domain knowledge from
experimentalists
• Template is relation graph to be filled in
by model
• Note: we are still formally evaluating
performance
• various issues in getting an accurate
evaluation, e.g., predictions that are
functionally correct but written differently
49. Applied to solid state synthesis / doping
We have performed the first-principles calculations onto the structural,
electronic and magnetic properties of seven 3d transition-metal (TM=V, Cr,
Mn, Fe, Co, Ni and Cu) atom substituting cation Zn in both zigzag (10,0) and
armchair (6,6) zinc oxide nanotubes (ZnONTs). The results show that there
exists a structural distortion around 3d TM impurities with respect to the
pristine ZnONTs. The magnetic moment increases for V-, Cr-doped ZnONTs
and reaches maximum for Mn-doped ZnONTs, and then decreases for Fe-, Co-
, Ni- and Cu-doped ZnONTs successively, which is consistent with the
predicted trend of Hund’s rule for maximizing the magnetic moments of the
doped TM ions. However, the values of the magnetic moments are smaller than
the predicted values of Hund’s rule due to strong hybridization between p
orbitals of the nearest neighbor O atoms of ZnONTs and d orbitals of the TM
atoms. Furthermore, the Mn-, Fe-, Co-, Cu-doped (10,0) and (6,6) ZnONTs
with half-metal and thus 100% spin polarization characters seem to be good
candidates for spintronic applications.
50. Use in initial hypothesis generation
50
classifying AuNP
morphologies based
on precursors used
Predicting new
materials for
functional
applications
predicting doping – if
a material can be
doped with A, can it
be doped with B?
Investigated as thermoelectrics
(independently of our study)
Investigated by our own collaborators
(as a result of our study)
(done using an older
method)
51. Outline
• Introduction to group and overview of our projects
• The Materials Project and virtual materials design
• The Matbench protocol: benchmarking ML algorithms
• Natural language processing applied to materials design
• Automating materials synthesis and characterization
51
52. Developing an automated lab (“A-lab”) that makes use
of literature data is in progress
52
Plan
Synthesize
Characterize
Analyze
local db +
ML
Automated Lab A
Plan
Synthesize
Characterize
Analyze
Conventional Lab B
Plan
Synthesize
Characterize
Analyze
local db +
ML
Automated Lab C
Literature data
+ broad coverage
– difficult to parse
– lack negative examples
Other A-lab data
+ structured data formats
+ negative examples
– not much out there …
Theory data
+ readily available
– difficult to establish
relevance to synthesis
53. The A-lab facility is designed to handle inorganic
powders
53
In operation:
XRD
Robot
Box furnaces
Setting up:
Tube
furnace x 4
LBNL bldg. 30
Dosing and mixing
Facility will handle powder-
based synthesis of inorganic
materials, with automated
characterization and
experimental planning
Collaboration w/ G. Ceder & H. Kim
July 2022
- Tube furnaces and
SEM ready
Hardware
development
Platform
Integration
Automated
Synthesis
AI-guided
Synthesis
April 2022
Box furnace, XRD,
& robots ready
November 2022
- Powder dosing system
- First automated syntheses
Summer 2023
AI-guided synthesis
Closed-
Loop
Materials
Discovery
Summer 2024
Closed-loop
materials discovery
54. Lab starting to take shape …
54
Courtesy Y. Fei,
Ceder Group
The embedded video
shows a robotic arm
performing various
synthesis tasks, such
as loading a box
furnace and
performing multiple
steps needed to
prepare and load an
XRD sample.
Other videos (not
shown here) show ball
milling, interaction
with tube furnaces.
A powder doser is
expected to arrive in
1-2 months.
55. The continuing challenge – putting it all together!
Currently we are still working on various components
Historical-data
Initial hypotheses
data-api
NLP and literature data
ML algorithms High-throughput DFT data
56. Acknowledgements
NLP
• Nick Walker
• John Dagdelen
• Alex Dunn
• Sanghoon Lee
• Amalie Trewartha
56
A-lab
• Rishi Kumar
• Yuxing Fei
• Haegyum Kim
• Gerbrand Ceder
Funding provided by:
• U.S. Department of Energy, Basic Energy Science, “D2S2” program
• Toyota Research Institutes, Accelerated Materials Design program
• Lawrence Berkeley National Laboratory “LDRD” program
Slides (will be) posted to hackingmaterials.lbl.gov
Materials Project
• Kristin Persson
• Matthew Horton
• All MP collaborators,
too many to name
…