TY - JOUR
T1 - Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks
AU - Kim, Edward
AU - Jensen, Zach
AU - Van Grootel, Alexander
AU - Huang, Kevin
AU - Staib, Matthew
AU - Mysore, Sheshera
AU - Chang, Haw Shiuan
AU - Strubell, Emma
AU - McCallum, Andrew
AU - Jegelka, Stefanie
AU - Olivetti, Elsa
N1 - Publisher Copyright:
Copyright © 2020 American Chemical Society.
PY - 2020/3/23
Y1 - 2020/3/23
N2 - Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated, unsupervised method for connecting scientific literature to inorganic synthesis insights. Starting from the natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for any inorganic materials of interest. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties and that the model's behavior complements the existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.
AB - Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated, unsupervised method for connecting scientific literature to inorganic synthesis insights. Starting from the natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for any inorganic materials of interest. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties and that the model's behavior complements the existing thermodynamic knowledge. Finally, we apply the model to perform synthesizability screening for proposed novel perovskite compounds.
UR - http://www.scopus.com/inward/record.url?scp=85082148237&partnerID=8YFLogxK
U2 - 10.1021/acs.jcim.9b00995
DO - 10.1021/acs.jcim.9b00995
M3 - Article
C2 - 31909619
AN - SCOPUS:85082148237
SN - 1549-9596
VL - 60
SP - 1194
EP - 1201
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 3
ER -