TY - GEN
T1 - Etalumis
T2 - 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019
AU - Baydin, Atilim Güne
AU - Shao, Lei
AU - Bhimji, Wahid
AU - Heinrich, Lukas
AU - Meadows, Lawrence
AU - Liu, Jialin
AU - Munk, Andreas
AU - Naderiparizi, Saeid
AU - Gram-Hansen, Bradley
AU - Louppe, Gilles
AU - Ma, Mingfei
AU - Zhao, Xiaohui
AU - Torr, Philip
AU - Lee, Victor
AU - Cranmer, Kyle
AU - Prabhat,
AU - Wood, Frank
N1 - Publisher Copyright:
© 2019 ACM.
PY - 2019/11/17
Y1 - 2019/11/17
N2 - Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN-LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global mini-batch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
AB - Probabilistic programming languages (PPLs) are receiving widespread attention for performing Bayesian inference in complex generative models. However, applications to science remain limited because of the impracticability of rewriting complex scientific simulators in a PPL, the computational cost of inference, and the lack of scalable implementations. To address these, we present a novel PPL framework that couples directly to existing scientific simulators through a cross-platform probabilistic execution protocol and provides Markov chain Monte Carlo (MCMC) and deep-learning-based inference compilation (IC) engines for tractable inference. To guide IC inference, we perform distributed training of a dynamic 3DCNN-LSTM architecture with a PyTorch-MPI-based framework on 1,024 32-core CPU nodes of the Cori supercomputer with a global mini-batch size of 128k: achieving a performance of 450 Tflop/s through enhancements to PyTorch. We demonstrate a Large Hadron Collider (LHC) use-case with the C++ Sherpa simulator and achieve the largest-scale posterior inference in a Turing-complete PPL.
KW - Deep learning
KW - Inference
KW - Probabilistic programming
KW - Simulation
UR - http://www.scopus.com/inward/record.url?scp=85076168418&partnerID=8YFLogxK
U2 - 10.1145/3295500.3356180
DO - 10.1145/3295500.3356180
M3 - Conference contribution
AN - SCOPUS:85076168418
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2019
PB - IEEE Computer Society
Y2 - 17 November 2019 through 22 November 2019
ER -