TY - JOUR
T1 - High-dimensional causal discovery under non-Gaussianity
AU - Samuel Wang, Y.
AU - Drton, Mathias
N1 - Publisher Copyright:
© 2019 Biometrika Trust.
PY - 2020/3/1
Y1 - 2020/3/1
N2 - We consider graphical models based on a recursive system of linear structural equations. This implies that there is an ordering, σ, of the variables such that each observed variable Yv is a linear function of a variable-specific error term and the other observed variables Yu with σ(u) < σ (v). The causal relationships, i.e., which other variables the linear functions depend on, can be described using a directed graph. It has previously been shownthat when the variable-specific error terms are non-Gaussian, the exact causal graph, as opposed to a Markov equivalence class, can be consistently estimated from observational data. We propose an algorithm that yields consistent estimates of the graph also in high-dimensional settings in which the number of variables may grow at a faster rate than the number of observations, but in which the underlying causal structure features suitable sparsity; specifically, the maximum in-degree of the graph is controlled. Our theoretical analysis is couched in the setting of log-concave error distributions.
AB - We consider graphical models based on a recursive system of linear structural equations. This implies that there is an ordering, σ, of the variables such that each observed variable Yv is a linear function of a variable-specific error term and the other observed variables Yu with σ(u) < σ (v). The causal relationships, i.e., which other variables the linear functions depend on, can be described using a directed graph. It has previously been shownthat when the variable-specific error terms are non-Gaussian, the exact causal graph, as opposed to a Markov equivalence class, can be consistently estimated from observational data. We propose an algorithm that yields consistent estimates of the graph also in high-dimensional settings in which the number of variables may grow at a faster rate than the number of observations, but in which the underlying causal structure features suitable sparsity; specifically, the maximum in-degree of the graph is controlled. Our theoretical analysis is couched in the setting of log-concave error distributions.
KW - Causal discovery
KW - Directed graphical model
KW - High-dimensional statistics
KW - Non-Gaussian data
KW - Structural equation model
UR - http://www.scopus.com/inward/record.url?scp=85082106370&partnerID=8YFLogxK
U2 - 10.1093/biomet/asz055
DO - 10.1093/biomet/asz055
M3 - Article
AN - SCOPUS:85082106370
SN - 0006-3444
VL - 107
SP - 41
EP - 59
JO - Biometrika
JF - Biometrika
IS - 1
ER -