TY - GEN
T1 - Bayesian blind source separation applied to the lymphocyte pathway
AU - Illner, Katrin
AU - Fuchs, Christiane
AU - Theis, Fabian J.
N1 - Publisher Copyright:
© 2014 Proceedings of COMPSTAT 2014 - 21st International Conference on Computational Statistics. All rights reserved.
PY - 2014
Y1 - 2014
N2 - In many biological applications one observes a multivariate mixture of signals, where both the mixing process and the signals are unknown. Blind source separation can extract such source signals. Often the data have additional structure, i. e. the variables (e. g. genes) are linked by an interaction network. Recently, we developed the probabilistic method emGrade that explicitly uses this network structure as a Bayesian network and thus performs a more appropriate separation of the data than standard methods. Here, we consider the application of emGrade to gene expression data together with a literature-derived pathway. Thanks to the probabilistic modeling, we can use model selection criteria and demonstrate the relevance of the pathway information for explaining the data. We further use estimates of missing observations to identify the most appropriate microarray probe sets for two genes that were not uniquely annotated after standard filtering. Finally, we identify genes relevant for the dynamics underlying the data; these genes were not detected without the network information.
AB - In many biological applications one observes a multivariate mixture of signals, where both the mixing process and the signals are unknown. Blind source separation can extract such source signals. Often the data have additional structure, i. e. the variables (e. g. genes) are linked by an interaction network. Recently, we developed the probabilistic method emGrade that explicitly uses this network structure as a Bayesian network and thus performs a more appropriate separation of the data than standard methods. Here, we consider the application of emGrade to gene expression data together with a literature-derived pathway. Thanks to the probabilistic modeling, we can use model selection criteria and demonstrate the relevance of the pathway information for explaining the data. We further use estimates of missing observations to identify the most appropriate microarray probe sets for two genes that were not uniquely annotated after standard filtering. Finally, we identify genes relevant for the dynamics underlying the data; these genes were not detected without the network information.
KW - expectation maximization
KW - gene expression data
KW - gene regulatory networks
KW - model selection
UR - http://www.scopus.com/inward/record.url?scp=85183589887&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85183589887
T3 - Proceedings of COMPSTAT 2014 - 21st International Conference on Computational Statistics
SP - 625
EP - 632
BT - Proceedings of COMPSTAT 2014 - 21st International Conference on Computational Statistics
A2 - Gilli, Manfred
A2 - Gonzalez-Rodriguez, Gil
A2 - Nieto-Reyes, Alicia
PB - The International Statistical Institute/International Association for Statistical Computing
T2 - 21st International Conference on Computational Statistics, COMPSTAT 2014
Y2 - 19 August 2014 through 22 August 2014
ER -