TY - JOUR
T1 - How little do we actually know? on the size of gene regulatory networks
AU - Röttger, Richard
AU - Rückert, Ulrich
AU - Taubert, Jan
AU - Baumbach, Jan
N1 - Funding Information:
The work of Jan Baumbach was supported by the Cluster of Excellence for Multimodal Computing (MMCI). Richard Röttger is grateful for financial support of the International Max Planck Research School (IMPRS) and the German Academic Exchange Service (DAAD). Jan Taubert receives funding from the Biotechnology and Biological Sciences Research Council (BBSRC).
PY - 2012
Y1 - 2012
N2 - The National Center for Biotechnology Information (NCBI) recently announced the availability of whole genome sequences for more than 1,000 species. And the number of sequenced individual organisms is growing. Ongoing improvement of DNA sequencing technology will further contribute to this, enabling large-scale evolution and population genetics studies. However, the availability of sequence information is only the first step in understanding how cells survive, reproduce, and adjust their behavior. The genetic control behind organized development and adaptation of complex organisms still remains widely undetermined. One major molecular control mechanism is transcriptional gene regulation. The direct juxtaposition of the total number of sequenced species to the handful of model organisms with known regulations is surprising. Here, we investigate how little we even know about these model organisms. We aim to predict the sizes of the whole-organism regulatory networks of seven species. In particular, we provide statistical lower bounds for the expected number of regulations. For Escherichia coli we estimate at most 37 percent of the expected gene regulatory interactions to be already discovered, 24 percent for Bacillus subtilis, and < 3% human, respectively. We conclude that even for our best researched model organisms we still lack substantial understanding of fundamental molecular control mechanisms, at least on a large scale.
AB - The National Center for Biotechnology Information (NCBI) recently announced the availability of whole genome sequences for more than 1,000 species. And the number of sequenced individual organisms is growing. Ongoing improvement of DNA sequencing technology will further contribute to this, enabling large-scale evolution and population genetics studies. However, the availability of sequence information is only the first step in understanding how cells survive, reproduce, and adjust their behavior. The genetic control behind organized development and adaptation of complex organisms still remains widely undetermined. One major molecular control mechanism is transcriptional gene regulation. The direct juxtaposition of the total number of sequenced species to the handful of model organisms with known regulations is surprising. Here, we investigate how little we even know about these model organisms. We aim to predict the sizes of the whole-organism regulatory networks of seven species. In particular, we provide statistical lower bounds for the expected number of regulations. For Escherichia coli we estimate at most 37 percent of the expected gene regulatory interactions to be already discovered, 24 percent for Bacillus subtilis, and < 3% human, respectively. We conclude that even for our best researched model organisms we still lack substantial understanding of fundamental molecular control mechanisms, at least on a large scale.
KW - Computational biology
KW - network statistics
KW - transcriptional gene regulatory networks
UR - http://www.scopus.com/inward/record.url?scp=84864925344&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2012.71
DO - 10.1109/TCBB.2012.71
M3 - Article
C2 - 22585140
AN - SCOPUS:84864925344
SN - 1545-5963
VL - 9
SP - 1293
EP - 1300
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 5
M1 - 6200261
ER -