TY - JOUR
T1 - Accurate promoter and enhancer identification in 127 ENCODE and roadmap epigenomics cell types and tissues by GenoSTAN
AU - Zacher, Benedikt
AU - Michel, Margaux
AU - Schwalb, Björn
AU - Cramer, Patrick
AU - Tresch, Achim
AU - Gagneur, Julien
N1 - Publisher Copyright:
© 2017 Zacher et al.This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, andreproduction in any medium, provided the original author and source are credited.
PY - 2017/1
Y1 - 2017/1
N2 - Accurate maps of promoters and enhancers are required for understanding transcriptional regulation. Promoters and enhancers are usually mapped by integration of chromatin assays charting histone modifications, DNA accessibility, and transcription factor binding. However, current algorithms are limited by unrealistic data distribution assumptions. Here we propose GenoSTAN (Genomic STate ANnotation), a hidden Markov model overcoming these limitations. We map promoters and enhancers for 127 cell types and tissues from the ENCODE and Roadmap Epigenomics projects, today's largest compendium of chromatin assays. Extensive benchmarks demonstrate that GenoSTAN generally identifies promoters and enhancers with significantly higher accuracy than previous methods. Moreover, Geno- STAN-derived promoters and enhancers showed significantly higher enrichment of complex trait-Associated genetic variants than current annotations. Altogether, GenoSTAN provides an easy-To-use tool to define promoters and enhancers in any system, and our annotation of human transcriptional cis-regulatory elements constitutes a rich resource for future research in biology and medicine.
AB - Accurate maps of promoters and enhancers are required for understanding transcriptional regulation. Promoters and enhancers are usually mapped by integration of chromatin assays charting histone modifications, DNA accessibility, and transcription factor binding. However, current algorithms are limited by unrealistic data distribution assumptions. Here we propose GenoSTAN (Genomic STate ANnotation), a hidden Markov model overcoming these limitations. We map promoters and enhancers for 127 cell types and tissues from the ENCODE and Roadmap Epigenomics projects, today's largest compendium of chromatin assays. Extensive benchmarks demonstrate that GenoSTAN generally identifies promoters and enhancers with significantly higher accuracy than previous methods. Moreover, Geno- STAN-derived promoters and enhancers showed significantly higher enrichment of complex trait-Associated genetic variants than current annotations. Altogether, GenoSTAN provides an easy-To-use tool to define promoters and enhancers in any system, and our annotation of human transcriptional cis-regulatory elements constitutes a rich resource for future research in biology and medicine.
UR - http://www.scopus.com/inward/record.url?scp=85008601761&partnerID=8YFLogxK
U2 - 10.1371/journal.pone.0169249
DO - 10.1371/journal.pone.0169249
M3 - Article
C2 - 28056037
AN - SCOPUS:85008601761
SN - 1932-6203
VL - 12
JO - PLoS ONE
JF - PLoS ONE
IS - 1
M1 - 0169249
ER -