TY - JOUR
T1 - METHimpute
T2 - Imputation-guided construction of complete methylomes from WGBS data
AU - Taudt, Aaron
AU - Roquis, David
AU - Vidalis, Amaryllis
AU - Wardenaar, René
AU - Johannes, Frank
AU - Colome-Tatché-Tatché, Maria
N1 - Publisher Copyright:
© 2018 The Author(s).
PY - 2018/6/7
Y1 - 2018/6/7
N2 - Background: Whole-genome bisulfite sequencing (WGBS) has become the standard method for interrogating plant methylomes at base resolution. However, deep WGBS measurements remain cost prohibitive for large, complex genomes and for population-level studies. As a result, most published plant methylomes are sequenced far below saturation, with a large proportion of cytosines having either missing data or insufficient coverage. Results: Here we present METHimpute, a Hidden Markov Model (HMM) based imputation algorithm for the analysis of WGBS data. Unlike existing methods, METHimpute enables the construction of complete methylomes by inferring the methylation status and level of all cytosines in the genome regardless of coverage. Application of METHimpute to maize, rice and Arabidopsis shows that the algorithm infers cytosine-resolution methylomes with high accuracy from data as low as 6X, compared to data with 60X, thus making it a cost-effective solution for large-scale studies. Conclusions: METHimpute provides methylation status calls and levels for all cytosines in the genome regardless of coverage, thus yielding complete methylomes even with low-coverage WGBS datasets. The method has been extensively tested in plants, but should also be applicable to other species. An implementation is available on Bioconductor.
AB - Background: Whole-genome bisulfite sequencing (WGBS) has become the standard method for interrogating plant methylomes at base resolution. However, deep WGBS measurements remain cost prohibitive for large, complex genomes and for population-level studies. As a result, most published plant methylomes are sequenced far below saturation, with a large proportion of cytosines having either missing data or insufficient coverage. Results: Here we present METHimpute, a Hidden Markov Model (HMM) based imputation algorithm for the analysis of WGBS data. Unlike existing methods, METHimpute enables the construction of complete methylomes by inferring the methylation status and level of all cytosines in the genome regardless of coverage. Application of METHimpute to maize, rice and Arabidopsis shows that the algorithm infers cytosine-resolution methylomes with high accuracy from data as low as 6X, compared to data with 60X, thus making it a cost-effective solution for large-scale studies. Conclusions: METHimpute provides methylation status calls and levels for all cytosines in the genome regardless of coverage, thus yielding complete methylomes even with low-coverage WGBS datasets. The method has been extensively tested in plants, but should also be applicable to other species. An implementation is available on Bioconductor.
KW - Hidden Markov Model
KW - Imputation
KW - Methylation
KW - Whole-genome bisulfite sequencing
UR - http://www.scopus.com/inward/record.url?scp=85048280464&partnerID=8YFLogxK
U2 - 10.1186/s12864-018-4641-x
DO - 10.1186/s12864-018-4641-x
M3 - Article
C2 - 29879918
AN - SCOPUS:85048280464
SN - 1471-2164
VL - 19
JO - BMC Genomics
JF - BMC Genomics
IS - 1
M1 - 444
ER -