TY - JOUR
T1 - MiMiC
T2 - a bioinformatic approach for generation of synthetic communities from metagenomes
AU - Kumar, Neeraj
AU - Hitch, Thomas C.A.
AU - Haller, Dirk
AU - Lagkouvardos, Ilias
AU - Clavel, Thomas
N1 - Publisher Copyright:
© 2021 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
PY - 2021/7
Y1 - 2021/7
N2 - Environmental and host-associated microbial communities are complex ecosystems, of which many members are still unknown. Hence, it is challenging to study community dynamics and important to create model systems of reduced complexity that mimic major community functions. Therefore, we developed MiMiC, a computational approach for data-driven design of simplified communities from shotgun metagenomes. We first built a comprehensive database of species-level bacterial and archaeal genomes (n = 22 627) consisting of binary (presence/absence) vectors of protein families (Pfam = 17 929). MiMiC predicts the composition of minimal consortia using an iterative scoring system based on maximal match-to-mismatch ratios between this database and the Pfam binary vector of any input metagenome. Pfam vectorization retained enough resolution to distinguish metagenomic profiles between six environmental and host-derived microbial communities (n = 937). The calculated number of species per minimal community ranged between 5 and 11, with MiMiC selected communities better recapitulating the functional repertoire of the original samples than randomly selected species. The inferred minimal communities retained habitat-specific features and were substantially different from communities consisting of most abundant members. The use of a mixture of known microbes revealed the ability to select 23 of 25 target species from the entire genome database. MiMiC is open source and available at https://github.com/ClavelLab/MiMiC.
AB - Environmental and host-associated microbial communities are complex ecosystems, of which many members are still unknown. Hence, it is challenging to study community dynamics and important to create model systems of reduced complexity that mimic major community functions. Therefore, we developed MiMiC, a computational approach for data-driven design of simplified communities from shotgun metagenomes. We first built a comprehensive database of species-level bacterial and archaeal genomes (n = 22 627) consisting of binary (presence/absence) vectors of protein families (Pfam = 17 929). MiMiC predicts the composition of minimal consortia using an iterative scoring system based on maximal match-to-mismatch ratios between this database and the Pfam binary vector of any input metagenome. Pfam vectorization retained enough resolution to distinguish metagenomic profiles between six environmental and host-derived microbial communities (n = 937). The calculated number of species per minimal community ranged between 5 and 11, with MiMiC selected communities better recapitulating the functional repertoire of the original samples than randomly selected species. The inferred minimal communities retained habitat-specific features and were substantially different from communities consisting of most abundant members. The use of a mixture of known microbes revealed the ability to select 23 of 25 target species from the entire genome database. MiMiC is open source and available at https://github.com/ClavelLab/MiMiC.
UR - http://www.scopus.com/inward/record.url?scp=85107051869&partnerID=8YFLogxK
U2 - 10.1111/1751-7915.13845
DO - 10.1111/1751-7915.13845
M3 - Article
C2 - 34081399
AN - SCOPUS:85107051869
SN - 1751-7907
VL - 14
SP - 1757
EP - 1770
JO - Microbial Biotechnology
JF - Microbial Biotechnology
IS - 4
ER -