Abstract
The wealth of genomic data has boosted the development of computational methods predicting the phenotypic outcomes of missense variants. The most accurate ones exploit multiple sequence alignments, which can be costly to generate. Recent efforts for democratizing protein structure prediction have overcome this bottleneck by leveraging the fast homology search of MMseqs2. Here, we show the usefulness of this strategy for mutational outcome prediction through a large-scale assessment of 1.5M missense variants across 72 protein families. Our study demonstrates the feasibility of producing alignment-based mutational landscape predictions that are both high-quality and compute-efficient for entire proteomes. We provide the community with the whole human proteome mutational landscape and simplified access to our predictive pipeline.
Original language | English |
---|---|
Article number | evad201 |
Journal | Genome Biology and Evolution |
Volume | 15 |
Issue number | 11 |
DOIs | |
State | Published - 1 Nov 2023 |
Keywords
- deep mutational scan
- evolution
- genotype-phenotype relationship
- multiple sequence alignment
- protein mutation