TY - JOUR
T1 - Engineering indel and substitution variants of diverse and ancient enzymes using Graphical Representation of Ancestral Sequence Predictions (GRASP)
AU - Foley, Gabriel
AU - Mora, Ariane
AU - Ross, Connie M.
AU - Bottoms, Scott
AU - Sützl, Leander
AU - Lamprecht, Marnie L.
AU - Zaugg, Julian
AU - Essebier, Alexandra
AU - Balderson, Brad
AU - Newell, Rhys
AU - Thomson, Raine E.S.
AU - Kobe, Bostjan
AU - Barnard, Ross T.
AU - Guddat, Luke
AU - Schenk, Gerhard
AU - Carsten, Jörg
AU - Gumulya, Yosephine
AU - Rost, Burkhard
AU - Haltrich, Dietmar
AU - Sieber, Volker
AU - Gillam, Elizabeth M.J.
AU - Bodén, Mikael
N1 - Publisher Copyright:
Copyright: © 2022 Foley et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2022/10
Y1 - 2022/10
N2 - Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.
AB - Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.
UR - http://www.scopus.com/inward/record.url?scp=85141890324&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1010633
DO - 10.1371/journal.pcbi.1010633
M3 - Article
C2 - 36279274
AN - SCOPUS:85141890324
SN - 1553-734X
VL - 18
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 10
M1 - e1010633
ER -