FunFam protein families improve residue level molecular function prediction

Linus Scheibenreif, Maria Littmann, Christine Orengo, Burkhard Rost

Research output: Contribution to journalArticlepeer-review

17 Scopus citations

Abstract

BACKGROUND: The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues. RESULTS: FunFam members agreed, on average, in 36.9 ± 0.6% of their binding residue annotations. This constituted a 6.7-fold increase over randomly grouped proteins and a 1.2-fold increase (1.1-fold on the same dataset) over proteins with the same enzymatic function (identical Enzyme Commission, EC, number). Mapping de novo binding residue prediction methods (BindPredict-CCS, BindPredict-CC) onto FunFam resulted in consensus predictions for those residues that were aligned and predicted alike (binding/non-binding) within a FunFam. This simple consensus increased the F1-score (for binding) 1.5-fold over the original prediction method. Variation of the threshold for how many proteins in the consensus prediction had to agree provided a convenient control of accuracy/precision and coverage/recall, e.g. reaching a precision as high as 60.8 ± 0.4% for a stringent threshold. CONCLUSIONS: The FunFams outperformed even the carefully curated EC numbers in terms of agreement of binding site residues. Additionally, we assume that our proof-of-principle through the prediction of protein binding residues will be relevant for many other solutions profiting from FunFams to infer functional information at the residue level.

Original languageEnglish
Article number400
Pages (from-to)400
Number of pages1
JournalBMC Bioinformatics
Volume20
Issue number1
DOIs
StatePublished - 18 Jul 2019

Keywords

  • Binding residue prediction
  • CATH
  • Functional families
  • Protein binding sites
  • Protein families
  • Protein function

Fingerprint

Dive into the research topics of 'FunFam protein families improve residue level molecular function prediction'. Together they form a unique fingerprint.

Cite this