FunFam protein families improve residue level molecular function prediction

Linus Scheibenreif, Maria Littmann, Christine Orengo, Burkhard Rost

Publikation: Beitrag in FachzeitschriftArtikelBegutachtung

15 Zitate (Scopus)

Abstract

BACKGROUND: The CATH database provides a hierarchical classification of protein domain structures including a sub-classification of superfamilies into functional families (FunFams). We analyzed the similarity of binding site annotations in these FunFams and incorporated FunFams into the prediction of protein binding residues. RESULTS: FunFam members agreed, on average, in 36.9 ± 0.6% of their binding residue annotations. This constituted a 6.7-fold increase over randomly grouped proteins and a 1.2-fold increase (1.1-fold on the same dataset) over proteins with the same enzymatic function (identical Enzyme Commission, EC, number). Mapping de novo binding residue prediction methods (BindPredict-CCS, BindPredict-CC) onto FunFam resulted in consensus predictions for those residues that were aligned and predicted alike (binding/non-binding) within a FunFam. This simple consensus increased the F1-score (for binding) 1.5-fold over the original prediction method. Variation of the threshold for how many proteins in the consensus prediction had to agree provided a convenient control of accuracy/precision and coverage/recall, e.g. reaching a precision as high as 60.8 ± 0.4% for a stringent threshold. CONCLUSIONS: The FunFams outperformed even the carefully curated EC numbers in terms of agreement of binding site residues. Additionally, we assume that our proof-of-principle through the prediction of protein binding residues will be relevant for many other solutions profiting from FunFams to infer functional information at the residue level.

OriginalspracheEnglisch
Aufsatznummer400
Seiten (von - bis)400
Seitenumfang1
FachzeitschriftBMC Bioinformatics
Jahrgang20
Ausgabenummer1
DOIs
PublikationsstatusVeröffentlicht - 18 Juli 2019

Fingerprint

Untersuchen Sie die Forschungsthemen von „FunFam protein families improve residue level molecular function prediction“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren