TY - JOUR
T1 - TMSEG
T2 - Novel prediction of transmembrane helices
AU - Bernhofer, Michael
AU - Kloppmann, Edda
AU - Reeb, Jonas
AU - Rost, Burkhard
N1 - Publisher Copyright:
© 2016 Wiley Periodicals, Inc.
PY - 2016/11/1
Y1 - 2016/11/1
N2 - Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non-redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state-of-the-art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity-based method. Second, TMSEG provides an add-on improvement for any existing method to benefit from. Proteins 2016; 84:1706–1716.
AB - Transmembrane proteins (TMPs) are important drug targets because they are essential for signaling, regulation, and transport. Despite important breakthroughs, experimental structure determination remains challenging for TMPs. Various methods have bridged the gap by predicting transmembrane helices (TMHs), but room for improvement remains. Here, we present TMSEG, a novel method identifying TMPs and accurately predicting their TMHs and their topology. The method combines machine learning with empirical filters. Testing it on a non-redundant dataset of 41 TMPs and 285 soluble proteins, and applying strict performance measures, TMSEG outperformed the state-of-the-art in our hands. TMSEG correctly distinguished helical TMPs from other proteins with a sensitivity of 98 ± 2% and a false positive rate as low as 3 ± 1%. Individual TMHs were predicted with a precision of 87 ± 3% and recall of 84 ± 3%. Furthermore, in 63 ± 6% of helical TMPs the placement of all TMHs and their inside/outside topology was correctly predicted. There are two main features that distinguish TMSEG from other methods. First, the errors in finding all helical TMPs in an organism are significantly reduced. For example, in human this leads to 200 and 1600 fewer misclassifications compared to the second and third best method available, and 4400 fewer mistakes than by a simple hydrophobicity-based method. Second, TMSEG provides an add-on improvement for any existing method to benefit from. Proteins 2016; 84:1706–1716.
KW - membrane protein
KW - protein structure prediction
KW - transmembrane helices
KW - transmembrane helix prediction
KW - transmembrane protein prediction
KW - α-helical integral membrane protein
UR - http://www.scopus.com/inward/record.url?scp=84992145729&partnerID=8YFLogxK
U2 - 10.1002/prot.25155
DO - 10.1002/prot.25155
M3 - Article
C2 - 27566436
AN - SCOPUS:84992145729
SN - 0887-3585
VL - 84
SP - 1706
EP - 1716
JO - Proteins: Structure, Function and Bioinformatics
JF - Proteins: Structure, Function and Bioinformatics
IS - 11
ER -