Predicting protein-protein interactions by a supervised learning classifier

Yang Huang, Dmitrij Frishman, Ilya Muchnik

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Reliable prediction of protein-protein interactions based on sequence information represents a major challenge in computational biology. Based on the assumption that the likelihood of two proteins to interact with each other is associated with their structural domain composition and functional role, we transformed the problem of predicting protein interactions to a classification problem. We developed a heuristic to generate training pairs and test pairs, and then designed a new feature space to represent the training data. In particular, we propose a new method to construct a negative data set such that the functional and structural properties of putative non-interacting proteins strongly resemble the properties of proteins known to interact. The support vector machine algorithm was used to perform the classification of interacting and non-interacting protein pairs in Saccharomyces cerevisiae and to search for optimal training parameters. The accuracy of the system to predict whether two yeast proteins interact in a 10-fold cross-validation experiment was 79%.

Original languageEnglish
Pages (from-to)291-301
Number of pages11
JournalComputational Biology and Chemistry
Issue number4
StatePublished - Oct 2004
Externally publishedYes


  • Genome analysis
  • Protein domains
  • Protein-protein interactions
  • SVM learning


Dive into the research topics of 'Predicting protein-protein interactions by a supervised learning classifier'. Together they form a unique fingerprint.

Cite this