Abstract
Background: 20 years of improved technology and growing sequences now renders residue-residue contact constraints in large protein families through correlated mutations accurate enough to drive de novo predictions of protein three-dimensional structure. The method EVfold broke new ground using mean-field Direct Coupling Analysis (EVfold-mfDCA); the method PSICOV applied a related concept by estimating a sparse inverse covariance matrix. Both methods (EVfold-mfDCA and PSICOV) are publicly available, but both require too much CPU time for interactive applications. On top, EVfold-mfDCA depends on proprietary software.Results: Here, we present FreeContact, a fast, open source implementation of EVfold-mfDCA and PSICOV. On a test set of 140 proteins, FreeContact was almost eight times faster than PSICOV without decreasing prediction performance. The EVfold-mfDCA implementation of FreeContact was over 220 times faster than PSICOV with negligible performance decrease. EVfold-mfDCA was unavailable for testing due to its dependency on proprietary software. FreeContact is implemented as the free C++ library " libfreecontact" , complete with command line tool " freecontact" , as well as Perl and Python modules. All components are available as Debian packages. FreeContact supports the BioXSD format for interoperability.Conclusions: FreeContact provides the opportunity to compute reliable contact predictions in any environment (desktop or cloud).
Original language | English |
---|---|
Article number | 85 |
Journal | BMC Bioinformatics |
Volume | 15 |
Issue number | 1 |
DOIs | |
State | Published - 26 Mar 2014 |
Keywords
- 2D prediction
- BioXSD
- Debian package
- EVcouplings
- EVfold
- Fast protein contact prediction
- Open-source software
- PSICOV
- Protein sequence analysis
- Protein structure prediction
- mfDCA