Skip to main content

FunPred-1: Protein function prediction from a protein interaction network using neighborhood analysis

Abstract

Proteins are responsible for all biological activities in living organisms. Thanks to genome sequencing projects, large amounts of DNA and protein sequence data are now available, but the biological functions of many proteins are still not annotated in most cases. The unknown function of such non-annotated proteins may be inferred or deduced from their neighbors in a protein interaction network. In this paper, we propose two new methods to predict protein functions based on network neighborhood properties. FunPred 1.1 uses a combination of three simple-yet-effective scoring techniques: the neighborhood ratio, the protein path connectivity and the relative functional similarity. FunPred 1.2 applies a heuristic approach using the edge clustering coefficient to reduce the search space by identifying densely connected neighborhood regions. The overall accuracy achieved in FunPred 1.2 over 8 functional groups involving hetero-interactions in 650 yeast proteins is around 87%, which is higher than the accuracy with FunPred 1.1. It is also higher than the accuracy of many of the state-of-the-art protein function prediction methods described in the literature. The test datasets and the complete source code of the developed software are now freely available at http://code.google.com/p/cmaterbioinfo/.

Abbreviations

BIND:

bimolecular interaction network database

DIP:

Database of Interacting Proteins

ECC:

edge clustering coefficient

HCS:

highly connected subgraphs

LNPC:

Laplacian network partitioning correlations

MCODE:

molecular complex detection

MIPS:

Munich Information Center for Protein Sequences

NMF:

non-negative matrix factorization

PPI:

protein-protein interactions

RNCS:

restricted neighborhood search clustering algorithm

SVM:

support vector machine

References

  1. Schwikowski, B., Uetz, P. and Fields, S. A network of protein-protein interactions in yeast. Nat. Biotechnol. 18 (2000) 1257–1261.

    CAS  PubMed  Article  Google Scholar 

  2. Hishigaki, H., Nakai, K., Ono, T., Tanigami, A. and Takagi, T. Assessment of prediction accuracy of protein function from protein-protein interaction data. Yeast (Chichester, England) 18 (2001) 523–531.

    CAS  Article  Google Scholar 

  3. Chen, J., Hsu, W., Lee, M.L. and Ng. S.K. Labeling network motifs in protein interactomes for protein function prediction. IEEE 23rd International Conference on Data Engineering (2007) 546–555.

    Google Scholar 

  4. Vazquez, A., Flammini, A., Maritan, A. and Vespignani, A. Global protein function prediction from protein-protein interaction networks. Nat. Biotechnol. 21 (2003) 697–700.

    CAS  PubMed  Article  Google Scholar 

  5. Karaoz, U., Murali, T.M., Letovsky, S., Zheng, Y., Ding, C., Cantor, C.R. and Kasif, S. Whole-genome annotation by using evidence integration in functional-linkage networks. Proc. Natl. Acad. Sci. USA 101 (2004) 2888–2893.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  6. Nabieva, E., Jim, K., Agarwal, A., Chazelle, B. and Singh, M. Wholeproteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21 (2005) i302–i310.

    CAS  PubMed  Article  Google Scholar 

  7. Deng, M., Mehta, S., Sun, F. and Chen, T. Inferring domain-domain interactions from protein-protein interactions. Genome Res. (2002) 1540–1548.

    Google Scholar 

  8. Letovsky, S. and Kasif, S. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19 (2003) i197–i204.

    PubMed  Article  Google Scholar 

  9. Wu, D.D. An efficient approach to detect a protein community from a seed. Proc. IEEE Symp. Comput. Intel. Bioinforma. Comput. Biol. (2005) 1–7.

    Google Scholar 

  10. Samanta, M.P. and Liang, S. Predicting protein functions from redundancies in large-scale protein interaction networks. Proc. Natl. Acad. Sci. USA 100 (2003) 12579–12583.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  11. Arnau, V., Mars, S. and Marín, I. Iterative cluster analysis of protein interaction data. Bioinformatics 21 (2005) 364–378.

    CAS  PubMed  Article  Google Scholar 

  12. Bader, G.D. and Hogue, C.W.V. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 27 (2003) 1–27.

    Google Scholar 

  13. Altaf-Ul-Amin, M., Shinbo, Y., Mihara, K., Kurokawa, K. and Kanaya, S. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics 7 (2006) DOI: 10.1186/1471-2105-7-207.

  14. Spirin, V. and Mirny, L.A. Protein complexes and functional modules in molecular networks. Proc. Natl. Acad. Sci. USA 100 (2003) 12123–12128.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  15. King, A.D., Przulj, N. and Jurisica, I. Protein complex prediction via costbased clustering. Bioinformatics 20 (2004) 3013–3020.

    CAS  PubMed  Article  Google Scholar 

  16. Asthana, S., King, O.D., Gibbons, F.D. and Roth, F.P. Predicting protein complex membership using probabilistic network reliability. Genome Res. 14 (2004) 1170–1175.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  17. Krogan, N. J., Cagney, G., Yu, H., Zhong, G., Guo, X., Ignatchenko, A., Li, J., Pu, S., Datta, N., Tikuisis, A.P., Punna, T., Peregrín-Alvarez, J. M., Shales, M., Zhang, X., Davey, M., Robinson, M.D., Paccanaro, A., Bray, J.E., Sheung, A., Beattie, B., Richards, D.P., Canadien, V., Lalev, A., Mena, F., Wong, P., Starostine, A., Canete, M.M., Vlasblom, J. Wu, S., Orsi, C., Collins, S.R., Chandran, S., Haw, R., Rilstone, J.J., Gandi, K., Thompson, N.J., Musso, G., St Onge, P., Ghanny, S., Lam, M.H.Y., Butland, G., Altaf-Ul, A.M., Kanaya, S., Shilatifard, A., O’Shea, E., Weissman, J.S., Ingles, C.J., Hughes, T.R., Parkinson, J., Gerstein, M., Wodak, S.J., Emili, A. and Greenblatt, J.F. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440 (2006) 637–643.

    CAS  PubMed  Article  Google Scholar 

  18. Wang, H., Huang, H., Ding, C. and Nie, F. Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization. J. Comput. Biol. 20 (2013) 344–358.

    PubMed  Article  Google Scholar 

  19. Chatterjee, P., Basu, S., Kundu, M., Nasipuri, M. and Plewczynski, D. PPI_SVM: prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables. Cell. Mol. Biol. Lett. 16 (2011) 264–278.

    CAS  PubMed  Article  Google Scholar 

  20. Wu, X., Zhu, L., Guo, J., Zhang, D.Y. and Lin, K. Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res. 34 (2006) 2137–2150.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  21. Moosavi, S., Rahgozar, M. and Rahimi, A. Protein function prediction using neighbor relativity in protein-protein interaction network. Comput. Biol. Chem. 43 (2013) DOI: 10.1016/j.compbiolchem.2012.12.003.

  22. Peng, W., Wang, J., Wang, W., Liu, Q., Wu, F.X. and Pan, Y. Iteration method for predicting essential proteins based on orthology and proteinprotein interaction networks. BMC Syst. Biol. 6 (2012) DOI: 10.1186/1752-0509-6-87.

  23. Chua, H.N., Sung, W.K. and Wong, L. Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 22 (2006) 1623–1630.

    CAS  PubMed  Article  Google Scholar 

  24. Chatterjee, P., Basu, S., Kundu, M., Nasipuri, M., and Plewczynski, D. PSP_MCSVM: brainstorming consensus prediction of protein secondary structures using two-stage multiclass support vector machines. J. Mol. Model. 17 (2011) 2191–2201.

    PubMed Central  PubMed  Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Subhadip Basu.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Saha, S., Chatterjee, P., Basu, S. et al. FunPred-1: Protein function prediction from a protein interaction network using neighborhood analysis. Cell Mol Biol Lett 19, 675–691 (2014). https://doi.org/10.2478/s11658-014-0221-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2478/s11658-014-0221-5

Keywords

  • Protein interaction network
  • Protein function prediction
  • Functional groups
  • Neighborhood analysis
  • Relative functional similarity
  • Edge clustering coefficient