Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure.

Garg, Aarti and Kaur, Harpreet and Raghava, G.P.S. (2005) Real value prediction of solvent accessibility in proteins using multiple sequence alignment and secondary structure. Proteins, 61 (2). pp. 318-24. ISSN 1097-0134

[img] PDF
raghava2005.6.pdf - Published Version
Restricted to Registered users only

Download (158Kb) | Request a copy
Official URL: http://onlinelibrary.wiley.com/doi/10.1002/prot.20...

Abstract

The present study is an attempt to develop a neural network-based method for predicting the real value of solvent accessibility from the sequence using evolutionary information in the form of multiple sequence alignment. In this method, two feed-forward networks with a single hidden layer have been trained with standard back-propagation as a learning algorithm. The Pearson's correlation coefficient increases from 0.53 to 0.63, and mean absolute error decreases from 18.2 to 16% when multiple-sequence alignment obtained from PSI-BLAST is used as input instead of a single sequence. The performance of the method further improves from a correlation coefficient of 0.63 to 0.67 when secondary structure information predicted by PSIPRED is incorporated in the prediction. The final network yields a mean absolute error value of 15.2% between the experimental and predicted values, when tested on two different nonhomologous and nonredundant datasets of varying sizes. The method consists of two steps: (1) in the first step, a sequence-to-structure network is trained with the multiple alignment profiles in the form of PSI-BLAST-generated position-specific scoring matrices, and (2) in the second step, the output obtained from the first network and PSIPRED-predicted secondary structure information is used as an input to the second structure-to-structure network. Based on the present study, a server SARpred (http://www.imtech.res.in/raghava/sarpred/) has been developed that predicts the real value of solvent accessibility of residues for a given protein sequence. We have also evaluated the performance of SARpred on 47 proteins used in CASP6 and achieved a correlation coefficient of 0.68 and a MAE of 15.9% between predicted and observed values.

Item Type: Article
Additional Information: Copyright of this article belongs to Wiley
Uncontrolled Keywords: solvent accessibility; prediction; real value; neural network; multiple alignment; secondary structure
Subjects: Q Science > QR Microbiology
Depositing User: Dr. K.P.S.Sengar
Date Deposited: 09 Jan 2012 04:25
Last Modified: 09 Jan 2012 04:25
URI: http://crdd.osdd.net/open/id/eprint/167

Actions (login required)

View Item View Item