NAGbinder: An approach for identifying N-acetylglucosamine interacting residues of a protein from its primary sequence.

Patiyal, Sumeet and Agrawal, Piyush and Kumar, Vinod and Dhall, Anjali and Kumar, Rajesh and Mishra, Gaurav and Raghava, G.P.S. (2020) NAGbinder: An approach for identifying N-acetylglucosamine interacting residues of a protein from its primary sequence. Protein science : a publication of the Protein Society, 29 (1). pp. 201-210. ISSN 1469-896X

Full text not available from this repository. (Request a copy)
Official URL: https://onlinelibrary.wiley.com/doi/full/10.1002/p...

Abstract

N-acetylglucosamine (NAG) belongs to the eight essential saccharides that are required to maintain the optimal health and precise functioning of systems ranging from bacteria to human. In the present study, we have developed a method, NAGbinder, which predicts the NAG-interacting residues in a protein from its primary sequence information. We extracted 231 NAG-interacting nonredundant protein chains from Protein Data Bank, where no two sequences share more than 40% sequence identity. All prediction models were trained, validated, and evaluated on these 231 protein chains. At first, prediction models were developed on balanced data consisting of 1,335 NAG-interacting and noninteracting residues, using various window size. The model developed by implementing Random Forest using binary profiles as the main principle for identifying NAG-interacting residue with window size 9, performed best among other models. It achieved highest Matthews Correlation Coefficient (MCC) of 0.31 and 0.25, and Area Under Receiver Operating Curve (AUROC) of 0.73 and 0.70 on training and validation data set, respectively. We also developed prediction models on realistic data set (1,335 NAG-interacting and 47,198 noninteracting residues) using the same principle, where the model achieved MCC of 0.26 and 0.27, and AUROC of 0.70 and 0.71, on training and validation data set, respectively. The success of our method can be appraised by the fact that, if a sequence of 1,000 amino acids is analyzed with our approach, 10 residues will be predicted as NAG-interacting, out of which five are correct. Best models were incorporated in the standalone version and in the webserver available at https://webs.iiitd.edu.in/raghava/nagbinder/.

Item Type: Article
Additional Information: Copyright of this article belongs to Wiley.
Uncontrolled Keywords: Binary profile; Machine learning techniques; N-acetylglucosamine; NAG; PSSM profile
Subjects: Q Science > QR Microbiology
Depositing User: Dr. K.P.S.Sengar
Date Deposited: 05 Feb 2020 10:22
Last Modified: 05 Feb 2020 10:22
URI: http://crdd.osdd.net/open/id/eprint/2540

Actions (login required)

View Item View Item